Issue with Enable LTR while pcie_aspm off

Message ID CABe79T47vUjq20HuMrJZxhZWMr0q3uXMF6xL1TgkqhqA8qi1wg@mail.gmail.com
State Not Applicable
Headers show
Series
  • Issue with Enable LTR while pcie_aspm off
Related show

Commit Message

Srinath Mannam April 13, 2018, 6:46 a.m.
Hi Bjorn,

In our platform we disable aspm using boot_arg "pcie_aspm=off".
But with the below patch, LTR is enabled which is part of ASPM.
even we keep disable aspm using "pcie_aspm=off" then why we need to
enable LTR. This is causing issues with few NVMe cards.

Please advice us how can we proceed in this scenario.

commit c46fd358070f22ba68d6e74c22016a33b914c20a
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Tue Nov 28 16:43:50 2017 -0600

    PCI/ASPM: Enable Latency Tolerance Reporting when supported

    Enable Latency Tolerance Reporting (LTR).  Note that LTR must be enabled in
    the Root Port first, and must not be enabled in any downstream device
    unless the Root Port and all intermediate Switches also support LTR.
    See PCIe r3.1, sec 6.18.

    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Reviewed-by: Vidya Sagar <vidyas@nvidia.com>

        struct hotplug_params hpp;
@@ -1883,6 +1915,7 @@ static void pci_configure_device(struct pci_dev *dev)
        pci_configure_mps(dev);
        pci_configure_extended_tags(dev, NULL);
        pci_configure_relaxed_ordering(dev);


Regards,
Srinath.

Comments

Bjorn Helgaas April 13, 2018, 1:56 p.m. | #1
Hi Srinath,

On Fri, Apr 13, 2018 at 12:16:46PM +0530, Srinath Mannam wrote:
> Hi Bjorn,
> 
> In our platform we disable aspm using boot_arg "pcie_aspm=off".
> But with the below patch, LTR is enabled which is part of ASPM.
> even we keep disable aspm using "pcie_aspm=off" then why we need to
> enable LTR. This is causing issues with few NVMe cards.

Strictly speaking, the spec does not define LTR as "part of" ASPM,
although as far as I can tell, the only references to *using* the LTR
information are related to ASPM L1 substates.

I think the reason I made this patch enable LTR regardless of the
initial ASPM settings is because the ASPM settings can be changed at
run-time via sysfs (see pcie_aspm_set_policy()).

If we enable L1 substates at run-time, LTR has to be enabled somehow,
either at enumeration-time (as we currently do) or by some similar
code in the pcie_aspm_set_policy() path.

LTR consumes some PCIe bandwidth, so it's not free, and I do agree
that it seems pointless to enable it if nothing is going to use the
information.

What sort of issues are you seeing?  Are the issues caused by hardware
defects, e.g., things that don't conform to the PCIe spec, or are they
caused by a software defect, e.g., something done wrong in the patch
below, or maybe some requirement that LTR not be enabled unless ASPM
L1 substates are enabled?

If the issues are related to hardware defects with LTR itself, maybe a
quirk would be the best way to approach it.  That would make it clear
what the actual problem is and may allow us to do a better job of
working around it.  For example, maybe we could enable some ASPM
states, but not those requiring LTR.  Or maybe we could enable ASPM in
some parts of the system, but not for the devices that have LTR
issues.

Also, it could potentially remove the need for you to use the
"pcie_aspm=off" parameter, which would improve the user experience.

> Please advice us how can we proceed in this scenario.
> 
> commit c46fd358070f22ba68d6e74c22016a33b914c20a
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date:   Tue Nov 28 16:43:50 2017 -0600
> 
>     PCI/ASPM: Enable Latency Tolerance Reporting when supported
> 
>     Enable Latency Tolerance Reporting (LTR).  Note that LTR must be enabled in
>     the Root Port first, and must not be enabled in any downstream device
>     unless the Root Port and all intermediate Switches also support LTR.
>     See PCIe r3.1, sec 6.18.
> 
>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>     Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 14e0ea1..3761b13 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1875,6 +1875,38 @@ static void
> pci_configure_relaxed_ordering(struct pci_dev *dev)
>         }
>  }
> 
> +static void pci_configure_ltr(struct pci_dev *dev)
> +{
> +#ifdef CONFIG_PCIEASPM
> +       u32 cap;
> +       struct pci_dev *bridge;
> +
> +       if (!pci_is_pcie(dev))
> +               return;
> +
> +       pcie_capability_read_dword(dev, PCI_EXP_DEVCAP2, &cap);
> +       if (!(cap & PCI_EXP_DEVCAP2_LTR))
> +               return;
> +
> +       /*
> +        * Software must not enable LTR in an Endpoint unless the Root
> +        * Complex and all intermediate Switches indicate support for LTR.
> +        * PCIe r3.1, sec 6.18.
> +        */
> +       if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT)
> +               dev->ltr_path = 1;
> +       else {
> +               bridge = pci_upstream_bridge(dev);
> +               if (bridge && bridge->ltr_path)
> +                       dev->ltr_path = 1;
> +       }
> +
> +       if (dev->ltr_path)
> +               pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
> +                                        PCI_EXP_DEVCTL2_LTR_EN);
> +#endif
> +}
> +
>  static void pci_configure_device(struct pci_dev *dev)
>  {
>         struct hotplug_params hpp;
> @@ -1883,6 +1915,7 @@ static void pci_configure_device(struct pci_dev *dev)
>         pci_configure_mps(dev);
>         pci_configure_extended_tags(dev, NULL);
>         pci_configure_relaxed_ordering(dev);
> 
> 
> Regards,
> Srinath.
Srinath Mannam April 13, 2018, 3:43 p.m. | #2
Hi Bjorn,

Thank you very much for quick response.

I am using samsung NVMe card as EP connected to our platform. As per
link capabilities device does not support L1.s.
But LTR is enabled in DEVCAP2.
In my observation after setting LTR without enabling ASPM (common
clock and link retraining) in software, device stopped responding.
I have no Idea, How endpoint should behave as per spec in this case.


From your explanation, I understand that LTR configuration we can't
keep it in the part of ASPM cap init function in aspm.c where we
enable common clock.
and L1.ss capabilities.

Also, as you said we use sysfs functions to set ASPM policies. that to
ASPM support enabled which configure aspm_capabilities and common
clock.

ASPM_CONFIG is depend on PCIE_CONFIG, so there may be cases where
people want to enable PCIE and donot want ASPM.
In that case pcie_aspm boot args will help to disable ASPM.

Is there any such cases where we don't need ASPM and need to set LTR.
If not we can move LTR configuration to aspm.c file.

I think as you said it will be better to add LTR configuration also
part of sysfs interface while we enable L1.ss feature.



Regards,
Srinath.

On Fri, Apr 13, 2018 at 7:26 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> Hi Srinath,
>
> On Fri, Apr 13, 2018 at 12:16:46PM +0530, Srinath Mannam wrote:
>> Hi Bjorn,
>>
>> In our platform we disable aspm using boot_arg "pcie_aspm=off".
>> But with the below patch, LTR is enabled which is part of ASPM.
>> even we keep disable aspm using "pcie_aspm=off" then why we need to
>> enable LTR. This is causing issues with few NVMe cards.
>
> Strictly speaking, the spec does not define LTR as "part of" ASPM,
> although as far as I can tell, the only references to *using* the LTR
> information are related to ASPM L1 substates.
>
> I think the reason I made this patch enable LTR regardless of the
> initial ASPM settings is because the ASPM settings can be changed at
> run-time via sysfs (see pcie_aspm_set_policy()).
>
> If we enable L1 substates at run-time, LTR has to be enabled somehow,
> either at enumeration-time (as we currently do) or by some similar
> code in the pcie_aspm_set_policy() path.
>
> LTR consumes some PCIe bandwidth, so it's not free, and I do agree
> that it seems pointless to enable it if nothing is going to use the
> information.
>
> What sort of issues are you seeing?  Are the issues caused by hardware
> defects, e.g., things that don't conform to the PCIe spec, or are they
> caused by a software defect, e.g., something done wrong in the patch
> below, or maybe some requirement that LTR not be enabled unless ASPM
> L1 substates are enabled?
>
> If the issues are related to hardware defects with LTR itself, maybe a
> quirk would be the best way to approach it.  That would make it clear
> what the actual problem is and may allow us to do a better job of
> working around it.  For example, maybe we could enable some ASPM
> states, but not those requiring LTR.  Or maybe we could enable ASPM in
> some parts of the system, but not for the devices that have LTR
> issues.
>
> Also, it could potentially remove the need for you to use the
> "pcie_aspm=off" parameter, which would improve the user experience.
>
>> Please advice us how can we proceed in this scenario.
>>
>> commit c46fd358070f22ba68d6e74c22016a33b914c20a
>> Author: Bjorn Helgaas <bhelgaas@google.com>
>> Date:   Tue Nov 28 16:43:50 2017 -0600
>>
>>     PCI/ASPM: Enable Latency Tolerance Reporting when supported
>>
>>     Enable Latency Tolerance Reporting (LTR).  Note that LTR must be enabled in
>>     the Root Port first, and must not be enabled in any downstream device
>>     unless the Root Port and all intermediate Switches also support LTR.
>>     See PCIe r3.1, sec 6.18.
>>
>>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>>     Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
>>
>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
>> index 14e0ea1..3761b13 100644
>> --- a/drivers/pci/probe.c
>> +++ b/drivers/pci/probe.c
>> @@ -1875,6 +1875,38 @@ static void
>> pci_configure_relaxed_ordering(struct pci_dev *dev)
>>         }
>>  }
>>
>> +static void pci_configure_ltr(struct pci_dev *dev)
>> +{
>> +#ifdef CONFIG_PCIEASPM
>> +       u32 cap;
>> +       struct pci_dev *bridge;
>> +
>> +       if (!pci_is_pcie(dev))
>> +               return;
>> +
>> +       pcie_capability_read_dword(dev, PCI_EXP_DEVCAP2, &cap);
>> +       if (!(cap & PCI_EXP_DEVCAP2_LTR))
>> +               return;
>> +
>> +       /*
>> +        * Software must not enable LTR in an Endpoint unless the Root
>> +        * Complex and all intermediate Switches indicate support for LTR.
>> +        * PCIe r3.1, sec 6.18.
>> +        */
>> +       if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT)
>> +               dev->ltr_path = 1;
>> +       else {
>> +               bridge = pci_upstream_bridge(dev);
>> +               if (bridge && bridge->ltr_path)
>> +                       dev->ltr_path = 1;
>> +       }
>> +
>> +       if (dev->ltr_path)
>> +               pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
>> +                                        PCI_EXP_DEVCTL2_LTR_EN);
>> +#endif
>> +}
>> +
>>  static void pci_configure_device(struct pci_dev *dev)
>>  {
>>         struct hotplug_params hpp;
>> @@ -1883,6 +1915,7 @@ static void pci_configure_device(struct pci_dev *dev)
>>         pci_configure_mps(dev);
>>         pci_configure_extended_tags(dev, NULL);
>>         pci_configure_relaxed_ordering(dev);
>>
>>
>> Regards,
>> Srinath.
Rajat Jain April 13, 2018, 7:04 p.m. | #3
Hello Srinath,

One question for you.

On Fri, Apr 13, 2018 at 8:43 AM, Srinath Mannam
<srinath.mannam@broadcom.com> wrote:
> Hi Bjorn,
>
> Thank you very much for quick response.
>
> I am using samsung NVMe card as EP connected to our platform. As per
> link capabilities device does not support L1.s.
> But LTR is enabled in DEVCAP2.
> In my observation after setting LTR without enabling ASPM (common
> clock and link retraining) in software, device stopped responding.


Can you elaborate a little on what do you mean by "stop responding"?

1) Is the system still usable but the PCI hierarchy not responding?
e.g. can you still talk to the other PCI devices in the system? If you
have a PCIe switch, can you talk to the other devices on other ports
of the switch?

2) When the device "stops responding", do you mean it stops responding
to memory accesses (i.e. no reads/writes by the driver to the device
registers are working)? Or does it also stop responding to
configuration cycles? e.g. does the lspci -vvv -s <device address>
still work?

Thanks,

Rajat


> I have no Idea, How endpoint should behave as per spec in this case.
>
>
> From your explanation, I understand that LTR configuration we can't
> keep it in the part of ASPM cap init function in aspm.c where we
> enable common clock.
> and L1.ss capabilities.
>
> Also, as you said we use sysfs functions to set ASPM policies. that to
> ASPM support enabled which configure aspm_capabilities and common
> clock.
>
> ASPM_CONFIG is depend on PCIE_CONFIG, so there may be cases where
> people want to enable PCIE and donot want ASPM.
> In that case pcie_aspm boot args will help to disable ASPM.
>
> Is there any such cases where we don't need ASPM and need to set LTR.
> If not we can move LTR configuration to aspm.c file.
>
> I think as you said it will be better to add LTR configuration also
> part of sysfs interface while we enable L1.ss feature.
>
>
>
> Regards,
> Srinath.
>
> On Fri, Apr 13, 2018 at 7:26 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>> Hi Srinath,
>>
>> On Fri, Apr 13, 2018 at 12:16:46PM +0530, Srinath Mannam wrote:
>>> Hi Bjorn,
>>>
>>> In our platform we disable aspm using boot_arg "pcie_aspm=off".
>>> But with the below patch, LTR is enabled which is part of ASPM.
>>> even we keep disable aspm using "pcie_aspm=off" then why we need to
>>> enable LTR. This is causing issues with few NVMe cards.
>>
>> Strictly speaking, the spec does not define LTR as "part of" ASPM,
>> although as far as I can tell, the only references to *using* the LTR
>> information are related to ASPM L1 substates.
>>
>> I think the reason I made this patch enable LTR regardless of the
>> initial ASPM settings is because the ASPM settings can be changed at
>> run-time via sysfs (see pcie_aspm_set_policy()).
>>
>> If we enable L1 substates at run-time, LTR has to be enabled somehow,
>> either at enumeration-time (as we currently do) or by some similar
>> code in the pcie_aspm_set_policy() path.
>>
>> LTR consumes some PCIe bandwidth, so it's not free, and I do agree
>> that it seems pointless to enable it if nothing is going to use the
>> information.
>>
>> What sort of issues are you seeing?  Are the issues caused by hardware
>> defects, e.g., things that don't conform to the PCIe spec, or are they
>> caused by a software defect, e.g., something done wrong in the patch
>> below, or maybe some requirement that LTR not be enabled unless ASPM
>> L1 substates are enabled?
>>
>> If the issues are related to hardware defects with LTR itself, maybe a
>> quirk would be the best way to approach it.  That would make it clear
>> what the actual problem is and may allow us to do a better job of
>> working around it.  For example, maybe we could enable some ASPM
>> states, but not those requiring LTR.  Or maybe we could enable ASPM in
>> some parts of the system, but not for the devices that have LTR
>> issues.
>>
>> Also, it could potentially remove the need for you to use the
>> "pcie_aspm=off" parameter, which would improve the user experience.
>>
>>> Please advice us how can we proceed in this scenario.
>>>
>>> commit c46fd358070f22ba68d6e74c22016a33b914c20a
>>> Author: Bjorn Helgaas <bhelgaas@google.com>
>>> Date:   Tue Nov 28 16:43:50 2017 -0600
>>>
>>>     PCI/ASPM: Enable Latency Tolerance Reporting when supported
>>>
>>>     Enable Latency Tolerance Reporting (LTR).  Note that LTR must be enabled in
>>>     the Root Port first, and must not be enabled in any downstream device
>>>     unless the Root Port and all intermediate Switches also support LTR.
>>>     See PCIe r3.1, sec 6.18.
>>>
>>>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>>>     Reviewed-by: Vidya Sagar <vidyas@nvidia.com>
>>>
>>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
>>> index 14e0ea1..3761b13 100644
>>> --- a/drivers/pci/probe.c
>>> +++ b/drivers/pci/probe.c
>>> @@ -1875,6 +1875,38 @@ static void
>>> pci_configure_relaxed_ordering(struct pci_dev *dev)
>>>         }
>>>  }
>>>
>>> +static void pci_configure_ltr(struct pci_dev *dev)
>>> +{
>>> +#ifdef CONFIG_PCIEASPM
>>> +       u32 cap;
>>> +       struct pci_dev *bridge;
>>> +
>>> +       if (!pci_is_pcie(dev))
>>> +               return;
>>> +
>>> +       pcie_capability_read_dword(dev, PCI_EXP_DEVCAP2, &cap);
>>> +       if (!(cap & PCI_EXP_DEVCAP2_LTR))
>>> +               return;
>>> +
>>> +       /*
>>> +        * Software must not enable LTR in an Endpoint unless the Root
>>> +        * Complex and all intermediate Switches indicate support for LTR.
>>> +        * PCIe r3.1, sec 6.18.
>>> +        */
>>> +       if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT)
>>> +               dev->ltr_path = 1;
>>> +       else {
>>> +               bridge = pci_upstream_bridge(dev);
>>> +               if (bridge && bridge->ltr_path)
>>> +                       dev->ltr_path = 1;
>>> +       }
>>> +
>>> +       if (dev->ltr_path)
>>> +               pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
>>> +                                        PCI_EXP_DEVCTL2_LTR_EN);
>>> +#endif
>>> +}
>>> +
>>>  static void pci_configure_device(struct pci_dev *dev)
>>>  {
>>>         struct hotplug_params hpp;
>>> @@ -1883,6 +1915,7 @@ static void pci_configure_device(struct pci_dev *dev)
>>>         pci_configure_mps(dev);
>>>         pci_configure_extended_tags(dev, NULL);
>>>         pci_configure_relaxed_ordering(dev);
>>>
>>>
>>> Regards,
>>> Srinath.
Bjorn Helgaas April 13, 2018, 9:16 p.m. | #4
On Fri, Apr 13, 2018 at 09:13:42PM +0530, Srinath Mannam wrote:
> Hi Bjorn,
> 
> Thank you very much for quick response.
> 
> I am using samsung NVMe card as EP connected to our platform. As per
> link capabilities device does not support L1.s.
> But LTR is enabled in DEVCAP2.

Interesting that the device advertises LTR support, but not L1.2
support.  As far as I can tell, that's legal per spec, but I don't
know what the LTR info would be used for.  The only use I know about
is comparison with LTR_L1.2_THRESHOLD, as in PCIe r4.0, sec 5.5.1.

Can you attach lspci -vv output for the NVMe device and all the
devices in the path leading to it?

> In my observation after setting LTR without enabling ASPM (common
> clock and link retraining) in software, device stopped responding.

From sec 5.5.1, it seems clear that LTR would have to be enabled
before L1.2 is enabled, because the decision to enter L1.2 depends on
information from LTR messages.

Since the device advertised LTR support, it seems like a hardware
defect if enabling it breaks something.  But I can imagine that if it
doesn't advertise L1.2 support, nobody expected LTR to be used.  Can
you try the patch below?  It should make it so we don't enable LTR
unless we also support L1.2.  Maybe that's enough to avoid the issue.

Bjorn


diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index f76eb7704f64..9e2212889e7e 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -400,6 +400,16 @@ static void pcie_get_aspm_reg(struct pci_dev *pdev,
 		info->l1ss_cap = 0;
 		return;
 	}
+
+	/*
+	 * If we don't have LTR for the entire path from the Root Complex
+	 * to this device, we can't use L1.2.  PCIe r4.0, secs 5.5.4, 6.18.
+	 */
+	if (!pdev->ltr_path) {
+		info->l1ss_cap &= ~PCI_L1SS_CTL1_PCIPM_L1_2;
+		info->l1ss_cap &= ~PCI_L1SS_CTL1_ASPM_L1_2;
+	}
+
 	pci_read_config_dword(pdev, info->l1ss_cap_ptr + PCI_L1SS_CTL1,
 			      &info->l1ss_ctl1);
 	pci_read_config_dword(pdev, info->l1ss_cap_ptr + PCI_L1SS_CTL2,
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index ac91b6fd0bcd..9a224612b3f8 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1954,12 +1954,29 @@ static void pci_configure_relaxed_ordering(struct pci_dev *dev)
 static void pci_configure_ltr(struct pci_dev *dev)
 {
 #ifdef CONFIG_PCIEASPM
-	u32 cap;
+	u32 cap, l1ss_cap, l1ss;
 	struct pci_dev *bridge;
 
 	if (!pci_is_pcie(dev))
 		return;
 
+	/*
+	 * Per PCIe r4.0, sec 5.5.4, we must program LTR_L1.2_THRESHOLD if
+	 * either L1.2 enable bit is set.  That suggests that LTR must be
+	 * enabled before L1.2.  If the device (and the entire path leading
+	 * to it) advertises L1.2 and LTR support, enable LTR.
+	 */
+	l1ss_cap = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_L1SS);
+	if (!l1ss_cap)
+		return;
+
+	pci_read_config_dword(dev, l1ss_cap + PCI_L1SS_CAP, &l1ss);
+	if (!(l1ss & PCI_L1SS_CAP_L1_PM_SS))
+		return;
+
+	if (!(l1ss & (PCI_L1SS_CAP_PCIPM_L1_2 | PCI_L1SS_CAP_ASPM_L1_2)))
+		return;
+
 	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP2, &cap);
 	if (!(cap & PCI_EXP_DEVCAP2_LTR))
 		return;
@@ -1967,7 +1984,7 @@ static void pci_configure_ltr(struct pci_dev *dev)
 	/*
 	 * Software must not enable LTR in an Endpoint unless the Root
 	 * Complex and all intermediate Switches indicate support for LTR.
-	 * PCIe r3.1, sec 6.18.
+	 * PCIe r4.0, sec 6.18.
 	 */
 	if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT)
 		dev->ltr_path = 1;
Srinath Mannam April 14, 2018, 3:34 a.m. | #5
Hi Bjorn,

Thank you for the patch. I will try on monday and let you know the results.

I am sorry, in previous mail by mistake I have written L1.s support is
not there, Actually I wanted to write L0s support is not there.
L1 and L1ss support is there.

But In our platform we required to disable ASPM.

RC in our platform supports upto Gen3 only.

I dnot have setup access now. I will share lspci output and results of
your patch on monday.

Hi Rajat,

Device stop responding means, completion timeout for config read.


Regards,
Srinath.

On Sat, Apr 14, 2018 at 2:46 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> On Fri, Apr 13, 2018 at 09:13:42PM +0530, Srinath Mannam wrote:
>> Hi Bjorn,
>>
>> Thank you very much for quick response.
>>
>> I am using samsung NVMe card as EP connected to our platform. As per
>> link capabilities device does not support L1.s.
>> But LTR is enabled in DEVCAP2.
>
> Interesting that the device advertises LTR support, but not L1.2
> support.  As far as I can tell, that's legal per spec, but I don't
> know what the LTR info would be used for.  The only use I know about
> is comparison with LTR_L1.2_THRESHOLD, as in PCIe r4.0, sec 5.5.1.
>
> Can you attach lspci -vv output for the NVMe device and all the
> devices in the path leading to it?
>
>> In my observation after setting LTR without enabling ASPM (common
>> clock and link retraining) in software, device stopped responding.
>
> From sec 5.5.1, it seems clear that LTR would have to be enabled
> before L1.2 is enabled, because the decision to enter L1.2 depends on
> information from LTR messages.
>
> Since the device advertised LTR support, it seems like a hardware
> defect if enabling it breaks something.  But I can imagine that if it
> doesn't advertise L1.2 support, nobody expected LTR to be used.  Can
> you try the patch below?  It should make it so we don't enable LTR
> unless we also support L1.2.  Maybe that's enough to avoid the issue.
>
> Bjorn
>
>
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index f76eb7704f64..9e2212889e7e 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -400,6 +400,16 @@ static void pcie_get_aspm_reg(struct pci_dev *pdev,
>                 info->l1ss_cap = 0;
>                 return;
>         }
> +
> +       /*
> +        * If we don't have LTR for the entire path from the Root Complex
> +        * to this device, we can't use L1.2.  PCIe r4.0, secs 5.5.4, 6.18.
> +        */
> +       if (!pdev->ltr_path) {
> +               info->l1ss_cap &= ~PCI_L1SS_CTL1_PCIPM_L1_2;
> +               info->l1ss_cap &= ~PCI_L1SS_CTL1_ASPM_L1_2;
> +       }
> +
>         pci_read_config_dword(pdev, info->l1ss_cap_ptr + PCI_L1SS_CTL1,
>                               &info->l1ss_ctl1);
>         pci_read_config_dword(pdev, info->l1ss_cap_ptr + PCI_L1SS_CTL2,
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index ac91b6fd0bcd..9a224612b3f8 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1954,12 +1954,29 @@ static void pci_configure_relaxed_ordering(struct pci_dev *dev)
>  static void pci_configure_ltr(struct pci_dev *dev)
>  {
>  #ifdef CONFIG_PCIEASPM
> -       u32 cap;
> +       u32 cap, l1ss_cap, l1ss;
>         struct pci_dev *bridge;
>
>         if (!pci_is_pcie(dev))
>                 return;
>
> +       /*
> +        * Per PCIe r4.0, sec 5.5.4, we must program LTR_L1.2_THRESHOLD if
> +        * either L1.2 enable bit is set.  That suggests that LTR must be
> +        * enabled before L1.2.  If the device (and the entire path leading
> +        * to it) advertises L1.2 and LTR support, enable LTR.
> +        */
> +       l1ss_cap = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_L1SS);
> +       if (!l1ss_cap)
> +               return;
> +
> +       pci_read_config_dword(dev, l1ss_cap + PCI_L1SS_CAP, &l1ss);
> +       if (!(l1ss & PCI_L1SS_CAP_L1_PM_SS))
> +               return;
> +
> +       if (!(l1ss & (PCI_L1SS_CAP_PCIPM_L1_2 | PCI_L1SS_CAP_ASPM_L1_2)))
> +               return;
> +
>         pcie_capability_read_dword(dev, PCI_EXP_DEVCAP2, &cap);
>         if (!(cap & PCI_EXP_DEVCAP2_LTR))
>                 return;
> @@ -1967,7 +1984,7 @@ static void pci_configure_ltr(struct pci_dev *dev)
>         /*
>          * Software must not enable LTR in an Endpoint unless the Root
>          * Complex and all intermediate Switches indicate support for LTR.
> -        * PCIe r3.1, sec 6.18.
> +        * PCIe r4.0, sec 6.18.
>          */
>         if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT)
>                 dev->ltr_path = 1;
Bjorn Helgaas April 14, 2018, 4:09 p.m. | #6
On Sat, Apr 14, 2018 at 09:04:05AM +0530, Srinath Mannam wrote:
> I am sorry, in previous mail by mistake I have written L1.s support is
> not there, Actually I wanted to write L0s support is not there.
> L1 and L1ss support is there.

If your endpoint (and everything in the path) advertise both LTR and
L1ss support, that patch probably won't make a difference.

It *might* make a difference if only part of the path supports both,
because my reading of the spec is that L1ss requires LTR and LTR
requires the entire path to support LTR, and we currently don't
enforce that "entire path" part before enabling L1ss.

> But In our platform we required to disable ASPM.

We're trying to figure out exactly *why* you must disable ASPM.  If
it's because of a hardware defect, e.g., the device advertises ASPM
support but it's actually broken, we probably need to add a quirk.
Given the complexity of ASPM, it's surprising we don't have similar
quirks already.

> RC in our platform supports upto Gen3 only.

I don't think there's a connection between ASPM or LTR and the link
speed (Gen2, Gen3, etc), is there?

Bjorn
Srinath Mannam April 16, 2018, 3:33 p.m. | #7
Hi Bjorn,

Please find my comments inline.

On Sat, Apr 14, 2018 at 9:39 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> On Sat, Apr 14, 2018 at 09:04:05AM +0530, Srinath Mannam wrote:
>> I am sorry, in previous mail by mistake I have written L1.s support is
>> not there, Actually I wanted to write L0s support is not there.
>> L1 and L1ss support is there.
>
> If your endpoint (and everything in the path) advertise both LTR and
> L1ss support, that patch probably won't make a difference.
>
> It *might* make a difference if only part of the path supports both,
> because my reading of the spec is that L1ss requires LTR and LTR
> requires the entire path to support LTR, and we currently don't
> enforce that "entire path" part before enabling L1ss.
>
Yes, this patch did not work.
>> But In our platform we required to disable ASPM.
>
> We're trying to figure out exactly *why* you must disable ASPM.  If
> it's because of a hardware defect, e.g., the device advertises ASPM
> support but it's actually broken, we probably need to add a quirk.
> Given the complexity of ASPM, it's surprising we don't have similar
> quirks already.
>
We see issues with ASPM enabled. Some link issues observed so for time being
we are using with aspm disabled until we fix that issue.
with LTR enabled also we observed some problem, that after LTR messages received
from EP, we see completion timeout with config write.
So I thought If LTR configuration function also part of aspm file, as
it was under CONFIG_ASPM.
using pcie_aspm boot arg I can disable both ASPM and LTR.
If this is not possible, then I will go for alternative solution of
quirk implementation as you suggest.
>> RC in our platform supports upto Gen3 only.
>
> I don't think there's a connection between ASPM or LTR and the link
> speed (Gen2, Gen3, etc), is there?
>
Here is the lspci output of RC and EP. EP is directly connected to our
RC, nothing in between.
RC:
0000:00:00.0 PCI bridge: Broadcom Limited Device d714 (prog-if 00
[Normal decode])
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 116
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
        I/O behind bridge: 00000000-00000fff [size=4K]
        Memory behind bridge: 80000000-800fffff [size=1M]
        Prefetchable memory behind bridge:
00000000fff00000-00000000000fffff [empty]
        Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- <SERR- <PERR-
        BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
                PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
        Capabilities: [48] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
        Capabilities: [ac] Express (v2) Root Port (Slot-), MSI 00
                DevCap: MaxPayload 512 bytes, PhantFunc 0
                        ExtTag+ RBE+
                DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+
Unsupported+
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr+ NoSnoop+
                        MaxPayload 512 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq-
AuxPwr+ TransPend-
                LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L0s L1,
Exit Latency L0s <1us, L1 <2us
                        ClockPM+ Surprise- LLActRep- BwNot+ ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 8GT/s, Width x4, TrErr- Train- SlotClk+
DLActive- BWMgmt+ ABWMgmt-
                RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal-
PMEIntEna+ CRSVisible+
                RootCap: CRSVisible-
                RootSta: PME ReqID 0000, PMEStatus- PMEPending-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+,
LTR+, OBFF Via WAKE# ARIFwd+
                         AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms,
TimeoutDis-, LTR+, OBFF Disabled ARIFwd-
                         AtomicOpsCtl: ReqEn- EgressBlck-
                LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range,
EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -3.5dB,
EqualizationComplete+, EqualizationPhase1+
                         EqualizationPhase2+, EqualizationPhase3+,
LinkEqualizationRequest-
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt-
UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                AERCap: First Error Pointer: 00, ECRCGenCap+
ECRCGenEn- ECRCChkCap+ ECRCChkEn-
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
                RootCmd: CERptEn+ NFERptEn+ FERptEn+
                RootSta: CERcvd- MultCERcvd- UERcvd- MultUERcvd-
                         FirstFatal- NonFatalMsg- FatalMsg- IntMsg 0
                ErrorSrc: ERR_COR: 0000 ERR_FATAL/NONFATAL: 0000
        Capabilities: [180 v1] Vendor Specific Information: ID=0000
Rev=0 Len=028 <?>
        Capabilities: [240 v1] L1 PM Substates
                L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+
ASPM_L1.1+ L1_PM_Substates+
                          PortCommonModeRestoreTime=8us PortTPowerOnTime=10us
                L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
                           T_CommonMode=1us LTR1.2_Threshold=0ns
                L1SubCtl2: T_PwrOn=10us
        Capabilities: [300 v1] #19
        Kernel driver in use: pcieport

EP:
0000:01:00.0 Non-Volatile memory controller: Samsung Electronics Co
Ltd NVMe SSD Controller SM961/PM961 (prog-if 02 [NVM Express])
        Subsystem: Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961
        Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
        Interrupt: pin A routed to IRQ 255
        Region 0: Memory at 400000000 (64-bit, non-prefetchable)
[disabled] [size=16K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA
PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [50] MSI: Enable- Count=1/32 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [70] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
unlimited, L1 unlimited
                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE+
FLReset+ SlotPowerLimit 0.000W
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal-
Unsupported-
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
                        MaxPayload 256 bytes, MaxReadReq 256 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq-
AuxPwr+ TransPend-
                LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit
Latency L1 <64us
                        ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 8GT/s, Width x4, TrErr- Train- SlotClk+
DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+,
LTR+, OBFF Not Supported
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms,
TimeoutDis-, LTR-, OBFF Disabled
                         AtomicOpsCtl: ReqEn-
                LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range,
EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB,
EqualizationComplete+, EqualizationPhase1+
                         EqualizationPhase2+, EqualizationPhase3+,
LinkEqualizationRequest-
        Capabilities: [b0] MSI-X: Enable- Count=8 Masked-
                Vector table: BAR=0 offset=00003000
                PBA: BAR=0 offset=00002000
        Capabilities: [100 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt-
UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                AERCap: First Error Pointer: 00, ECRCGenCap+
ECRCGenEn- ECRCChkCap+ ECRCChkEn-
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
        Capabilities: [148 v1] Device Serial Number 00-00-00-00-00-00-00-00
        Capabilities: [158 v1] Power Budgeting <?>
        Capabilities: [168 v1] #19
        Capabilities: [188 v1] Latency Tolerance Reporting
                Max snoop latency: 0ns
                Max no snoop latency: 0ns
        Capabilities: [190 v1] L1 PM Substates
                L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+
ASPM_L1.1+ L1_PM_Substates+
                          PortCommonModeRestoreTime=10us PortTPowerOnTime=10us
                L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
                           T_CommonMode=0us LTR1.2_Threshold=0ns
                L1SubCtl2: T_PwrOn=10us



> Bjorn
Bjorn Helgaas April 16, 2018, 9:35 p.m. | #8
[+cc Keith, linux-nvme, LKML; another possible ASPM issue with Samsung
NVMe SSD 960 EVO]

On Mon, Apr 16, 2018 at 09:03:33PM +0530, Srinath Mannam wrote:
> On Sat, Apr 14, 2018 at 9:39 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Sat, Apr 14, 2018 at 09:04:05AM +0530, Srinath Mannam wrote:
> >> I am sorry, in previous mail by mistake I have written L1.s support is
> >> not there, Actually I wanted to write L0s support is not there.
> >> L1 and L1ss support is there.
> >
> > If your endpoint (and everything in the path) advertise both LTR and
> > L1ss support, that patch probably won't make a difference.
> >
> > It *might* make a difference if only part of the path supports both,
> > because my reading of the spec is that L1ss requires LTR and LTR
> > requires the entire path to support LTR, and we currently don't
> > enforce that "entire path" part before enabling L1ss.
> >
> Yes, this patch did not work.

OK, thanks for checking.  Since there's only one link in the path and
both ends advertise L1SS and LTR support, I wouldn't expect it to make
a difference.

> >> But In our platform we required to disable ASPM.
>
> > We're trying to figure out exactly *why* you must disable ASPM.  If
> > it's because of a hardware defect, e.g., the device advertises ASPM
> > support but it's actually broken, we probably need to add a quirk.
> > Given the complexity of ASPM, it's surprising we don't have similar
> > quirks already.
>
> We see issues with ASPM enabled. Some link issues observed so for
> time being we are using with aspm disabled until we fix that issue.

I see other reports of ASPM issues with that Samsung 960 PRO NVMe SSD.
Maybe they're related?

  https://lkml.kernel.org/r/20171214184701.GA6322@libmpq.org
  https://forums.lenovo.com/t5/ThinkCentre-A-E-M-S-Series/M900-Tiny-UEFI-Bug-M-2-NVMe-SSD-amp-8260-WiFi-ASPM-disabled-Much/td-p/3570469

You might try setting NVME_QUIRK_NO_APST to see if that's related.
There are some quirks that sound similar:

  8427bbc22486 ("nvme-pci: disable APST on Samsung SSD 960 EVO + ASUS PRIME B350M-A")
  467c77d4cbef ("nvme-pci: disable APST for Samsung NVMe SSD 960 EVO + ASUS PRIME Z370-A")

Keith, et al, here's the relevant part of Srinath's lspci.  Both ends
of the link claim to support ASPM including L1SS and LTR, but Srinath
has to disable ASPM to get the SSD to work reliably.  Just FYI.

> 0000:00:00.0 PCI bridge: Broadcom Limited Device d714 (prog-if 00 [Normal decode])
>        Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
>        Capabilities: [ac] Express (v2) Root Port (Slot-), MSI 00
>                LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L0s L1, Exit Latency L0s <1us, L1 <2us
>                        ClockPM+ Surprise- LLActRep- BwNot+ ASPMOptComp+
>                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
>                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Via WAKE# ARIFwd+
>                         AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS-
>                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd-
>                         AtomicOpsCtl: ReqEn- EgressBlck-
>        Capabilities: [240 v1] L1 PM Substates
>                L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
>                          PortCommonModeRestoreTime=8us PortTPowerOnTime=10us
>                L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
>                           T_CommonMode=1us LTR1.2_Threshold=0ns

> 0000:01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961 (prog-if 02 [NVM Express])
>        Capabilities: [70] Express (v2) Endpoint, MSI 00
>                LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
>                        ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
>                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
>                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Not Supported
>                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
>                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
>                         AtomicOpsCtl: ReqEn- 
>        Capabilities: [190 v1] L1 PM Substates
>                L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
>                          PortCommonModeRestoreTime=10us PortTPowerOnTime=10us
>                L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
>                           T_CommonMode=0us LTR1.2_Threshold=0ns


> with LTR enabled also we observed some problem, that after LTR
> messages received from EP, we see completion timeout with config
> write.

> So I thought If LTR configuration function also part of aspm file,
> as it was under CONFIG_ASPM.  using pcie_aspm boot arg I can disable
> both ASPM and LTR.

> If this is not possible, then I will go for alternative solution of
> quirk implementation as you suggest.

Is this platform a lab prototype or is it already shipping?  If it's
already shipping, you probably need some sort of upstream solution
like an NVMe or PCIe quirk, but if not, maybe you can just hack your
bringup kernel to disable ASPM and LTR until you fix the root cause.  

Bjorn
Srinath Mannam April 17, 2018, 9:03 a.m. | #9
Hi Bjorn,

Thank you for more insight you have given about the problem.

For us the issue comes before we disable apst feature.
on APST quirk set, NVMe driver disable apst by send a command to NVMe
controller.
We see issue at the time of NVMe initialization only.

So APST quirk did not helped.



On Tue, Apr 17, 2018 at 3:05 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> [+cc Keith, linux-nvme, LKML; another possible ASPM issue with Samsung
> NVMe SSD 960 EVO]
>
> On Mon, Apr 16, 2018 at 09:03:33PM +0530, Srinath Mannam wrote:
>> On Sat, Apr 14, 2018 at 9:39 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>> > On Sat, Apr 14, 2018 at 09:04:05AM +0530, Srinath Mannam wrote:
>> >> I am sorry, in previous mail by mistake I have written L1.s support is
>> >> not there, Actually I wanted to write L0s support is not there.
>> >> L1 and L1ss support is there.
>> >
>> > If your endpoint (and everything in the path) advertise both LTR and
>> > L1ss support, that patch probably won't make a difference.
>> >
>> > It *might* make a difference if only part of the path supports both,
>> > because my reading of the spec is that L1ss requires LTR and LTR
>> > requires the entire path to support LTR, and we currently don't
>> > enforce that "entire path" part before enabling L1ss.
>> >
>> Yes, this patch did not work.
>
> OK, thanks for checking.  Since there's only one link in the path and
> both ends advertise L1SS and LTR support, I wouldn't expect it to make
> a difference.
>
>> >> But In our platform we required to disable ASPM.
>>
>> > We're trying to figure out exactly *why* you must disable ASPM.  If
>> > it's because of a hardware defect, e.g., the device advertises ASPM
>> > support but it's actually broken, we probably need to add a quirk.
>> > Given the complexity of ASPM, it's surprising we don't have similar
>> > quirks already.
>>
>> We see issues with ASPM enabled. Some link issues observed so for
>> time being we are using with aspm disabled until we fix that issue.
>
> I see other reports of ASPM issues with that Samsung 960 PRO NVMe SSD.
> Maybe they're related?
>
>   https://lkml.kernel.org/r/20171214184701.GA6322@libmpq.org
>   https://forums.lenovo.com/t5/ThinkCentre-A-E-M-S-Series/M900-Tiny-UEFI-Bug-M-2-NVMe-SSD-amp-8260-WiFi-ASPM-disabled-Much/td-p/3570469
>
> You might try setting NVME_QUIRK_NO_APST to see if that's related.
> There are some quirks that sound similar:
>
>   8427bbc22486 ("nvme-pci: disable APST on Samsung SSD 960 EVO + ASUS PRIME B350M-A")
>   467c77d4cbef ("nvme-pci: disable APST for Samsung NVMe SSD 960 EVO + ASUS PRIME Z370-A")
>
> Keith, et al, here's the relevant part of Srinath's lspci.  Both ends
> of the link claim to support ASPM including L1SS and LTR, but Srinath
> has to disable ASPM to get the SSD to work reliably.  Just FYI.
>
>> 0000:00:00.0 PCI bridge: Broadcom Limited Device d714 (prog-if 00 [Normal decode])
>>        Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
>>        Capabilities: [ac] Express (v2) Root Port (Slot-), MSI 00
>>                LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L0s L1, Exit Latency L0s <1us, L1 <2us
>>                        ClockPM+ Surprise- LLActRep- BwNot+ ASPMOptComp+
>>                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
>>                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>>                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Via WAKE# ARIFwd+
>>                         AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS-
>>                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd-
>>                         AtomicOpsCtl: ReqEn- EgressBlck-
>>        Capabilities: [240 v1] L1 PM Substates
>>                L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
>>                          PortCommonModeRestoreTime=8us PortTPowerOnTime=10us
>>                L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
>>                           T_CommonMode=1us LTR1.2_Threshold=0ns
>
>> 0000:01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961 (prog-if 02 [NVM Express])
>>        Capabilities: [70] Express (v2) Endpoint, MSI 00
>>                LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
>>                        ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
>>                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
>>                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>>                DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR+, OBFF Not Supported
>>                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
>>                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
>>                         AtomicOpsCtl: ReqEn-
>>        Capabilities: [190 v1] L1 PM Substates
>>                L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
>>                          PortCommonModeRestoreTime=10us PortTPowerOnTime=10us
>>                L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
>>                           T_CommonMode=0us LTR1.2_Threshold=0ns
>
>
>> with LTR enabled also we observed some problem, that after LTR
>> messages received from EP, we see completion timeout with config
>> write.
>
>> So I thought If LTR configuration function also part of aspm file,
>> as it was under CONFIG_ASPM.  using pcie_aspm boot arg I can disable
>> both ASPM and LTR.
>
>> If this is not possible, then I will go for alternative solution of
>> quirk implementation as you suggest.
>
> Is this platform a lab prototype or is it already shipping?  If it's
> already shipping, you probably need some sort of upstream solution
> like an NVMe or PCIe quirk, but if not, maybe you can just hack your
> bringup kernel to disable ASPM and LTR until you fix the root cause.
>
we are at evolution stage so we need to fix this ASAP.
As you said earlier, Can I add sysfs interface to enable LTR same as
we do L1SS or in the part of aspm cap init function.

> Bjorn

Regards,
Srinath.
Bjorn Helgaas April 17, 2018, 5:11 p.m. | #10
On Tue, Apr 17, 2018 at 02:33:52PM +0530, Srinath Mannam wrote:
> Hi Bjorn,
> 
> Thank you for more insight you have given about the problem.
> 
> For us the issue comes before we disable apst feature.
> on APST quirk set, NVMe driver disable apst by send a command to NVMe
> controller.
> We see issue at the time of NVMe initialization only.
> 
> So APST quirk did not helped.

OK, thanks for checking that.

> On Tue, Apr 17, 2018 at 3:05 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Mon, Apr 16, 2018 at 09:03:33PM +0530, Srinath Mannam wrote:
> >> On Sat, Apr 14, 2018 at 9:39 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> >> > On Sat, Apr 14, 2018 at 09:04:05AM +0530, Srinath Mannam wrote:

> >> >> But In our platform we required to disable ASPM.
> >>
> >> > We're trying to figure out exactly *why* you must disable ASPM.  If
> >> > it's because of a hardware defect, e.g., the device advertises ASPM
> >> > support but it's actually broken, we probably need to add a quirk.
> >> > Given the complexity of ASPM, it's surprising we don't have similar
> >> > quirks already.
> >>
> >> We see issues with ASPM enabled. Some link issues observed so for
> >> time being we are using with aspm disabled until we fix that issue.
> >> ...

> >> with LTR enabled also we observed some problem, that after LTR
> >> messages received from EP, we see completion timeout with config
> >> write.
> >
> >> So I thought If LTR configuration function also part of aspm file,
> >> as it was under CONFIG_ASPM.  using pcie_aspm boot arg I can disable
> >> both ASPM and LTR.
> >
> >> If this is not possible, then I will go for alternative solution of
> >> quirk implementation as you suggest.
> >
> > Is this platform a lab prototype or is it already shipping?  If it's
> > already shipping, you probably need some sort of upstream solution
> > like an NVMe or PCIe quirk, but if not, maybe you can just hack your
> > bringup kernel to disable ASPM and LTR until you fix the root cause.
> >
> we are at evolution stage so we need to fix this ASAP.
> As you said earlier, Can I add sysfs interface to enable LTR same as
> we do L1SS or in the part of aspm cap init function.

I don't know what evolution stage means, but it sounds like you do
need an upstream fix.

I don't understand the root cause of the problem yet, so I don't know
what the best solution for you is.  Turning off LTR and ASPM
completely is a pretty big hammer.  Ideally we could isolate this to
certain devices or certain pieces of ASPM so we could still get some
of the benefit.

  - Do you need to disable LTR on the entire system, or just for the
    Samsung NVMe device?

  - Do you need to disable LTR if you replace the Samsung device with
    something else?

  - LTR is only needed for the ASPM L1.2 substate.  Do you need to
    completely disable ASPM, or is it sufficient to disable LTR and
    the ASPM L1.2 substate?

Here are some patches you can try.  These require some tweaking based
on what the root cause of the problem is.

  1  PCI/ACPI: Request control of LTR from the platform
  2  PCI/ASPM: Disable ASPM L1.2 Substate if we don't have LTR
  3  PCI: Enable LTR only if ASPM is enabled
  4  PCI: Disable LTR for Samsung NVMe SSD Controller SM961/PM961

If LTR is completely broken for all devices on your platform, and
disabling it and the ASPM L1.2 substate is sufficient, you could use
patches 1 and 2 along with a PCI or DMI quirk to clear
host->native_ltr.  That should prevent use of LTR and the ASPM L1.2
substate.

Patch 3 should disable LTR as well as all of ASPM (not just the ASPM
L1.2 substate) if you boot with "pcie_aspm=off".  I don't like this
very well because the user experience is poor -- you need a release
note or something telling the user to boot with "pcie_aspm=off", which
is a really ugly solution.

If the problem is some sort of interaction between the Samsung SSD and
your platform, you could use patches 2 and 4 to disable LTR and ASPM
L1.2 just for that device in your system.  Of course, you would need
to add a DMI or similar check in the quirk, because we can't disable 
LTR and ASPM L1.2 for the Samsung SSD in *all* systems.

I think patches 1-3 are candidates for the mainline kernel, regardless
of whether you need them yourself.


commit 0e78bd8bd5a99f47282b1ab5dbd679d28ff3b460
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Tue Apr 17 10:58:09 2018 -0500

    PCI/ACPI: Request control of LTR from the platform
    
    Per the PCI Firmware spec r3.2, sec 4.5, the OS should request control of
    Latency Tolerance Reporting (LTR) before using it.  LTR is only required
    for the ASPM L1.2 Substate.
    
    If we support ASPM, request control of LTR.  If the platform does not grant
    us control of LTR, don't use it.

diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
index 0da18bde6a16..f32d767e8e3b 100644
--- a/drivers/acpi/pci_root.c
+++ b/drivers/acpi/pci_root.c
@@ -153,6 +153,7 @@ static struct pci_osc_bit_struct pci_osc_control_bit[] = {
 	{ OSC_PCI_EXPRESS_PME_CONTROL, "PME" },
 	{ OSC_PCI_EXPRESS_AER_CONTROL, "AER" },
 	{ OSC_PCI_EXPRESS_CAPABILITY_CONTROL, "PCIeCapability" },
+	{ OSC_PCI_EXPRESS_LTR_CONTROL, "LTR" },
 };
 
 static void decode_osc_bits(struct acpi_pci_root *root, char *msg, u32 word,
@@ -475,6 +476,10 @@ static void negotiate_os_control(struct acpi_pci_root *root, int *no_aspm)
 		| OSC_PCI_EXPRESS_NATIVE_HP_CONTROL
 		| OSC_PCI_EXPRESS_PME_CONTROL;
 
+#ifdef CONFIG_PCIEASPM
+	control |= OSC_PCI_EXPRESS_LTR_CONTROL;
+#endif
+
 	if (pci_aer_available()) {
 		if (aer_acpi_firmware_first())
 			dev_info(&device->dev,
@@ -905,6 +910,8 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
 		host_bridge->native_aer = 0;
 	if (!(root->osc_control_set & OSC_PCI_EXPRESS_PME_CONTROL))
 		host_bridge->native_pme = 0;
+	if (!(root->osc_control_set & OSC_PCI_EXPRESS_LTR_CONTROL))
+		host_bridge->native_ltr = 0;
 
 	pci_scan_child_bus(bus);
 	pci_set_host_bridge_release(host_bridge, acpi_pci_root_release_info,
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index ac91b6fd0bcd..cc1688d75664 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -554,6 +554,7 @@ struct pci_host_bridge *pci_alloc_host_bridge(size_t priv)
 	bridge->native_aer = 1;
 	bridge->native_hotplug = 1;
 	bridge->native_pme = 1;
+	bridge->native_ltr = 1;
 
 	return bridge;
 }
@@ -1954,9 +1955,13 @@ static void pci_configure_relaxed_ordering(struct pci_dev *dev)
 static void pci_configure_ltr(struct pci_dev *dev)
 {
 #ifdef CONFIG_PCIEASPM
+	struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
 	u32 cap;
 	struct pci_dev *bridge;
 
+	if (!host->native_ltr)
+		return;
+
 	if (!pci_is_pcie(dev))
 		return;
 
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 15bfb15c2fa5..49f63c67a9d1 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -506,7 +506,8 @@ extern bool osc_pc_lpi_support_confirmed;
 #define OSC_PCI_EXPRESS_PME_CONTROL		0x00000004
 #define OSC_PCI_EXPRESS_AER_CONTROL		0x00000008
 #define OSC_PCI_EXPRESS_CAPABILITY_CONTROL	0x00000010
-#define OSC_PCI_CONTROL_MASKS			0x0000001f
+#define OSC_PCI_EXPRESS_LTR_CONTROL		0x00000020
+#define OSC_PCI_CONTROL_MASKS			0x0000003f
 
 #define ACPI_GSB_ACCESS_ATTRIB_QUICK		0x00000002
 #define ACPI_GSB_ACCESS_ATTRIB_SEND_RCV         0x00000004
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 73178a2fcee0..d0149c01996d 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -473,6 +473,7 @@ struct pci_host_bridge {
 	unsigned int	native_aer:1;		/* OS may use PCIe AER */
 	unsigned int	native_hotplug:1;	/* OS may use PCIe hotplug */
 	unsigned int	native_pme:1;		/* OS may use PCIe PME */
+	unsigned int	native_ltr:1;		/* OS may use PCIe LTR */
 	/* Resource alignment requirements */
 	resource_size_t (*align_resource)(struct pci_dev *dev,
 			const struct resource *res,

commit 4c65e86ad91284f4ceac1a8cbde86855829f3db8
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Tue Apr 17 11:25:51 2018 -0500

    PCI/ASPM: Disable ASPM L1.2 Substate if we don't have LTR
    
    When in the ASPM L1.0 state (but not the PCI-PM L1.0 state), the most
    recent LTR value and the LTR_L1.2_THRESHOLD determines whether the link
    enters the L1.2 substate.
    
    If we don't have LTR enabled, prevent the use of ASPM L1.2.
    
    PCI-PM L1.2 may still be used because it doesn't depend on
    LTR_L1.2_THRESHOLD (see PCIe r4.0, sec 5.5.1).

diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index f76eb7704f64..938ced5bbbfc 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -400,6 +400,15 @@ static void pcie_get_aspm_reg(struct pci_dev *pdev,
 		info->l1ss_cap = 0;
 		return;
 	}
+
+	/*
+	 * If we don't have LTR for the entire path from the Root Complex
+	 * to this device, we can't use ASPM L1.2 because it relies on the
+	 * LTR_L1.2_THRESHOLD.  See PCIe r4.0, secs 5.5.4, 6.18.
+	 */
+	if (!pdev->ltr_path)
+		info->l1ss_cap &= ~PCI_L1SS_CTL1_ASPM_L1_2;
+
 	pci_read_config_dword(pdev, info->l1ss_cap_ptr + PCI_L1SS_CTL1,
 			      &info->l1ss_ctl1);
 	pci_read_config_dword(pdev, info->l1ss_cap_ptr + PCI_L1SS_CTL2,

commit a4229ad9e9d9c8acf1f1e43444f93372a4fbc8c2
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Tue Apr 17 11:54:06 2018 -0500

    PCI: Enable LTR only if ASPM is enabled
    
    The only time we need LTR is when we enable the ASPM L1.2 substate.  If
    ASPM is disabled, don't bother enabling LTR.

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index cc1688d75664..988876a2868f 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1962,6 +1962,9 @@ static void pci_configure_ltr(struct pci_dev *dev)
 	if (!host->native_ltr)
 		return;
 
+	if (!pcie_aspm_support_enabled())
+		return;
+
 	if (!pci_is_pcie(dev))
 		return;
 

commit c62d4a5e5006ed71e154fe49ed64e78c16ffaebc
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Tue Apr 17 10:27:26 2018 -0500

    PCI: Disable LTR for Samsung NVMe SSD Controller SM961/PM961
    
    This is just for testing purposes.  Obviously we can't disable LTR (and
    hence ASPM L1 Substates) for this device in *all* systems, because it works
    in most cases.

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 988876a2868f..25dcf4817e8a 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1962,6 +1962,9 @@ static void pci_configure_ltr(struct pci_dev *dev)
 	if (!host->native_ltr)
 		return;
 
+	if (dev->no_ltr)
+		return;
+
 	if (!pcie_aspm_support_enabled())
 		return;
 
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 2990ad1e7c99..ae0a088373e9 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4741,3 +4741,9 @@ DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_AMD, PCI_ANY_ID,
 			      PCI_CLASS_MULTIMEDIA_HD_AUDIO, 8, quirk_gpu_hda);
 DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
 			      PCI_CLASS_MULTIMEDIA_HD_AUDIO, 8, quirk_gpu_hda);
+
+static void quirk_samsung_ssd(struct pci_dev *dev)
+{
+	dev->no_ltr = 1;
+}
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SAMSUNG, 0xa804, quirk_samsung_ssd);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index d0149c01996d..443802bc5663 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -346,6 +346,7 @@ struct pci_dev {
 
 #ifdef CONFIG_PCIEASPM
 	struct pcie_link_state	*link_state;	/* ASPM link state */
+	unsigned int	no_ltr:1;	/* May not use LTR */
 	unsigned int	ltr_path:1;	/* Latency Tolerance Reporting
 					   supported from root to here */
 #endif
Srinath Mannam April 18, 2018, 9:05 a.m. | #11
Hi Bjorn,

Thank you very much for you time and solution.

On Tue, Apr 17, 2018 at 10:41 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
> On Tue, Apr 17, 2018 at 02:33:52PM +0530, Srinath Mannam wrote:
>> Hi Bjorn,
>>
>> Thank you for more insight you have given about the problem.
>>
>> For us the issue comes before we disable apst feature.
>> on APST quirk set, NVMe driver disable apst by send a command to NVMe
>> controller.
>> We see issue at the time of NVMe initialization only.
>>
>> So APST quirk did not helped.
>
> OK, thanks for checking that.
>
>> On Tue, Apr 17, 2018 at 3:05 AM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>> > On Mon, Apr 16, 2018 at 09:03:33PM +0530, Srinath Mannam wrote:
>> >> On Sat, Apr 14, 2018 at 9:39 PM, Bjorn Helgaas <helgaas@kernel.org> wrote:
>> >> > On Sat, Apr 14, 2018 at 09:04:05AM +0530, Srinath Mannam wrote:
>
>> >> >> But In our platform we required to disable ASPM.
>> >>
>> >> > We're trying to figure out exactly *why* you must disable ASPM.  If
>> >> > it's because of a hardware defect, e.g., the device advertises ASPM
>> >> > support but it's actually broken, we probably need to add a quirk.
>> >> > Given the complexity of ASPM, it's surprising we don't have similar
>> >> > quirks already.
>> >>
>> >> We see issues with ASPM enabled. Some link issues observed so for
>> >> time being we are using with aspm disabled until we fix that issue.
>> >> ...
>
>> >> with LTR enabled also we observed some problem, that after LTR
>> >> messages received from EP, we see completion timeout with config
>> >> write.
>> >
>> >> So I thought If LTR configuration function also part of aspm file,
>> >> as it was under CONFIG_ASPM.  using pcie_aspm boot arg I can disable
>> >> both ASPM and LTR.
>> >
>> >> If this is not possible, then I will go for alternative solution of
>> >> quirk implementation as you suggest.
>> >
>> > Is this platform a lab prototype or is it already shipping?  If it's
>> > already shipping, you probably need some sort of upstream solution
>> > like an NVMe or PCIe quirk, but if not, maybe you can just hack your
>> > bringup kernel to disable ASPM and LTR until you fix the root cause.
>> >
>> we are at evolution stage so we need to fix this ASAP.
>> As you said earlier, Can I add sysfs interface to enable LTR same as
>> we do L1SS or in the part of aspm cap init function.
>
> I don't know what evolution stage means, but it sounds like you do
> need an upstream fix.
>
> I don't understand the root cause of the problem yet, so I don't know
> what the best solution for you is.  Turning off LTR and ASPM
> completely is a pretty big hammer.  Ideally we could isolate this to
> certain devices or certain pieces of ASPM so we could still get some
> of the benefit.
>
>   - Do you need to disable LTR on the entire system, or just for the
>     Samsung NVMe device?
>
We see both ASPM and LTR issues with samsung device SM961/PM961.
But the same device works fine with multiple other platforms, means we
have to debug our RC.

>   - Do you need to disable LTR if you replace the Samsung device with
>     something else?
>
So far we see issue with samsung device only. Need to explore more.

>   - LTR is only needed for the ASPM L1.2 substate.  Do you need to
>     completely disable ASPM, or is it sufficient to disable LTR and
>     the ASPM L1.2 substate?

We need to disable common clock configuration which is done in the part of ASPM.
If we enable common clock, we see issues with samsung device.
>
> Here are some patches you can try.  These require some tweaking based
> on what the root cause of the problem is.
>
>   1  PCI/ACPI: Request control of LTR from the platform
>   2  PCI/ASPM: Disable ASPM L1.2 Substate if we don't have LTR
>   3  PCI: Enable LTR only if ASPM is enabled
>   4  PCI: Disable LTR for Samsung NVMe SSD Controller SM961/PM961
>
> If LTR is completely broken for all devices on your platform, and
> disabling it and the ASPM L1.2 substate is sufficient, you could use
> patches 1 and 2 along with a PCI or DMI quirk to clear
> host->native_ltr.  That should prevent use of LTR and the ASPM L1.2
> substate.
>
> Patch 3 should disable LTR as well as all of ASPM (not just the ASPM
> L1.2 substate) if you boot with "pcie_aspm=off".  I don't like this
> very well because the user experience is poor -- you need a release
> note or something telling the user to boot with "pcie_aspm=off", which
> is a really ugly solution.
>
> If the problem is some sort of interaction between the Samsung SSD and
> your platform, you could use patches 2 and 4 to disable LTR and ASPM
> L1.2 just for that device in your system.  Of course, you would need
> to add a DMI or similar check in the quirk, because we can't disable
> LTR and ASPM L1.2 for the Samsung SSD in *all* systems.
>
> I think patches 1-3 are candidates for the mainline kernel, regardless
> of whether you need them yourself.
>
As you suggested I am using patch 2 and 4 which are suitable to our requirement.
Until we resolve the HW issue, we will continue to use these patches.

>
> commit 0e78bd8bd5a99f47282b1ab5dbd679d28ff3b460
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date:   Tue Apr 17 10:58:09 2018 -0500
>
>     PCI/ACPI: Request control of LTR from the platform
>
>     Per the PCI Firmware spec r3.2, sec 4.5, the OS should request control of
>     Latency Tolerance Reporting (LTR) before using it.  LTR is only required
>     for the ASPM L1.2 Substate.
>
>     If we support ASPM, request control of LTR.  If the platform does not grant
>     us control of LTR, don't use it.
>
> diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> index 0da18bde6a16..f32d767e8e3b 100644
> --- a/drivers/acpi/pci_root.c
> +++ b/drivers/acpi/pci_root.c
> @@ -153,6 +153,7 @@ static struct pci_osc_bit_struct pci_osc_control_bit[] = {
>         { OSC_PCI_EXPRESS_PME_CONTROL, "PME" },
>         { OSC_PCI_EXPRESS_AER_CONTROL, "AER" },
>         { OSC_PCI_EXPRESS_CAPABILITY_CONTROL, "PCIeCapability" },
> +       { OSC_PCI_EXPRESS_LTR_CONTROL, "LTR" },
>  };
>
>  static void decode_osc_bits(struct acpi_pci_root *root, char *msg, u32 word,
> @@ -475,6 +476,10 @@ static void negotiate_os_control(struct acpi_pci_root *root, int *no_aspm)
>                 | OSC_PCI_EXPRESS_NATIVE_HP_CONTROL
>                 | OSC_PCI_EXPRESS_PME_CONTROL;
>
> +#ifdef CONFIG_PCIEASPM
> +       control |= OSC_PCI_EXPRESS_LTR_CONTROL;
> +#endif
> +
>         if (pci_aer_available()) {
>                 if (aer_acpi_firmware_first())
>                         dev_info(&device->dev,
> @@ -905,6 +910,8 @@ struct pci_bus *acpi_pci_root_create(struct acpi_pci_root *root,
>                 host_bridge->native_aer = 0;
>         if (!(root->osc_control_set & OSC_PCI_EXPRESS_PME_CONTROL))
>                 host_bridge->native_pme = 0;
> +       if (!(root->osc_control_set & OSC_PCI_EXPRESS_LTR_CONTROL))
> +               host_bridge->native_ltr = 0;
>
>         pci_scan_child_bus(bus);
>         pci_set_host_bridge_release(host_bridge, acpi_pci_root_release_info,
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index ac91b6fd0bcd..cc1688d75664 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -554,6 +554,7 @@ struct pci_host_bridge *pci_alloc_host_bridge(size_t priv)
>         bridge->native_aer = 1;
>         bridge->native_hotplug = 1;
>         bridge->native_pme = 1;
> +       bridge->native_ltr = 1;
>
>         return bridge;
>  }
> @@ -1954,9 +1955,13 @@ static void pci_configure_relaxed_ordering(struct pci_dev *dev)
>  static void pci_configure_ltr(struct pci_dev *dev)
>  {
>  #ifdef CONFIG_PCIEASPM
> +       struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
>         u32 cap;
>         struct pci_dev *bridge;
>
> +       if (!host->native_ltr)
> +               return;
> +
>         if (!pci_is_pcie(dev))
>                 return;
>
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> index 15bfb15c2fa5..49f63c67a9d1 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -506,7 +506,8 @@ extern bool osc_pc_lpi_support_confirmed;
>  #define OSC_PCI_EXPRESS_PME_CONTROL            0x00000004
>  #define OSC_PCI_EXPRESS_AER_CONTROL            0x00000008
>  #define OSC_PCI_EXPRESS_CAPABILITY_CONTROL     0x00000010
> -#define OSC_PCI_CONTROL_MASKS                  0x0000001f
> +#define OSC_PCI_EXPRESS_LTR_CONTROL            0x00000020
> +#define OSC_PCI_CONTROL_MASKS                  0x0000003f
>
>  #define ACPI_GSB_ACCESS_ATTRIB_QUICK           0x00000002
>  #define ACPI_GSB_ACCESS_ATTRIB_SEND_RCV         0x00000004
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 73178a2fcee0..d0149c01996d 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -473,6 +473,7 @@ struct pci_host_bridge {
>         unsigned int    native_aer:1;           /* OS may use PCIe AER */
>         unsigned int    native_hotplug:1;       /* OS may use PCIe hotplug */
>         unsigned int    native_pme:1;           /* OS may use PCIe PME */
> +       unsigned int    native_ltr:1;           /* OS may use PCIe LTR */
>         /* Resource alignment requirements */
>         resource_size_t (*align_resource)(struct pci_dev *dev,
>                         const struct resource *res,
>
> commit 4c65e86ad91284f4ceac1a8cbde86855829f3db8
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date:   Tue Apr 17 11:25:51 2018 -0500
>
>     PCI/ASPM: Disable ASPM L1.2 Substate if we don't have LTR
>
>     When in the ASPM L1.0 state (but not the PCI-PM L1.0 state), the most
>     recent LTR value and the LTR_L1.2_THRESHOLD determines whether the link
>     enters the L1.2 substate.
>
>     If we don't have LTR enabled, prevent the use of ASPM L1.2.
>
>     PCI-PM L1.2 may still be used because it doesn't depend on
>     LTR_L1.2_THRESHOLD (see PCIe r4.0, sec 5.5.1).
>
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index f76eb7704f64..938ced5bbbfc 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -400,6 +400,15 @@ static void pcie_get_aspm_reg(struct pci_dev *pdev,
>                 info->l1ss_cap = 0;
>                 return;
>         }
> +
> +       /*
> +        * If we don't have LTR for the entire path from the Root Complex
> +        * to this device, we can't use ASPM L1.2 because it relies on the
> +        * LTR_L1.2_THRESHOLD.  See PCIe r4.0, secs 5.5.4, 6.18.
> +        */
> +       if (!pdev->ltr_path)
> +               info->l1ss_cap &= ~PCI_L1SS_CTL1_ASPM_L1_2;
> +
>         pci_read_config_dword(pdev, info->l1ss_cap_ptr + PCI_L1SS_CTL1,
>                               &info->l1ss_ctl1);
>         pci_read_config_dword(pdev, info->l1ss_cap_ptr + PCI_L1SS_CTL2,
>
> commit a4229ad9e9d9c8acf1f1e43444f93372a4fbc8c2
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date:   Tue Apr 17 11:54:06 2018 -0500
>
>     PCI: Enable LTR only if ASPM is enabled
>
>     The only time we need LTR is when we enable the ASPM L1.2 substate.  If
>     ASPM is disabled, don't bother enabling LTR.
>
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index cc1688d75664..988876a2868f 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1962,6 +1962,9 @@ static void pci_configure_ltr(struct pci_dev *dev)
>         if (!host->native_ltr)
>                 return;
>
> +       if (!pcie_aspm_support_enabled())
> +               return;
> +
>         if (!pci_is_pcie(dev))
>                 return;
>
>
> commit c62d4a5e5006ed71e154fe49ed64e78c16ffaebc
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date:   Tue Apr 17 10:27:26 2018 -0500
>
>     PCI: Disable LTR for Samsung NVMe SSD Controller SM961/PM961
>
>     This is just for testing purposes.  Obviously we can't disable LTR (and
>     hence ASPM L1 Substates) for this device in *all* systems, because it works
>     in most cases.
>
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index 988876a2868f..25dcf4817e8a 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -1962,6 +1962,9 @@ static void pci_configure_ltr(struct pci_dev *dev)
>         if (!host->native_ltr)
>                 return;
>
> +       if (dev->no_ltr)
> +               return;
> +
>         if (!pcie_aspm_support_enabled())
>                 return;
>
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 2990ad1e7c99..ae0a088373e9 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -4741,3 +4741,9 @@ DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_AMD, PCI_ANY_ID,
>                               PCI_CLASS_MULTIMEDIA_HD_AUDIO, 8, quirk_gpu_hda);
>  DECLARE_PCI_FIXUP_CLASS_FINAL(PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
>                               PCI_CLASS_MULTIMEDIA_HD_AUDIO, 8, quirk_gpu_hda);
> +
> +static void quirk_samsung_ssd(struct pci_dev *dev)
> +{
> +       dev->no_ltr = 1;
> +}
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SAMSUNG, 0xa804, quirk_samsung_ssd);
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index d0149c01996d..443802bc5663 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -346,6 +346,7 @@ struct pci_dev {
>
>  #ifdef CONFIG_PCIEASPM
>         struct pcie_link_state  *link_state;    /* ASPM link state */
> +       unsigned int    no_ltr:1;       /* May not use LTR */
>         unsigned int    ltr_path:1;     /* Latency Tolerance Reporting
>                                            supported from root to here */
>  #endif


Thank you again,

Regards,
Srinath.

Patch

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 14e0ea1..3761b13 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1875,6 +1875,38 @@  static void
pci_configure_relaxed_ordering(struct pci_dev *dev)
        }
 }

+static void pci_configure_ltr(struct pci_dev *dev)
+{
+#ifdef CONFIG_PCIEASPM
+       u32 cap;
+       struct pci_dev *bridge;
+
+       if (!pci_is_pcie(dev))
+               return;
+
+       pcie_capability_read_dword(dev, PCI_EXP_DEVCAP2, &cap);
+       if (!(cap & PCI_EXP_DEVCAP2_LTR))
+               return;
+
+       /*
+        * Software must not enable LTR in an Endpoint unless the Root
+        * Complex and all intermediate Switches indicate support for LTR.
+        * PCIe r3.1, sec 6.18.
+        */
+       if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT)
+               dev->ltr_path = 1;
+       else {
+               bridge = pci_upstream_bridge(dev);
+               if (bridge && bridge->ltr_path)
+                       dev->ltr_path = 1;
+       }
+
+       if (dev->ltr_path)
+               pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
+                                        PCI_EXP_DEVCTL2_LTR_EN);
+#endif
+}
+
 static void pci_configure_device(struct pci_dev *dev)
 {