diff mbox series

[v5] PCI/ASPM: Add back L1 PM Substate save and restore

Message ID 20231221011250.191599-1-david.e.box@linux.intel.com
State New
Headers show
Series [v5] PCI/ASPM: Add back L1 PM Substate save and restore | expand

Commit Message

David E. Box Dec. 21, 2023, 1:12 a.m. UTC
Commit a7152be79b62 ("Revert "PCI/ASPM: Save L1 PM Substates Capability
for suspend/resume"") reverted saving and restoring of ASPM L1 Substates
due to a regression that caused resume from suspend to fail on certain
systems. However, we never added this capability back and this is now
causing systems fail to enter low power CPU states, drawing more power
from the battery.

The original revert mentioned that we restore L1 PM substate configuration
even though ASPM L1 may already be enabled. This is due the fact that
the pci_restore_aspm_l1ss_state() was called before pci_restore_pcie_state().

Try to enable this functionality again following PCIe r6.0.1, sec 5.5.4
more closely by:

  1) Do not restore ASPM configuration in pci_restore_pcie_state() but
     do that after PCIe capability is restored in pci_restore_aspm_state()
     following PCIe r6.0, sec 5.5.4.

  2) If BIOS reenables L1SS on us, particularly L1.2, we need to clear the
     enables in the right order, downstream before upstream. Defer restoring
     the L1SS config until we are at the downstream component. Then update
     the config for both ends of the link in the prescribed order.

  3) Program ASPM L1 PM substate configuration before L1 enables.

  4) Program ASPM L1 PM substate enables last after rest of the fields
     in the capability are programmed.

Reported-by: Koba Ko <koba.ko@canonical.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217321
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216782
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216877
Cc: Tasev Nikola <tasev.stefanoska@skynet.be>
Cc: Mark Enriquez <enriquezmark36@gmail.com>
Cc: Thomas Witt <kernel@witt.link>
Cc: Werner Sembach <wse@tuxedocomputers.com>
Co-developed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Co-developed-by: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
---
Hi all,

Previous versions of the patch can be found:

v4: https://lore.kernel.org/linux-pci/20231002070044.2299644-1-mika.westerberg@linux.intel.com/
v3: https://lore.kernel.org/linux-pci/20230925074636.2893747-1-mika.westerberg@linux.intel.com/
v2: https://lore.kernel.org/linux-pci/20230911073352.3472918-1-mika.westerberg@linux.intel.com/
v1: https://lore.kernel.org/linux-pci/20230627062442.54008-1-mika.westerberg@linux.intel.com/ 

Changes from v4 (David):
  - Defer restoring upstream component until at downstream component and
    handle both together in the prescribed order:
       - Disable L1SS on child then parent
       - Restore configuration on both without enables
       - Enable L1SS on parent then child
    This fixes a similar hang observed on my system but still needs
    reporters to test.
  - Removed denylist and associated code.
  - Removed Reviewed-by and Tested-by since I've modified Mika's patch and
    the changes need to be retested. Added Co-developed-by.

Changes from v3 (Mika):

  - Use pcie_capability_set_word() instead.

  - Add tag from Ilpo.

Changes from v2 (Mika):

  - Added tested by tag from Kai-Heng Feng.

  - Dropped the two unneeded (u32 *) casts.

  - Dropped unnecessary comment.

Changes from v1 (Mika):

  - We move ASPM enables from pci_restore_pcie_state() into
    pci_restore_aspm_state() to make sure they are clear when L1SS bits
    are programmed (as per PCIe spec).

  - The denylist includes the TUXEDO system as well but only if suspend
    is done via BIOS (e.g mem_sleep=deep is forced by user). This way
    the PCIe devices should continue working after S3 resume, and at the
    same time allow better power savings. If the default s2idle is used
    then we restore L1SS to allow the CPU enter lower power states. This
    is the best I was able to come up to make everyone happy.

 drivers/pci/pci.c       |  18 ++++++-
 drivers/pci/pci.h       |   4 ++
 drivers/pci/pcie/aspm.c | 116 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 136 insertions(+), 2 deletions(-)

Comments

Mika Westerberg Dec. 21, 2023, 10:43 a.m. UTC | #1
Hi David,

On Wed, Dec 20, 2023 at 05:12:50PM -0800, David E. Box wrote:
> Commit a7152be79b62 ("Revert "PCI/ASPM: Save L1 PM Substates Capability
> for suspend/resume"") reverted saving and restoring of ASPM L1 Substates
> due to a regression that caused resume from suspend to fail on certain
> systems. However, we never added this capability back and this is now
> causing systems fail to enter low power CPU states, drawing more power
> from the battery.
> 
> The original revert mentioned that we restore L1 PM substate configuration
> even though ASPM L1 may already be enabled. This is due the fact that
> the pci_restore_aspm_l1ss_state() was called before pci_restore_pcie_state().
> 
> Try to enable this functionality again following PCIe r6.0.1, sec 5.5.4
> more closely by:
> 
>   1) Do not restore ASPM configuration in pci_restore_pcie_state() but
>      do that after PCIe capability is restored in pci_restore_aspm_state()
>      following PCIe r6.0, sec 5.5.4.
> 
>   2) If BIOS reenables L1SS on us, particularly L1.2, we need to clear the
>      enables in the right order, downstream before upstream. Defer restoring
>      the L1SS config until we are at the downstream component. Then update
>      the config for both ends of the link in the prescribed order.
> 
>   3) Program ASPM L1 PM substate configuration before L1 enables.
> 
>   4) Program ASPM L1 PM substate enables last after rest of the fields
>      in the capability are programmed.
> 
> Reported-by: Koba Ko <koba.ko@canonical.com>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217321
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216782
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216877
> Cc: Tasev Nikola <tasev.stefanoska@skynet.be>
> Cc: Mark Enriquez <enriquezmark36@gmail.com>
> Cc: Thomas Witt <kernel@witt.link>
> Cc: Werner Sembach <wse@tuxedocomputers.com>
> Co-developed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> Co-developed-by: David E. Box <david.e.box@linux.intel.com>
> Signed-off-by: David E. Box <david.e.box@linux.intel.com>
> ---
> Hi all,
> 
> Previous versions of the patch can be found:
> 
> v4: https://lore.kernel.org/linux-pci/20231002070044.2299644-1-mika.westerberg@linux.intel.com/
> v3: https://lore.kernel.org/linux-pci/20230925074636.2893747-1-mika.westerberg@linux.intel.com/
> v2: https://lore.kernel.org/linux-pci/20230911073352.3472918-1-mika.westerberg@linux.intel.com/
> v1: https://lore.kernel.org/linux-pci/20230627062442.54008-1-mika.westerberg@linux.intel.com/ 
> 
> Changes from v4 (David):
>   - Defer restoring upstream component until at downstream component and
>     handle both together in the prescribed order:
>        - Disable L1SS on child then parent
>        - Restore configuration on both without enables
>        - Enable L1SS on parent then child
>     This fixes a similar hang observed on my system but still needs
>     reporters to test.
>   - Removed denylist and associated code.
>   - Removed Reviewed-by and Tested-by since I've modified Mika's patch and
>     the changes need to be retested. Added Co-developed-by.

Thanks a lot for following up and figuring out the root cause!

I have two tiny stylistic comments, the code otherwise looks good.

> Changes from v3 (Mika):
> 
>   - Use pcie_capability_set_word() instead.
> 
>   - Add tag from Ilpo.
> 
> Changes from v2 (Mika):
> 
>   - Added tested by tag from Kai-Heng Feng.
> 
>   - Dropped the two unneeded (u32 *) casts.
> 
>   - Dropped unnecessary comment.
> 
> Changes from v1 (Mika):
> 
>   - We move ASPM enables from pci_restore_pcie_state() into
>     pci_restore_aspm_state() to make sure they are clear when L1SS bits
>     are programmed (as per PCIe spec).
> 
>   - The denylist includes the TUXEDO system as well but only if suspend
>     is done via BIOS (e.g mem_sleep=deep is forced by user). This way
>     the PCIe devices should continue working after S3 resume, and at the
>     same time allow better power savings. If the default s2idle is used
>     then we restore L1SS to allow the CPU enter lower power states. This
>     is the best I was able to come up to make everyone happy.
> 
>  drivers/pci/pci.c       |  18 ++++++-
>  drivers/pci/pci.h       |   4 ++
>  drivers/pci/pcie/aspm.c | 116 ++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 136 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 55bc3576a985..3c4b2647b4ca 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1579,7 +1579,7 @@ static void pci_restore_pcie_state(struct pci_dev *dev)
>  {
>  	int i = 0;
>  	struct pci_cap_saved_state *save_state;
> -	u16 *cap;
> +	u16 *cap, val;
>  
>  	save_state = pci_find_saved_cap(dev, PCI_CAP_ID_EXP);
>  	if (!save_state)
> @@ -1594,7 +1594,14 @@ static void pci_restore_pcie_state(struct pci_dev *dev)
>  
>  	cap = (u16 *)&save_state->cap.data[0];
>  	pcie_capability_write_word(dev, PCI_EXP_DEVCTL, cap[i++]);
> -	pcie_capability_write_word(dev, PCI_EXP_LNKCTL, cap[i++]);
> +	/*
> +	 * Restoring ASPM L1 substates has special requirements
> +	 * according to the PCIe spec 6.0. So we restore here only the
> +	 * LNKCTL register with the ASPM control field clear. ASPM will
> +	 * be restored in pci_restore_aspm_state().
> +	 */
> +	val = cap[i++] & ~PCI_EXP_LNKCTL_ASPMC;
> +	pcie_capability_write_word(dev, PCI_EXP_LNKCTL, val);
>  	pcie_capability_write_word(dev, PCI_EXP_SLTCTL, cap[i++]);
>  	pcie_capability_write_word(dev, PCI_EXP_RTCTL, cap[i++]);
>  	pcie_capability_write_word(dev, PCI_EXP_DEVCTL2, cap[i++]);
> @@ -1705,6 +1712,7 @@ int pci_save_state(struct pci_dev *dev)
>  	pci_save_ltr_state(dev);
>  	pci_save_dpc_state(dev);
>  	pci_save_aer_state(dev);
> +	pci_save_aspm_state(dev);
>  	pci_save_ptm_state(dev);
>  	return pci_save_vc_state(dev);
>  }
> @@ -1817,6 +1825,7 @@ void pci_restore_state(struct pci_dev *dev)
>  	pci_restore_rebar_state(dev);
>  	pci_restore_dpc_state(dev);
>  	pci_restore_ptm_state(dev);
> +	pci_restore_aspm_state(dev);
>  
>  	pci_aer_clear_status(dev);
>  	pci_restore_aer_state(dev);
> @@ -3509,6 +3518,11 @@ void pci_allocate_cap_save_buffers(struct pci_dev *dev)
>  	if (error)
>  		pci_err(dev, "unable to allocate suspend buffer for LTR\n");
>  
> +	error = pci_add_ext_cap_save_buffer(dev, PCI_EXT_CAP_ID_L1SS,
> +					    2 * sizeof(u32));
> +	if (error)
> +		pci_err(dev, "unable to allocate suspend buffer for ASPM-L1SS\n");
> +
>  	pci_allocate_vc_save_buffers(dev);
>  }
>  
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 5ecbcf041179..2d0f9ae3d9b6 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -570,10 +570,14 @@ int pcie_retrain_link(struct pci_dev *pdev, bool use_lt);
>  void pcie_aspm_init_link_state(struct pci_dev *pdev);
>  void pcie_aspm_exit_link_state(struct pci_dev *pdev);
>  void pcie_aspm_powersave_config_link(struct pci_dev *pdev);
> +void pci_save_aspm_state(struct pci_dev *pdev);
> +void pci_restore_aspm_state(struct pci_dev *pdev);
>  #else
>  static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { }
>  static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { }
>  static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { }
> +static inline void pci_save_aspm_state(struct pci_dev *pdev) { }
> +static inline void pci_restore_aspm_state(struct pci_dev *pdev) { }
>  #endif
>  
>  #ifdef CONFIG_PCIE_ECRC
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index 50b04ae5c394..dc5ea0ff2e07 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -1027,6 +1027,122 @@ void pcie_aspm_powersave_config_link(struct pci_dev *pdev)
>  	up_read(&pci_bus_sem);
>  }
>  
> +void pci_save_aspm_state(struct pci_dev *pdev)
> +{
> +	struct pci_cap_saved_state *save_state;
> +	u16 l1ss = pdev->l1ss;
> +	u32 *cap;
> +
> +	/*
> +	 * Save L1 substate configuration. The ASPM L0s/L1 configuration
> +	 * is already saved in pci_save_pcie_state().
> +	 */
> +	if (!l1ss)
> +		return;
> +
> +	save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_L1SS);
> +	if (!save_state)
> +		return;
> +
> +	cap = &save_state->cap.data[0];
> +	pci_read_config_dword(pdev, l1ss + PCI_L1SS_CTL2, cap++);
> +	pci_read_config_dword(pdev, l1ss + PCI_L1SS_CTL1, cap++);
> +}
> +
> +static void pcie_restore_aspm_l1ss(struct pci_dev *child)
> +{
> +	struct pci_cap_saved_state *pl_save_state, *cl_save_state;
> +	struct pci_dev *parent = child->bus->self;
> +	struct pcie_link_state *link;
> +	u32 *cap, pl_ctl1, pl_ctl2, pl_l1_2_enable;
> +	u32 cl_ctl1, cl_ctl2, cl_l1_2_enable;
> +
> +	/*
> +	 * In case BIOS enabled L1.2 after resume, we need to disable it first
> +	 * on the downstream component before the upstream. So, don't attempt to
> +	 * restore either until we are at the downstream component.
> +	 */
> +	if (child->link_state || !parent || !child->l1ss || !parent->l1ss)
> +		return;
> +
> +	link = parent->link_state;
> +	if (!link)
> +		return;
> +
> +	cl_save_state = pci_find_saved_ext_cap(child, PCI_EXT_CAP_ID_L1SS);
> +	pl_save_state = pci_find_saved_ext_cap(parent, PCI_EXT_CAP_ID_L1SS);
> +	if (!cl_save_state || !pl_save_state)
> +		return;
> +
> +	cap = &cl_save_state->cap.data[0];
> +	cl_ctl2 = *cap++;
> +	cl_ctl1 = *cap;
> +	cap = &pl_save_state->cap.data[0];
> +	pl_ctl2 = *cap++;
> +	pl_ctl1 = *cap;
> +
> +

Drop the other empty line.

> +	/*
> +	 * Disable L1.2 on this downstream endpoint device first, followed
> +	 * by the upstream
> +	 */
> +	pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1,
> +				PCI_L1SS_CTL1_L1_2_MASK, 0);
> +	pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1,
> +				PCI_L1SS_CTL1_L1_2_MASK, 0);
> +
> +	/*
> +	 * In addition, Common_Mode_Restore_Time and LTR_L1.2_THRESHOLD
> +	 * in PCI_L1SS_CTL1 must be programmed *before* setting the L1.2
> +	 * enable bits, even though they're all in PCI_L1SS_CTL1.
> +	 */
> +	pl_l1_2_enable = pl_ctl1 & PCI_L1SS_CTL1_L1_2_MASK;
> +	pl_ctl1 &= ~PCI_L1SS_CTL1_L1_2_MASK;
> +	cl_l1_2_enable = cl_ctl1 & PCI_L1SS_CTL1_L1_2_MASK;
> +	cl_ctl1 &= ~PCI_L1SS_CTL1_L1_2_MASK;
> +
> +	/* Write back without enables first (above we cleared them in ctl1) */
> +	pci_write_config_dword(parent, parent->l1ss + PCI_L1SS_CTL2, pl_ctl2);
> +	pci_write_config_dword(child, child->l1ss + PCI_L1SS_CTL2, cl_ctl2);
> +	pci_write_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1, pl_ctl1);
> +	pci_write_config_dword(child, child->l1ss + PCI_L1SS_CTL1, cl_ctl1);
> +
> +

Ditto.

> +	/* Then write back the enables */
> +	if (pl_l1_2_enable || cl_l1_2_enable) {
> +		pci_write_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1,
> +				       pl_ctl1 | pl_l1_2_enable);
> +		pci_write_config_dword(child, child->l1ss + PCI_L1SS_CTL1,
> +				       cl_ctl1 | cl_l1_2_enable);
> +	}
> +}
> +
> +void pci_restore_aspm_state(struct pci_dev *pdev)
> +{
> +	struct pci_cap_saved_state *save_state;
> +	u16 *cap, val;
> +
> +	save_state = pci_find_saved_cap(pdev, PCI_CAP_ID_EXP);
> +
> +	if (!save_state)
> +		return;
> +
> +	cap = (u16 *)&save_state->cap.data[0];
> +	/* Must match the ordering in pci_save/restore_pcie_state() */
> +	val = cap[1] & PCI_EXP_LNKCTL_ASPMC;
> +	if (!val)
> +		return;
> +
> +	/*
> +	 * We restore L1 substate configuration first before enabling L1
> +	 * as the PCIe spec 6.0 sec 5.5.4 suggests.
> +	 */
> +	pcie_restore_aspm_l1ss(pdev);
> +
> +	/* Re-enable L0s/L1 */
> +	pcie_capability_set_word(pdev, PCI_EXP_LNKCTL, val);
> +}
> +
>  static struct pcie_link_state *pcie_aspm_get_link(struct pci_dev *pdev)
>  {
>  	struct pci_dev *bridge;
> -- 
> 2.34.1
Bjorn Helgaas Dec. 28, 2023, 10:31 p.m. UTC | #2
This is great.  Thanks so much for really pushing through this issue!

On Wed, Dec 20, 2023 at 05:12:50PM -0800, David E. Box wrote:
> Commit a7152be79b62 ("Revert "PCI/ASPM: Save L1 PM Substates Capability
> for suspend/resume"")

I would refer to 4ff116d0d5fd ("PCI/ASPM: Save L1 PM Substates
Capability for suspend/resume") first because that's really the
important piece that we're putting back in place.  a7152be79b62 was
just a bump in the road that delayed things.

> ... reverted saving and restoring of ASPM L1 Substates
> due to a regression that caused resume from suspend to fail on certain
> systems. However, we never added this capability back and this is now
> causing systems fail to enter low power CPU states, drawing more power
> from the battery.
> 
> The original revert mentioned that we restore L1 PM substate configuration
> even though ASPM L1 may already be enabled. This is due the fact that
> the pci_restore_aspm_l1ss_state() was called before pci_restore_pcie_state().
> 
> Try to enable this functionality again following PCIe r6.0.1, sec 5.5.4
> more closely by:
> 
>   1) Do not restore ASPM configuration in pci_restore_pcie_state() but
>      do that after PCIe capability is restored in pci_restore_aspm_state()
>      following PCIe r6.0, sec 5.5.4.
> 
>   2) If BIOS reenables L1SS on us, particularly L1.2, we need to clear the
>      enables in the right order, downstream before upstream. Defer restoring
>      the L1SS config until we are at the downstream component. Then update
>      the config for both ends of the link in the prescribed order.
> 
>   3) Program ASPM L1 PM substate configuration before L1 enables.
> 
>   4) Program ASPM L1 PM substate enables last after rest of the fields
>      in the capability are programmed.
> 
> Reported-by: Koba Ko <koba.ko@canonical.com>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217321
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216782
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216877
> Cc: Tasev Nikola <tasev.stefanoska@skynet.be>
> Cc: Mark Enriquez <enriquezmark36@gmail.com>
> Cc: Thomas Witt <kernel@witt.link>
> Cc: Werner Sembach <wse@tuxedocomputers.com>

I know you cc'd Vidya above, but I would include him here so it's
obvious in the git log that he's involved:

  Cc: Vidya Sagar <vidyas@nvidia.com>

> Co-developed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> Co-developed-by: David E. Box <david.e.box@linux.intel.com>
> Signed-off-by: David E. Box <david.e.box@linux.intel.com>
> ---
> Hi all,
> 
> Previous versions of the patch can be found:
> 
> v4: https://lore.kernel.org/linux-pci/20231002070044.2299644-1-mika.westerberg@linux.intel.com/
> v3: https://lore.kernel.org/linux-pci/20230925074636.2893747-1-mika.westerberg@linux.intel.com/
> v2: https://lore.kernel.org/linux-pci/20230911073352.3472918-1-mika.westerberg@linux.intel.com/
> v1: https://lore.kernel.org/linux-pci/20230627062442.54008-1-mika.westerberg@linux.intel.com/ 
> 
> Changes from v4 (David):
>   - Defer restoring upstream component until at downstream component and
>     handle both together in the prescribed order:
>        - Disable L1SS on child then parent
>        - Restore configuration on both without enables
>        - Enable L1SS on parent then child
>     This fixes a similar hang observed on my system but still needs
>     reporters to test.
>   - Removed denylist and associated code.
>   - Removed Reviewed-by and Tested-by since I've modified Mika's patch and
>     the changes need to be retested. Added Co-developed-by.
> 
> Changes from v3 (Mika):
> 
>   - Use pcie_capability_set_word() instead.
> 
>   - Add tag from Ilpo.
> 
> Changes from v2 (Mika):
> 
>   - Added tested by tag from Kai-Heng Feng.
> 
>   - Dropped the two unneeded (u32 *) casts.
> 
>   - Dropped unnecessary comment.
> 
> Changes from v1 (Mika):
> 
>   - We move ASPM enables from pci_restore_pcie_state() into
>     pci_restore_aspm_state() to make sure they are clear when L1SS bits
>     are programmed (as per PCIe spec).
> 
>   - The denylist includes the TUXEDO system as well but only if suspend
>     is done via BIOS (e.g mem_sleep=deep is forced by user). This way
>     the PCIe devices should continue working after S3 resume, and at the
>     same time allow better power savings. If the default s2idle is used
>     then we restore L1SS to allow the CPU enter lower power states. This
>     is the best I was able to come up to make everyone happy.
> 
>  drivers/pci/pci.c       |  18 ++++++-
>  drivers/pci/pci.h       |   4 ++
>  drivers/pci/pcie/aspm.c | 116 ++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 136 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 55bc3576a985..3c4b2647b4ca 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1579,7 +1579,7 @@ static void pci_restore_pcie_state(struct pci_dev *dev)
>  {
>  	int i = 0;
>  	struct pci_cap_saved_state *save_state;
> -	u16 *cap;
> +	u16 *cap, val;
>  
>  	save_state = pci_find_saved_cap(dev, PCI_CAP_ID_EXP);
>  	if (!save_state)
> @@ -1594,7 +1594,14 @@ static void pci_restore_pcie_state(struct pci_dev *dev)
>  
>  	cap = (u16 *)&save_state->cap.data[0];
>  	pcie_capability_write_word(dev, PCI_EXP_DEVCTL, cap[i++]);
> -	pcie_capability_write_word(dev, PCI_EXP_LNKCTL, cap[i++]);

Add blank line before the comment here.

> +	/*
> +	 * Restoring ASPM L1 substates has special requirements
> +	 * according to the PCIe spec 6.0. 

This would need a section number and a hint about exactly what the
special requirement is, but I think this sentence could be removed
completely because the details are in pci_restore_aspm_state(), and
you already refer to that, so we don't need to duplicate the details
here.

> +        So we restore here only the
> +	 * LNKCTL register with the ASPM control field clear. ASPM will
> +	 * be restored in pci_restore_aspm_state().
> +	 */
> +	val = cap[i++] & ~PCI_EXP_LNKCTL_ASPMC;
> +	pcie_capability_write_word(dev, PCI_EXP_LNKCTL, val);

When CONFIG_PCIEASPM is not set, we will clear ASPMC here and never
restore it.  I don't know if this ever happens.  Do we need to worry
about this?  Might firmware restore ASPMC itself before we get here?
What do we want to happen in this case?

Since ASPM is intertwined with the PCIe Capability, can we call
pci_restore_aspm_state() from here instead of from
pci_restore_state()?

Calling it here would make it easier to see the required ordering
(LNKCTL with ASPMC cleared, restore ASPM L1SS, restore ASPMC) and
it would be obvious that none of the other stuff in
pci_restore_state() is relevant (PASID, PRI, ATS, VC, etc).

If that could be done, I think it would make sense to do the same with
pci_save_aspm_state() even though it's a little more independent.

>  	pcie_capability_write_word(dev, PCI_EXP_SLTCTL, cap[i++]);
>  	pcie_capability_write_word(dev, PCI_EXP_RTCTL, cap[i++]);
>  	pcie_capability_write_word(dev, PCI_EXP_DEVCTL2, cap[i++]);
> @@ -1705,6 +1712,7 @@ int pci_save_state(struct pci_dev *dev)
>  	pci_save_ltr_state(dev);
>  	pci_save_dpc_state(dev);
>  	pci_save_aer_state(dev);
> +	pci_save_aspm_state(dev);
>  	pci_save_ptm_state(dev);
>  	return pci_save_vc_state(dev);
>  }
> @@ -1817,6 +1825,7 @@ void pci_restore_state(struct pci_dev *dev)
>  	pci_restore_rebar_state(dev);
>  	pci_restore_dpc_state(dev);
>  	pci_restore_ptm_state(dev);
> +	pci_restore_aspm_state(dev);
>  
>  	pci_aer_clear_status(dev);
>  	pci_restore_aer_state(dev);
> @@ -3509,6 +3518,11 @@ void pci_allocate_cap_save_buffers(struct pci_dev *dev)
>  	if (error)
>  		pci_err(dev, "unable to allocate suspend buffer for LTR\n");
>  
> +	error = pci_add_ext_cap_save_buffer(dev, PCI_EXT_CAP_ID_L1SS,
> +					    2 * sizeof(u32));
> +	if (error)
> +		pci_err(dev, "unable to allocate suspend buffer for ASPM-L1SS\n");
> +
>  	pci_allocate_vc_save_buffers(dev);
>  }
>  
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 5ecbcf041179..2d0f9ae3d9b6 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -570,10 +570,14 @@ int pcie_retrain_link(struct pci_dev *pdev, bool use_lt);
>  void pcie_aspm_init_link_state(struct pci_dev *pdev);
>  void pcie_aspm_exit_link_state(struct pci_dev *pdev);
>  void pcie_aspm_powersave_config_link(struct pci_dev *pdev);
> +void pci_save_aspm_state(struct pci_dev *pdev);
> +void pci_restore_aspm_state(struct pci_dev *pdev);
>  #else
>  static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { }
>  static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { }
>  static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { }
> +static inline void pci_save_aspm_state(struct pci_dev *pdev) { }
> +static inline void pci_restore_aspm_state(struct pci_dev *pdev) { }
>  #endif
>  
>  #ifdef CONFIG_PCIE_ECRC
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index 50b04ae5c394..dc5ea0ff2e07 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -1027,6 +1027,122 @@ void pcie_aspm_powersave_config_link(struct pci_dev *pdev)
>  	up_read(&pci_bus_sem);
>  }
>  
> +void pci_save_aspm_state(struct pci_dev *pdev)
> +{
> +	struct pci_cap_saved_state *save_state;
> +	u16 l1ss = pdev->l1ss;
> +	u32 *cap;
> +
> +	/*
> +	 * Save L1 substate configuration. The ASPM L0s/L1 configuration
> +	 * is already saved in pci_save_pcie_state().
> +	 */
> +	if (!l1ss)
> +		return;
> +
> +	save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_L1SS);
> +	if (!save_state)
> +		return;
> +
> +	cap = &save_state->cap.data[0];
> +	pci_read_config_dword(pdev, l1ss + PCI_L1SS_CTL2, cap++);
> +	pci_read_config_dword(pdev, l1ss + PCI_L1SS_CTL1, cap++);
> +}
> +
> +static void pcie_restore_aspm_l1ss(struct pci_dev *child)
> +{
> +	struct pci_cap_saved_state *pl_save_state, *cl_save_state;
> +	struct pci_dev *parent = child->bus->self;
> +	struct pcie_link_state *link;
> +	u32 *cap, pl_ctl1, pl_ctl2, pl_l1_2_enable;
> +	u32 cl_ctl1, cl_ctl2, cl_l1_2_enable;
> +
> +	/*
> +	 * In case BIOS enabled L1.2 after resume, we need to disable it first
> +	 * on the downstream component before the upstream. So, don't attempt to
> +	 * restore either until we are at the downstream component.
> +	 */
> +	if (child->link_state || !parent || !child->l1ss || !parent->l1ss)
> +		return;

I guess child->link_state is non-zero if child is the upstream end of
a Link, right?  And we want to do nothing at the upstream end?

I think I would name it "pdev" instead of "child" because "child"
implies something that is often not true.

And maybe split the tests differently so it's easier to think about
what each part is for, e.g.,

  if (!parent)
    return;

  if (!parent->link_state || pdev->link_state)
    return;

  if (!pdev->l1ss || !parent->l1ss)
    return;

> +	link = parent->link_state;
> +	if (!link)
> +		return;
> +
> +	cl_save_state = pci_find_saved_ext_cap(child, PCI_EXT_CAP_ID_L1SS);
> +	pl_save_state = pci_find_saved_ext_cap(parent, PCI_EXT_CAP_ID_L1SS);
> +	if (!cl_save_state || !pl_save_state)
> +		return;
> +
> +	cap = &cl_save_state->cap.data[0];
> +	cl_ctl2 = *cap++;
> +	cl_ctl1 = *cap;
> +	cap = &pl_save_state->cap.data[0];
> +	pl_ctl2 = *cap++;
> +	pl_ctl1 = *cap;
> +
> +
> +	/*
> +	 * Disable L1.2 on this downstream endpoint device first, followed
> +	 * by the upstream
> +	 */
> +	pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1,
> +				PCI_L1SS_CTL1_L1_2_MASK, 0);
> +	pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1,
> +				PCI_L1SS_CTL1_L1_2_MASK, 0);
> +
> +	/*
> +	 * In addition, Common_Mode_Restore_Time and LTR_L1.2_THRESHOLD
> +	 * in PCI_L1SS_CTL1 must be programmed *before* setting the L1.2
> +	 * enable bits, even though they're all in PCI_L1SS_CTL1.
> +	 */
> +	pl_l1_2_enable = pl_ctl1 & PCI_L1SS_CTL1_L1_2_MASK;
> +	pl_ctl1 &= ~PCI_L1SS_CTL1_L1_2_MASK;
> +	cl_l1_2_enable = cl_ctl1 & PCI_L1SS_CTL1_L1_2_MASK;
> +	cl_ctl1 &= ~PCI_L1SS_CTL1_L1_2_MASK;
> +
> +	/* Write back without enables first (above we cleared them in ctl1) */
> +	pci_write_config_dword(parent, parent->l1ss + PCI_L1SS_CTL2, pl_ctl2);
> +	pci_write_config_dword(child, child->l1ss + PCI_L1SS_CTL2, cl_ctl2);
> +	pci_write_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1, pl_ctl1);
> +	pci_write_config_dword(child, child->l1ss + PCI_L1SS_CTL1, cl_ctl1);
> +
> +
> +	/* Then write back the enables */
> +	if (pl_l1_2_enable || cl_l1_2_enable) {
> +		pci_write_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1,
> +				       pl_ctl1 | pl_l1_2_enable);
> +		pci_write_config_dword(child, child->l1ss + PCI_L1SS_CTL1,
> +				       cl_ctl1 | cl_l1_2_enable);
> +	}
> +}
> +
> +void pci_restore_aspm_state(struct pci_dev *pdev)
> +{
> +	struct pci_cap_saved_state *save_state;
> +	u16 *cap, val;
> +
> +	save_state = pci_find_saved_cap(pdev, PCI_CAP_ID_EXP);
> +
> +	if (!save_state)
> +		return;
> +
> +	cap = (u16 *)&save_state->cap.data[0];
> +	/* Must match the ordering in pci_save/restore_pcie_state() */
> +	val = cap[1] & PCI_EXP_LNKCTL_ASPMC;
> +	if (!val)
> +		return;
> +
> +	/*
> +	 * We restore L1 substate configuration first before enabling L1
> +	 * as the PCIe spec 6.0 sec 5.5.4 suggests.
> +	 */
> +	pcie_restore_aspm_l1ss(pdev);
> +
> +	/* Re-enable L0s/L1 */
> +	pcie_capability_set_word(pdev, PCI_EXP_LNKCTL, val);
> +}
> +
>  static struct pcie_link_state *pcie_aspm_get_link(struct pci_dev *pdev)
>  {
>  	struct pci_dev *bridge;
> -- 
> 2.34.1
>
Bjorn Helgaas Dec. 29, 2023, 12:30 a.m. UTC | #3
[+cc Michael]

On Thu, Dec 28, 2023 at 04:31:12PM -0600, Bjorn Helgaas wrote:
> On Wed, Dec 20, 2023 at 05:12:50PM -0800, David E. Box wrote:
> ...

> > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > index 55bc3576a985..3c4b2647b4ca 100644
> > --- a/drivers/pci/pci.c
> > +++ b/drivers/pci/pci.c

> > @@ -1579,7 +1579,7 @@ static void pci_restore_pcie_state(struct pci_dev *dev)
> >  {
> > ...

> > +        So we restore here only the
> > +	 * LNKCTL register with the ASPM control field clear. ASPM will
> > +	 * be restored in pci_restore_aspm_state().
> > +	 */
> > +	val = cap[i++] & ~PCI_EXP_LNKCTL_ASPMC;
> > +	pcie_capability_write_word(dev, PCI_EXP_LNKCTL, val);
> 
> When CONFIG_PCIEASPM is not set, we will clear ASPMC here and never
> restore it.  I don't know if this ever happens.  Do we need to worry
> about this?  Might firmware restore ASPMC itself before we get here?
> What do we want to happen in this case?
> 
> Since ASPM is intertwined with the PCIe Capability, can we call
> pci_restore_aspm_state() from here instead of from
> pci_restore_state()?
> 
> Calling it here would make it easier to see the required ordering
> (LNKCTL with ASPMC cleared, restore ASPM L1SS, restore ASPMC) and
> it would be obvious that none of the other stuff in
> pci_restore_state() is relevant (PASID, PRI, ATS, VC, etc).
> 
> If that could be done, I think it would make sense to do the same with
> pci_save_aspm_state() even though it's a little more independent.

The lspci output in Michael's report at
https://lore.kernel.org/r/76c61361-b8b4-435f-a9f1-32b716763d62@5challer.de
reminded me that LTR is important for L1.2, and we currently have
this:

  pci_restore_state
    pci_restore_ltr_state
    pci_restore_pcie_state

I wonder if pci_restore_ltr_state() should be called from
pci_restore_pcie_state() as well?  It's intimately connected to ASPM,
and that connection isn't very clear in the current code.

Bjorn
David E. Box Jan. 10, 2024, 3:24 p.m. UTC | #4
On Thu, 2023-12-28 at 18:30 -0600, Bjorn Helgaas wrote:
> [+cc Michael]
> 
> On Thu, Dec 28, 2023 at 04:31:12PM -0600, Bjorn Helgaas wrote:
> > On Wed, Dec 20, 2023 at 05:12:50PM -0800, David E. Box wrote:
> > ...
> 
> > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > index 55bc3576a985..3c4b2647b4ca 100644
> > > --- a/drivers/pci/pci.c
> > > +++ b/drivers/pci/pci.c
> 
> > > @@ -1579,7 +1579,7 @@ static void pci_restore_pcie_state(struct pci_dev
> > > *dev)
> > >  {
> > > ...
> 
> > > +        So we restore here only the
> > > +	 * LNKCTL register with the ASPM control field clear. ASPM will
> > > +	 * be restored in pci_restore_aspm_state().
> > > +	 */
> > > +	val = cap[i++] & ~PCI_EXP_LNKCTL_ASPMC;
> > > +	pcie_capability_write_word(dev, PCI_EXP_LNKCTL, val);
> > 
> > When CONFIG_PCIEASPM is not set, we will clear ASPMC here and never
> > restore it.  I don't know if this ever happens.  Do we need to worry
> > about this?  Might firmware restore ASPMC itself before we get here?
> > What do we want to happen in this case?

I just checked this. L1 does get disabled which we don't want. We need to save
and restore the BIOS ASPM configuration even when CONFIG_PCIEASPM is not set.

> > 
> > Since ASPM is intertwined with the PCIe Capability, can we call
> > pci_restore_aspm_state() from here instead of from
> > pci_restore_state()?
> > 
> > Calling it here would make it easier to see the required ordering
> > (LNKCTL with ASPMC cleared, restore ASPM L1SS, restore ASPMC) and
> > it would be obvious that none of the other stuff in
> > pci_restore_state() is relevant (PASID, PRI, ATS, VC, etc).
> > 
> > If that could be done, I think it would make sense to do the same with
> > pci_save_aspm_state() even though it's a little more independent.
> 

Makes sense

> The lspci output in Michael's report at
> https://lore.kernel.org/r/76c61361-b8b4-435f-a9f1-32b716763d62@5challer.de
> reminded me that LTR is important for L1.2, and we currently have
> this:
> 
>   pci_restore_state
>     pci_restore_ltr_state
>     pci_restore_pcie_state
> 
> I wonder if pci_restore_ltr_state() should be called from
> pci_restore_pcie_state() as well?  It's intimately connected to ASPM,
> and that connection isn't very clear in the current code.

Make sense too since LTR is a required capability for L1.2. I'll send updated
patches after the merge window.

David
Bjorn Helgaas Jan. 10, 2024, 6:46 p.m. UTC | #5
On Wed, Jan 10, 2024 at 07:24:31AM -0800, David E. Box wrote:
> On Thu, 2023-12-28 at 18:30 -0600, Bjorn Helgaas wrote:
> > On Thu, Dec 28, 2023 at 04:31:12PM -0600, Bjorn Helgaas wrote:
> > > On Wed, Dec 20, 2023 at 05:12:50PM -0800, David E. Box wrote:

> > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > > index 55bc3576a985..3c4b2647b4ca 100644
> > > > --- a/drivers/pci/pci.c
> > > > +++ b/drivers/pci/pci.c
> > 
> > > > @@ -1579,7 +1579,7 @@ static void pci_restore_pcie_state(struct pci_dev
> > > > *dev)
> > > >  {
> > > > ...
> > 
> > > > +        So we restore here only the
> > > > +	 * LNKCTL register with the ASPM control field clear. ASPM will
> > > > +	 * be restored in pci_restore_aspm_state().
> > > > +	 */
> > > > +	val = cap[i++] & ~PCI_EXP_LNKCTL_ASPMC;
> > > > +	pcie_capability_write_word(dev, PCI_EXP_LNKCTL, val);
> > > 
> > > When CONFIG_PCIEASPM is not set, we will clear ASPMC here and never
> > > restore it.  I don't know if this ever happens.  Do we need to worry
> > > about this?  Might firmware restore ASPMC itself before we get here?
> > > What do we want to happen in this case?
> 
> I just checked this. L1 does get disabled which we don't want. We
> need to save and restore the BIOS ASPM configuration even when
> CONFIG_PCIEASPM is not set.

There's some other ASPM stuff that we want even when CONFIG_PCIEASPM
is not set.  I think some of that code is currently in probe.c and
pci.c.

I can't find it right now, but we had some discussion about moving
that code into aspm.c, compiling aspm.c unconditionally, and adding
CONFIG_PCIEASPM ifdefs inside it for these cases.  Maybe this is the
time do to that?  If so, probably a preliminary patch or two to do
the code movement without any functional changes, followed by the
actual fixes.

> > > Since ASPM is intertwined with the PCIe Capability, can we call
> > > pci_restore_aspm_state() from here instead of from
> > > pci_restore_state()?
> > > 
> > > Calling it here would make it easier to see the required ordering
> > > (LNKCTL with ASPMC cleared, restore ASPM L1SS, restore ASPMC) and
> > > it would be obvious that none of the other stuff in
> > > pci_restore_state() is relevant (PASID, PRI, ATS, VC, etc).
> > > 
> > > If that could be done, I think it would make sense to do the same with
> > > pci_save_aspm_state() even though it's a little more independent.
> 
> Makes sense
> 
> > The lspci output in Michael's report at
> > https://lore.kernel.org/r/76c61361-b8b4-435f-a9f1-32b716763d62@5challer.de
> > reminded me that LTR is important for L1.2, and we currently have
> > this:
> > 
> >   pci_restore_state
> >     pci_restore_ltr_state
> >     pci_restore_pcie_state
> > 
> > I wonder if pci_restore_ltr_state() should be called from
> > pci_restore_pcie_state() as well?  It's intimately connected to ASPM,
> > and that connection isn't very clear in the current code.
> 
> Make sense too since LTR is a required capability for L1.2. I'll
> send updated patches after the merge window.

Sounds good, will look for them :)

Bjorn
Ilpo Järvinen Jan. 11, 2024, 12:28 p.m. UTC | #6
On Wed, 10 Jan 2024, Bjorn Helgaas wrote:

> On Wed, Jan 10, 2024 at 07:24:31AM -0800, David E. Box wrote:
> > On Thu, 2023-12-28 at 18:30 -0600, Bjorn Helgaas wrote:
> > > On Thu, Dec 28, 2023 at 04:31:12PM -0600, Bjorn Helgaas wrote:
> > > > On Wed, Dec 20, 2023 at 05:12:50PM -0800, David E. Box wrote:
> 
> > > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > > > index 55bc3576a985..3c4b2647b4ca 100644
> > > > > --- a/drivers/pci/pci.c
> > > > > +++ b/drivers/pci/pci.c
> > > 
> > > > > @@ -1579,7 +1579,7 @@ static void pci_restore_pcie_state(struct pci_dev
> > > > > *dev)
> > > > >  {
> > > > > ...
> > > 
> > > > > +        So we restore here only the
> > > > > +	 * LNKCTL register with the ASPM control field clear. ASPM will
> > > > > +	 * be restored in pci_restore_aspm_state().
> > > > > +	 */
> > > > > +	val = cap[i++] & ~PCI_EXP_LNKCTL_ASPMC;
> > > > > +	pcie_capability_write_word(dev, PCI_EXP_LNKCTL, val);
> > > > 
> > > > When CONFIG_PCIEASPM is not set, we will clear ASPMC here and never
> > > > restore it.  I don't know if this ever happens.  Do we need to worry
> > > > about this?  Might firmware restore ASPMC itself before we get here?
> > > > What do we want to happen in this case?
> > 
> > I just checked this. L1 does get disabled which we don't want. We
> > need to save and restore the BIOS ASPM configuration even when
> > CONFIG_PCIEASPM is not set.
> 
> There's some other ASPM stuff that we want even when CONFIG_PCIEASPM
> is not set.  I think some of that code is currently in probe.c and
> pci.c.
> 
> I can't find it right now, but we had some discussion about moving
> that code into aspm.c, compiling aspm.c unconditionally, and adding
> CONFIG_PCIEASPM ifdefs inside it for these cases.  Maybe this is the
> time do to that?  If so, probably a preliminary patch or two to do
> the code movement without any functional changes, followed by the
> actual fixes.

Hi,

It's here:

https://lore.kernel.org/linux-pci/20231011200442.GA1040348@bhelgaas/

I'm still not even half way done with the update for that patch series 
because I got stuck while attempting to do the rtsx driver changes the 
new way and moved to work on other stuff for a while.

So far I've only tentatively placed some #ifdef into aspm.c but it wasn't 
all thought nor well arranged yet to limit the number of needed #ifdefs so
nothing ready to be posted yet. And therefore not much effort is lost if 
David wants to take a look at it instead in the meantime.
David E. Box Jan. 13, 2024, 12:36 a.m. UTC | #7
On Thu, 2024-01-11 at 14:28 +0200, Ilpo Järvinen wrote:
> On Wed, 10 Jan 2024, Bjorn Helgaas wrote:
> 
> > On Wed, Jan 10, 2024 at 07:24:31AM -0800, David E. Box wrote:
> > > On Thu, 2023-12-28 at 18:30 -0600, Bjorn Helgaas wrote:
> > > > On Thu, Dec 28, 2023 at 04:31:12PM -0600, Bjorn Helgaas wrote:
> > > > > On Wed, Dec 20, 2023 at 05:12:50PM -0800, David E. Box wrote:
> > 
> > > > > > diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> > > > > > index 55bc3576a985..3c4b2647b4ca 100644
> > > > > > --- a/drivers/pci/pci.c
> > > > > > +++ b/drivers/pci/pci.c
> > > > 
> > > > > > @@ -1579,7 +1579,7 @@ static void pci_restore_pcie_state(struct
> > > > > > pci_dev
> > > > > > *dev)
> > > > > >  {
> > > > > > ...
> > > > 
> > > > > > +        So we restore here only the
> > > > > > +        * LNKCTL register with the ASPM control field clear. ASPM
> > > > > > will
> > > > > > +        * be restored in pci_restore_aspm_state().
> > > > > > +        */
> > > > > > +       val = cap[i++] & ~PCI_EXP_LNKCTL_ASPMC;
> > > > > > +       pcie_capability_write_word(dev, PCI_EXP_LNKCTL, val);
> > > > > 
> > > > > When CONFIG_PCIEASPM is not set, we will clear ASPMC here and never
> > > > > restore it.  I don't know if this ever happens.  Do we need to worry
> > > > > about this?  Might firmware restore ASPMC itself before we get here?
> > > > > What do we want to happen in this case?
> > > 
> > > I just checked this. L1 does get disabled which we don't want. We
> > > need to save and restore the BIOS ASPM configuration even when
> > > CONFIG_PCIEASPM is not set.
> > 
> > There's some other ASPM stuff that we want even when CONFIG_PCIEASPM
> > is not set.  I think some of that code is currently in probe.c and
> > pci.c.
> > 
> > I can't find it right now, but we had some discussion about moving
> > that code into aspm.c, compiling aspm.c unconditionally, and adding
> > CONFIG_PCIEASPM ifdefs inside it for these cases.  Maybe this is the
> > time do to that?  If so, probably a preliminary patch or two to do
> > the code movement without any functional changes, followed by the
> > actual fixes.
> 
> Hi,
> 
> It's here:
> 
> https://lore.kernel.org/linux-pci/20231011200442.GA1040348@bhelgaas/
> 
> I'm still not even half way done with the update for that patch series 
> because I got stuck while attempting to do the rtsx driver changes the 
> new way and moved to work on other stuff for a while.
> 
> So far I've only tentatively placed some #ifdef into aspm.c but it wasn't 
> all thought nor well arranged yet to limit the number of needed #ifdefs so
> nothing ready to be posted yet. And therefore not much effort is lost if 
> David wants to take a look at it instead in the meantime.

I'll take a stab at it.

David
diff mbox series

Patch

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 55bc3576a985..3c4b2647b4ca 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1579,7 +1579,7 @@  static void pci_restore_pcie_state(struct pci_dev *dev)
 {
 	int i = 0;
 	struct pci_cap_saved_state *save_state;
-	u16 *cap;
+	u16 *cap, val;
 
 	save_state = pci_find_saved_cap(dev, PCI_CAP_ID_EXP);
 	if (!save_state)
@@ -1594,7 +1594,14 @@  static void pci_restore_pcie_state(struct pci_dev *dev)
 
 	cap = (u16 *)&save_state->cap.data[0];
 	pcie_capability_write_word(dev, PCI_EXP_DEVCTL, cap[i++]);
-	pcie_capability_write_word(dev, PCI_EXP_LNKCTL, cap[i++]);
+	/*
+	 * Restoring ASPM L1 substates has special requirements
+	 * according to the PCIe spec 6.0. So we restore here only the
+	 * LNKCTL register with the ASPM control field clear. ASPM will
+	 * be restored in pci_restore_aspm_state().
+	 */
+	val = cap[i++] & ~PCI_EXP_LNKCTL_ASPMC;
+	pcie_capability_write_word(dev, PCI_EXP_LNKCTL, val);
 	pcie_capability_write_word(dev, PCI_EXP_SLTCTL, cap[i++]);
 	pcie_capability_write_word(dev, PCI_EXP_RTCTL, cap[i++]);
 	pcie_capability_write_word(dev, PCI_EXP_DEVCTL2, cap[i++]);
@@ -1705,6 +1712,7 @@  int pci_save_state(struct pci_dev *dev)
 	pci_save_ltr_state(dev);
 	pci_save_dpc_state(dev);
 	pci_save_aer_state(dev);
+	pci_save_aspm_state(dev);
 	pci_save_ptm_state(dev);
 	return pci_save_vc_state(dev);
 }
@@ -1817,6 +1825,7 @@  void pci_restore_state(struct pci_dev *dev)
 	pci_restore_rebar_state(dev);
 	pci_restore_dpc_state(dev);
 	pci_restore_ptm_state(dev);
+	pci_restore_aspm_state(dev);
 
 	pci_aer_clear_status(dev);
 	pci_restore_aer_state(dev);
@@ -3509,6 +3518,11 @@  void pci_allocate_cap_save_buffers(struct pci_dev *dev)
 	if (error)
 		pci_err(dev, "unable to allocate suspend buffer for LTR\n");
 
+	error = pci_add_ext_cap_save_buffer(dev, PCI_EXT_CAP_ID_L1SS,
+					    2 * sizeof(u32));
+	if (error)
+		pci_err(dev, "unable to allocate suspend buffer for ASPM-L1SS\n");
+
 	pci_allocate_vc_save_buffers(dev);
 }
 
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 5ecbcf041179..2d0f9ae3d9b6 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -570,10 +570,14 @@  int pcie_retrain_link(struct pci_dev *pdev, bool use_lt);
 void pcie_aspm_init_link_state(struct pci_dev *pdev);
 void pcie_aspm_exit_link_state(struct pci_dev *pdev);
 void pcie_aspm_powersave_config_link(struct pci_dev *pdev);
+void pci_save_aspm_state(struct pci_dev *pdev);
+void pci_restore_aspm_state(struct pci_dev *pdev);
 #else
 static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { }
 static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { }
 static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { }
+static inline void pci_save_aspm_state(struct pci_dev *pdev) { }
+static inline void pci_restore_aspm_state(struct pci_dev *pdev) { }
 #endif
 
 #ifdef CONFIG_PCIE_ECRC
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 50b04ae5c394..dc5ea0ff2e07 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -1027,6 +1027,122 @@  void pcie_aspm_powersave_config_link(struct pci_dev *pdev)
 	up_read(&pci_bus_sem);
 }
 
+void pci_save_aspm_state(struct pci_dev *pdev)
+{
+	struct pci_cap_saved_state *save_state;
+	u16 l1ss = pdev->l1ss;
+	u32 *cap;
+
+	/*
+	 * Save L1 substate configuration. The ASPM L0s/L1 configuration
+	 * is already saved in pci_save_pcie_state().
+	 */
+	if (!l1ss)
+		return;
+
+	save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_L1SS);
+	if (!save_state)
+		return;
+
+	cap = &save_state->cap.data[0];
+	pci_read_config_dword(pdev, l1ss + PCI_L1SS_CTL2, cap++);
+	pci_read_config_dword(pdev, l1ss + PCI_L1SS_CTL1, cap++);
+}
+
+static void pcie_restore_aspm_l1ss(struct pci_dev *child)
+{
+	struct pci_cap_saved_state *pl_save_state, *cl_save_state;
+	struct pci_dev *parent = child->bus->self;
+	struct pcie_link_state *link;
+	u32 *cap, pl_ctl1, pl_ctl2, pl_l1_2_enable;
+	u32 cl_ctl1, cl_ctl2, cl_l1_2_enable;
+
+	/*
+	 * In case BIOS enabled L1.2 after resume, we need to disable it first
+	 * on the downstream component before the upstream. So, don't attempt to
+	 * restore either until we are at the downstream component.
+	 */
+	if (child->link_state || !parent || !child->l1ss || !parent->l1ss)
+		return;
+
+	link = parent->link_state;
+	if (!link)
+		return;
+
+	cl_save_state = pci_find_saved_ext_cap(child, PCI_EXT_CAP_ID_L1SS);
+	pl_save_state = pci_find_saved_ext_cap(parent, PCI_EXT_CAP_ID_L1SS);
+	if (!cl_save_state || !pl_save_state)
+		return;
+
+	cap = &cl_save_state->cap.data[0];
+	cl_ctl2 = *cap++;
+	cl_ctl1 = *cap;
+	cap = &pl_save_state->cap.data[0];
+	pl_ctl2 = *cap++;
+	pl_ctl1 = *cap;
+
+
+	/*
+	 * Disable L1.2 on this downstream endpoint device first, followed
+	 * by the upstream
+	 */
+	pci_clear_and_set_dword(child, child->l1ss + PCI_L1SS_CTL1,
+				PCI_L1SS_CTL1_L1_2_MASK, 0);
+	pci_clear_and_set_dword(parent, parent->l1ss + PCI_L1SS_CTL1,
+				PCI_L1SS_CTL1_L1_2_MASK, 0);
+
+	/*
+	 * In addition, Common_Mode_Restore_Time and LTR_L1.2_THRESHOLD
+	 * in PCI_L1SS_CTL1 must be programmed *before* setting the L1.2
+	 * enable bits, even though they're all in PCI_L1SS_CTL1.
+	 */
+	pl_l1_2_enable = pl_ctl1 & PCI_L1SS_CTL1_L1_2_MASK;
+	pl_ctl1 &= ~PCI_L1SS_CTL1_L1_2_MASK;
+	cl_l1_2_enable = cl_ctl1 & PCI_L1SS_CTL1_L1_2_MASK;
+	cl_ctl1 &= ~PCI_L1SS_CTL1_L1_2_MASK;
+
+	/* Write back without enables first (above we cleared them in ctl1) */
+	pci_write_config_dword(parent, parent->l1ss + PCI_L1SS_CTL2, pl_ctl2);
+	pci_write_config_dword(child, child->l1ss + PCI_L1SS_CTL2, cl_ctl2);
+	pci_write_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1, pl_ctl1);
+	pci_write_config_dword(child, child->l1ss + PCI_L1SS_CTL1, cl_ctl1);
+
+
+	/* Then write back the enables */
+	if (pl_l1_2_enable || cl_l1_2_enable) {
+		pci_write_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1,
+				       pl_ctl1 | pl_l1_2_enable);
+		pci_write_config_dword(child, child->l1ss + PCI_L1SS_CTL1,
+				       cl_ctl1 | cl_l1_2_enable);
+	}
+}
+
+void pci_restore_aspm_state(struct pci_dev *pdev)
+{
+	struct pci_cap_saved_state *save_state;
+	u16 *cap, val;
+
+	save_state = pci_find_saved_cap(pdev, PCI_CAP_ID_EXP);
+
+	if (!save_state)
+		return;
+
+	cap = (u16 *)&save_state->cap.data[0];
+	/* Must match the ordering in pci_save/restore_pcie_state() */
+	val = cap[1] & PCI_EXP_LNKCTL_ASPMC;
+	if (!val)
+		return;
+
+	/*
+	 * We restore L1 substate configuration first before enabling L1
+	 * as the PCIe spec 6.0 sec 5.5.4 suggests.
+	 */
+	pcie_restore_aspm_l1ss(pdev);
+
+	/* Re-enable L0s/L1 */
+	pcie_capability_set_word(pdev, PCI_EXP_LNKCTL, val);
+}
+
 static struct pcie_link_state *pcie_aspm_get_link(struct pci_dev *pdev)
 {
 	struct pci_dev *bridge;