diff mbox series

[RESEND,v3,2/2] PCI/PTM: fix to maintain pci_dev->ptm_enabled

Message ID 20220830104913.1620539-2-rajvi.jingar@linux.intel.com
State New
Headers show
Series [RESEND,v3,1/2] PCI/PM: refactor pci_pm_suspend_noirq() | expand

Commit Message

Rajvi Jingar Aug. 30, 2022, 10:49 a.m. UTC
pci_dev->ptm_enabled needs to be maintained to reflect the current PTM
state of the device. In pci_ptm_disable(), clear ptm_enabled from
'struct pci_dev' on disabling PTM state for the device.
In pci_restore_ptm_state(), set dev->ptm_enabled based on the restored
PTM state of the device.

In pci_ptm_disable(), perform ptm_enabled check to avoid config space
access in case if PTM is already disabled for the device. ptm_enabled
won't be set for non-PCIe devices so pci_is_pcie(dev) check is not
needed anymore.

Signed-off-by: Rajvi Jingar <rajvi.jingar@linux.intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 v1->v2:
   - add ptm_enabled check in pci_ptm_disable().
   - set the dev->ptm_enabled value in pci_restore_ptm_state().
 v2->v3:
   - remove pci_is_pcie(dev) check in pci_ptm_disable().
   - add Reviewed-by tag in commit message
---
 drivers/pci/pcie/ptm.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Bjorn Helgaas Aug. 30, 2022, 4:25 p.m. UTC | #1
[+cc Kai-Heng]

On Tue, Aug 30, 2022 at 03:49:13AM -0700, Rajvi Jingar wrote:
> pci_dev->ptm_enabled needs to be maintained to reflect the current PTM
> state of the device. In pci_ptm_disable(), clear ptm_enabled from
> 'struct pci_dev' on disabling PTM state for the device.
> In pci_restore_ptm_state(), set dev->ptm_enabled based on the restored
> PTM state of the device.
> 
> In pci_ptm_disable(), perform ptm_enabled check to avoid config space
> access in case if PTM is already disabled for the device. ptm_enabled
> won't be set for non-PCIe devices so pci_is_pcie(dev) check is not
> needed anymore.

This one sounds like it's supposed to fix something, but I'm not clear
exactly what.

I have a vague memory of config accesses messing up a low power state.
But this is still completely magical and unmaintainable since AFAIK
there is nothing in the PCIe spec about avoiding config accesses when
PTM is disabled.

At the very least, we would need more details in the commit log and
a hint in the code about this.

> Signed-off-by: Rajvi Jingar <rajvi.jingar@linux.intel.com>
> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  v1->v2:
>    - add ptm_enabled check in pci_ptm_disable().
>    - set the dev->ptm_enabled value in pci_restore_ptm_state().
>  v2->v3:
>    - remove pci_is_pcie(dev) check in pci_ptm_disable().
>    - add Reviewed-by tag in commit message
> ---
>  drivers/pci/pcie/ptm.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pcie/ptm.c b/drivers/pci/pcie/ptm.c
> index 368a254e3124..1ce241d4538f 100644
> --- a/drivers/pci/pcie/ptm.c
> +++ b/drivers/pci/pcie/ptm.c
> @@ -34,7 +34,7 @@ void pci_disable_ptm(struct pci_dev *dev)
>  	int ptm;
>  	u16 ctrl;
>  
> -	if (!pci_is_pcie(dev))
> +	if (!dev->ptm_enabled)
>  		return;

This will conflict with a change Kai-Heng Feng and I have been working
on, but I can resolve it when applying.

>  	ptm = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_PTM);
> @@ -44,6 +44,7 @@ void pci_disable_ptm(struct pci_dev *dev)
>  	pci_read_config_word(dev, ptm + PCI_PTM_CTRL, &ctrl);
>  	ctrl &= ~(PCI_PTM_CTRL_ENABLE | PCI_PTM_CTRL_ROOT);
>  	pci_write_config_word(dev, ptm + PCI_PTM_CTRL, ctrl);
> +	dev->ptm_enabled = 0;
>  }
>  
>  void pci_save_ptm_state(struct pci_dev *dev)
> @@ -83,6 +84,7 @@ void pci_restore_ptm_state(struct pci_dev *dev)
>  
>  	cap = (u16 *)&save_state->cap.data[0];
>  	pci_write_config_word(dev, ptm + PCI_PTM_CTRL, *cap);
> +	dev->ptm_enabled = !!(*cap & PCI_PTM_CTRL_ENABLE);
>  }
>  
>  void pci_ptm_init(struct pci_dev *dev)
> -- 
> 2.25.1
>
Rafael J. Wysocki Aug. 30, 2022, 4:58 p.m. UTC | #2
On Tue, Aug 30, 2022 at 6:25 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> [+cc Kai-Heng]
>
> On Tue, Aug 30, 2022 at 03:49:13AM -0700, Rajvi Jingar wrote:
> > pci_dev->ptm_enabled needs to be maintained to reflect the current PTM
> > state of the device. In pci_ptm_disable(), clear ptm_enabled from
> > 'struct pci_dev' on disabling PTM state for the device.
> > In pci_restore_ptm_state(), set dev->ptm_enabled based on the restored
> > PTM state of the device.
> >
> > In pci_ptm_disable(), perform ptm_enabled check to avoid config space
> > access in case if PTM is already disabled for the device. ptm_enabled
> > won't be set for non-PCIe devices so pci_is_pcie(dev) check is not
> > needed anymore.
>
> This one sounds like it's supposed to fix something, but I'm not clear
> exactly what.
>
> I have a vague memory of config accesses messing up a low power state.
> But this is still completely magical and unmaintainable since AFAIK
> there is nothing in the PCIe spec about avoiding config accesses when
> PTM is disabled.

Because ptm_enabled is expected to always reflect the hardware state,
pci_disable_ptm() needs to be amended to clear it.  Also it is prudent
to explicitly make it reflect the new hardware state in
pci_restore_ptm_state().

Then, pci_disable_ptm() can be made bail out if ptm_enabled is clear,
because it has nothing to do then and the pci_is_pcie() check in there
is not necessary, because ptm_enabled will never be set for devices
that are not PCIe.

> At the very least, we would need more details in the commit log and
> a hint in the code about this.
>
> > Signed-off-by: Rajvi Jingar <rajvi.jingar@linux.intel.com>
> > Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  v1->v2:
> >    - add ptm_enabled check in pci_ptm_disable().
> >    - set the dev->ptm_enabled value in pci_restore_ptm_state().
> >  v2->v3:
> >    - remove pci_is_pcie(dev) check in pci_ptm_disable().
> >    - add Reviewed-by tag in commit message
> > ---
> >  drivers/pci/pcie/ptm.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/pcie/ptm.c b/drivers/pci/pcie/ptm.c
> > index 368a254e3124..1ce241d4538f 100644
> > --- a/drivers/pci/pcie/ptm.c
> > +++ b/drivers/pci/pcie/ptm.c
> > @@ -34,7 +34,7 @@ void pci_disable_ptm(struct pci_dev *dev)
> >       int ptm;
> >       u16 ctrl;
> >
> > -     if (!pci_is_pcie(dev))
> > +     if (!dev->ptm_enabled)
> >               return;
>
> This will conflict with a change Kai-Heng Feng and I have been working
> on, but I can resolve it when applying.
>
> >       ptm = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_PTM);
> > @@ -44,6 +44,7 @@ void pci_disable_ptm(struct pci_dev *dev)
> >       pci_read_config_word(dev, ptm + PCI_PTM_CTRL, &ctrl);
> >       ctrl &= ~(PCI_PTM_CTRL_ENABLE | PCI_PTM_CTRL_ROOT);
> >       pci_write_config_word(dev, ptm + PCI_PTM_CTRL, ctrl);
> > +     dev->ptm_enabled = 0;
> >  }
> >
> >  void pci_save_ptm_state(struct pci_dev *dev)
> > @@ -83,6 +84,7 @@ void pci_restore_ptm_state(struct pci_dev *dev)
> >
> >       cap = (u16 *)&save_state->cap.data[0];
> >       pci_write_config_word(dev, ptm + PCI_PTM_CTRL, *cap);
> > +     dev->ptm_enabled = !!(*cap & PCI_PTM_CTRL_ENABLE);
> >  }
> >
> >  void pci_ptm_init(struct pci_dev *dev)
> > --
> > 2.25.1
> >
Bjorn Helgaas Aug. 30, 2022, 5:37 p.m. UTC | #3
On Tue, Aug 30, 2022 at 06:58:20PM +0200, Rafael J. Wysocki wrote:
> On Tue, Aug 30, 2022 at 6:25 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Tue, Aug 30, 2022 at 03:49:13AM -0700, Rajvi Jingar wrote:
> > > pci_dev->ptm_enabled needs to be maintained to reflect the current PTM
> > > state of the device. In pci_ptm_disable(), clear ptm_enabled from
> > > 'struct pci_dev' on disabling PTM state for the device.
> > > In pci_restore_ptm_state(), set dev->ptm_enabled based on the restored
> > > PTM state of the device.
> > >
> > > In pci_ptm_disable(), perform ptm_enabled check to avoid config space
> > > access in case if PTM is already disabled for the device. ptm_enabled
> > > won't be set for non-PCIe devices so pci_is_pcie(dev) check is not
> > > needed anymore.
> >
> > This one sounds like it's supposed to fix something, but I'm not clear
> > exactly what.
> >
> > I have a vague memory of config accesses messing up a low power state.
> > But this is still completely magical and unmaintainable since AFAIK
> > there is nothing in the PCIe spec about avoiding config accesses when
> > PTM is disabled.

I'm remembering this, which seemed like an ancestor of this patch:
https://lore.kernel.org/r/CAJZ5v0gNy6YJA+RNTEyHBdoJK-jqKN60oU_k_LX4=cTuyoO2mg@mail.gmail.com

This patch is queued up and does something similar (disabling PTM on
all devices before suspend):
https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/commit/?id=d878400c7d98

Is d878400c7d98 enough to solve the functional issue, and this patch
is basically a cleanup?  I think it's a nice cleanup and worth doing.

I'm just trying to figure out the "avoid config space access" in the
commit log.  If avoiding config space access is necessary, it needs
more explanation.

> Because ptm_enabled is expected to always reflect the hardware state,
> pci_disable_ptm() needs to be amended to clear it.  Also it is prudent
> to explicitly make it reflect the new hardware state in
> pci_restore_ptm_state().
>
> Then, pci_disable_ptm() can be made bail out if ptm_enabled is clear,
> because it has nothing to do then and the pci_is_pcie() check in there
> is not necessary, because ptm_enabled will never be set for devices
> that are not PCIe.
>
> > At the very least, we would need more details in the commit log and
> > a hint in the code about this.
> >
> > > Signed-off-by: Rajvi Jingar <rajvi.jingar@linux.intel.com>
> > > Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > ---
> > >  v1->v2:
> > >    - add ptm_enabled check in pci_ptm_disable().
> > >    - set the dev->ptm_enabled value in pci_restore_ptm_state().
> > >  v2->v3:
> > >    - remove pci_is_pcie(dev) check in pci_ptm_disable().
> > >    - add Reviewed-by tag in commit message
> > > ---
> > >  drivers/pci/pcie/ptm.c | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/pci/pcie/ptm.c b/drivers/pci/pcie/ptm.c
> > > index 368a254e3124..1ce241d4538f 100644
> > > --- a/drivers/pci/pcie/ptm.c
> > > +++ b/drivers/pci/pcie/ptm.c
> > > @@ -34,7 +34,7 @@ void pci_disable_ptm(struct pci_dev *dev)
> > >       int ptm;
> > >       u16 ctrl;
> > >
> > > -     if (!pci_is_pcie(dev))
> > > +     if (!dev->ptm_enabled)
> > >               return;
> >
> > This will conflict with a change Kai-Heng Feng and I have been working
> > on, but I can resolve it when applying.
> >
> > >       ptm = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_PTM);
> > > @@ -44,6 +44,7 @@ void pci_disable_ptm(struct pci_dev *dev)
> > >       pci_read_config_word(dev, ptm + PCI_PTM_CTRL, &ctrl);
> > >       ctrl &= ~(PCI_PTM_CTRL_ENABLE | PCI_PTM_CTRL_ROOT);
> > >       pci_write_config_word(dev, ptm + PCI_PTM_CTRL, ctrl);
> > > +     dev->ptm_enabled = 0;
> > >  }
> > >
> > >  void pci_save_ptm_state(struct pci_dev *dev)
> > > @@ -83,6 +84,7 @@ void pci_restore_ptm_state(struct pci_dev *dev)
> > >
> > >       cap = (u16 *)&save_state->cap.data[0];
> > >       pci_write_config_word(dev, ptm + PCI_PTM_CTRL, *cap);
> > > +     dev->ptm_enabled = !!(*cap & PCI_PTM_CTRL_ENABLE);
> > >  }
> > >
> > >  void pci_ptm_init(struct pci_dev *dev)
> > > --
> > > 2.25.1
> > >
Rafael J. Wysocki Aug. 30, 2022, 6:03 p.m. UTC | #4
On Tue, Aug 30, 2022 at 7:37 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Tue, Aug 30, 2022 at 06:58:20PM +0200, Rafael J. Wysocki wrote:
> > On Tue, Aug 30, 2022 at 6:25 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Tue, Aug 30, 2022 at 03:49:13AM -0700, Rajvi Jingar wrote:
> > > > pci_dev->ptm_enabled needs to be maintained to reflect the current PTM
> > > > state of the device. In pci_ptm_disable(), clear ptm_enabled from
> > > > 'struct pci_dev' on disabling PTM state for the device.
> > > > In pci_restore_ptm_state(), set dev->ptm_enabled based on the restored
> > > > PTM state of the device.
> > > >
> > > > In pci_ptm_disable(), perform ptm_enabled check to avoid config space
> > > > access in case if PTM is already disabled for the device. ptm_enabled
> > > > won't be set for non-PCIe devices so pci_is_pcie(dev) check is not
> > > > needed anymore.
> > >
> > > This one sounds like it's supposed to fix something, but I'm not clear
> > > exactly what.
> > >
> > > I have a vague memory of config accesses messing up a low power state.
> > > But this is still completely magical and unmaintainable since AFAIK
> > > there is nothing in the PCIe spec about avoiding config accesses when
> > > PTM is disabled.
>
> I'm remembering this, which seemed like an ancestor of this patch:
> https://lore.kernel.org/r/CAJZ5v0gNy6YJA+RNTEyHBdoJK-jqKN60oU_k_LX4=cTuyoO2mg@mail.gmail.com
>
> This patch is queued up and does something similar (disabling PTM on
> all devices before suspend):
> https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/commit/?id=d878400c7d98
>
> Is d878400c7d98 enough to solve the functional issue, and this patch
> is basically a cleanup?  I think it's a nice cleanup and worth doing.

This patch is independent of d878400c7d98.  We've been working on a
d878400c7d98 counterpart on top of this patch.

There is a problem with d878400c7d98 that disabling PTM from
pci_prepare_to_sleep() is not enough, because that function is not
called for some endpoints where we also want to disable PTM on system
suspend.

IMV the most suitable place to disable PTM (temporarily) on
system-wide suspend is in pci_pm_suspend_noirq(, because it is the
last piece of generic PCI code running for all PCI devices regardless
of what their drivers do.

> I'm just trying to figure out the "avoid config space access" in the
> commit log.  If avoiding config space access is necessary, it needs
> more explanation.
>
> > Because ptm_enabled is expected to always reflect the hardware state,
> > pci_disable_ptm() needs to be amended to clear it.  Also it is prudent
> > to explicitly make it reflect the new hardware state in
> > pci_restore_ptm_state().
> >
> > Then, pci_disable_ptm() can be made bail out if ptm_enabled is clear,
> > because it has nothing to do then and the pci_is_pcie() check in there
> > is not necessary, because ptm_enabled will never be set for devices
> > that are not PCIe.
> >
> > > At the very least, we would need more details in the commit log and
> > > a hint in the code about this.
> > >
> > > > Signed-off-by: Rajvi Jingar <rajvi.jingar@linux.intel.com>
> > > > Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > ---
> > > >  v1->v2:
> > > >    - add ptm_enabled check in pci_ptm_disable().
> > > >    - set the dev->ptm_enabled value in pci_restore_ptm_state().
> > > >  v2->v3:
> > > >    - remove pci_is_pcie(dev) check in pci_ptm_disable().
> > > >    - add Reviewed-by tag in commit message
> > > > ---
> > > >  drivers/pci/pcie/ptm.c | 4 +++-
> > > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/pci/pcie/ptm.c b/drivers/pci/pcie/ptm.c
> > > > index 368a254e3124..1ce241d4538f 100644
> > > > --- a/drivers/pci/pcie/ptm.c
> > > > +++ b/drivers/pci/pcie/ptm.c
> > > > @@ -34,7 +34,7 @@ void pci_disable_ptm(struct pci_dev *dev)
> > > >       int ptm;
> > > >       u16 ctrl;
> > > >
> > > > -     if (!pci_is_pcie(dev))
> > > > +     if (!dev->ptm_enabled)
> > > >               return;
> > >
> > > This will conflict with a change Kai-Heng Feng and I have been working
> > > on, but I can resolve it when applying.
> > >
> > > >       ptm = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_PTM);
> > > > @@ -44,6 +44,7 @@ void pci_disable_ptm(struct pci_dev *dev)
> > > >       pci_read_config_word(dev, ptm + PCI_PTM_CTRL, &ctrl);
> > > >       ctrl &= ~(PCI_PTM_CTRL_ENABLE | PCI_PTM_CTRL_ROOT);
> > > >       pci_write_config_word(dev, ptm + PCI_PTM_CTRL, ctrl);
> > > > +     dev->ptm_enabled = 0;
> > > >  }
> > > >
> > > >  void pci_save_ptm_state(struct pci_dev *dev)
> > > > @@ -83,6 +84,7 @@ void pci_restore_ptm_state(struct pci_dev *dev)
> > > >
> > > >       cap = (u16 *)&save_state->cap.data[0];
> > > >       pci_write_config_word(dev, ptm + PCI_PTM_CTRL, *cap);
> > > > +     dev->ptm_enabled = !!(*cap & PCI_PTM_CTRL_ENABLE);
> > > >  }
> > > >
> > > >  void pci_ptm_init(struct pci_dev *dev)
> > > > --
> > > > 2.25.1
> > > >
Bjorn Helgaas Aug. 30, 2022, 7:46 p.m. UTC | #5
On Tue, Aug 30, 2022 at 08:03:41PM +0200, Rafael J. Wysocki wrote:
> On Tue, Aug 30, 2022 at 7:37 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Tue, Aug 30, 2022 at 06:58:20PM +0200, Rafael J. Wysocki wrote:
> > > On Tue, Aug 30, 2022 at 6:25 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > On Tue, Aug 30, 2022 at 03:49:13AM -0700, Rajvi Jingar wrote:
> > > > > pci_dev->ptm_enabled needs to be maintained to reflect the current PTM
> > > > > state of the device. In pci_ptm_disable(), clear ptm_enabled from
> > > > > 'struct pci_dev' on disabling PTM state for the device.
> > > > > In pci_restore_ptm_state(), set dev->ptm_enabled based on the restored
> > > > > PTM state of the device.
> > > > >
> > > > > In pci_ptm_disable(), perform ptm_enabled check to avoid config space
> > > > > access in case if PTM is already disabled for the device. ptm_enabled
> > > > > won't be set for non-PCIe devices so pci_is_pcie(dev) check is not
> > > > > needed anymore.
> > > >
> > > > This one sounds like it's supposed to fix something, but I'm not clear
> > > > exactly what.
> > > >
> > > > I have a vague memory of config accesses messing up a low power state.
> > > > But this is still completely magical and unmaintainable since AFAIK
> > > > there is nothing in the PCIe spec about avoiding config accesses when
> > > > PTM is disabled.
> >
> > I'm remembering this, which seemed like an ancestor of this patch:
> > https://lore.kernel.org/r/CAJZ5v0gNy6YJA+RNTEyHBdoJK-jqKN60oU_k_LX4=cTuyoO2mg@mail.gmail.com
> >
> > This patch is queued up and does something similar (disabling PTM on
> > all devices before suspend):
> > https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/commit/?id=d878400c7d98
> >
> > Is d878400c7d98 enough to solve the functional issue, and this patch
> > is basically a cleanup?  I think it's a nice cleanup and worth doing.
> 
> This patch is independent of d878400c7d98.  We've been working on a
> d878400c7d98 counterpart on top of this patch.
> 
> There is a problem with d878400c7d98 that disabling PTM from
> pci_prepare_to_sleep() is not enough, because that function is not
> called for some endpoints where we also want to disable PTM on system
> suspend.
> 
> IMV the most suitable place to disable PTM (temporarily) on
> system-wide suspend is in pci_pm_suspend_noirq(, because it is the
> last piece of generic PCI code running for all PCI devices regardless
> of what their drivers do.
>
> > I'm just trying to figure out the "avoid config space access" in the
> > commit log.  If avoiding config space access is necessary, it needs
> > more explanation.

AFAICT, these are great cleanups but neither of these patches is
really a functional change.  If that's true, the "avoid config space
access" should be removed from the commit log.

I dropped d878400c7d98 for now and updated my pci/pm branch with just
these two patches with updates as attached below.

Then we can rework d878400c7d98 so it disables PTM unconditionally in
pci_pm_suspend_noirq() instead of in pci_prepare_to_sleep().

Currently, or with d878400c7d98, we only call pci_prepare_to_sleep()
when (!skip_bus_pm && !state_saved && pci_power_manageable()), so if a
driver saves its own state, we won't disable PTM, which seems like a
problem.

I assume we also want to move the pci_disable_ptm() from
pci_finish_runtime_suspend() to pci_pm_runtime_suspend() to keep it
parallel with pci_pm_suspend_noirq().

Is this making sense?

> > > Because ptm_enabled is expected to always reflect the hardware state,
> > > pci_disable_ptm() needs to be amended to clear it.  Also it is prudent
> > > to explicitly make it reflect the new hardware state in
> > > pci_restore_ptm_state().
> > >
> > > Then, pci_disable_ptm() can be made bail out if ptm_enabled is clear,
> > > because it has nothing to do then and the pci_is_pcie() check in there
> > > is not necessary, because ptm_enabled will never be set for devices
> > > that are not PCIe.
> > >
> > > > At the very least, we would need more details in the commit log and
> > > > a hint in the code about this.

commit 22b07d9ddd02 ("PCI/PTM: Update pdev->ptm_enabled to track hardware state")
Author: Rajvi Jingar <rajvi.jingar@linux.intel.com>
Date:   Tue Aug 30 03:49:13 2022 -0700

    PCI/PTM: Update pdev->ptm_enabled to track hardware state
    
    Update pdev->ptm_enabled to track hardware state when we disable or restore
    PTM state.
    
    No functional change intended, since 'ptm_enabled' was previously only
    tested during enumeration and pci_disable_ptm() is only used during
    suspend.
    
    [bhelgaas: commit log]
    Link: https://lore.kernel.org/r/20220830104913.1620539-2-rajvi.jingar@linux.intel.com
    Signed-off-by: Rajvi Jingar <rajvi.jingar@linux.intel.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

diff --git a/drivers/pci/pcie/ptm.c b/drivers/pci/pcie/ptm.c
index 368a254e3124..1ce241d4538f 100644
--- a/drivers/pci/pcie/ptm.c
+++ b/drivers/pci/pcie/ptm.c
@@ -34,7 +34,7 @@ void pci_disable_ptm(struct pci_dev *dev)
 	int ptm;
 	u16 ctrl;
 
-	if (!pci_is_pcie(dev))
+	if (!dev->ptm_enabled)
 		return;
 
 	ptm = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_PTM);
@@ -44,6 +44,7 @@ void pci_disable_ptm(struct pci_dev *dev)
 	pci_read_config_word(dev, ptm + PCI_PTM_CTRL, &ctrl);
 	ctrl &= ~(PCI_PTM_CTRL_ENABLE | PCI_PTM_CTRL_ROOT);
 	pci_write_config_word(dev, ptm + PCI_PTM_CTRL, ctrl);
+	dev->ptm_enabled = 0;
 }
 
 void pci_save_ptm_state(struct pci_dev *dev)
@@ -83,6 +84,7 @@ void pci_restore_ptm_state(struct pci_dev *dev)
 
 	cap = (u16 *)&save_state->cap.data[0];
 	pci_write_config_word(dev, ptm + PCI_PTM_CTRL, *cap);
+	dev->ptm_enabled = !!(*cap & PCI_PTM_CTRL_ENABLE);
 }
 
 void pci_ptm_init(struct pci_dev *dev)

commit 6e594f65471b ("PCI/PM: Simplify pci_pm_suspend_noirq()")
Author: Rajvi Jingar <rajvi.jingar@linux.intel.com>
Date:   Tue Aug 30 03:49:12 2022 -0700

    PCI/PM: Simplify pci_pm_suspend_noirq()
    
    We always want to save the device state unless the driver has already done
    it.  Rearrange the checking in pci_pm_suspend_noirq() to make this more
    clear.  No functional change intended.
    
    [bhelgaas: commit log, rewrap comment]
    Link: https://lore.kernel.org/r/20220830104913.1620539-1-rajvi.jingar@linux.intel.com
    Signed-off-by: Rajvi Jingar <rajvi.jingar@linux.intel.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 49238ddd39ee..2815922ac525 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -867,20 +867,15 @@ static int pci_pm_suspend_noirq(struct device *dev)
 		}
 	}
 
-	if (pci_dev->skip_bus_pm) {
+	if (!pci_dev->state_saved) {
+		pci_save_state(pci_dev);
+
 		/*
-		 * Either the device is a bridge with a child in D0 below it, or
-		 * the function is running for the second time in a row without
-		 * going through full resume, which is possible only during
-		 * suspend-to-idle in a spurious wakeup case.  The device should
-		 * be in D0 at this point, but if it is a bridge, it may be
-		 * necessary to save its state.
+		 * If the device is a bridge with a child in D0 below it,
+		 * it needs to stay in D0, so check skip_bus_pm to avoid
+		 * putting it into a low-power state in that case.
 		 */
-		if (!pci_dev->state_saved)
-			pci_save_state(pci_dev);
-	} else if (!pci_dev->state_saved) {
-		pci_save_state(pci_dev);
-		if (pci_power_manageable(pci_dev))
+		if (!pci_dev->skip_bus_pm && pci_power_manageable(pci_dev))
 			pci_prepare_to_sleep(pci_dev);
 	}
Rafael J. Wysocki Aug. 31, 2022, 5:53 p.m. UTC | #6
On Tue, Aug 30, 2022 at 9:46 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Tue, Aug 30, 2022 at 08:03:41PM +0200, Rafael J. Wysocki wrote:
> > On Tue, Aug 30, 2022 at 7:37 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Tue, Aug 30, 2022 at 06:58:20PM +0200, Rafael J. Wysocki wrote:
> > > > On Tue, Aug 30, 2022 at 6:25 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > > > On Tue, Aug 30, 2022 at 03:49:13AM -0700, Rajvi Jingar wrote:
> > > > > > pci_dev->ptm_enabled needs to be maintained to reflect the current PTM
> > > > > > state of the device. In pci_ptm_disable(), clear ptm_enabled from
> > > > > > 'struct pci_dev' on disabling PTM state for the device.
> > > > > > In pci_restore_ptm_state(), set dev->ptm_enabled based on the restored
> > > > > > PTM state of the device.
> > > > > >
> > > > > > In pci_ptm_disable(), perform ptm_enabled check to avoid config space
> > > > > > access in case if PTM is already disabled for the device. ptm_enabled
> > > > > > won't be set for non-PCIe devices so pci_is_pcie(dev) check is not
> > > > > > needed anymore.
> > > > >
> > > > > This one sounds like it's supposed to fix something, but I'm not clear
> > > > > exactly what.
> > > > >
> > > > > I have a vague memory of config accesses messing up a low power state.
> > > > > But this is still completely magical and unmaintainable since AFAIK
> > > > > there is nothing in the PCIe spec about avoiding config accesses when
> > > > > PTM is disabled.
> > >
> > > I'm remembering this, which seemed like an ancestor of this patch:
> > > https://lore.kernel.org/r/CAJZ5v0gNy6YJA+RNTEyHBdoJK-jqKN60oU_k_LX4=cTuyoO2mg@mail.gmail.com
> > >
> > > This patch is queued up and does something similar (disabling PTM on
> > > all devices before suspend):
> > > https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/commit/?id=d878400c7d98
> > >
> > > Is d878400c7d98 enough to solve the functional issue, and this patch
> > > is basically a cleanup?  I think it's a nice cleanup and worth doing.
> >
> > This patch is independent of d878400c7d98.  We've been working on a
> > d878400c7d98 counterpart on top of this patch.
> >
> > There is a problem with d878400c7d98 that disabling PTM from
> > pci_prepare_to_sleep() is not enough, because that function is not
> > called for some endpoints where we also want to disable PTM on system
> > suspend.
> >
> > IMV the most suitable place to disable PTM (temporarily) on
> > system-wide suspend is in pci_pm_suspend_noirq(, because it is the
> > last piece of generic PCI code running for all PCI devices regardless
> > of what their drivers do.
> >
> > > I'm just trying to figure out the "avoid config space access" in the
> > > commit log.  If avoiding config space access is necessary, it needs
> > > more explanation.
>
> AFAICT, these are great cleanups but neither of these patches is
> really a functional change.  If that's true,

It is true.

> the "avoid config space access" should be removed from the commit log.

Agreed.

> I dropped d878400c7d98 for now and updated my pci/pm branch with just
> these two patches with updates as attached below.

They look good to me.

> Then we can rework d878400c7d98 so it disables PTM unconditionally in
> pci_pm_suspend_noirq() instead of in pci_prepare_to_sleep().

In the meantime, I recalled that nvme wanted to leave the device in D0
and program it into an internal low-power state in some cases which
would be disturbed by disabling PTM later on (a config space write
would kick the device out of the internal low-power state).

So it looks like it would be better to disable PTM as the first thing
in pci_pm_suspend() before calling the driver's suspend callback
(which may be nvme suspend), but then we'd need to save the original
PTM status and restore it during the subsequent resume.  That could be
done as early as in pci_pm_resume_noirq(), but I think the cleanest
way would be to add a new bit to struct pci_device for that.
Alternatively, we can drop the $subject patch, so ptm_enabled still
only means that it has been enabled during enumeration and it can be
used to restore the original PTM status during resume.

> Currently, or with d878400c7d98, we only call pci_prepare_to_sleep()
> when (!skip_bus_pm && !state_saved && pci_power_manageable()), so if a
> driver saves its own state, we won't disable PTM, which seems like a
> problem.

Agreed.

> I assume we also want to move the pci_disable_ptm() from
> pci_finish_runtime_suspend() to pci_pm_runtime_suspend() to keep it
> parallel with pci_pm_suspend_noirq().

I think so.

> Is this making sense?

Yes, it is, modulo the above observations.

> > > > Because ptm_enabled is expected to always reflect the hardware state,
> > > > pci_disable_ptm() needs to be amended to clear it.  Also it is prudent
> > > > to explicitly make it reflect the new hardware state in
> > > > pci_restore_ptm_state().
> > > >
> > > > Then, pci_disable_ptm() can be made bail out if ptm_enabled is clear,
> > > > because it has nothing to do then and the pci_is_pcie() check in there
> > > > is not necessary, because ptm_enabled will never be set for devices
> > > > that are not PCIe.
> > > >
> > > > > At the very least, we would need more details in the commit log and
> > > > > a hint in the code about this.
>
> commit 22b07d9ddd02 ("PCI/PTM: Update pdev->ptm_enabled to track hardware state")
> Author: Rajvi Jingar <rajvi.jingar@linux.intel.com>
> Date:   Tue Aug 30 03:49:13 2022 -0700
>
>     PCI/PTM: Update pdev->ptm_enabled to track hardware state
>
>     Update pdev->ptm_enabled to track hardware state when we disable or restore
>     PTM state.
>
>     No functional change intended, since 'ptm_enabled' was previously only
>     tested during enumeration and pci_disable_ptm() is only used during
>     suspend.
>
>     [bhelgaas: commit log]
>     Link: https://lore.kernel.org/r/20220830104913.1620539-2-rajvi.jingar@linux.intel.com
>     Signed-off-by: Rajvi Jingar <rajvi.jingar@linux.intel.com>
>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>     Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> diff --git a/drivers/pci/pcie/ptm.c b/drivers/pci/pcie/ptm.c
> index 368a254e3124..1ce241d4538f 100644
> --- a/drivers/pci/pcie/ptm.c
> +++ b/drivers/pci/pcie/ptm.c
> @@ -34,7 +34,7 @@ void pci_disable_ptm(struct pci_dev *dev)
>         int ptm;
>         u16 ctrl;
>
> -       if (!pci_is_pcie(dev))
> +       if (!dev->ptm_enabled)
>                 return;
>
>         ptm = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_PTM);
> @@ -44,6 +44,7 @@ void pci_disable_ptm(struct pci_dev *dev)
>         pci_read_config_word(dev, ptm + PCI_PTM_CTRL, &ctrl);
>         ctrl &= ~(PCI_PTM_CTRL_ENABLE | PCI_PTM_CTRL_ROOT);
>         pci_write_config_word(dev, ptm + PCI_PTM_CTRL, ctrl);
> +       dev->ptm_enabled = 0;
>  }
>
>  void pci_save_ptm_state(struct pci_dev *dev)
> @@ -83,6 +84,7 @@ void pci_restore_ptm_state(struct pci_dev *dev)
>
>         cap = (u16 *)&save_state->cap.data[0];
>         pci_write_config_word(dev, ptm + PCI_PTM_CTRL, *cap);
> +       dev->ptm_enabled = !!(*cap & PCI_PTM_CTRL_ENABLE);
>  }
>
>  void pci_ptm_init(struct pci_dev *dev)
>
> commit 6e594f65471b ("PCI/PM: Simplify pci_pm_suspend_noirq()")
> Author: Rajvi Jingar <rajvi.jingar@linux.intel.com>
> Date:   Tue Aug 30 03:49:12 2022 -0700
>
>     PCI/PM: Simplify pci_pm_suspend_noirq()
>
>     We always want to save the device state unless the driver has already done
>     it.  Rearrange the checking in pci_pm_suspend_noirq() to make this more
>     clear.  No functional change intended.
>
>     [bhelgaas: commit log, rewrap comment]
>     Link: https://lore.kernel.org/r/20220830104913.1620539-1-rajvi.jingar@linux.intel.com
>     Signed-off-by: Rajvi Jingar <rajvi.jingar@linux.intel.com>
>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>     Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 49238ddd39ee..2815922ac525 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -867,20 +867,15 @@ static int pci_pm_suspend_noirq(struct device *dev)
>                 }
>         }
>
> -       if (pci_dev->skip_bus_pm) {
> +       if (!pci_dev->state_saved) {
> +               pci_save_state(pci_dev);
> +
>                 /*
> -                * Either the device is a bridge with a child in D0 below it, or
> -                * the function is running for the second time in a row without
> -                * going through full resume, which is possible only during
> -                * suspend-to-idle in a spurious wakeup case.  The device should
> -                * be in D0 at this point, but if it is a bridge, it may be
> -                * necessary to save its state.
> +                * If the device is a bridge with a child in D0 below it,
> +                * it needs to stay in D0, so check skip_bus_pm to avoid
> +                * putting it into a low-power state in that case.
>                  */
> -               if (!pci_dev->state_saved)
> -                       pci_save_state(pci_dev);
> -       } else if (!pci_dev->state_saved) {
> -               pci_save_state(pci_dev);
> -               if (pci_power_manageable(pci_dev))
> +               if (!pci_dev->skip_bus_pm && pci_power_manageable(pci_dev))
>                         pci_prepare_to_sleep(pci_dev);
>         }
>
Bjorn Helgaas Sept. 1, 2022, 9:26 p.m. UTC | #7
On Wed, Aug 31, 2022 at 07:53:05PM +0200, Rafael J. Wysocki wrote:
> ...

> In the meantime, I recalled that nvme wanted to leave the device in D0
> and program it into an internal low-power state in some cases which
> would be disturbed by disabling PTM later on (a config space write
> would kick the device out of the internal low-power state).
> 
> So it looks like it would be better to disable PTM as the first thing
> in pci_pm_suspend() before calling the driver's suspend callback
> (which may be nvme suspend), 

Yes, I was thinking the same thing.  There's no reason we need to wait
until interrupts are disabled to disable PTM.

> but then we'd need to save the original PTM status and restore it
> during the subsequent resume.  That could be done as early as in
> pci_pm_resume_noirq(), but I think the cleanest way would be to add
> a new bit to struct pci_device for that.

> Alternatively, we can drop the $subject patch, so ptm_enabled still
> only means that it has been enabled during enumeration and it can be
> used to restore the original PTM status during resume.

I like this second idea of dropping this "PCI/PTM: fix to maintain
pci_dev->ptm_enabled" patch and using "dev->ptm_enabled" to set the
PTM Enable bit on restore, as in the patches below.  Then we don't
need to do anything explicit to re-enable PTM.

If this makes sense, I'll add a few cleanups on top and post as a
formal series.

Bjorn


commit 73690aa361a7 ("PCI/PM: Always disable PTM for all devices during suspend")
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Thu Sep 1 16:14:45 2022 -0500

    PCI/PM: Always disable PTM for all devices during suspend
    
    We want to disable PTM on Root Ports because that allows some chips, e.g.,
    Intel mobile chips since Coffee Lake, to enter a lower-power PM state.
    
    That means we also have to disable PTM on downstream devices because PCIe
    r6.0, sec 2.2.8, strongly recommends that functions support generation of
    messages in non-D0 states, so we assume Switch Upstream Ports or Endpoints
    may send PTM Requests while in D1, D2, and D3hot.  A PTM message received
    by a Downstream Port (including a Root Port) with PTM disabled must be
    treated as an Unsupported Request (sec 6.21.3).
    
    PTM was previously disabled only for Root Ports, and it was disabled in
    pci_prepare_to_sleep(), which is not called at all if a driver supports
    legacy PM or does its own state saving.
    
    Instead, disable PTM early in pci_pm_suspend() and pci_pm_runtime_suspend()
    so we do it in all cases.
    
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 2815922ac525..f07399a94807 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -772,6 +772,12 @@ static int pci_pm_suspend(struct device *dev)
 	struct pci_dev *pci_dev = to_pci_dev(dev);
 	const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
 
+	/*
+	 * Disabling PTM allows some systems, e.g., Intel mobile chips
+	 * since Coffee Lake, to enter a lower-power PM state.
+	 */
+	pci_disable_ptm(pci_dev);
+
 	pci_dev->skip_bus_pm = false;
 
 	if (pci_has_legacy_pm_support(pci_dev))
@@ -1269,6 +1275,8 @@ static int pci_pm_runtime_suspend(struct device *dev)
 	pci_power_t prev = pci_dev->current_state;
 	int error;
 
+	pci_disable_ptm(pci_dev);
+
 	/*
 	 * If pci_dev->driver is not set (unbound), we leave the device in D0,
 	 * but it may go to D3cold when the bridge above it runtime suspends.
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 95bc329e74c0..b0e2968c8cca 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -2706,16 +2706,6 @@ int pci_prepare_to_sleep(struct pci_dev *dev)
 	if (target_state == PCI_POWER_ERROR)
 		return -EIO;
 
-	/*
-	 * There are systems (for example, Intel mobile chips since Coffee
-	 * Lake) where the power drawn while suspended can be significantly
-	 * reduced by disabling PTM on PCIe root ports as this allows the
-	 * port to enter a lower-power PM state and the SoC to reach a
-	 * lower-power idle state as a whole.
-	 */
-	if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT)
-		pci_disable_ptm(dev);
-
 	pci_enable_wake(dev, target_state, wakeup);
 
 	error = pci_set_power_state(dev, target_state);
@@ -2764,16 +2754,6 @@ int pci_finish_runtime_suspend(struct pci_dev *dev)
 	if (target_state == PCI_POWER_ERROR)
 		return -EIO;
 
-	/*
-	 * There are systems (for example, Intel mobile chips since Coffee
-	 * Lake) where the power drawn while suspended can be significantly
-	 * reduced by disabling PTM on PCIe root ports as this allows the
-	 * port to enter a lower-power PM state and the SoC to reach a
-	 * lower-power idle state as a whole.
-	 */
-	if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT)
-		pci_disable_ptm(dev);
-
 	__pci_enable_wake(dev, target_state, pci_dev_run_wake(dev));
 
 	error = pci_set_power_state(dev, target_state);

commit f84a7e954e37 ("PCI/PTM: Enable PTM when restoring state")
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Thu Sep 1 15:51:23 2022 -0500

    PCI/PTM: Enable PTM when restoring state
    
    The suspend path may disable PTM before saving config state, which means
    the PCI_PTM_CTRL_ENABLE bit in the saved state may be cleared even though
    we want PTM to be enabled when resuming.
    
    If "dev->ptm_enabled" is set, it means PTM should be enabled, so make sure
    PCI_PTM_CTRL_ENABLE is set when restoring the PTM state.
    
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

diff --git a/drivers/pci/pcie/ptm.c b/drivers/pci/pcie/ptm.c
index b6a417247ce3..3115601a85ef 100644
--- a/drivers/pci/pcie/ptm.c
+++ b/drivers/pci/pcie/ptm.c
@@ -82,6 +82,14 @@ void pci_restore_ptm_state(struct pci_dev *dev)
 		return;
 
 	cap = (u16 *)&save_state->cap.data[0];
+
+	/*
+	 * The suspend path may disable PTM before saving config state.
+	 * Make sure PCI_PTM_CTRL_ENABLE is set if PTM should be enabled.
+	 */
+	if (dev->ptm_enabled)
+		*cap |= PCI_PTM_CTRL_ENABLE;
+
 	pci_write_config_word(dev, ptm + PCI_PTM_CTRL, *cap);
 }
 

commit 1d7d32a35df0 ("PCI/PTM: Preserve PTM Root Select")
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Thu Sep 1 15:54:15 2022 -0500

    PCI/PTM: Preserve PTM Root Select
    
    When disabling PTM, there's no need to clear the Root Select bit.  We
    disable PTM during suspend, and we want to re-enable it during resume.
    Clearing Root Select here makes re-enabling more complicated.
    
    Per PCIe r6.0, sec 7.9.15.3, "When set, if the PTM Enable bit is also Set,
    this Time Source is the PTM Root," so if PTM Enable is cleared, the value
    of Root Select should be irrelevant.
    
    Preserve Root Select to simplify re-enabling PTM.
    
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    Cc: David E. Box <david.e.box@linux.intel.com>

diff --git a/drivers/pci/pcie/ptm.c b/drivers/pci/pcie/ptm.c
index 368a254e3124..b6a417247ce3 100644
--- a/drivers/pci/pcie/ptm.c
+++ b/drivers/pci/pcie/ptm.c
@@ -42,7 +42,7 @@ void pci_disable_ptm(struct pci_dev *dev)
 		return;
 
 	pci_read_config_word(dev, ptm + PCI_PTM_CTRL, &ctrl);
-	ctrl &= ~(PCI_PTM_CTRL_ENABLE | PCI_PTM_CTRL_ROOT);
+	ctrl &= ~PCI_PTM_CTRL_ENABLE;
 	pci_write_config_word(dev, ptm + PCI_PTM_CTRL, ctrl);
 }
Rafael J. Wysocki Sept. 3, 2022, 5:03 p.m. UTC | #8
On Thu, Sep 1, 2022 at 11:26 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Wed, Aug 31, 2022 at 07:53:05PM +0200, Rafael J. Wysocki wrote:
> > ...
>
> > In the meantime, I recalled that nvme wanted to leave the device in D0
> > and program it into an internal low-power state in some cases which
> > would be disturbed by disabling PTM later on (a config space write
> > would kick the device out of the internal low-power state).
> >
> > So it looks like it would be better to disable PTM as the first thing
> > in pci_pm_suspend() before calling the driver's suspend callback
> > (which may be nvme suspend),
>
> Yes, I was thinking the same thing.  There's no reason we need to wait
> until interrupts are disabled to disable PTM.
>
> > but then we'd need to save the original PTM status and restore it
> > during the subsequent resume.  That could be done as early as in
> > pci_pm_resume_noirq(), but I think the cleanest way would be to add
> > a new bit to struct pci_device for that.
>
> > Alternatively, we can drop the $subject patch, so ptm_enabled still
> > only means that it has been enabled during enumeration and it can be
> > used to restore the original PTM status during resume.
>
> I like this second idea of dropping this "PCI/PTM: fix to maintain
> pci_dev->ptm_enabled" patch and using "dev->ptm_enabled" to set the
> PTM Enable bit on restore, as in the patches below.  Then we don't
> need to do anything explicit to re-enable PTM.
>
> If this makes sense, I'll add a few cleanups on top and post as a
> formal series.

It does to me.

Thanks for taking care of this!

> commit 73690aa361a7 ("PCI/PM: Always disable PTM for all devices during suspend")
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date:   Thu Sep 1 16:14:45 2022 -0500
>
>     PCI/PM: Always disable PTM for all devices during suspend
>
>     We want to disable PTM on Root Ports because that allows some chips, e.g.,
>     Intel mobile chips since Coffee Lake, to enter a lower-power PM state.
>
>     That means we also have to disable PTM on downstream devices because PCIe
>     r6.0, sec 2.2.8, strongly recommends that functions support generation of
>     messages in non-D0 states, so we assume Switch Upstream Ports or Endpoints
>     may send PTM Requests while in D1, D2, and D3hot.  A PTM message received
>     by a Downstream Port (including a Root Port) with PTM disabled must be
>     treated as an Unsupported Request (sec 6.21.3).
>
>     PTM was previously disabled only for Root Ports, and it was disabled in
>     pci_prepare_to_sleep(), which is not called at all if a driver supports
>     legacy PM or does its own state saving.
>
>     Instead, disable PTM early in pci_pm_suspend() and pci_pm_runtime_suspend()
>     so we do it in all cases.
>
>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 2815922ac525..f07399a94807 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -772,6 +772,12 @@ static int pci_pm_suspend(struct device *dev)
>         struct pci_dev *pci_dev = to_pci_dev(dev);
>         const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
>
> +       /*
> +        * Disabling PTM allows some systems, e.g., Intel mobile chips
> +        * since Coffee Lake, to enter a lower-power PM state.
> +        */
> +       pci_disable_ptm(pci_dev);
> +
>         pci_dev->skip_bus_pm = false;
>
>         if (pci_has_legacy_pm_support(pci_dev))
> @@ -1269,6 +1275,8 @@ static int pci_pm_runtime_suspend(struct device *dev)
>         pci_power_t prev = pci_dev->current_state;
>         int error;
>
> +       pci_disable_ptm(pci_dev);
> +
>         /*
>          * If pci_dev->driver is not set (unbound), we leave the device in D0,
>          * but it may go to D3cold when the bridge above it runtime suspends.
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 95bc329e74c0..b0e2968c8cca 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -2706,16 +2706,6 @@ int pci_prepare_to_sleep(struct pci_dev *dev)
>         if (target_state == PCI_POWER_ERROR)
>                 return -EIO;
>
> -       /*
> -        * There are systems (for example, Intel mobile chips since Coffee
> -        * Lake) where the power drawn while suspended can be significantly
> -        * reduced by disabling PTM on PCIe root ports as this allows the
> -        * port to enter a lower-power PM state and the SoC to reach a
> -        * lower-power idle state as a whole.
> -        */
> -       if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT)
> -               pci_disable_ptm(dev);
> -
>         pci_enable_wake(dev, target_state, wakeup);
>
>         error = pci_set_power_state(dev, target_state);
> @@ -2764,16 +2754,6 @@ int pci_finish_runtime_suspend(struct pci_dev *dev)
>         if (target_state == PCI_POWER_ERROR)
>                 return -EIO;
>
> -       /*
> -        * There are systems (for example, Intel mobile chips since Coffee
> -        * Lake) where the power drawn while suspended can be significantly
> -        * reduced by disabling PTM on PCIe root ports as this allows the
> -        * port to enter a lower-power PM state and the SoC to reach a
> -        * lower-power idle state as a whole.
> -        */
> -       if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT)
> -               pci_disable_ptm(dev);
> -
>         __pci_enable_wake(dev, target_state, pci_dev_run_wake(dev));
>
>         error = pci_set_power_state(dev, target_state);
>
> commit f84a7e954e37 ("PCI/PTM: Enable PTM when restoring state")
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date:   Thu Sep 1 15:51:23 2022 -0500
>
>     PCI/PTM: Enable PTM when restoring state
>
>     The suspend path may disable PTM before saving config state, which means
>     the PCI_PTM_CTRL_ENABLE bit in the saved state may be cleared even though
>     we want PTM to be enabled when resuming.
>
>     If "dev->ptm_enabled" is set, it means PTM should be enabled, so make sure
>     PCI_PTM_CTRL_ENABLE is set when restoring the PTM state.
>
>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>
> diff --git a/drivers/pci/pcie/ptm.c b/drivers/pci/pcie/ptm.c
> index b6a417247ce3..3115601a85ef 100644
> --- a/drivers/pci/pcie/ptm.c
> +++ b/drivers/pci/pcie/ptm.c
> @@ -82,6 +82,14 @@ void pci_restore_ptm_state(struct pci_dev *dev)
>                 return;
>
>         cap = (u16 *)&save_state->cap.data[0];
> +
> +       /*
> +        * The suspend path may disable PTM before saving config state.
> +        * Make sure PCI_PTM_CTRL_ENABLE is set if PTM should be enabled.
> +        */
> +       if (dev->ptm_enabled)
> +               *cap |= PCI_PTM_CTRL_ENABLE;
> +
>         pci_write_config_word(dev, ptm + PCI_PTM_CTRL, *cap);
>  }
>
>
> commit 1d7d32a35df0 ("PCI/PTM: Preserve PTM Root Select")
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date:   Thu Sep 1 15:54:15 2022 -0500
>
>     PCI/PTM: Preserve PTM Root Select
>
>     When disabling PTM, there's no need to clear the Root Select bit.  We
>     disable PTM during suspend, and we want to re-enable it during resume.
>     Clearing Root Select here makes re-enabling more complicated.
>
>     Per PCIe r6.0, sec 7.9.15.3, "When set, if the PTM Enable bit is also Set,
>     this Time Source is the PTM Root," so if PTM Enable is cleared, the value
>     of Root Select should be irrelevant.
>
>     Preserve Root Select to simplify re-enabling PTM.
>
>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>     Cc: David E. Box <david.e.box@linux.intel.com>
>
> diff --git a/drivers/pci/pcie/ptm.c b/drivers/pci/pcie/ptm.c
> index 368a254e3124..b6a417247ce3 100644
> --- a/drivers/pci/pcie/ptm.c
> +++ b/drivers/pci/pcie/ptm.c
> @@ -42,7 +42,7 @@ void pci_disable_ptm(struct pci_dev *dev)
>                 return;
>
>         pci_read_config_word(dev, ptm + PCI_PTM_CTRL, &ctrl);
> -       ctrl &= ~(PCI_PTM_CTRL_ENABLE | PCI_PTM_CTRL_ROOT);
> +       ctrl &= ~PCI_PTM_CTRL_ENABLE;
>         pci_write_config_word(dev, ptm + PCI_PTM_CTRL, ctrl);
>  }
>
diff mbox series

Patch

diff --git a/drivers/pci/pcie/ptm.c b/drivers/pci/pcie/ptm.c
index 368a254e3124..1ce241d4538f 100644
--- a/drivers/pci/pcie/ptm.c
+++ b/drivers/pci/pcie/ptm.c
@@ -34,7 +34,7 @@  void pci_disable_ptm(struct pci_dev *dev)
 	int ptm;
 	u16 ctrl;
 
-	if (!pci_is_pcie(dev))
+	if (!dev->ptm_enabled)
 		return;
 
 	ptm = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_PTM);
@@ -44,6 +44,7 @@  void pci_disable_ptm(struct pci_dev *dev)
 	pci_read_config_word(dev, ptm + PCI_PTM_CTRL, &ctrl);
 	ctrl &= ~(PCI_PTM_CTRL_ENABLE | PCI_PTM_CTRL_ROOT);
 	pci_write_config_word(dev, ptm + PCI_PTM_CTRL, ctrl);
+	dev->ptm_enabled = 0;
 }
 
 void pci_save_ptm_state(struct pci_dev *dev)
@@ -83,6 +84,7 @@  void pci_restore_ptm_state(struct pci_dev *dev)
 
 	cap = (u16 *)&save_state->cap.data[0];
 	pci_write_config_word(dev, ptm + PCI_PTM_CTRL, *cap);
+	dev->ptm_enabled = !!(*cap & PCI_PTM_CTRL_ENABLE);
 }
 
 void pci_ptm_init(struct pci_dev *dev)