diff mbox series

PCI/sysfs: Protect driver's D3cold preference from user space

Message ID b8a7f4af2b73f6b506ad8ddee59d747cbf834606.1695025365.git.lukas@wunner.de
State New
Headers show
Series PCI/sysfs: Protect driver's D3cold preference from user space | expand

Commit Message

Lukas Wunner Sept. 18, 2023, 12:48 p.m. UTC
struct pci_dev contains two flags which govern whether the device may
suspend to D3cold:

* no_d3cold provides an opt-out for drivers (e.g. if a device is known
  to not wake from D3cold)

* d3cold_allowed provides an opt-out for user space (default is true,
  user space may set to false)

Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"),
the user space setting overwrites the driver setting.  Essentially user
space is trusted to know better than the driver whether D3cold is
working.

That feels unsafe and wrong.  Assume that the change was introduced
inadvertently and do not overwrite no_d3cold when d3cold_allowed is
modified.  Instead, consider d3cold_allowed in addition to no_d3cold
when choosing a suspend state for the device.

That way, user space may opt out of D3cold if the driver hasn't, but it
may no longer force an opt in if the driver has opted out.

Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v4.8+
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
---
 drivers/pci/pci-acpi.c  | 2 +-
 drivers/pci/pci-sysfs.c | 5 +----
 2 files changed, 2 insertions(+), 5 deletions(-)

Comments

Mika Westerberg Sept. 18, 2023, 1:07 p.m. UTC | #1
Hi Lukas,

On Mon, Sep 18, 2023 at 02:48:01PM +0200, Lukas Wunner wrote:
> struct pci_dev contains two flags which govern whether the device may
> suspend to D3cold:
> 
> * no_d3cold provides an opt-out for drivers (e.g. if a device is known
>   to not wake from D3cold)
> 
> * d3cold_allowed provides an opt-out for user space (default is true,
>   user space may set to false)
> 
> Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"),
> the user space setting overwrites the driver setting.  Essentially user
> space is trusted to know better than the driver whether D3cold is
> working.
> 
> That feels unsafe and wrong.  Assume that the change was introduced
> inadvertently and do not overwrite no_d3cold when d3cold_allowed is
> modified.  Instead, consider d3cold_allowed in addition to no_d3cold
> when choosing a suspend state for the device.
> 
> That way, user space may opt out of D3cold if the driver hasn't, but it
> may no longer force an opt in if the driver has opted out.

Makes sense. I just wonder should the sysfs write fail from userspace
perspective if the driver has opted out and userspace tries to force it?
Or it does that already?
Mario Limonciello Sept. 18, 2023, 1:14 p.m. UTC | #2
On 9/18/2023 08:07, Mika Westerberg wrote:
> Hi Lukas,
> 
> On Mon, Sep 18, 2023 at 02:48:01PM +0200, Lukas Wunner wrote:
>> struct pci_dev contains two flags which govern whether the device may
>> suspend to D3cold:
>>
>> * no_d3cold provides an opt-out for drivers (e.g. if a device is known
>>    to not wake from D3cold)
>>
>> * d3cold_allowed provides an opt-out for user space (default is true,
>>    user space may set to false)
>>
>> Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"),
>> the user space setting overwrites the driver setting.  Essentially user
>> space is trusted to know better than the driver whether D3cold is
>> working.
>>
>> That feels unsafe and wrong.  Assume that the change was introduced
>> inadvertently and do not overwrite no_d3cold when d3cold_allowed is
>> modified.  Instead, consider d3cold_allowed in addition to no_d3cold
>> when choosing a suspend state for the device.
>>
>> That way, user space may opt out of D3cold if the driver hasn't, but it
>> may no longer force an opt in if the driver has opted out.
> 
> Makes sense. I just wonder should the sysfs write fail from userspace
> perspective if the driver has opted out and userspace tries to force it?
> Or it does that already?

What's the history behind why userspace is allowed to opt a device out 
of D3cold in the first place?

It feels like it should have been a debugging only thing to me.
Lukas Wunner Sept. 18, 2023, 1:24 p.m. UTC | #3
On Mon, Sep 18, 2023 at 08:14:21AM -0500, Mario Limonciello wrote:
> On 9/18/2023 08:07, Mika Westerberg wrote:
> > On Mon, Sep 18, 2023 at 02:48:01PM +0200, Lukas Wunner wrote:
> > > struct pci_dev contains two flags which govern whether the device may
> > > suspend to D3cold:
> > > 
> > > * no_d3cold provides an opt-out for drivers (e.g. if a device is known
> > >    to not wake from D3cold)
> > > 
> > > * d3cold_allowed provides an opt-out for user space (default is true,
> > >    user space may set to false)
> > > 
> > > Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"),
> > > the user space setting overwrites the driver setting.  Essentially user
> > > space is trusted to know better than the driver whether D3cold is
> > > working.
> > > 
> > > That feels unsafe and wrong.  Assume that the change was introduced
> > > inadvertently and do not overwrite no_d3cold when d3cold_allowed is
> > > modified.  Instead, consider d3cold_allowed in addition to no_d3cold
> > > when choosing a suspend state for the device.
> > > 
> > > That way, user space may opt out of D3cold if the driver hasn't, but it
> > > may no longer force an opt in if the driver has opted out.
> > 
> > Makes sense. I just wonder should the sysfs write fail from userspace
> > perspective if the driver has opted out and userspace tries to force it?
> > Or it does that already?
> 
> What's the history behind why userspace is allowed to opt a device out of
> D3cold in the first place?
> 
> It feels like it should have been a debugging only thing to me.

That's a fair question.

Apparently the default for d3cold_allowed was originally "false"
and user space could opt in to D3cold.  Then commit 4f9c1397e2e8
("PCI/PM: Enable D3/D3cold by default for most devices") changed
the default to "true".  That was 11 years ago.

I agree that today this should all work automatically and a
user space option to disable D3cold on a per-device basis only
really makes sense as a debugging aid, hence belongs in debugfs.

Thanks,

Lukas
Mario Limonciello Sept. 18, 2023, 1:28 p.m. UTC | #4
On 9/18/2023 08:24, Lukas Wunner wrote:
> On Mon, Sep 18, 2023 at 08:14:21AM -0500, Mario Limonciello wrote:
>> On 9/18/2023 08:07, Mika Westerberg wrote:
>>> On Mon, Sep 18, 2023 at 02:48:01PM +0200, Lukas Wunner wrote:
>>>> struct pci_dev contains two flags which govern whether the device may
>>>> suspend to D3cold:
>>>>
>>>> * no_d3cold provides an opt-out for drivers (e.g. if a device is known
>>>>     to not wake from D3cold)
>>>>
>>>> * d3cold_allowed provides an opt-out for user space (default is true,
>>>>     user space may set to false)
>>>>
>>>> Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"),
>>>> the user space setting overwrites the driver setting.  Essentially user
>>>> space is trusted to know better than the driver whether D3cold is
>>>> working.
>>>>
>>>> That feels unsafe and wrong.  Assume that the change was introduced
>>>> inadvertently and do not overwrite no_d3cold when d3cold_allowed is
>>>> modified.  Instead, consider d3cold_allowed in addition to no_d3cold
>>>> when choosing a suspend state for the device.
>>>>
>>>> That way, user space may opt out of D3cold if the driver hasn't, but it
>>>> may no longer force an opt in if the driver has opted out.
>>>
>>> Makes sense. I just wonder should the sysfs write fail from userspace
>>> perspective if the driver has opted out and userspace tries to force it?
>>> Or it does that already?
>>
>> What's the history behind why userspace is allowed to opt a device out of
>> D3cold in the first place?
>>
>> It feels like it should have been a debugging only thing to me.
> 
> That's a fair question.
> 
> Apparently the default for d3cold_allowed was originally "false"
> and user space could opt in to D3cold.  Then commit 4f9c1397e2e8
> ("PCI/PM: Enable D3/D3cold by default for most devices") changed
> the default to "true".  That was 11 years ago.
> 
> I agree that today this should all work automatically and a
> user space option to disable D3cold on a per-device basis only
> really makes sense as a debugging aid, hence belongs in debugfs.
> 

Thanks.  Then perhaps as part of moving it to debugfs it makes sense to 
simplify the logic.

IE also drop the d3cold_allowed member from struct pci_dev and instead 
make the debugfs item reflect the "no_d3cold" member.
Lukas Wunner Sept. 18, 2023, 2:26 p.m. UTC | #5
On Mon, Sep 18, 2023 at 08:28:51AM -0500, Mario Limonciello wrote:
> On 9/18/2023 08:24, Lukas Wunner wrote:
> > On Mon, Sep 18, 2023 at 08:14:21AM -0500, Mario Limonciello wrote:
> > > What's the history behind why userspace is allowed to opt a device out of
> > > D3cold in the first place?
> > > 
> > > It feels like it should have been a debugging only thing to me.
> > 
> > That's a fair question.
> > 
> > Apparently the default for d3cold_allowed was originally "false"
> > and user space could opt in to D3cold.  Then commit 4f9c1397e2e8
> > ("PCI/PM: Enable D3/D3cold by default for most devices") changed
> > the default to "true".  That was 11 years ago.
> > 
> > I agree that today this should all work automatically and a
> > user space option to disable D3cold on a per-device basis only
> > really makes sense as a debugging aid, hence belongs in debugfs.
> > 
> 
> Thanks.  Then perhaps as part of moving it to debugfs it makes sense to
> simplify the logic.

d3cold_allowed is documented in Documentation/ABI/testing/sysfs-bus-pci,
so it's user space ABI which we're not allowed to break.  We'd have to
declare it deprecated, emit a warning when it's used and slowly phase
it out over the years.

Until then, the $SUBJECT_PATCH probably still makes sense to reinstate
the behavior we had until 2016...

Thanks,

Lukas
Mario Limonciello Sept. 18, 2023, 2:52 p.m. UTC | #6
On 9/18/2023 09:26, Lukas Wunner wrote:
> On Mon, Sep 18, 2023 at 08:28:51AM -0500, Mario Limonciello wrote:
>> On 9/18/2023 08:24, Lukas Wunner wrote:
>>> On Mon, Sep 18, 2023 at 08:14:21AM -0500, Mario Limonciello wrote:
>>>> What's the history behind why userspace is allowed to opt a device out of
>>>> D3cold in the first place?
>>>>
>>>> It feels like it should have been a debugging only thing to me.
>>>
>>> That's a fair question.
>>>
>>> Apparently the default for d3cold_allowed was originally "false"
>>> and user space could opt in to D3cold.  Then commit 4f9c1397e2e8
>>> ("PCI/PM: Enable D3/D3cold by default for most devices") changed
>>> the default to "true".  That was 11 years ago.
>>>
>>> I agree that today this should all work automatically and a
>>> user space option to disable D3cold on a per-device basis only
>>> really makes sense as a debugging aid, hence belongs in debugfs.
>>>
>>
>> Thanks.  Then perhaps as part of moving it to debugfs it makes sense to
>> simplify the logic.
> 
> d3cold_allowed is documented in Documentation/ABI/testing/sysfs-bus-pci,
> so it's user space ABI which we're not allowed to break.  We'd have to
> declare it deprecated, emit a warning when it's used and slowly phase
> it out over the years.

What about as part of deprecating it to make it always return -EINVAL no 
matter the input?  The ABI isn't broken, but it allows for the optimization.

> 
> Until then, the $SUBJECT_PATCH probably still makes sense to reinstate
> the behavior we had until 2016...
> 
> Thanks,
> 
> Lukas
Bjorn Helgaas Sept. 28, 2023, 10:36 p.m. UTC | #7
On Mon, Sep 18, 2023 at 02:48:01PM +0200, Lukas Wunner wrote:
> struct pci_dev contains two flags which govern whether the device may
> suspend to D3cold:
> 
> * no_d3cold provides an opt-out for drivers (e.g. if a device is known
>   to not wake from D3cold)
> 
> * d3cold_allowed provides an opt-out for user space (default is true,
>   user space may set to false)
> 
> Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"),
> the user space setting overwrites the driver setting.  Essentially user
> space is trusted to know better than the driver whether D3cold is
> working.
> 
> That feels unsafe and wrong.  Assume that the change was introduced
> inadvertently and do not overwrite no_d3cold when d3cold_allowed is
> modified.  Instead, consider d3cold_allowed in addition to no_d3cold
> when choosing a suspend state for the device.
> 
> That way, user space may opt out of D3cold if the driver hasn't, but it
> may no longer force an opt in if the driver has opted out.
> 
> Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Cc: stable@vger.kernel.org # v4.8+
> Cc: Mika Westerberg <mika.westerberg@linux.intel.com>

Mika and Mario, you both commented on this, but I *think* you were
both OK with it as-is for now?  If so, can you give a Reviewed-by?
I don't want to go ahead if you have any concerns.

Since we're touching this area, is there anything we can do to clarify
the comments?

        unsigned int    no_d3cold:1;    /* D3cold is forbidden */
        unsigned int    bridge_d3:1;    /* Allow D3 for bridge */
        unsigned int    d3cold_allowed:1;       /* D3cold is allowed by user */

These all have to do with D3, but:

  - For no_d3cold, "D3cold is forbidden" doesn't capture the notion
    that "driver knows the device doesn't wake from D3cold"

  - It's not clear whether "bridge_d3" applies to D3hot, D3cold, or
    both.

  - The computation of "bridge_d3" is ... complicated.  It depends on
    all the device/platform behavior assumptions in
    pci_bridge_d3_possible().  And *also* (I think) the user-space
    "d3cold_allowed" knob, via pci_dev_check_d3cold() in
    pci_bridge_d3_update().

    So it's kind of a mix of hardware characteristics and
    administrative controls.

Any comment updates could be a separate patch so as not to complicate
this one.

> @@ -911,7 +911,7 @@ pci_power_t acpi_pci_choose_state(struct pci_dev *pdev)
>  {
>  	int acpi_state, d_max;
>  
> -	if (pdev->no_d3cold)
> +	if (pdev->no_d3cold || !pdev->d3cold_allowed)

Haha, looks good.  Too bad the senses are opposite ("no_d3cold" and
"!d3cold_allowed"), but that's for another day.

>  		d_max = ACPI_STATE_D3_HOT;
>  	else
>  		d_max = ACPI_STATE_D3_COLD;
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index ba38fc4..e18ccd2 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -530,10 +530,7 @@ static ssize_t d3cold_allowed_store(struct device *dev,
>  		return -EINVAL;
>  
>  	pdev->d3cold_allowed = !!val;
> -	if (pdev->d3cold_allowed)
> -		pci_d3cold_enable(pdev);
> -	else
> -		pci_d3cold_disable(pdev);
> +	pci_bridge_d3_update(pdev);

This is great.  D3 is still a tangle, but this is a significant
improvement.

>  	pm_runtime_resume(dev);

Bjorn
Mika Westerberg Sept. 29, 2023, 4:44 a.m. UTC | #8
On Thu, Sep 28, 2023 at 05:36:30PM -0500, Bjorn Helgaas wrote:
> On Mon, Sep 18, 2023 at 02:48:01PM +0200, Lukas Wunner wrote:
> > struct pci_dev contains two flags which govern whether the device may
> > suspend to D3cold:
> > 
> > * no_d3cold provides an opt-out for drivers (e.g. if a device is known
> >   to not wake from D3cold)
> > 
> > * d3cold_allowed provides an opt-out for user space (default is true,
> >   user space may set to false)
> > 
> > Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"),
> > the user space setting overwrites the driver setting.  Essentially user
> > space is trusted to know better than the driver whether D3cold is
> > working.
> > 
> > That feels unsafe and wrong.  Assume that the change was introduced
> > inadvertently and do not overwrite no_d3cold when d3cold_allowed is
> > modified.  Instead, consider d3cold_allowed in addition to no_d3cold
> > when choosing a suspend state for the device.
> > 
> > That way, user space may opt out of D3cold if the driver hasn't, but it
> > may no longer force an opt in if the driver has opted out.
> > 
> > Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
> > Signed-off-by: Lukas Wunner <lukas@wunner.de>
> > Cc: stable@vger.kernel.org # v4.8+
> > Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
> 
> Mika and Mario, you both commented on this, but I *think* you were
> both OK with it as-is for now?  If so, can you give a Reviewed-by?
> I don't want to go ahead if you have any concerns.

No concerns from me,

Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Mario Limonciello Sept. 29, 2023, 7:03 p.m. UTC | #9
[Public]

> On Thu, Sep 28, 2023 at 05:36:30PM -0500, Bjorn Helgaas wrote:
> > On Mon, Sep 18, 2023 at 02:48:01PM +0200, Lukas Wunner wrote:
> > > struct pci_dev contains two flags which govern whether the device may
> > > suspend to D3cold:
> > >
> > > * no_d3cold provides an opt-out for drivers (e.g. if a device is known
> > >   to not wake from D3cold)
> > >
> > > * d3cold_allowed provides an opt-out for user space (default is true,
> > >   user space may set to false)
> > >
> > > Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during
> suspend"),
> > > the user space setting overwrites the driver setting.  Essentially user
> > > space is trusted to know better than the driver whether D3cold is
> > > working.
> > >
> > > That feels unsafe and wrong.  Assume that the change was introduced
> > > inadvertently and do not overwrite no_d3cold when d3cold_allowed is
> > > modified.  Instead, consider d3cold_allowed in addition to no_d3cold
> > > when choosing a suspend state for the device.
> > >
> > > That way, user space may opt out of D3cold if the driver hasn't, but it
> > > may no longer force an opt in if the driver has opted out.
> > >
> > > Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
> > > Signed-off-by: Lukas Wunner <lukas@wunner.de>
> > > Cc: stable@vger.kernel.org # v4.8+
> > > Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
> >
> > Mika and Mario, you both commented on this, but I *think* you were
> > both OK with it as-is for now?  If so, can you give a Reviewed-by?
> > I don't want to go ahead if you have any concerns.
>
> No concerns from me,
>
> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>

It is a step in the right direction.  I do think a later enhancement to
fail the write from sysfs (to be ABI compatible but "deprecate it") and
internal optimization as a result is a good idea though.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Bjorn Helgaas Sept. 29, 2023, 10:48 p.m. UTC | #10
On Mon, Sep 18, 2023 at 02:48:01PM +0200, Lukas Wunner wrote:
> struct pci_dev contains two flags which govern whether the device may
> suspend to D3cold:
> 
> * no_d3cold provides an opt-out for drivers (e.g. if a device is known
>   to not wake from D3cold)
> 
> * d3cold_allowed provides an opt-out for user space (default is true,
>   user space may set to false)
> 
> Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"),
> the user space setting overwrites the driver setting.  Essentially user
> space is trusted to know better than the driver whether D3cold is
> working.
> 
> That feels unsafe and wrong.  Assume that the change was introduced
> inadvertently and do not overwrite no_d3cold when d3cold_allowed is
> modified.  Instead, consider d3cold_allowed in addition to no_d3cold
> when choosing a suspend state for the device.
> 
> That way, user space may opt out of D3cold if the driver hasn't, but it
> may no longer force an opt in if the driver has opted out.
> 
> Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> Cc: stable@vger.kernel.org # v4.8+
> Cc: Mika Westerberg <mika.westerberg@linux.intel.com>

Applied with Reviewed-by from Mika and Mario to pci/pm for v6.7,
thanks!

> ---
>  drivers/pci/pci-acpi.c  | 2 +-
>  drivers/pci/pci-sysfs.c | 5 +----
>  2 files changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
> index 7aa1c20..2f5eddf 100644
> --- a/drivers/pci/pci-acpi.c
> +++ b/drivers/pci/pci-acpi.c
> @@ -911,7 +911,7 @@ pci_power_t acpi_pci_choose_state(struct pci_dev *pdev)
>  {
>  	int acpi_state, d_max;
>  
> -	if (pdev->no_d3cold)
> +	if (pdev->no_d3cold || !pdev->d3cold_allowed)
>  		d_max = ACPI_STATE_D3_HOT;
>  	else
>  		d_max = ACPI_STATE_D3_COLD;
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index ba38fc4..e18ccd2 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -530,10 +530,7 @@ static ssize_t d3cold_allowed_store(struct device *dev,
>  		return -EINVAL;
>  
>  	pdev->d3cold_allowed = !!val;
> -	if (pdev->d3cold_allowed)
> -		pci_d3cold_enable(pdev);
> -	else
> -		pci_d3cold_disable(pdev);
> +	pci_bridge_d3_update(pdev);
>  
>  	pm_runtime_resume(dev);
>  
> -- 
> 2.39.2
>
diff mbox series

Patch

diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index 7aa1c20..2f5eddf 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -911,7 +911,7 @@  pci_power_t acpi_pci_choose_state(struct pci_dev *pdev)
 {
 	int acpi_state, d_max;
 
-	if (pdev->no_d3cold)
+	if (pdev->no_d3cold || !pdev->d3cold_allowed)
 		d_max = ACPI_STATE_D3_HOT;
 	else
 		d_max = ACPI_STATE_D3_COLD;
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index ba38fc4..e18ccd2 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -530,10 +530,7 @@  static ssize_t d3cold_allowed_store(struct device *dev,
 		return -EINVAL;
 
 	pdev->d3cold_allowed = !!val;
-	if (pdev->d3cold_allowed)
-		pci_d3cold_enable(pdev);
-	else
-		pci_d3cold_disable(pdev);
+	pci_bridge_d3_update(pdev);
 
 	pm_runtime_resume(dev);