diff mbox series

[v2,1/2] pci: prevent sk hynix nvme from entering D3

Message ID 20181106071214.12745-1-acelan.kao@canonical.com
State Changes Requested
Delegated to: Bjorn Helgaas
Headers show
Series [v2,1/2] pci: prevent sk hynix nvme from entering D3 | expand

Commit Message

AceLan Kao Nov. 6, 2018, 7:12 a.m. UTC
It leads to the power consumption raises to 2.2W during s2idle, while
it consumes less than 1W during long idle if put SK hynix nvme to D3
and then enter s2idle.
From SK hynix FE, MS Windows doesn't put nvme to D3, and uses its own
APST feature to do the power management.
To leverage its APST feature during s2idle, we can't disable nvme
device while suspending, too.

BTW, prevent it from entering D3 will increase the power consumtion around
0.13W ~ 0.15W during short/long idle, and the power consumption during
s2idle becomes 0.77W.

Signed-off-by: AceLan Kao <acelan.kao@canonical.com>
---
 drivers/pci/quirks.c    | 1 +
 include/linux/pci_ids.h | 2 ++
 2 files changed, 3 insertions(+)

Comments

Bjorn Helgaas Nov. 9, 2018, 12:21 a.m. UTC | #1
On Tue, Nov 06, 2018 at 03:12:13PM +0800, AceLan Kao wrote:
> It leads to the power consumption raises to 2.2W during s2idle, while
> it consumes less than 1W during long idle if put SK hynix nvme to D3
> and then enter s2idle.
> From SK hynix FE, MS Windows doesn't put nvme to D3, and uses its own
> APST feature to do the power management.
> To leverage its APST feature during s2idle, we can't disable nvme
> device while suspending, too.

I don't know how APST works, but it sounds like you want to disable D3
if you're using APST.  But that's not what this patch does; this
disables it always.

I'm not sure we want a quirk for this at all, since as Christoph
points out, it doesn't fix a functional issue as the other uses of
quirk_no_ata_d3() do.

From your emails with Christoph, it sounds like this quirk is a
workaround for a firmware defect.  If we *do* end up wanting a quirk,
the changelog should at least mention the firmware defect and maybe
check whether it has been fixed.

> BTW, prevent it from entering D3 will increase the power consumtion around
> 0.13W ~ 0.15W during short/long idle, and the power consumption during
> s2idle becomes 0.77W.
> 
> Signed-off-by: AceLan Kao <acelan.kao@canonical.com>
> ---
>  drivers/pci/quirks.c    | 1 +
>  include/linux/pci_ids.h | 2 ++
>  2 files changed, 3 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 4700d24e5d55..b7e6492e8311 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -1332,6 +1332,7 @@ DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_AL, PCI_ANY_ID,
>     occur when mode detecting */
>  DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_VIA, PCI_ANY_ID,
>  				PCI_CLASS_STORAGE_IDE, 8, quirk_no_ata_d3);
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SK_HYNIX, 0x1527, quirk_no_ata_d3);
>  
>  /*
>   * This was originally an Alpha-specific thing, but it really fits here.
> diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
> index 69f0abe1ba1a..5f5adda07de0 100644
> --- a/include/linux/pci_ids.h
> +++ b/include/linux/pci_ids.h
> @@ -3090,4 +3090,6 @@
>  
>  #define PCI_VENDOR_ID_NCUBE		0x10ff
>  
> +#define PCI_VENDOR_ID_SK_HYNIX		0x1c5c
> +
>  #endif /* _LINUX_PCI_IDS_H */
> -- 
> 2.17.1
> 
> 
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme
Kai-Heng Feng Nov. 15, 2018, 7:16 a.m. UTC | #2
Hi,

> On Nov 9, 2018, at 08:21, Bjorn Helgaas <helgaas@kernel.org> wrote:
> 
> On Tue, Nov 06, 2018 at 03:12:13PM +0800, AceLan Kao wrote:
>> It leads to the power consumption raises to 2.2W during s2idle, while
>> it consumes less than 1W during long idle if put SK hynix nvme to D3
>> and then enter s2idle.
>> From SK hynix FE, MS Windows doesn't put nvme to D3, and uses its own
>> APST feature to do the power management.
>> To leverage its APST feature during s2idle, we can't disable nvme
>> device while suspending, too.

We have a new Intel NVMe [8086:f1a6] that has this “new” behavior.

> 
> I don't know how APST works, but it sounds like you want to disable D3
> if you're using APST.  But that's not what this patch does; this
> disables it always.

Ok, will work on a new patch that only disables D3 when APST is enabled.

> 
> I'm not sure we want a quirk for this at all, since as Christoph
> points out, it doesn't fix a functional issue as the other uses of
> quirk_no_ata_d3() do.
> 
> From your emails with Christoph, it sounds like this quirk is a
> workaround for a firmware defect.  If we *do* end up wanting a quirk,
> the changelog should at least mention the firmware defect and maybe
> check whether it has been fixed.

According to SK Hynix folks and new evidence on the new Intel NVMe
we have, this is something we are going to see more often.

Kai-Heng

> 
>> BTW, prevent it from entering D3 will increase the power consumtion around
>> 0.13W ~ 0.15W during short/long idle, and the power consumption during
>> s2idle becomes 0.77W.
>> 
>> Signed-off-by: AceLan Kao <acelan.kao@canonical.com>
>> ---
>> drivers/pci/quirks.c    | 1 +
>> include/linux/pci_ids.h | 2 ++
>> 2 files changed, 3 insertions(+)
>> 
>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>> index 4700d24e5d55..b7e6492e8311 100644
>> --- a/drivers/pci/quirks.c
>> +++ b/drivers/pci/quirks.c
>> @@ -1332,6 +1332,7 @@ DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_AL, PCI_ANY_ID,
>>    occur when mode detecting */
>> DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_VIA, PCI_ANY_ID,
>> 				PCI_CLASS_STORAGE_IDE, 8, quirk_no_ata_d3);
>> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SK_HYNIX, 0x1527, quirk_no_ata_d3);
>> 
>> /*
>>  * This was originally an Alpha-specific thing, but it really fits here.
>> diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
>> index 69f0abe1ba1a..5f5adda07de0 100644
>> --- a/include/linux/pci_ids.h
>> +++ b/include/linux/pci_ids.h
>> @@ -3090,4 +3090,6 @@
>> 
>> #define PCI_VENDOR_ID_NCUBE		0x10ff
>> 
>> +#define PCI_VENDOR_ID_SK_HYNIX		0x1c5c
>> +
>> #endif /* _LINUX_PCI_IDS_H */
>> -- 
>> 2.17.1
>> 
>> 
>> _______________________________________________
>> Linux-nvme mailing list
>> Linux-nvme@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-nvme
Bjorn Helgaas Nov. 15, 2018, 2:58 p.m. UTC | #3
On Thu, Nov 15, 2018 at 03:16:29PM +0800, Kai Heng Feng wrote:
> > On Nov 9, 2018, at 08:21, Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Tue, Nov 06, 2018 at 03:12:13PM +0800, AceLan Kao wrote:
> >> It leads to the power consumption raises to 2.2W during s2idle, while
> >> it consumes less than 1W during long idle if put SK hynix nvme to D3
> >> and then enter s2idle.
> >> From SK hynix FE, MS Windows doesn't put nvme to D3, and uses its own
> >> APST feature to do the power management.
> >> To leverage its APST feature during s2idle, we can't disable nvme
> >> device while suspending, too.
> 
> We have a new Intel NVMe [8086:f1a6] that has this “new” behavior.
> 
> > I don't know how APST works, but it sounds like you want to disable D3
> > if you're using APST.  But that's not what this patch does; this
> > disables it always.
> 
> Ok, will work on a new patch that only disables D3 when APST is enabled.

My comment was that the changelog didn't match the code.  I don't know
which one is wrong, so I wasn't trying to suggest that you change the
code.  If the code is right and the changelog is wrong, just change
the changelog.

> > I'm not sure we want a quirk for this at all, since as Christoph
> > points out, it doesn't fix a functional issue as the other uses of
> > quirk_no_ata_d3() do.
> > 
> > From your emails with Christoph, it sounds like this quirk is a
> > workaround for a firmware defect.  If we *do* end up wanting a quirk,
> > the changelog should at least mention the firmware defect and maybe
> > check whether it has been fixed.
> 
> According to SK Hynix folks and new evidence on the new Intel NVMe
> we have, this is something we are going to see more often.

Hmmm, are you suggesting that if we went this quirk route, we'd be
updating the quirk frequently to add new devices?

I'm opposed to that as a strategy because it makes needless work.  You
have to update the quirk, backport it to older kernels, re-release
distro kernels, etc.

If this situation is going to happen frequently, it would be better to
(a) fix the firmware defect (if that's what this is) or (b) pursue
some APST or other spec change so there's a generic documented way to
handle this without requiring device-specific quirks.

Bjorn
Bjorn Helgaas Nov. 15, 2018, 5:30 p.m. UTC | #4
On Thu, Nov 15, 2018 at 08:58:09AM -0600, Bjorn Helgaas wrote:
> On Thu, Nov 15, 2018 at 03:16:29PM +0800, Kai Heng Feng wrote:
> > On Nov 9, 2018, at 08:21, Bjorn Helgaas <helgaas@kernel.org> wrote:

> > > I'm not sure we want a quirk for this at all, since as Christoph
> > > points out, it doesn't fix a functional issue as the other uses of
> > > quirk_no_ata_d3() do.
> > > 
> > > From your emails with Christoph, it sounds like this quirk is a
> > > workaround for a firmware defect.  If we *do* end up wanting a quirk,
> > > the changelog should at least mention the firmware defect and maybe
> > > check whether it has been fixed.
> > 
> > According to SK Hynix folks and new evidence on the new Intel NVMe
> > we have, this is something we are going to see more often.
> 
> Hmmm, are you suggesting that if we went this quirk route, we'd be
> updating the quirk frequently to add new devices?
> 
> I'm opposed to that as a strategy because it makes needless work.  You
> have to update the quirk, backport it to older kernels, re-release
> distro kernels, etc.

But I guess you have to do this anyway just to add the vendor/device
ID to the driver, so maybe this isn't a big deal to you.  If you can
do a quirk like this in the driver, it would be invisible to me and I
wouldn't care.  I just don't want to deal with ongoing tweaks like
this in the PCI core :)

Bjorn
Christoph Hellwig Nov. 16, 2018, 7:49 a.m. UTC | #5
On Thu, Nov 15, 2018 at 11:30:15AM -0600, Bjorn Helgaas wrote:
> 
> But I guess you have to do this anyway just to add the vendor/device
> ID to the driver, so maybe this isn't a big deal to you.  If you can
> do a quirk like this in the driver, it would be invisible to me and I
> wouldn't care.  I just don't want to deal with ongoing tweaks like
> this in the PCI core :)

No, NVMe is a spec with a class code, and a specification that is
vendor independent.  NVMe devices declare invididual features based
on common fields.

APST is an optional feature with all kinds of parameters, but there
is absolutely no language that a host should not put the device into
D3 mode if APST is supported anywhere in the NVMe spec, and such
behavior is also rather counter intuitive.  If SK Hynix thinks this
is sensible behavior they should bring it up in the NVMe technical
working group.  I've pinged a contact there to see what this whole
story is about.
diff mbox series

Patch

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 4700d24e5d55..b7e6492e8311 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -1332,6 +1332,7 @@  DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_AL, PCI_ANY_ID,
    occur when mode detecting */
 DECLARE_PCI_FIXUP_CLASS_EARLY(PCI_VENDOR_ID_VIA, PCI_ANY_ID,
 				PCI_CLASS_STORAGE_IDE, 8, quirk_no_ata_d3);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SK_HYNIX, 0x1527, quirk_no_ata_d3);
 
 /*
  * This was originally an Alpha-specific thing, but it really fits here.
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index 69f0abe1ba1a..5f5adda07de0 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -3090,4 +3090,6 @@ 
 
 #define PCI_VENDOR_ID_NCUBE		0x10ff
 
+#define PCI_VENDOR_ID_SK_HYNIX		0x1c5c
+
 #endif /* _LINUX_PCI_IDS_H */