diff mbox

[Precise,SRU,3/3] hwmon: fam15h_power: fix bogus values with current BIOSes

Message ID 1339620776-8097-3-git-send-email-tim.gardner@canonical.com
State New
Headers show

Commit Message

Tim Gardner June 13, 2012, 8:52 p.m. UTC
From: Andre Przywara <andre.przywara@amd.com>

BugLink: http://bugs.launchpad.net/bugs/1009086

Newer BKDG[1] versions recommend a different initialization value for
the running average range register in the northbridge. This improves
the power reading by avoiding counter saturations resulting in bogus
values for anything below about 80% of TDP power consumption.
Updated BIOSes will have this new value set up from the beginning,
but meanwhile we correct this value ourselves.
This needs to be done on all northbridges, even on those where the
driver itself does not register at.

This fixes the driver on all current machines to provide proper
values for idle load.

[1]
http://support.amd.com/us/Processor_TechDocs/42301_15h_Mod_00h-0Fh_BKDG.pdf
Chapter 3.8: D18F5xE0 Processor TDP Running Average (p. 452)

Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Acked-by: Jean Delvare <khali@linux-fr.org>
[guenter.roeck@ericsson.com: Removed unnecessary return statement]
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
Cc: stable@vger.kernel.org # 3.0+
(cherry picked from commit 00250ec90963b7ef6678438888f3244985ecde14)

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
---
 drivers/hwmon/fam15h_power.c |   39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

Comments

Leann Ogasawara June 13, 2012, 9:12 p.m. UTC | #1
On 06/13/2012 01:52 PM, Tim Gardner wrote:
> From: Andre Przywara <andre.przywara@amd.com>
>
> BugLink: http://bugs.launchpad.net/bugs/1009086
>
> Newer BKDG[1] versions recommend a different initialization value for
> the running average range register in the northbridge. This improves
> the power reading by avoiding counter saturations resulting in bogus
> values for anything below about 80% of TDP power consumption.
> Updated BIOSes will have this new value set up from the beginning,
> but meanwhile we correct this value ourselves.
> This needs to be done on all northbridges, even on those where the
> driver itself does not register at.
>
> This fixes the driver on all current machines to provide proper
> values for idle load.
>
> [1]
> http://support.amd.com/us/Processor_TechDocs/42301_15h_Mod_00h-0Fh_BKDG.pdf
> Chapter 3.8: D18F5xE0 Processor TDP Running Average (p. 452)
>
> Signed-off-by: Andre Przywara <andre.przywara@amd.com>
> Acked-by: Jean Delvare <khali@linux-fr.org>
> [guenter.roeck@ericsson.com: Removed unnecessary return statement]
> Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
> Cc: stable@vger.kernel.org # 3.0+
> (cherry picked from commit 00250ec90963b7ef6678438888f3244985ecde14)
>
> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>

Again hardware specific and low risk of regression.  Also already CC'd
to upstream stable.

Acked-by: Leann Ogasawara <leann.ogasawara@canonical.com>

> ---
>  drivers/hwmon/fam15h_power.c |   39 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 39 insertions(+)
>
> diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
> index b7494af..37a8fc9 100644
> --- a/drivers/hwmon/fam15h_power.c
> +++ b/drivers/hwmon/fam15h_power.c
> @@ -122,6 +122,38 @@ static bool __devinit fam15h_power_is_internal_node0(struct pci_dev *f4)
>  	return true;
>  }
>  
> +/*
> + * Newer BKDG versions have an updated recommendation on how to properly
> + * initialize the running average range (was: 0xE, now: 0x9). This avoids
> + * counter saturations resulting in bogus power readings.
> + * We correct this value ourselves to cope with older BIOSes.
> + */
> +static void __devinit tweak_runavg_range(struct pci_dev *pdev)
> +{
> +	u32 val;
> +	const struct pci_device_id affected_device = {
> +		PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_15H_NB_F4) };
> +
> +	/*
> +	 * let this quirk apply only to the current version of the
> +	 * northbridge, since future versions may change the behavior
> +	 */
> +	if (!pci_match_id(&affected_device, pdev))
> +		return;
> +
> +	pci_bus_read_config_dword(pdev->bus,
> +		PCI_DEVFN(PCI_SLOT(pdev->devfn), 5),
> +		REG_TDP_RUNNING_AVERAGE, &val);
> +	if ((val & 0xf) != 0xe)
> +		return;
> +
> +	val &= ~0xf;
> +	val |=  0x9;
> +	pci_bus_write_config_dword(pdev->bus,
> +		PCI_DEVFN(PCI_SLOT(pdev->devfn), 5),
> +		REG_TDP_RUNNING_AVERAGE, val);
> +}
> +
>  static void __devinit fam15h_power_init_data(struct pci_dev *f4,
>  					     struct fam15h_power_data *data)
>  {
> @@ -155,6 +187,13 @@ static int __devinit fam15h_power_probe(struct pci_dev *pdev,
>  	struct device *dev;
>  	int err;
>  
> +	/*
> +	 * though we ignore every other northbridge, we still have to
> +	 * do the tweaking on _each_ node in MCM processors as the counters
> +	 * are working hand-in-hand
> +	 */
> +	tweak_runavg_range(pdev);
> +
>  	if (!fam15h_power_is_internal_node0(pdev)) {
>  		err = -ENODEV;
>  		goto exit;
Brad Figg June 13, 2012, 10:50 p.m. UTC | #2
On 06/13/2012 01:52 PM, Tim Gardner wrote:
> From: Andre Przywara <andre.przywara@amd.com>
> 
> BugLink: http://bugs.launchpad.net/bugs/1009086
> 
> Newer BKDG[1] versions recommend a different initialization value for
> the running average range register in the northbridge. This improves
> the power reading by avoiding counter saturations resulting in bogus
> values for anything below about 80% of TDP power consumption.
> Updated BIOSes will have this new value set up from the beginning,
> but meanwhile we correct this value ourselves.
> This needs to be done on all northbridges, even on those where the
> driver itself does not register at.
> 
> This fixes the driver on all current machines to provide proper
> values for idle load.
> 
> [1]
> http://support.amd.com/us/Processor_TechDocs/42301_15h_Mod_00h-0Fh_BKDG.pdf
> Chapter 3.8: D18F5xE0 Processor TDP Running Average (p. 452)
> 
> Signed-off-by: Andre Przywara <andre.przywara@amd.com>
> Acked-by: Jean Delvare <khali@linux-fr.org>
> [guenter.roeck@ericsson.com: Removed unnecessary return statement]
> Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
> Cc: stable@vger.kernel.org # 3.0+
> (cherry picked from commit 00250ec90963b7ef6678438888f3244985ecde14)
> 
> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
> ---
>  drivers/hwmon/fam15h_power.c |   39 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 39 insertions(+)
> 
> diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
> index b7494af..37a8fc9 100644
> --- a/drivers/hwmon/fam15h_power.c
> +++ b/drivers/hwmon/fam15h_power.c
> @@ -122,6 +122,38 @@ static bool __devinit fam15h_power_is_internal_node0(struct pci_dev *f4)
>  	return true;
>  }
>  
> +/*
> + * Newer BKDG versions have an updated recommendation on how to properly
> + * initialize the running average range (was: 0xE, now: 0x9). This avoids
> + * counter saturations resulting in bogus power readings.
> + * We correct this value ourselves to cope with older BIOSes.
> + */
> +static void __devinit tweak_runavg_range(struct pci_dev *pdev)
> +{
> +	u32 val;
> +	const struct pci_device_id affected_device = {
> +		PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_15H_NB_F4) };
> +
> +	/*
> +	 * let this quirk apply only to the current version of the
> +	 * northbridge, since future versions may change the behavior
> +	 */
> +	if (!pci_match_id(&affected_device, pdev))
> +		return;
> +
> +	pci_bus_read_config_dword(pdev->bus,
> +		PCI_DEVFN(PCI_SLOT(pdev->devfn), 5),
> +		REG_TDP_RUNNING_AVERAGE, &val);
> +	if ((val & 0xf) != 0xe)
> +		return;
> +
> +	val &= ~0xf;
> +	val |=  0x9;
> +	pci_bus_write_config_dword(pdev->bus,
> +		PCI_DEVFN(PCI_SLOT(pdev->devfn), 5),
> +		REG_TDP_RUNNING_AVERAGE, val);
> +}
> +
>  static void __devinit fam15h_power_init_data(struct pci_dev *f4,
>  					     struct fam15h_power_data *data)
>  {
> @@ -155,6 +187,13 @@ static int __devinit fam15h_power_probe(struct pci_dev *pdev,
>  	struct device *dev;
>  	int err;
>  
> +	/*
> +	 * though we ignore every other northbridge, we still have to
> +	 * do the tweaking on _each_ node in MCM processors as the counters
> +	 * are working hand-in-hand
> +	 */
> +	tweak_runavg_range(pdev);
> +
>  	if (!fam15h_power_is_internal_node0(pdev)) {
>  		err = -ENODEV;
>  		goto exit;
Herton Ronaldo Krzesinski June 13, 2012, 10:54 p.m. UTC | #3
On Wed, Jun 13, 2012 at 02:52:56PM -0600, Tim Gardner wrote:
> (cherry picked from commit 00250ec90963b7ef6678438888f3244985ecde14)
> 
> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
> ---
>  drivers/hwmon/fam15h_power.c |   39 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 39 insertions(+)
> 
> diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
> index b7494af..37a8fc9 100644
> --- a/drivers/hwmon/fam15h_power.c
> +++ b/drivers/hwmon/fam15h_power.c
> @@ -122,6 +122,38 @@ static bool __devinit fam15h_power_is_internal_node0(struct pci_dev *f4)
>  	return true;
>  }
>  
> +/*
> + * Newer BKDG versions have an updated recommendation on how to properly
> + * initialize the running average range (was: 0xE, now: 0x9). This avoids
> + * counter saturations resulting in bogus power readings.
> + * We correct this value ourselves to cope with older BIOSes.
> + */
> +static void __devinit tweak_runavg_range(struct pci_dev *pdev)
> +{
> +	u32 val;
> +	const struct pci_device_id affected_device = {
> +		PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_15H_NB_F4) };

This original change introduced a problem, we should also cherry-pick 
commit c3e40a9972428d6e2d8e287ed0233a57a218c30f, ack for the 3 patches
plus the fix.
diff mbox

Patch

diff --git a/drivers/hwmon/fam15h_power.c b/drivers/hwmon/fam15h_power.c
index b7494af..37a8fc9 100644
--- a/drivers/hwmon/fam15h_power.c
+++ b/drivers/hwmon/fam15h_power.c
@@ -122,6 +122,38 @@  static bool __devinit fam15h_power_is_internal_node0(struct pci_dev *f4)
 	return true;
 }
 
+/*
+ * Newer BKDG versions have an updated recommendation on how to properly
+ * initialize the running average range (was: 0xE, now: 0x9). This avoids
+ * counter saturations resulting in bogus power readings.
+ * We correct this value ourselves to cope with older BIOSes.
+ */
+static void __devinit tweak_runavg_range(struct pci_dev *pdev)
+{
+	u32 val;
+	const struct pci_device_id affected_device = {
+		PCI_VDEVICE(AMD, PCI_DEVICE_ID_AMD_15H_NB_F4) };
+
+	/*
+	 * let this quirk apply only to the current version of the
+	 * northbridge, since future versions may change the behavior
+	 */
+	if (!pci_match_id(&affected_device, pdev))
+		return;
+
+	pci_bus_read_config_dword(pdev->bus,
+		PCI_DEVFN(PCI_SLOT(pdev->devfn), 5),
+		REG_TDP_RUNNING_AVERAGE, &val);
+	if ((val & 0xf) != 0xe)
+		return;
+
+	val &= ~0xf;
+	val |=  0x9;
+	pci_bus_write_config_dword(pdev->bus,
+		PCI_DEVFN(PCI_SLOT(pdev->devfn), 5),
+		REG_TDP_RUNNING_AVERAGE, val);
+}
+
 static void __devinit fam15h_power_init_data(struct pci_dev *f4,
 					     struct fam15h_power_data *data)
 {
@@ -155,6 +187,13 @@  static int __devinit fam15h_power_probe(struct pci_dev *pdev,
 	struct device *dev;
 	int err;
 
+	/*
+	 * though we ignore every other northbridge, we still have to
+	 * do the tweaking on _each_ node in MCM processors as the counters
+	 * are working hand-in-hand
+	 */
+	tweak_runavg_range(pdev);
+
 	if (!fam15h_power_is_internal_node0(pdev)) {
 		err = -ENODEV;
 		goto exit;