diff mbox

r8169: keep firmware in memory.

Message ID 20110113230753.GA2750@electric-eye.fr.zoreil.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Francois Romieu Jan. 13, 2011, 11:07 p.m. UTC
The firmware agent is not available during resume. Loading the firmware
during open() (see eee3a96c6368f47df8df5bd4ed1843600652b337) is not
enough.

close() is run during resume through rtl8169_reset_task(), whence the
mildly natural release of firmware in the driver removal method instead.

It will help with http://bugs.debian.org/609538. It will not avoid
the 60 seconds delay when:
- there is no firmware
- the driver is loaded and the device is not up before a suspend/resume

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Tested-by: Jarek Kamiński <jarek@vilo.eu.org>
Cc: Hayes <hayeswang@realtek.com>
Cc: Ben Hutchings <benh@debian.org>
---
 drivers/net/r8169.c |   43 +++++++++++++++++++++++++++++++------------
 1 files changed, 31 insertions(+), 12 deletions(-)

Comments

David Miller Jan. 14, 2011, 5:50 a.m. UTC | #1
From: Francois Romieu <romieu@fr.zoreil.com>
Date: Fri, 14 Jan 2011 00:07:53 +0100

> The firmware agent is not available during resume. Loading the firmware
> during open() (see eee3a96c6368f47df8df5bd4ed1843600652b337) is not
> enough.
> 
> close() is run during resume through rtl8169_reset_task(), whence the
> mildly natural release of firmware in the driver removal method instead.
> 
> It will help with http://bugs.debian.org/609538. It will not avoid
> the 60 seconds delay when:
> - there is no firmware
> - the driver is loaded and the device is not up before a suspend/resume
> 
> Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
> Tested-by: Jarek Kamiński <jarek@vilo.eu.org>
> Cc: Hayes <hayeswang@realtek.com>
> Cc: Ben Hutchings <benh@debian.org>

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Tokarev Jan. 14, 2011, 6:52 a.m. UTC | #2
14.01.2011 02:07, Francois Romieu wrote:
> The firmware agent is not available during resume. Loading the firmware
> during open() (see eee3a96c6368f47df8df5bd4ed1843600652b337) is not
> enough.
> 
> close() is run during resume through rtl8169_reset_task(), whence the
> mildly natural release of firmware in the driver removal method instead.
> 
> It will help with http://bugs.debian.org/609538. It will not avoid
> the 60 seconds delay when:
> - there is no firmware
> - the driver is loaded and the device is not up before a suspend/resume

Given all this I think this is somewhat clumsy still.  How
does other NIC drivers handles this situation - e.g. tg3?
Maybe this needs to be a generic solution instead of per-driver?

/mjt
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linus Torvalds Jan. 14, 2011, 4:05 p.m. UTC | #3
[ background for new people in discussion: once more, a driver resume
didn't get a working firmware load. ]

On Thu, Jan 13, 2011 at 10:52 PM, Michael Tokarev <mjt@tls.msk.ru> wrote:
>
> Given all this I think this is somewhat clumsy still.  How
> does other NIC drivers handles this situation - e.g. tg3?
> Maybe this needs to be a generic solution instead of per-driver?

We've had this bug several times - and not just for network drivers
either. It's almost universal that drivers just load their firmware at
initialization, and then don't realize that they need to do it at
resume too and have no filesystems accessible. I think the firmware
access interface is partly to blame, because I suspect both the
documentation and the code (and probably any examples: as mentioned,
this is very common) tends to be about that initial one-shot "bring
the device up" rather than thinking about "oh, resume is a bootup too,
and unlike initial boot, it needs to come up in a configured state".

Maybe the firmware loader could be made more useful to automatically
handle the caching (it already knows about built-in firmware) for the
case where CONFIG_PM is enabled. So that drivers _could_ just
basically do "request_firmware()" in their resume functions, and it
would get satisfied from RAM for the re-request case.

Or maybe we could have some honking big comments, at least ;)

Added some device core/fw-class people to the discussion for comments.
Clearly the fw loading _works_, but it does end up being a common bug
that it's messed up at resume. This time it was the "oh, I moved the
firmware loading around so now it's done at a different stage", but
quite often it's literally just "oh, it never worked, I had just
forgotten about resume".

                       Linus
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Hutchings Jan. 14, 2011, 4:30 p.m. UTC | #4
On Fri, Jan 14, 2011 at 08:05:22AM -0800, Linus Torvalds wrote:
> [ background for new people in discussion: once more, a driver resume
> didn't get a working firmware load. ]
> 
> On Thu, Jan 13, 2011 at 10:52 PM, Michael Tokarev <mjt@tls.msk.ru> wrote:
> >
> > Given all this I think this is somewhat clumsy still.  How
> > does other NIC drivers handles this situation - e.g. tg3?
> > Maybe this needs to be a generic solution instead of per-driver?
> 
> We've had this bug several times - and not just for network drivers
> either.
[...]
> Maybe the firmware loader could be made more useful to automatically
> handle the caching (it already knows about built-in firmware) for the
> case where CONFIG_PM is enabled. So that drivers _could_ just
> basically do "request_firmware()" in their resume functions, and it
> would get satisfied from RAM for the re-request case.
[...]
 
This is something I started to implement, but never got finished.  I
don't think it can be done without an API change, though, as we need
to know when to drop firmware from the cache.  But perhaps this could
be done with a hook in the device-driver binding code.

Ben.
Linus Torvalds Jan. 14, 2011, 5:21 p.m. UTC | #5
On Fri, Jan 14, 2011 at 8:30 AM, Ben Hutchings <benh@debian.org> wrote:
>
> This is something I started to implement, but never got finished.  I
> don't think it can be done without an API change, though, as we need
> to know when to drop firmware from the cache.  But perhaps this could
> be done with a hook in the device-driver binding code.

Or just associate the firmware with a module?

So if the firmware gets loaded, it stays in memory until the module is unloaded?

And this all would only be the case if CONFIG_PM is set, so you'd not
waste memory unnecessarily.

                Linus
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael J. Wysocki Jan. 14, 2011, 9:44 p.m. UTC | #6
On Friday, January 14, 2011, Linus Torvalds wrote:
> On Fri, Jan 14, 2011 at 8:30 AM, Ben Hutchings <benh@debian.org> wrote:
> >
> > This is something I started to implement, but never got finished.  I
> > don't think it can be done without an API change, though, as we need
> > to know when to drop firmware from the cache.  But perhaps this could
> > be done with a hook in the device-driver binding code.
> 
> Or just associate the firmware with a module?
> 
> So if the firmware gets loaded, it stays in memory until the module is unloaded?
> 
> And this all would only be the case if CONFIG_PM is set, so you'd not
> waste memory unnecessarily.

Alternatively, a suspend/hibernate notifier can be used for that I think.

They are called before the freezing and after the thawing of user space, so the
the PM_POST_SUSPEND or PM_POST_RESTORE notification can easily cause the
firmare(s) to be dropped from memory.

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index bb8645a..bde7d61 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -554,6 +554,8 @@  struct rtl8169_private {
 	struct mii_if_info mii;
 	struct rtl8169_counters counters;
 	u32 saved_wolopts;
+
+	const struct firmware *fw;
 };
 
 MODULE_AUTHOR("Realtek and the Linux r8169 crew <netdev@vger.kernel.org>");
@@ -1766,6 +1768,29 @@  rtl_phy_write_fw(struct rtl8169_private *tp, const struct firmware *fw)
 	}
 }
 
+static void rtl_release_firmware(struct rtl8169_private *tp)
+{
+	release_firmware(tp->fw);
+	tp->fw = NULL;
+}
+
+static int rtl_apply_firmware(struct rtl8169_private *tp, const char *fw_name)
+{
+	const struct firmware **fw = &tp->fw;
+	int rc = !*fw;
+
+	if (rc) {
+		rc = request_firmware(fw, fw_name, &tp->pci_dev->dev);
+		if (rc < 0)
+			goto out;
+	}
+
+	/* TODO: release firmware once rtl_phy_write_fw signals failures. */
+	rtl_phy_write_fw(tp, *fw);
+out:
+	return rc;
+}
+
 static void rtl8169s_hw_phy_config(struct rtl8169_private *tp)
 {
 	static const struct phy_reg phy_reg_init[] = {
@@ -2139,7 +2164,6 @@  static void rtl8168d_1_hw_phy_config(struct rtl8169_private *tp)
 		{ 0x0d, 0xf880 }
 	};
 	void __iomem *ioaddr = tp->mmio_addr;
-	const struct firmware *fw;
 
 	rtl_writephy_batch(tp, phy_reg_init_0, ARRAY_SIZE(phy_reg_init_0));
 
@@ -2203,11 +2227,8 @@  static void rtl8168d_1_hw_phy_config(struct rtl8169_private *tp)
 
 	rtl_writephy(tp, 0x1f, 0x0005);
 	rtl_writephy(tp, 0x05, 0x001b);
-	if (rtl_readphy(tp, 0x06) == 0xbf00 &&
-	    request_firmware(&fw, FIRMWARE_8168D_1, &tp->pci_dev->dev) == 0) {
-		rtl_phy_write_fw(tp, fw);
-		release_firmware(fw);
-	} else {
+	if ((rtl_readphy(tp, 0x06) != 0xbf00) ||
+	    (rtl_apply_firmware(tp, FIRMWARE_8168D_1) < 0)) {
 		netif_warn(tp, probe, tp->dev, "unable to apply firmware patch\n");
 	}
 
@@ -2257,7 +2278,6 @@  static void rtl8168d_2_hw_phy_config(struct rtl8169_private *tp)
 		{ 0x0d, 0xf880 }
 	};
 	void __iomem *ioaddr = tp->mmio_addr;
-	const struct firmware *fw;
 
 	rtl_writephy_batch(tp, phy_reg_init_0, ARRAY_SIZE(phy_reg_init_0));
 
@@ -2312,11 +2332,8 @@  static void rtl8168d_2_hw_phy_config(struct rtl8169_private *tp)
 
 	rtl_writephy(tp, 0x1f, 0x0005);
 	rtl_writephy(tp, 0x05, 0x001b);
-	if (rtl_readphy(tp, 0x06) == 0xb300 &&
-	    request_firmware(&fw, FIRMWARE_8168D_2, &tp->pci_dev->dev) == 0) {
-		rtl_phy_write_fw(tp, fw);
-		release_firmware(fw);
-	} else {
+	if ((rtl_readphy(tp, 0x06) != 0xb300) ||
+	    (rtl_apply_firmware(tp, FIRMWARE_8168D_2) < 0)) {
 		netif_warn(tp, probe, tp->dev, "unable to apply firmware patch\n");
 	}
 
@@ -3200,6 +3217,8 @@  static void __devexit rtl8169_remove_one(struct pci_dev *pdev)
 
 	cancel_delayed_work_sync(&tp->task);
 
+	rtl_release_firmware(tp);
+
 	unregister_netdev(dev);
 
 	if (pci_dev_run_wake(pdev))