diff mbox

[2/2,v2] sierra_net: fix issues with SYNC/RESTART messages and interrupt pipe setup

Message ID 1360176176.11742.16.camel@dcbw.foobar.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Dan Williams Feb. 6, 2013, 6:42 p.m. UTC
As part of the initialization sequence, the driver sends a SYNC message
via the control pipe to the firmware, which appears to request a
firmware restart.  The firmware responds with an indication via the
interrupt pipe set up by usbnet.  If the driver does not receive a
RESTART indication within a certain amount of time, it will periodically
send additional SYNC messages until it receives the RESTART indication.

Unfortunately, the interrupt URB is only submitted while the netdev
is open, which is usually not the case during initialization, and thus
the firmware's RESTART indication is lost.  So the driver continues
sending SYNC messages, and eventually the firmware crashes when it
receives too many.  This leads to a wedged netdev.

To ensure the firmware's RESTART indications can be received by the
driver, request that usbnet keep the interrupt URB active via
FLAG_INTR_ALWAYS.

Second, move the code that sends the SYNC message out of the
bind() hook and after usbnet_probe() to ensure the interrupt URB
is set up before trying to use it.

Signed-off-by: Dan Williams <dcbw@redhat.com>
---
 drivers/net/usb/sierra_net.c | 33 ++++++++++++++++++++++++++-------
 1 file changed, 26 insertions(+), 7 deletions(-)

Comments

Oliver Neukum Feb. 6, 2013, 8:17 p.m. UTC | #1
On Wednesday 06 February 2013 12:42:56 Dan Williams wrote:
> As part of the initialization sequence, the driver sends a SYNC message
> via the control pipe to the firmware, which appears to request a
> firmware restart.  The firmware responds with an indication via the
> interrupt pipe set up by usbnet.  If the driver does not receive a
> RESTART indication within a certain amount of time, it will periodically
> send additional SYNC messages until it receives the RESTART indication.
> 
> Unfortunately, the interrupt URB is only submitted while the netdev
> is open, which is usually not the case during initialization, and thus
> the firmware's RESTART indication is lost.  So the driver continues
> sending SYNC messages, and eventually the firmware crashes when it
> receives too many.  This leads to a wedged netdev.

If I understand this correctly we should stop the interrupt pipe once
RESTART has been recieved. I am afraid this patch is a bit inefficient.

	Regards
		Oliver

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjørn Mork Feb. 6, 2013, 9:11 p.m. UTC | #2
Dan Williams <dcbw@redhat.com> writes:

> As part of the initialization sequence, the driver sends a SYNC message
> via the control pipe to the firmware, which appears to request a
> firmware restart.  The firmware responds with an indication via the
> interrupt pipe set up by usbnet.  If the driver does not receive a
> RESTART indication within a certain amount of time, it will periodically
> send additional SYNC messages until it receives the RESTART indication.
>
> Unfortunately, the interrupt URB is only submitted while the netdev
> is open, which is usually not the case during initialization, and thus
> the firmware's RESTART indication is lost.  So the driver continues
> sending SYNC messages, and eventually the firmware crashes when it
> receives too many.  This leads to a wedged netdev.
>
> To ensure the firmware's RESTART indications can be received by the
> driver, request that usbnet keep the interrupt URB active via
> FLAG_INTR_ALWAYS.
>
> Second, move the code that sends the SYNC message out of the
> bind() hook and after usbnet_probe() to ensure the interrupt URB
> is set up before trying to use it.

Given this description I am wondering if you couldn't just move the
whole SYNC thing to a new reset() hook, using some private flag to make
sure it only runs once?  Does it really need to start at probe time?


Bjørn


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dan Williams Feb. 7, 2013, 5:06 p.m. UTC | #3
On Wed, 2013-02-06 at 22:11 +0100, Bjørn Mork wrote:
> Dan Williams <dcbw@redhat.com> writes:
> 
> > As part of the initialization sequence, the driver sends a SYNC message
> > via the control pipe to the firmware, which appears to request a
> > firmware restart.  The firmware responds with an indication via the
> > interrupt pipe set up by usbnet.  If the driver does not receive a
> > RESTART indication within a certain amount of time, it will periodically
> > send additional SYNC messages until it receives the RESTART indication.
> >
> > Unfortunately, the interrupt URB is only submitted while the netdev
> > is open, which is usually not the case during initialization, and thus
> > the firmware's RESTART indication is lost.  So the driver continues
> > sending SYNC messages, and eventually the firmware crashes when it
> > receives too many.  This leads to a wedged netdev.
> >
> > To ensure the firmware's RESTART indications can be received by the
> > driver, request that usbnet keep the interrupt URB active via
> > FLAG_INTR_ALWAYS.
> >
> > Second, move the code that sends the SYNC message out of the
> > bind() hook and after usbnet_probe() to ensure the interrupt URB
> > is set up before trying to use it.
> 
> Given this description I am wondering if you couldn't just move the
> whole SYNC thing to a new reset() hook, using some private flag to make
> sure it only runs once?  Does it really need to start at probe time?

It doesn't need to run exactly at probe, but it appears to need to be
the first thing the driver does when communicating with the firmware to
ensure clear state and whatnot.  Possibly like the QMI SYNC message that
clears all the client IDs and resets the internal stack.  (the driver
also sends a "shutdown" message to the firmware when unbinding).

So I do think that somewhere around probe() is the best time to do this,
because it's best to initialize the device when the driver binds to it
and react to errors as soon as possible, rather than trying to set
everything up on open/IFF_UP and then fail right before you want to
actually use the device.  Late-fail is quite unhelpful for applications.

I don't really care if it happens in probe() or somewhere else right
after the driver is bound to the device, but it should be part of the
initialization process.

Dan

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dan Williams Feb. 7, 2013, 9:09 p.m. UTC | #4
On Wed, 2013-02-06 at 21:17 +0100, Oliver Neukum wrote:
> On Wednesday 06 February 2013 12:42:56 Dan Williams wrote:
> > As part of the initialization sequence, the driver sends a SYNC message
> > via the control pipe to the firmware, which appears to request a
> > firmware restart.  The firmware responds with an indication via the
> > interrupt pipe set up by usbnet.  If the driver does not receive a
> > RESTART indication within a certain amount of time, it will periodically
> > send additional SYNC messages until it receives the RESTART indication.
> > 
> > Unfortunately, the interrupt URB is only submitted while the netdev
> > is open, which is usually not the case during initialization, and thus
> > the firmware's RESTART indication is lost.  So the driver continues
> > sending SYNC messages, and eventually the firmware crashes when it
> > receives too many.  This leads to a wedged netdev.
> 
> If I understand this correctly we should stop the interrupt pipe once
> RESTART has been recieved. I am afraid this patch is a bit inefficient.

The firmware may send other indications to the host over the interrupt
pipe at any time, apparently.  However, I have not seen firmware send
these indications in practice, so maybe we can ignore this and do as you
suggest.

So how would you suggest to handle this given the constraints?

Basically, we need to allow sierra_net to submit dev->interrupt at
probe/bind and then after a period of time kill it when it's no longer
needed.  The problem is that if the interface is set IFF_UP right after
bind but before sierra_net has finished its SYNC/Restart dance, we need
to ensure that sierra_net doesn't kill the URB unconditionally.

One way to do this would be a new usbnet function to subdrivers could
call to submit the URB which would be a NOP if the URB was already
submitted.  It would be paired with a function to kill the URB which
would be a NOP if (URB already killed || (IFF_UP and subdriver has
status() hook)).  Kinda like refcounting the interrupt URB submission
but not.

Does that sound like a worthwhile approach?

Dan

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjørn Mork Feb. 13, 2013, 11:44 a.m. UTC | #5
Dan Williams <dcbw@redhat.com> writes:

> It doesn't need to run exactly at probe, but it appears to need to be
> the first thing the driver does when communicating with the firmware to
> ensure clear state and whatnot.  Possibly like the QMI SYNC message that
> clears all the client IDs and resets the internal stack.  (the driver
> also sends a "shutdown" message to the firmware when unbinding).
>
> So I do think that somewhere around probe() is the best time to do this,
> because it's best to initialize the device when the driver binds to it
> and react to errors as soon as possible, rather than trying to set
> everything up on open/IFF_UP and then fail right before you want to
> actually use the device.  Late-fail is quite unhelpful for applications.
>
> I don't really care if it happens in probe() or somewhere else right
> after the driver is bound to the device, but it should be part of the
> initialization process.

I was looking for something else in the rndis_host driver right now and
noticed that it has the same sort of problem.  The generic_rndis_bind()
function will call rndis_command() which does

         usb_control_msg(.., USB_CDC_SEND_ENCAPSULATED_COMMAND, ..);
         usb_interrupt_msg(.., dev->status->desc.bEndpointAddress, .. );
         for (count = 0; count < 10; count++) {
             usb_control_msg(.., USB_CDC_GET_ENCAPSULATED_RESPONSE, ..);
             if (ok)
                return 0;
             msleep(20);
         }

Somewhat ugly, but it is a way to send and receive commands while
probing.  This driver also sends a command in unbind().

Looks like your patch would allow this to be solved a lot cleaner,
replacing the loop over USB_CDC_GET_ENCAPSULATED_RESPONSE with proper
interrupt endpoint handling.


Bjørn
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dan Williams Feb. 13, 2013, 4:34 p.m. UTC | #6
On Wed, 2013-02-13 at 12:44 +0100, Bjørn Mork wrote:
> Dan Williams <dcbw@redhat.com> writes:
> 
> > It doesn't need to run exactly at probe, but it appears to need to be
> > the first thing the driver does when communicating with the firmware to
> > ensure clear state and whatnot.  Possibly like the QMI SYNC message that
> > clears all the client IDs and resets the internal stack.  (the driver
> > also sends a "shutdown" message to the firmware when unbinding).
> >
> > So I do think that somewhere around probe() is the best time to do this,
> > because it's best to initialize the device when the driver binds to it
> > and react to errors as soon as possible, rather than trying to set
> > everything up on open/IFF_UP and then fail right before you want to
> > actually use the device.  Late-fail is quite unhelpful for applications.
> >
> > I don't really care if it happens in probe() or somewhere else right
> > after the driver is bound to the device, but it should be part of the
> > initialization process.
> 
> I was looking for something else in the rndis_host driver right now and
> noticed that it has the same sort of problem.  The generic_rndis_bind()
> function will call rndis_command() which does
> 
>          usb_control_msg(.., USB_CDC_SEND_ENCAPSULATED_COMMAND, ..);
>          usb_interrupt_msg(.., dev->status->desc.bEndpointAddress, .. );
>          for (count = 0; count < 10; count++) {
>              usb_control_msg(.., USB_CDC_GET_ENCAPSULATED_RESPONSE, ..);
>              if (ok)
>                 return 0;
>              msleep(20);
>          }
> 
> Somewhat ugly, but it is a way to send and receive commands while
> probing.  This driver also sends a command in unbind().
> 
> Looks like your patch would allow this to be solved a lot cleaner,
> replacing the loop over USB_CDC_GET_ENCAPSULATED_RESPONSE with proper
> interrupt endpoint handling.

It would end up being more "correct" but more complicated, because you'd
need to have rndis_command() block on a semaphore or something until the
response was processed by a "status" handler, which neither rndis_host.c
or rndis_wlan.c actually implement.  The status handler would have to
know that something was waiting on it, and then package up the response
in some way that the rndis_command() (which is now blocking on the
status interrupt) can read it and return it to the caller.

More correct, more code, more complicated...  but probably still
worthwhile?

Dan

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/usb/sierra_net.c b/drivers/net/usb/sierra_net.c
index 18dd425..185ffec 100644
--- a/drivers/net/usb/sierra_net.c
+++ b/drivers/net/usb/sierra_net.c
@@ -427,6 +427,13 @@  static void sierra_net_dosync(struct usbnet *dev)
 
 	dev_dbg(&dev->udev->dev, "%s", __func__);
 
+	/* The SIERRA_NET_HIP_MSYNC_ID command appears to request that the
+	 * firmware restart itself.  After restarting, the modem will respond
+	 * with the SIERRA_NET_HIP_RESTART_ID indication.  The driver continues
+	 * sending MSYNC commands every few seconds until it receives the
+	 * RESTART event from the firmware
+	 */
+
 	/* tell modem we are ready */
 	status = sierra_net_send_sync(dev);
 	if (status < 0)
@@ -709,6 +716,9 @@  static int sierra_net_bind(struct usbnet *dev, struct usb_interface *intf)
 	/* set context index initially to 0 - prepares tx hdr template */
 	sierra_net_set_ctx_index(priv, 0);
 
+	/* prepare sync message template */
+	memcpy(priv->sync_msg, sync_tmplate, sizeof(priv->sync_msg));
+
 	/* decrease the rx_urb_size and max_tx_size to 4k on USB 1.1 */
 	dev->rx_urb_size  = SIERRA_NET_RX_URB_SIZE;
 	if (dev->udev->speed != USB_SPEED_HIGH)
@@ -744,11 +754,6 @@  static int sierra_net_bind(struct usbnet *dev, struct usb_interface *intf)
 		kfree(priv);
 		return -ENODEV;
 	}
-	/* prepare sync message from template */
-	memcpy(priv->sync_msg, sync_tmplate, sizeof(priv->sync_msg));
-
-	/* initiate the sync sequence */
-	sierra_net_dosync(dev);
 
 	return 0;
 }
@@ -905,7 +910,7 @@  static struct sk_buff *sierra_net_tx_fixup(struct usbnet *dev,
 
 static const struct driver_info sierra_net_info_direct_ip = {
 	.description = "Sierra Wireless USB-to-WWAN Modem",
-	.flags = FLAG_WWAN | FLAG_SEND_ZLP,
+	.flags = FLAG_WWAN | FLAG_SEND_ZLP | FLAG_INTR_ALWAYS,
 	.bind = sierra_net_bind,
 	.unbind = sierra_net_unbind,
 	.status = sierra_net_status,
@@ -913,6 +918,20 @@  static const struct driver_info sierra_net_info_direct_ip = {
 	.tx_fixup = sierra_net_tx_fixup,
 };
 
+static int
+sierra_net_probe (struct usb_interface *udev, const struct usb_device_id *prod)
+{
+	int ret;
+
+	ret = usbnet_probe (udev, prod);
+	if (ret == 0) {
+		/* Interrupt URB now set up; initiate sync sequence */
+		sierra_net_dosync(usb_get_intfdata(udev));
+	}
+
+	return ret;
+}
+
 #define DIRECT_IP_DEVICE(vend, prod) \
 	{USB_DEVICE_INTERFACE_NUMBER(vend, prod, 7), \
 	.driver_info = (unsigned long)&sierra_net_info_direct_ip}, \
@@ -935,7 +954,7 @@  MODULE_DEVICE_TABLE(usb, products);
 static struct usb_driver sierra_net_driver = {
 	.name = "sierra_net",
 	.id_table = products,
-	.probe = usbnet_probe,
+	.probe = sierra_net_probe,
 	.disconnect = usbnet_disconnect,
 	.suspend = usbnet_suspend,
 	.resume = usbnet_resume,