Message ID | 1360176176.11742.16.camel@dcbw.foobar.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On Wednesday 06 February 2013 12:42:56 Dan Williams wrote: > As part of the initialization sequence, the driver sends a SYNC message > via the control pipe to the firmware, which appears to request a > firmware restart. The firmware responds with an indication via the > interrupt pipe set up by usbnet. If the driver does not receive a > RESTART indication within a certain amount of time, it will periodically > send additional SYNC messages until it receives the RESTART indication. > > Unfortunately, the interrupt URB is only submitted while the netdev > is open, which is usually not the case during initialization, and thus > the firmware's RESTART indication is lost. So the driver continues > sending SYNC messages, and eventually the firmware crashes when it > receives too many. This leads to a wedged netdev. If I understand this correctly we should stop the interrupt pipe once RESTART has been recieved. I am afraid this patch is a bit inefficient. Regards Oliver -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Dan Williams <dcbw@redhat.com> writes: > As part of the initialization sequence, the driver sends a SYNC message > via the control pipe to the firmware, which appears to request a > firmware restart. The firmware responds with an indication via the > interrupt pipe set up by usbnet. If the driver does not receive a > RESTART indication within a certain amount of time, it will periodically > send additional SYNC messages until it receives the RESTART indication. > > Unfortunately, the interrupt URB is only submitted while the netdev > is open, which is usually not the case during initialization, and thus > the firmware's RESTART indication is lost. So the driver continues > sending SYNC messages, and eventually the firmware crashes when it > receives too many. This leads to a wedged netdev. > > To ensure the firmware's RESTART indications can be received by the > driver, request that usbnet keep the interrupt URB active via > FLAG_INTR_ALWAYS. > > Second, move the code that sends the SYNC message out of the > bind() hook and after usbnet_probe() to ensure the interrupt URB > is set up before trying to use it. Given this description I am wondering if you couldn't just move the whole SYNC thing to a new reset() hook, using some private flag to make sure it only runs once? Does it really need to start at probe time? Bjørn -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2013-02-06 at 22:11 +0100, Bjørn Mork wrote: > Dan Williams <dcbw@redhat.com> writes: > > > As part of the initialization sequence, the driver sends a SYNC message > > via the control pipe to the firmware, which appears to request a > > firmware restart. The firmware responds with an indication via the > > interrupt pipe set up by usbnet. If the driver does not receive a > > RESTART indication within a certain amount of time, it will periodically > > send additional SYNC messages until it receives the RESTART indication. > > > > Unfortunately, the interrupt URB is only submitted while the netdev > > is open, which is usually not the case during initialization, and thus > > the firmware's RESTART indication is lost. So the driver continues > > sending SYNC messages, and eventually the firmware crashes when it > > receives too many. This leads to a wedged netdev. > > > > To ensure the firmware's RESTART indications can be received by the > > driver, request that usbnet keep the interrupt URB active via > > FLAG_INTR_ALWAYS. > > > > Second, move the code that sends the SYNC message out of the > > bind() hook and after usbnet_probe() to ensure the interrupt URB > > is set up before trying to use it. > > Given this description I am wondering if you couldn't just move the > whole SYNC thing to a new reset() hook, using some private flag to make > sure it only runs once? Does it really need to start at probe time? It doesn't need to run exactly at probe, but it appears to need to be the first thing the driver does when communicating with the firmware to ensure clear state and whatnot. Possibly like the QMI SYNC message that clears all the client IDs and resets the internal stack. (the driver also sends a "shutdown" message to the firmware when unbinding). So I do think that somewhere around probe() is the best time to do this, because it's best to initialize the device when the driver binds to it and react to errors as soon as possible, rather than trying to set everything up on open/IFF_UP and then fail right before you want to actually use the device. Late-fail is quite unhelpful for applications. I don't really care if it happens in probe() or somewhere else right after the driver is bound to the device, but it should be part of the initialization process. Dan -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2013-02-06 at 21:17 +0100, Oliver Neukum wrote: > On Wednesday 06 February 2013 12:42:56 Dan Williams wrote: > > As part of the initialization sequence, the driver sends a SYNC message > > via the control pipe to the firmware, which appears to request a > > firmware restart. The firmware responds with an indication via the > > interrupt pipe set up by usbnet. If the driver does not receive a > > RESTART indication within a certain amount of time, it will periodically > > send additional SYNC messages until it receives the RESTART indication. > > > > Unfortunately, the interrupt URB is only submitted while the netdev > > is open, which is usually not the case during initialization, and thus > > the firmware's RESTART indication is lost. So the driver continues > > sending SYNC messages, and eventually the firmware crashes when it > > receives too many. This leads to a wedged netdev. > > If I understand this correctly we should stop the interrupt pipe once > RESTART has been recieved. I am afraid this patch is a bit inefficient. The firmware may send other indications to the host over the interrupt pipe at any time, apparently. However, I have not seen firmware send these indications in practice, so maybe we can ignore this and do as you suggest. So how would you suggest to handle this given the constraints? Basically, we need to allow sierra_net to submit dev->interrupt at probe/bind and then after a period of time kill it when it's no longer needed. The problem is that if the interface is set IFF_UP right after bind but before sierra_net has finished its SYNC/Restart dance, we need to ensure that sierra_net doesn't kill the URB unconditionally. One way to do this would be a new usbnet function to subdrivers could call to submit the URB which would be a NOP if the URB was already submitted. It would be paired with a function to kill the URB which would be a NOP if (URB already killed || (IFF_UP and subdriver has status() hook)). Kinda like refcounting the interrupt URB submission but not. Does that sound like a worthwhile approach? Dan -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Dan Williams <dcbw@redhat.com> writes: > It doesn't need to run exactly at probe, but it appears to need to be > the first thing the driver does when communicating with the firmware to > ensure clear state and whatnot. Possibly like the QMI SYNC message that > clears all the client IDs and resets the internal stack. (the driver > also sends a "shutdown" message to the firmware when unbinding). > > So I do think that somewhere around probe() is the best time to do this, > because it's best to initialize the device when the driver binds to it > and react to errors as soon as possible, rather than trying to set > everything up on open/IFF_UP and then fail right before you want to > actually use the device. Late-fail is quite unhelpful for applications. > > I don't really care if it happens in probe() or somewhere else right > after the driver is bound to the device, but it should be part of the > initialization process. I was looking for something else in the rndis_host driver right now and noticed that it has the same sort of problem. The generic_rndis_bind() function will call rndis_command() which does usb_control_msg(.., USB_CDC_SEND_ENCAPSULATED_COMMAND, ..); usb_interrupt_msg(.., dev->status->desc.bEndpointAddress, .. ); for (count = 0; count < 10; count++) { usb_control_msg(.., USB_CDC_GET_ENCAPSULATED_RESPONSE, ..); if (ok) return 0; msleep(20); } Somewhat ugly, but it is a way to send and receive commands while probing. This driver also sends a command in unbind(). Looks like your patch would allow this to be solved a lot cleaner, replacing the loop over USB_CDC_GET_ENCAPSULATED_RESPONSE with proper interrupt endpoint handling. Bjørn -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2013-02-13 at 12:44 +0100, Bjørn Mork wrote: > Dan Williams <dcbw@redhat.com> writes: > > > It doesn't need to run exactly at probe, but it appears to need to be > > the first thing the driver does when communicating with the firmware to > > ensure clear state and whatnot. Possibly like the QMI SYNC message that > > clears all the client IDs and resets the internal stack. (the driver > > also sends a "shutdown" message to the firmware when unbinding). > > > > So I do think that somewhere around probe() is the best time to do this, > > because it's best to initialize the device when the driver binds to it > > and react to errors as soon as possible, rather than trying to set > > everything up on open/IFF_UP and then fail right before you want to > > actually use the device. Late-fail is quite unhelpful for applications. > > > > I don't really care if it happens in probe() or somewhere else right > > after the driver is bound to the device, but it should be part of the > > initialization process. > > I was looking for something else in the rndis_host driver right now and > noticed that it has the same sort of problem. The generic_rndis_bind() > function will call rndis_command() which does > > usb_control_msg(.., USB_CDC_SEND_ENCAPSULATED_COMMAND, ..); > usb_interrupt_msg(.., dev->status->desc.bEndpointAddress, .. ); > for (count = 0; count < 10; count++) { > usb_control_msg(.., USB_CDC_GET_ENCAPSULATED_RESPONSE, ..); > if (ok) > return 0; > msleep(20); > } > > Somewhat ugly, but it is a way to send and receive commands while > probing. This driver also sends a command in unbind(). > > Looks like your patch would allow this to be solved a lot cleaner, > replacing the loop over USB_CDC_GET_ENCAPSULATED_RESPONSE with proper > interrupt endpoint handling. It would end up being more "correct" but more complicated, because you'd need to have rndis_command() block on a semaphore or something until the response was processed by a "status" handler, which neither rndis_host.c or rndis_wlan.c actually implement. The status handler would have to know that something was waiting on it, and then package up the response in some way that the rndis_command() (which is now blocking on the status interrupt) can read it and return it to the caller. More correct, more code, more complicated... but probably still worthwhile? Dan -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/usb/sierra_net.c b/drivers/net/usb/sierra_net.c index 18dd425..185ffec 100644 --- a/drivers/net/usb/sierra_net.c +++ b/drivers/net/usb/sierra_net.c @@ -427,6 +427,13 @@ static void sierra_net_dosync(struct usbnet *dev) dev_dbg(&dev->udev->dev, "%s", __func__); + /* The SIERRA_NET_HIP_MSYNC_ID command appears to request that the + * firmware restart itself. After restarting, the modem will respond + * with the SIERRA_NET_HIP_RESTART_ID indication. The driver continues + * sending MSYNC commands every few seconds until it receives the + * RESTART event from the firmware + */ + /* tell modem we are ready */ status = sierra_net_send_sync(dev); if (status < 0) @@ -709,6 +716,9 @@ static int sierra_net_bind(struct usbnet *dev, struct usb_interface *intf) /* set context index initially to 0 - prepares tx hdr template */ sierra_net_set_ctx_index(priv, 0); + /* prepare sync message template */ + memcpy(priv->sync_msg, sync_tmplate, sizeof(priv->sync_msg)); + /* decrease the rx_urb_size and max_tx_size to 4k on USB 1.1 */ dev->rx_urb_size = SIERRA_NET_RX_URB_SIZE; if (dev->udev->speed != USB_SPEED_HIGH) @@ -744,11 +754,6 @@ static int sierra_net_bind(struct usbnet *dev, struct usb_interface *intf) kfree(priv); return -ENODEV; } - /* prepare sync message from template */ - memcpy(priv->sync_msg, sync_tmplate, sizeof(priv->sync_msg)); - - /* initiate the sync sequence */ - sierra_net_dosync(dev); return 0; } @@ -905,7 +910,7 @@ static struct sk_buff *sierra_net_tx_fixup(struct usbnet *dev, static const struct driver_info sierra_net_info_direct_ip = { .description = "Sierra Wireless USB-to-WWAN Modem", - .flags = FLAG_WWAN | FLAG_SEND_ZLP, + .flags = FLAG_WWAN | FLAG_SEND_ZLP | FLAG_INTR_ALWAYS, .bind = sierra_net_bind, .unbind = sierra_net_unbind, .status = sierra_net_status, @@ -913,6 +918,20 @@ static const struct driver_info sierra_net_info_direct_ip = { .tx_fixup = sierra_net_tx_fixup, }; +static int +sierra_net_probe (struct usb_interface *udev, const struct usb_device_id *prod) +{ + int ret; + + ret = usbnet_probe (udev, prod); + if (ret == 0) { + /* Interrupt URB now set up; initiate sync sequence */ + sierra_net_dosync(usb_get_intfdata(udev)); + } + + return ret; +} + #define DIRECT_IP_DEVICE(vend, prod) \ {USB_DEVICE_INTERFACE_NUMBER(vend, prod, 7), \ .driver_info = (unsigned long)&sierra_net_info_direct_ip}, \ @@ -935,7 +954,7 @@ MODULE_DEVICE_TABLE(usb, products); static struct usb_driver sierra_net_driver = { .name = "sierra_net", .id_table = products, - .probe = usbnet_probe, + .probe = sierra_net_probe, .disconnect = usbnet_disconnect, .suspend = usbnet_suspend, .resume = usbnet_resume,
As part of the initialization sequence, the driver sends a SYNC message via the control pipe to the firmware, which appears to request a firmware restart. The firmware responds with an indication via the interrupt pipe set up by usbnet. If the driver does not receive a RESTART indication within a certain amount of time, it will periodically send additional SYNC messages until it receives the RESTART indication. Unfortunately, the interrupt URB is only submitted while the netdev is open, which is usually not the case during initialization, and thus the firmware's RESTART indication is lost. So the driver continues sending SYNC messages, and eventually the firmware crashes when it receives too many. This leads to a wedged netdev. To ensure the firmware's RESTART indications can be received by the driver, request that usbnet keep the interrupt URB active via FLAG_INTR_ALWAYS. Second, move the code that sends the SYNC message out of the bind() hook and after usbnet_probe() to ensure the interrupt URB is set up before trying to use it. Signed-off-by: Dan Williams <dcbw@redhat.com> --- drivers/net/usb/sierra_net.c | 33 ++++++++++++++++++++++++++------- 1 file changed, 26 insertions(+), 7 deletions(-)