diff mbox

[net,stable] net: usb: Add HP hs2434 device to ZLP exception table

Message ID 20130825182913.2865aa1a8cefe329ab9c7a85@gmail.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Rob Gardner Aug. 26, 2013, 12:29 a.m. UTC
From 018343ce2e679d97283fb51da25c43aa876d087a Mon Sep 17 00:00:00 2001
From: Rob Gardner <robmatic@gmail.com>
Date: Sun, 25 Aug 2013 16:02:23 -0600
Subject: [PATCH net,stable] net: usb: Add HP hs2434 device to ZLP exception table

This patch adds another entry (HP hs2434 Mobile Broadband) to the list
of exceptional devices that require a zero length packet in order to
function properly. This list was added in commit 844e88f0. The hs2434
is manufactured by Sierra Wireless, who also produces the MC7710,
which the ZLP exception list was created for in the first place. So
hopefully it is just this one producer's devices that will need this
workaround.

Tested on a DM1-4310NR HP notebook, which does not function without this
change.

Signed-off-by: Rob Gardner <robmatic@gmail.com>
---
 drivers/net/usb/cdc_mbim.c |    4 ++++
 1 file changed, 4 insertions(+)

Comments

Bjørn Mork Aug. 26, 2013, 1:10 a.m. UTC | #1
Rob Gardner <robmatic@gmail.com> writes:

> From 018343ce2e679d97283fb51da25c43aa876d087a Mon Sep 17 00:00:00 2001
> From: Rob Gardner <robmatic@gmail.com>
> Date: Sun, 25 Aug 2013 16:02:23 -0600
> Subject: [PATCH net,stable] net: usb: Add HP hs2434 device to ZLP exception table
>
> This patch adds another entry (HP hs2434 Mobile Broadband) to the list
> of exceptional devices that require a zero length packet in order to
> function properly. This list was added in commit 844e88f0. The hs2434
> is manufactured by Sierra Wireless, who also produces the MC7710,
> which the ZLP exception list was created for in the first place. So
> hopefully it is just this one producer's devices that will need this
> workaround.
>
> Tested on a DM1-4310NR HP notebook, which does not function without this
> change.
>
> Signed-off-by: Rob Gardner <robmatic@gmail.com>

Acked-by: Bjørn Mork <bjorn@mork.no>

Thanks a lot for doing this!


As I warned about in http://www.spinics.net/lists/netdev/msg223742.html
I am now going to reconsider what do do about this issue. There are only
two devices in the exception list so far, but I would like to note that
this represents 100% of the Sierra Wireless MBIM devices we *know* are
tested with this driver.

Testing for this firmware bug and verifying the workaround on a new
device is currently hard, requiring more skills than most end users can
be expected to have. I therefore find it very likely that we will miss
(or quite possible have missed already) a number of devices which should
have been in this exception list. I believe we have to do something
about this.

Ideas are welcome.


Bjørn
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Aug. 28, 2013, 10:23 p.m. UTC | #2
From: Rob Gardner <robmatic@gmail.com>
Date: Sun, 25 Aug 2013 18:29:13 -0600

> From 018343ce2e679d97283fb51da25c43aa876d087a Mon Sep 17 00:00:00 2001
> From: Rob Gardner <robmatic@gmail.com>
> Date: Sun, 25 Aug 2013 16:02:23 -0600
> Subject: [PATCH net,stable] net: usb: Add HP hs2434 device to ZLP exception table
> 
> This patch adds another entry (HP hs2434 Mobile Broadband) to the list
> of exceptional devices that require a zero length packet in order to
> function properly. This list was added in commit 844e88f0. The hs2434
> is manufactured by Sierra Wireless, who also produces the MC7710,
> which the ZLP exception list was created for in the first place. So
> hopefully it is just this one producer's devices that will need this
> workaround.
> 
> Tested on a DM1-4310NR HP notebook, which does not function without this
> change.
> 
> Signed-off-by: Rob Gardner <robmatic@gmail.com>

Applied and queued up for -tablt, thanks.

On the topic of dealing with detection of this issue, how harmful is the
workaround on chips that don't need it?  Will the driver not work in that
case?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rob Gardner Aug. 28, 2013, 11:18 p.m. UTC | #3
On Wed, 28 Aug 2013 18:23:32 -0400 (EDT)
David Miller <davem@davemloft.net> wrote:

> From: Rob Gardner <robmatic@gmail.com>
> Date: Sun, 25 Aug 2013 18:29:13 -0600
> 
> > From 018343ce2e679d97283fb51da25c43aa876d087a Mon Sep 17 00:00:00 2001
> > From: Rob Gardner <robmatic@gmail.com>
> > Date: Sun, 25 Aug 2013 16:02:23 -0600
> > Subject: [PATCH net,stable] net: usb: Add HP hs2434 device to ZLP exception table
> > 
> > This patch adds another entry (HP hs2434 Mobile Broadband) to the list
> > of exceptional devices that require a zero length packet in order to
> > function properly. This list was added in commit 844e88f0. The hs2434
> > is manufactured by Sierra Wireless, who also produces the MC7710,
> > which the ZLP exception list was created for in the first place. So
> > hopefully it is just this one producer's devices that will need this
> > workaround.
> > 
> > Tested on a DM1-4310NR HP notebook, which does not function without this
> > change.
> > 
> > Signed-off-by: Rob Gardner <robmatic@gmail.com>
> 
> Applied and queued up for -tablt, thanks.
> 
> On the topic of dealing with detection of this issue, how harmful is the
> workaround on chips that don't need it?  Will the driver not work in that
> case?


Bjørn says it could have a minor performance impact, but should not
cause any loss of functionality. I don't think there's any hard data on
what the performance cost might be but I know he would eventually like
to get rid of the exception list and live with the possible performance
impact if it means functionality for more devices.
David Miller Aug. 29, 2013, 12:01 a.m. UTC | #4
From: Rob Gardner <robmatic@gmail.com>
Date: Wed, 28 Aug 2013 17:18:12 -0600

> Bjørn says it could have a minor performance impact, but should not
> cause any loss of functionality. I don't think there's any hard data on
> what the performance cost might be but I know he would eventually like
> to get rid of the exception list and live with the possible performance
> impact if it means functionality for more devices.

Therefore I would recommend turning the workaround on by default and
have a white list.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rob Gardner Aug. 29, 2013, 12:40 a.m. UTC | #5
On Wed, 28 Aug 2013 20:01:11 -0400 (EDT)
David Miller <davem@davemloft.net> wrote:

> From: Rob Gardner <robmatic@gmail.com>
> Date: Wed, 28 Aug 2013 17:18:12 -0600
> 
> > Bjørn says it could have a minor performance impact, but should not
> > cause any loss of functionality. I don't think there's any hard data on
> > what the performance cost might be but I know he would eventually like
> > to get rid of the exception list and live with the possible performance
> > impact if it means functionality for more devices.
> 
> Therefore I would recommend turning the workaround on by default and
> have a white list.

The exception list means "usb_device_id entries for specific devices
that are known to need the workaround." There are just two such entries.
There isn't even a separate list. So maybe we just have a nomenclature
problem, and we could simply be calling this a "white list" instead of
"exception list". The other possible meaning of whitelist (all devices
that *don't* need the workaround) is unthinkable, as that would be a huge
list and much more prone to be unmanageable than what we've got now.
David Miller Aug. 29, 2013, 4:33 a.m. UTC | #6
From: Rob Gardner <robmatic@gmail.com>
Date: Wed, 28 Aug 2013 18:40:22 -0600

> The exception list means "usb_device_id entries for specific devices
> that are known to need the workaround." There are just two such entries.
> There isn't even a separate list. So maybe we just have a nomenclature
> problem, and we could simply be calling this a "white list" instead of
> "exception list". The other possible meaning of whitelist (all devices
> that *don't* need the workaround) is unthinkable, as that would be a huge
> list and much more prone to be unmanageable than what we've got now.

Ok I misunderstood.  I thought we were discoving that all chips which
have been thoroughly investigated end up needing the workaround.

I guess it's only a specific family of these chips which seem to all
have the problem?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rob Gardner Aug. 29, 2013, 5:51 a.m. UTC | #7
On Thu, 29 Aug 2013 00:33:52 -0400 (EDT)
David Miller <davem@davemloft.net> wrote:

> From: Rob Gardner <robmatic@gmail.com>
> Date: Wed, 28 Aug 2013 18:40:22 -0600
> 
> > The exception list means "usb_device_id entries for specific devices
> > that are known to need the workaround." There are just two such entries.
> > There isn't even a separate list. So maybe we just have a nomenclature
> > problem, and we could simply be calling this a "white list" instead of
> > "exception list". The other possible meaning of whitelist (all devices
> > that *don't* need the workaround) is unthinkable, as that would be a huge
> > list and much more prone to be unmanageable than what we've got now.
> 
> Ok I misunderstood.  I thought we were discoving that all chips which
> have been thoroughly investigated end up needing the workaround.
> 
> I guess it's only a specific family of these chips which seem to all
> have the problem?

Yes. The first one was the Sierra MC7710, and the second is the one
that I discovered, the HP hs2434, which is actually manufactured by
Sierra. So it may be an issue specific to Sierra, but unfortunately the
hs2434 uses HP's manufacturer id (0x3f0) and not Sierra's manufacturer
id (0x1199) so we can't just discriminate based on that.
Bjørn Mork Aug. 29, 2013, 8:32 a.m. UTC | #8
David Miller <davem@davemloft.net> writes:

> From: Rob Gardner <robmatic@gmail.com>
> Date: Wed, 28 Aug 2013 18:40:22 -0600
>
>> The exception list means "usb_device_id entries for specific devices
>> that are known to need the workaround." There are just two such entries.
>> There isn't even a separate list. So maybe we just have a nomenclature
>> problem, and we could simply be calling this a "white list" instead of
>> "exception list". The other possible meaning of whitelist (all devices
>> that *don't* need the workaround) is unthinkable, as that would be a huge
>> list and much more prone to be unmanageable than what we've got now.
>
> Ok I misunderstood.  I thought we were discoving that all chips which
> have been thoroughly investigated end up needing the workaround.
>
> I guess it's only a specific family of these chips which seem to all
> have the problem?

This driver is a class driver, which means that it deals with a number
of completely independent chip designs from different vendors.  We have
so far seen and tested chips from ST-Ericsson, Qualcomm and Mediatek
with this driver.  Given that Microsoft requires MBIM, we are likely to
see implementations on all available 3G/LTE chips (if there are any
others?)

Only devices with Qualcomm chips have the bug in question, and so far we
have only observed it in modules made by Sierra Wireless. The firmware
in these modules has a base part made by Qualcomm and an application
part made by Sierra Wireless.  We do not know which part of the firmware
is responsible for the bug, but we do know that the Qualcomm base
firmware used by these modules support MBIM framing.  Based on this, I
am guessing that the chip vendor Qualcomm is responsible for the bug. If
correct, then the bug is likely to appear in products from many
different module vendors.  It would not be the first time we saw that...

I do have a Huawei MBIM device with a Qualcomm chip, but this module
does not have the Qualcomm firmware with MBIM support. The MBIM
implementation is therefore assumed to be made by Huawei and running as
a firmware application on the device.  This device does not have the bug.

Now, if we could just identify devices running a Qualcomm base firmware
with MBIM support then we could enable the workaround for all those. But
the driver does not have any way to identify them.  It must base its
decision on the USB descriptors, in practice only device ID and product
ID. A module vendor can use chips from different sources (Huawei does
this), and as Rob says: Laptop vendors will often have modules made with
their own vendor ID. Most laptop vendors will also use modules from
different vendors.  HP, as a current example, is known to use modules
both from Sierra Wireless (need the workaround) and Ericsson (does not
want the workaround).

This makes any sort of vendor matching difficult, and we end up with
device specific blacklisting or whitelisting as the only option.  If it
had been only the two devices we have now, or even only a few tens of
devices, this would not have been a big deal.  But I do expect that the
length of this exception list will be comparable to the list of devices
supported by the qmi_wwan driver, i.e. hundreds.  And, like that driver,
the list will probably never be complete. Given the difficulties
detecting the need for the workaround, the list is probably going to be
extremely incomplete. Only a small percentage of end users with affected
devices will be able to provide the necessary information.

For the question of the workaround impact on devices without bug: They
will work fine, and the effect will not be user noticable AFAICS.  But
it will have a negative impact on the device, most likely causing higher
power consumption. Alexey has explained this much better than I can, as
I do not fully understand the device firmware design and requirements:
http://www.spinics.net/lists/linux-usb/msg78078.html

Quoting part of his explanation (talking about the *device* CPU and
INT - we do not care about the host):

  If host decides to send ZLP after full NTB, CPU must handle
  additional INT per every full NTB instead of doing useful work.
  For FTP transfer with constantly full NTBs you get twice amount of
  Interrupts.

I do agree that this is unfortunate. And hurting standard compliant
devices to work around a standard violating bug in other devices does
not feel right.  But I do not see any other option.



Bjørn
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/usb/cdc_mbim.c b/drivers/net/usb/cdc_mbim.c
index 8728198..25ba7ec 100644
--- a/drivers/net/usb/cdc_mbim.c
+++ b/drivers/net/usb/cdc_mbim.c
@@ -399,8 +399,12 @@  static const struct usb_device_id mbim_devs[] = {
 	/* Sierra Wireless MC7710 need ZLPs */
 	{ USB_DEVICE_AND_INTERFACE_INFO(0x1199, 0x68a2, USB_CLASS_COMM, USB_CDC_SUBCLASS_MBIM, USB_CDC_PROTO_NONE),
 	  .driver_info = (unsigned long)&cdc_mbim_info_zlp,
 	},
+	/* HP hs2434 Mobile Broadband Module needs ZLPs */
+	{ USB_DEVICE_AND_INTERFACE_INFO(0x3f0, 0x4b1d, USB_CLASS_COMM, USB_CDC_SUBCLASS_MBIM, USB_CDC_PROTO_NONE),
+	  .driver_info = (unsigned long)&cdc_mbim_info_zlp,
+	},
 	{ USB_INTERFACE_INFO(USB_CLASS_COMM, USB_CDC_SUBCLASS_MBIM, USB_CDC_PROTO_NONE),
 	  .driver_info = (unsigned long)&cdc_mbim_info,
 	},
 	{