Patchwork [U-Boot] usb: fix usb start problem with SMSC USB hub and Toshiba USB stick

login
register
mail settings
Submitter Anatolij Gustschin
Date Aug. 31, 2010, 6:45 p.m.
Message ID <1283280314-10700-1-git-send-email-agust@denx.de>
Download mbox | patch
Permalink /patch/71835/
State Superseded
Delegated to: Remy Bohmer
Headers show

Comments

Anatolij Gustschin - Aug. 31, 2010, 6:45 p.m.
Checking the status field of the qTD token in the current code
do not take into acount cases where endpoint stall (halted) bit
is set together with some other status bits. As a result clearing
stall on an endpoint won't be done if other status bits were set.

E.g. 'usb start' often fails with Toshiba USB stick 0x930/0x6545
connected to the SMSC USB 2.0 hub 0x424/0x2514. Debugging the
issue showed that while bulk IN transfers with length of 13 or
18 the status field of the qTD token sometimes indicates trans-
action error (XactErr) and sometimes additionally endpoint halted
state. In the latter case resetting the USB device in error
recovery code fails as no clear stall request on the endpoint
will be done. The patch fixes status field checking code to
properly handle endpoint halted state.

However this fix is not enough to solve 'usb start' problem
with hub/stick combination mentioned above. Running with lot of
debug code in ehci_submit_async() I've never seen the problem
with usb stick recognition. After removing this debug code the
similar problem sometimes showed up again. Therefore the patch
also adds delay in ehci_submit_async() for above-mentioned
hub/stick combination. Even without this delay the fix is an
improvement since it fixes the problem with board freezy after
subsequent failed 'usb start/stop' cycles as it was observed
on mpc5121ads board.

Signed-off-by: Anatolij Gustschin <agust@denx.de>
---
 common/usb_storage.c        |    5 +++--
 drivers/usb/host/ehci-hcd.c |    8 ++++++++
 2 files changed, 11 insertions(+), 2 deletions(-)
Remy Bohmer - Sept. 1, 2010, 8:49 a.m.
Hi,

2010/8/31 Anatolij Gustschin <agust@denx.de>:
> Checking the status field of the qTD token in the current code
> do not take into acount cases where endpoint stall (halted) bit
> is set together with some other status bits. As a result clearing
> stall on an endpoint won't be done if other status bits were set.

Reading this description and this code:

>        /* special handling of STALL in DATA phase */
> -       if ((result < 0) && (us->pusb_dev->status & USB_ST_STALLED)) {
> +       if ((result < 0) &&
> +           (us->pusb_dev->status & (USB_ST_STALLED | USB_ST_CRC_ERR))) {
>                USB_STOR_PRINTF("DATA:stall\n");

Description and code do not seem to match. The old implementation
explicitly checked if the STALLED bit was set, and if so the endpoint
would be cleared. Now, it seems you are clearing the endpoint _also_
when the CRC_ERR bit is set.

There are a lot more reasons, at least on other host controllers that
set the status USB_ST_CRC_ERR, does this change not break the other
code? (A simple grep will show you the situations where it is
returned.)

> E.g. 'usb start' often fails with Toshiba USB stick 0x930/0x6545
> connected to the SMSC USB 2.0 hub 0x424/0x2514. Debugging the
> issue showed that while bulk IN transfers with length of 13 or
> 18 the status field of the qTD token sometimes indicates trans-
> action error (XactErr) and sometimes additionally endpoint halted
> state. In the latter case resetting the USB device in error
> recovery code fails as no clear stall request on the endpoint
> will be done.

OK

> However this fix is not enough to solve 'usb start' problem
> with hub/stick combination mentioned above. Running with lot of
> debug code in ehci_submit_async() I've never seen the problem
> with usb stick recognition. After removing this debug code the
> similar problem sometimes showed up again. Therefore the patch
> also adds delay in ehci_submit_async() for above-mentioned
> hub/stick combination. Even without this delay the fix is an

Why only for these USB sticks? Are these sticks special among the
thousands of different sticks out there?
Or could it be more reliable to always do this delay for all USB
sticks? Or even better, what are we missing here that the delay is
needed at all?

> diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c
> index 37d056e..7463a75 100644
> --- a/drivers/usb/host/ehci-hcd.c
> +++ b/drivers/usb/host/ehci-hcd.c

Is this problem only valid for the EHCI code? I can imagine that the
other host controllers (like UHCI and OHCI and so on) code suffer the
same problem and need similar fixes...
(At least I know that I have here a couple of USB sticks that show
similar problems with OHCI ;-) )

Kind regards,

Remy
Wolfgang Denk - Oct. 12, 2010, 8:33 p.m.
Dear Remy,

In message <1283280314-10700-1-git-send-email-agust@denx.de> Anatolij
Gustschin wrote:
> Checking the status field of the qTD token in the current code
> do not take into acount cases where endpoint stall (halted) bit
> is set together with some other status bits. As a result clearing
> stall on an endpoint won't be done if other status bits were set.
> 
> E.g. 'usb start' often fails with Toshiba USB stick 0x930/0x6545
> connected to the SMSC USB 2.0 hub 0x424/0x2514. Debugging the
> issue showed that while bulk IN transfers with length of 13 or
> 18 the status field of the qTD token sometimes indicates trans-
> action error (XactErr) and sometimes additionally endpoint halted
> state. In the latter case resetting the USB device in error
> recovery code fails as no clear stall request on the endpoint
> will be done. The patch fixes status field checking code to
> properly handle endpoint halted state.
> 
> However this fix is not enough to solve 'usb start' problem
> with hub/stick combination mentioned above. Running with lot of
> debug code in ehci_submit_async() I've never seen the problem
> with usb stick recognition. After removing this debug code the
> similar problem sometimes showed up again. Therefore the patch
> also adds delay in ehci_submit_async() for above-mentioned
> hub/stick combination. Even without this delay the fix is an
> improvement since it fixes the problem with board freezy after
> subsequent failed 'usb start/stop' cycles as it was observed
> on mpc5121ads board.
> 
> Signed-off-by: Anatolij Gustschin <agust@denx.de>
> ---
>  common/usb_storage.c        |    5 +++--
>  drivers/usb/host/ehci-hcd.c |    8 ++++++++
>  2 files changed, 11 insertions(+), 2 deletions(-)

Any comments on this?

Best regards,

Wolfgang Denk
Wolfgang Denk - Oct. 23, 2010, 7:54 p.m.
Dear Anatolij,

In message <AANLkTikbnmdtJNCjd-pjeHWsw+Ng8sTF1iZT1utG6JS8@mail.gmail.com> Remy Bohmer wrote:
> Hi,
> 
> 2010/8/31 Anatolij Gustschin <agust@denx.de>:
> > Checking the status field of the qTD token in the current code
> > do not take into acount cases where endpoint stall (halted) bit
> > is set together with some other status bits. As a result clearing
> > stall on an endpoint won't be done if other status bits were set.
> 
> Reading this description and this code:
> 
> >        /* special handling of STALL in DATA phase */
> > -       if ((result < 0) && (us->pusb_dev->status & USB_ST_STALLED)) {
> > +       if ((result < 0) &&
> > +           (us->pusb_dev->status & (USB_ST_STALLED | USB_ST_CRC_ERR))) {
> >                USB_STOR_PRINTF("DATA:stall\n");
> 
> Description and code do not seem to match. The old implementation
> explicitly checked if the STALLED bit was set, and if so the endpoint
> would be cleared. Now, it seems you are clearing the endpoint _also_
> when the CRC_ERR bit is set.
> 
> There are a lot more reasons, at least on other host controllers that
> set the status USB_ST_CRC_ERR, does this change not break the other
> code? (A simple grep will show you the situations where it is
> returned.)
> 
> > E.g. 'usb start' often fails with Toshiba USB stick 0x930/0x6545
> > connected to the SMSC USB 2.0 hub 0x424/0x2514. Debugging the
> > issue showed that while bulk IN transfers with length of 13 or
> > 18 the status field of the qTD token sometimes indicates trans-
> > action error (XactErr) and sometimes additionally endpoint halted
> > state. In the latter case resetting the USB device in error
> > recovery code fails as no clear stall request on the endpoint
> > will be done.
> 
> OK
> 
> > However this fix is not enough to solve 'usb start' problem
> > with hub/stick combination mentioned above. Running with lot of
> > debug code in ehci_submit_async() I've never seen the problem
> > with usb stick recognition. After removing this debug code the
> > similar problem sometimes showed up again. Therefore the patch
> > also adds delay in ehci_submit_async() for above-mentioned
> > hub/stick combination. Even without this delay the fix is an
> 
> Why only for these USB sticks? Are these sticks special among the
> thousands of different sticks out there?
> Or could it be more reliable to always do this delay for all USB
> sticks? Or even better, what are we missing here that the delay is
> needed at all?
> 
> > diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c
> > index 37d056e..7463a75 100644
> > --- a/drivers/usb/host/ehci-hcd.c
> > +++ b/drivers/usb/host/ehci-hcd.c
> 
> Is this problem only valid for the EHCI code? I can imagine that the
> other host controllers (like UHCI and OHCI and so on) code suffer the
> same problem and need similar fixes...
> (At least I know that I have here a couple of USB sticks that show
> similar problems with OHCI ;-) )


As far as I can tell you never replied to thesequestions, with the
result that this patch did not make it into mainline yet.

Could you please have a look?  Thanks.

Best regards,

Wolfgang Denk
Anatolij Gustschin - Oct. 27, 2010, 1:05 p.m.
Hi Remy,

On Wed, 1 Sep 2010 10:49:22 +0200
Remy Bohmer <linux@bohmer.net> wrote:
...
> 2010/8/31 Anatolij Gustschin <agust@denx.de>:
> > Checking the status field of the qTD token in the current code
> > do not take into acount cases where endpoint stall (halted) bit
> > is set together with some other status bits. As a result clearing
> > stall on an endpoint won't be done if other status bits were set.
> 
> Reading this description and this code:
> 
> >        /* special handling of STALL in DATA phase */
> > -       if ((result < 0) && (us->pusb_dev->status &
> > USB_ST_STALLED)) {
> > +       if ((result < 0) &&
> > +           (us->pusb_dev->status & (USB_ST_STALLED |
> > USB_ST_CRC_ERR))) { USB_STOR_PRINTF("DATA:stall\n");
> 
> Description and code do not seem to match. The old implementation
> explicitly checked if the STALLED bit was set, and if so the endpoint
> would be cleared. Now, it seems you are clearing the endpoint _also_
> when the CRC_ERR bit is set.

Yes. This was intentional but unfortunately not directly mentioned
in the patch description. The reason is the following:

in all cases where transaction error (XactErr) was reported with
additionally STALLED bit set, the handling of the STALL condition
successfully recovered from the error case and the device was
recognized.

in cases where only transaction error was reported (STALLED bit
not set) the recovery from the error case didn't succeed and
"Device NOT ready" (Request Sense returned 06 28 00) status was
finally reported. I didn't find the information in the EHCI spec
how the recovery from the XactErr should be done properly. Do
you know how to handle it?

> There are a lot more reasons, at least on other host controllers that
> set the status USB_ST_CRC_ERR, does this change not break the other
> code? (A simple grep will show you the situations where it is
> returned.)

This change will probably break other code. I'll drop it here and
additionally set USB_ST_STALLED flag in the ehci driver instead.

> > E.g. 'usb start' often fails with Toshiba USB stick 0x930/0x6545
> > connected to the SMSC USB 2.0 hub 0x424/0x2514. Debugging the
> > issue showed that while bulk IN transfers with length of 13 or
> > 18 the status field of the qTD token sometimes indicates trans-
> > action error (XactErr) and sometimes additionally endpoint halted
> > state. In the latter case resetting the USB device in error
> > recovery code fails as no clear stall request on the endpoint
> > will be done.
> 
> OK
> 
> > However this fix is not enough to solve 'usb start' problem
> > with hub/stick combination mentioned above. Running with lot of
> > debug code in ehci_submit_async() I've never seen the problem
> > with usb stick recognition. After removing this debug code the
> > similar problem sometimes showed up again. Therefore the patch
> > also adds delay in ehci_submit_async() for above-mentioned
> > hub/stick combination. Even without this delay the fix is an
> 
> Why only for these USB sticks? Are these sticks special among the
> thousands of different sticks out there?

I do not have the right equipment to analyze this problem properly.
This is the special USB stick-hub combination. The same USB stick
is always recognized properly if it is connected directly. Other
USB sticks also work with this USB hub.

> Or could it be more reliable to always do this delay for all USB
> sticks? Or even better, what are we missing here that the delay is
> needed at all?

Doing this delay for other USB sticks will negatively affect data
transfer speed.

> > diff --git a/drivers/usb/host/ehci-hcd.c
> > b/drivers/usb/host/ehci-hcd.c index 37d056e..7463a75 100644
> > --- a/drivers/usb/host/ehci-hcd.c
> > +++ b/drivers/usb/host/ehci-hcd.c
> 
> Is this problem only valid for the EHCI code? I can imagine that the
> other host controllers (like UHCI and OHCI and so on) code suffer the
> same problem and need similar fixes...
> (At least I know that I have here a couple of USB sticks that show
> similar problems with OHCI ;-) )

Maybe, maybe not. The problem shows up on SheevaPlug and mpc5121ads
boards, both using USB EHCI. I do not have the possibility to test
it on other host controllers currently.

Best regards,
Anatolij
Remy Bohmer - Oct. 29, 2010, 1:57 p.m.
Hi,

2010/10/27 Anatolij Gustschin <agust@denx.de>:
> Hi Remy,
>
> On Wed, 1 Sep 2010 10:49:22 +0200
> Remy Bohmer <linux@bohmer.net> wrote:
> ...
>> 2010/8/31 Anatolij Gustschin <agust@denx.de>:
>> > Checking the status field of the qTD token in the current code
>> > do not take into acount cases where endpoint stall (halted) bit
>> > is set together with some other status bits. As a result clearing
>> > stall on an endpoint won't be done if other status bits were set.
>>
>> Reading this description and this code:
>>
>> >        /* special handling of STALL in DATA phase */
>> > -       if ((result < 0) && (us->pusb_dev->status &
>> > USB_ST_STALLED)) {
>> > +       if ((result < 0) &&
>> > +           (us->pusb_dev->status & (USB_ST_STALLED |
>> > USB_ST_CRC_ERR))) { USB_STOR_PRINTF("DATA:stall\n");
>>
>> Description and code do not seem to match. The old implementation
>> explicitly checked if the STALLED bit was set, and if so the endpoint
>> would be cleared. Now, it seems you are clearing the endpoint _also_
>> when the CRC_ERR bit is set.
>
> Yes. This was intentional but unfortunately not directly mentioned
> in the patch description. The reason is the following:
>
> in all cases where transaction error (XactErr) was reported with
> additionally STALLED bit set, the handling of the STALL condition
> successfully recovered from the error case and the device was
> recognized.
>
> in cases where only transaction error was reported (STALLED bit
> not set) the recovery from the error case didn't succeed and
> "Device NOT ready" (Request Sense returned 06 28 00) status was
> finally reported. I didn't find the information in the EHCI spec
> how the recovery from the XactErr should be done properly. Do
> you know how to handle it?
>
>> There are a lot more reasons, at least on other host controllers that
>> set the status USB_ST_CRC_ERR, does this change not break the other
>> code? (A simple grep will show you the situations where it is
>> returned.)
>
> This change will probably break other code. I'll drop it here and
> additionally set USB_ST_STALLED flag in the ehci driver instead.

OK. I will ignore this version of this patch. I assume you will post a
new revision of this patch?

Kind regards,

Remy
Anatolij Gustschin - Oct. 29, 2010, 2:23 p.m.
Hi,

On Fri, 29 Oct 2010 15:57:54 +0200
Remy Bohmer <linux@bohmer.net> wrote:
...
> > in cases where only transaction error was reported (STALLED bit
> > not set) the recovery from the error case didn't succeed and
> > "Device NOT ready" (Request Sense returned 06 28 00) status was
> > finally reported. I didn't find the information in the EHCI spec
> > how the recovery from the XactErr should be done properly. Do
> > you know how to handle it?
> >
> >> There are a lot more reasons, at least on other host controllers that
> >> set the status USB_ST_CRC_ERR, does this change not break the other
> >> code? (A simple grep will show you the situations where it is
> >> returned.)
> >
> > This change will probably break other code. I'll drop it here and
> > additionally set USB_ST_STALLED flag in the ehci driver instead.
> 
> OK. I will ignore this version of this patch. I assume you will post a
> new revision of this patch?

Yes, my intention is to post v2 patch.

Thanks,
Anatolij

Patch

diff --git a/common/usb_storage.c b/common/usb_storage.c
index 76949b8..5ca92c3 100644
--- a/common/usb_storage.c
+++ b/common/usb_storage.c
@@ -680,7 +680,8 @@  int usb_stor_BBB_transport(ccb *srb, struct us_data *us)
 	result = usb_bulk_msg(us->pusb_dev, pipe, srb->pdata, srb->datalen,
 			      &data_actlen, USB_CNTL_TIMEOUT * 5);
 	/* special handling of STALL in DATA phase */
-	if ((result < 0) && (us->pusb_dev->status & USB_ST_STALLED)) {
+	if ((result < 0) &&
+	    (us->pusb_dev->status & (USB_ST_STALLED | USB_ST_CRC_ERR))) {
 		USB_STOR_PRINTF("DATA:stall\n");
 		/* clear the STALL on the endpoint */
 		result = usb_stor_BBB_clear_endpt_stall(us,
@@ -710,7 +711,7 @@  again:
 
 	/* special handling of STALL in STATUS phase */
 	if ((result < 0) && (retry < 1) &&
-	    (us->pusb_dev->status & USB_ST_STALLED)) {
+	    (us->pusb_dev->status & (USB_ST_STALLED | USB_ST_CRC_ERR))) {
 		USB_STOR_PRINTF("STATUS:stall\n");
 		/* clear the STALL on the endpoint */
 		result = usb_stor_BBB_clear_endpt_stall(us, us->ep_in);
diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c
index 37d056e..7463a75 100644
--- a/drivers/usb/host/ehci-hcd.c
+++ b/drivers/usb/host/ehci-hcd.c
@@ -430,6 +430,12 @@  ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer,
 	usbsts = ehci_readl(&hcor->or_usbsts);
 	ehci_writel(&hcor->or_usbsts, (usbsts & 0x3f));
 
+	if (dev->descriptor.idVendor == 0x930 &&
+	    dev->descriptor.idProduct == 0x6545 &&
+	    dev->parent->descriptor.idVendor == 0x424 &&
+	    dev->parent->descriptor.idProduct == 0x2514)
+		wait_ms(10);
+
 	/* Enable async. schedule. */
 	cmd = ehci_readl(&hcor->or_usbcmd);
 	cmd |= CMD_ASE;
@@ -490,6 +496,8 @@  ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer,
 			break;
 		default:
 			dev->status = USB_ST_CRC_ERR;
+			if ((token & 0x40) == 0x40)
+				dev->status |= USB_ST_STALLED;
 			break;
 		}
 		dev->act_len = length - ((token >> 16) & 0x7fff);