diff mbox

[tegrarcm,v1,4/4] Increate USB timeout value

Message ID 1457135087-967-5-git-send-email-jimmzhang@nvidia.com
State Superseded, archived
Headers show

Commit Message

jimmzhang March 4, 2016, 11:44 p.m. UTC
It has been observed that some times USB time out could occure when a long (3ft)
usb cable is connected to recovery mode usb port

Signed-off-by: Jimmy Zhang <jimmzhang>
---
 src/usb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Allen Martin March 5, 2016, 1:46 a.m. UTC | #1
On Fri, Mar 04, 2016 at 03:44:47PM -0800, Jimmy Zhang wrote:
> It has been observed that some times USB time out could occure when a long (3ft)
> usb cable is connected to recovery mode usb port
> 

How about making it a command line argument instead, so next time
someone tries a 6ft cable we don't need another patch? :^)

-Allen
--
To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
jimmzhang March 5, 2016, 2:39 a.m. UTC | #2
> -----Original Message-----
> From: Allen Martin
> Sent: Friday, March 04, 2016 5:47 PM
> To: Jimmy Zhang
> Cc: Stephen Warren; alban.bedel@avionic-design.de; linux-
> tegra@vger.kernel.org
> Subject: Re: [tegrarcm PATCH v1 4/4] Increate USB timeout value
> 
> On Fri, Mar 04, 2016 at 03:44:47PM -0800, Jimmy Zhang wrote:
> > It has been observed that some times USB time out could occure when a
> > long (3ft) usb cable is connected to recovery mode usb port
> >
> 
> How about making it a command line argument instead, so next time
> someone tries a 6ft cable we don't need another patch? :^)
> 
Agree. Will make a change.

> -Allen
--
To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stephen Warren March 7, 2016, 7:39 p.m. UTC | #3
On 03/04/2016 04:44 PM, Jimmy Zhang wrote:
> It has been observed that some times USB time out could occure when a long (3ft)
> usb cable is connected to recovery mode usb port

This explanation makes no sense. 3ft is practically the shortest USB 
cable anyone would use. Equally, assuming a compliant correctly 
functioning cable, cable length doesn't affect the time it takes to 
execute transactions.

Is the issue more that if a low quality cable is used, then there are 
lots of transfer errors, and the time taken for retries is large? If so, 
there's not much guarantee that a larger timeout would help in general, 
since there's no guarantee of any upper bound on the time it takes for a 
packet not to get corrupted in this case. Still, I suppose it's fine to 
increase the timeout to account for when it accidentally works.

In summary: a commit description that more accurately represents the 
problem being solved is required.
--
To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
jimmzhang March 9, 2016, 1:41 a.m. UTC | #4
> -----Original Message-----
> From: Stephen Warren [mailto:swarren@wwwdotorg.org]
> Sent: Monday, March 07, 2016 11:40 AM
> To: Jimmy Zhang
> Cc: Allen Martin; Stephen Warren; alban.bedel@avionic-design.de; linux-
> tegra@vger.kernel.org
> Subject: Re: [tegrarcm PATCH v1 4/4] Increate USB timeout value
> 
> On 03/04/2016 04:44 PM, Jimmy Zhang wrote:
> > It has been observed that some times USB time out could occure when a
> > long (3ft) usb cable is connected to recovery mode usb port
> 
> This explanation makes no sense. 3ft is practically the shortest USB cable
> anyone would use. Equally, assuming a compliant correctly functioning cable,
> cable length doesn't affect the time it takes to execute transactions.
> 
> Is the issue more that if a low quality cable is used, then there are lots of
> transfer errors, and the time taken for retries is large? If so, there's not much
> guarantee that a larger timeout would help in general, since there's no
> guarantee of any upper bound on the time it takes for a packet not to get
> corrupted in this case. Still, I suppose it's fine to increase the timeout to
> account for when it accidentally works.
> 
> In summary: a commit description that more accurately represents the
> problem being solved is required.

We have seen this problem since T30/T114. Initially we just retried by downloading again. It may work after a few tries. Later, we found a simple solution is replacing with a shorter cable. Then one day, for testing purpose, when the timeout value was replaced with 0, ie, unlimited timeout, all cables work fine. So, it may worth to figure out the root cause.
   
--
To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stephen Warren March 9, 2016, 5:41 p.m. UTC | #5
On 03/08/2016 06:41 PM, Jimmy Zhang wrote:
>
>
>> -----Original Message-----
>> From: Stephen Warren [mailto:swarren@wwwdotorg.org]
>> Sent: Monday, March 07, 2016 11:40 AM
>> To: Jimmy Zhang
>> Cc: Allen Martin; Stephen Warren; alban.bedel@avionic-design.de; linux-
>> tegra@vger.kernel.org
>> Subject: Re: [tegrarcm PATCH v1 4/4] Increate USB timeout value
>>
>> On 03/04/2016 04:44 PM, Jimmy Zhang wrote:
>>> It has been observed that some times USB time out could occure when a
>>> long (3ft) usb cable is connected to recovery mode usb port
>>
>> This explanation makes no sense. 3ft is practically the shortest USB cable
>> anyone would use. Equally, assuming a compliant correctly functioning cable,
>> cable length doesn't affect the time it takes to execute transactions.
>>
>> Is the issue more that if a low quality cable is used, then there are lots of
>> transfer errors, and the time taken for retries is large? If so, there's not much
>> guarantee that a larger timeout would help in general, since there's no
>> guarantee of any upper bound on the time it takes for a packet not to get
>> corrupted in this case. Still, I suppose it's fine to increase the timeout to
>> account for when it accidentally works.
>>
>> In summary: a commit description that more accurately represents the
>> problem being solved is required.
>
> We have seen this problem since T30/T114. Initially we just retried by downloading again. It may work after a few tries. Later, we found a simple solution is replacing with a shorter cable. Then one day, for testing purpose, when the timeout value was replaced with 0, ie, unlimited timeout, all cables work fine. So, it may worth to figure out the root cause.

There's obviously some variability here, since you mention that the 
communication works sometimes and not others.

When you tested the varying cable lengths, did you test at least 10x 
more times than the percentage of times that pass or fail with the 
original cable you used? Without testing a large number of times, you 
don't know if you just got lucky the one time you tried a different 
cable, or whether it really made a statistical difference.

Much more likely is that Tegra itself sometimes takes more time to 
respond, or sometimes receives/sends packets that get corrupted 
somewhere and have to be retried.

BTW, have you tried using Wireshark and usbmon to capture a trace of the 
failures to try and diagnose the issue?

Finally, I'm not objecting to this patch if it does work around the 
problem. All I'm objecting to is blaming the issue on cable length when 
I'm not convinced that's been conclusively proven. A simple message such 
as the following would work:

RCM communication sometimes fails if the USB timeout is short. Increase 
the timeout value to avoid this issue. The exact cause is not yet diagnosed.
--
To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
jimmzhang March 9, 2016, 7:56 p.m. UTC | #6
> -----Original Message-----
> From: Stephen Warren [mailto:swarren@wwwdotorg.org]
> Sent: Wednesday, March 09, 2016 9:41 AM
> To: Jimmy Zhang
> Cc: Allen Martin; Stephen Warren; alban.bedel@avionic-design.de; linux-
> tegra@vger.kernel.org
> Subject: Re: [tegrarcm PATCH v1 4/4] Increate USB timeout value
> 
> On 03/08/2016 06:41 PM, Jimmy Zhang wrote:
> >
> >
> >> -----Original Message-----
> >> From: Stephen Warren [mailto:swarren@wwwdotorg.org]
> >> Sent: Monday, March 07, 2016 11:40 AM
> >> To: Jimmy Zhang
> >> Cc: Allen Martin; Stephen Warren; alban.bedel@avionic-design.de;
> >> linux- tegra@vger.kernel.org
> >> Subject: Re: [tegrarcm PATCH v1 4/4] Increate USB timeout value
> >>
> >> On 03/04/2016 04:44 PM, Jimmy Zhang wrote:
> >>> It has been observed that some times USB time out could occure when
> >>> a long (3ft) usb cable is connected to recovery mode usb port
> >>
> >> This explanation makes no sense. 3ft is practically the shortest USB
> >> cable anyone would use. Equally, assuming a compliant correctly
> >> functioning cable, cable length doesn't affect the time it takes to execute
> transactions.
> >>
> >> Is the issue more that if a low quality cable is used, then there are
> >> lots of transfer errors, and the time taken for retries is large? If
> >> so, there's not much guarantee that a larger timeout would help in
> >> general, since there's no guarantee of any upper bound on the time it
> >> takes for a packet not to get corrupted in this case. Still, I
> >> suppose it's fine to increase the timeout to account for when it
> accidentally works.
> >>
> >> In summary: a commit description that more accurately represents the
> >> problem being solved is required.
> >
> > We have seen this problem since T30/T114. Initially we just retried by
> downloading again. It may work after a few tries. Later, we found a simple
> solution is replacing with a shorter cable. Then one day, for testing purpose,
> when the timeout value was replaced with 0, ie, unlimited timeout, all cables
> work fine. So, it may worth to figure out the root cause.
> 
> There's obviously some variability here, since you mention that the
> communication works sometimes and not others.
> 
> When you tested the varying cable lengths, did you test at least 10x more
> times than the percentage of times that pass or fail with the original cable
> you used? Without testing a large number of times, you don't know if you
> just got lucky the one time you tried a different cable, or whether it really
> made a statistical difference.
> 
> Much more likely is that Tegra itself sometimes takes more time to respond,
> or sometimes receives/sends packets that get corrupted somewhere and
> have to be retried.
> 
> BTW, have you tried using Wireshark and usbmon to capture a trace of the
> failures to try and diagnose the issue?
> 
> Finally, I'm not objecting to this patch if it does work around the problem. All
> I'm objecting to is blaming the issue on cable length when I'm not convinced
> that's been conclusively proven. A simple message such as the following
> would work:
> 
Agree. We actually never pay much attention to the tool issue because the focus is always on the task performed at that moment.

As Allen suggested, I will create a new usb_timeout command line parameter to allow user to set timeout value.
  
> RCM communication sometimes fails if the USB timeout is short. Increase the
> timeout value to avoid this issue. The exact cause is not yet diagnosed.
--
To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/src/usb.c b/src/usb.c
index 8a367426b29c..4f7d1cae9d3a 100644
--- a/src/usb.c
+++ b/src/usb.c
@@ -35,7 +35,7 @@ 
 #include "debug.h"
 
 // USB xfer timeout in ms
-#define USB_TIMEOUT 1000
+#define USB_TIMEOUT 5000
 
 #define USB_XFER_MAX 4096