Patchwork [U-Boot] usb_storage: fix ehci driver max transfer size

login
register
mail settings
Submitter Stefan Herbrechtsmeier
Date June 18, 2012, 6:22 p.m.
Message ID <1340043748-9261-1-git-send-email-stefan@herbrechtsmeier.net>
Download mbox | patch
Permalink /patch/165576/
State Superseded
Delegated to: Marek Vasut
Headers show

Comments

Stefan Herbrechtsmeier - June 18, 2012, 6:22 p.m.
The commit 5dd95cf93dfffa1d19a1928990852aac9f55b9d9 'usb_storage:
Fix EHCI "out of buffer pointers" with CD-ROM' introduce a bug in
usb_storage as it wrongly assumes that every transfer can use 4k
per qt_buffer. This is wrong if the start address of the data
is not 4k aligned and leads to 'EHCI timed out on TD' messages
because of 'out of buffer pointers' in ehci_td_buffer function.

Cc: Marek Vasut <marex@denx.de>
Signed-off-by: Stefan Herbrechtsmeier <stefan@herbrechtsmeier.net>
---
 common/usb_storage.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
Marek Vasut - June 18, 2012, 6:57 p.m.
Dear Stefan Herbrechtsmeier,

> The commit 5dd95cf93dfffa1d19a1928990852aac9f55b9d9 'usb_storage:
> Fix EHCI "out of buffer pointers" with CD-ROM' introduce a bug in
> usb_storage as it wrongly assumes that every transfer can use 4k
> per qt_buffer. This is wrong if the start address of the data
> is not 4k aligned and leads to 'EHCI timed out on TD' messages
> because of 'out of buffer pointers' in ehci_td_buffer function.

Stefan, thanks a lot for finding this! I'll look into it on 21st I hope.

> 
> Cc: Marek Vasut <marex@denx.de>
> Signed-off-by: Stefan Herbrechtsmeier <stefan@herbrechtsmeier.net>
> ---
>  common/usb_storage.c |    6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/common/usb_storage.c b/common/usb_storage.c
> index faad237..612a553 100644
> --- a/common/usb_storage.c
> +++ b/common/usb_storage.c
> @@ -1416,10 +1416,10 @@ int usb_stor_get_info(struct usb_device *dev,
> struct us_data *ss, USB_STOR_PRINTF("partype: %d\n", dev_desc->part_type);
> 
>  	/*
> -	 * The U-Boot EHCI driver cannot handle more than 4096 * 5 bytes in a
> -	 * transfer without running itself out of qt_buffers.
> +	 * The U-Boot EHCI driver cannot handle more than 4096 * 4 bytes in a
> +	 * unaligned transfer without running itself out of qt_buffers.
>  	 */
> -	ss->max_xfer_blk = (4096 * 5) / dev_desc->blksz;
> +	ss->max_xfer_blk = (4096 * 4) / dev_desc->blksz;
> 
>  	init_part(dev_desc);

Best regards,
Marek Vasut
Marek Vasut - July 3, 2012, 6:10 p.m.
Dear Stefan Herbrechtsmeier,

> The commit 5dd95cf93dfffa1d19a1928990852aac9f55b9d9 'usb_storage:
> Fix EHCI "out of buffer pointers" with CD-ROM' introduce a bug in
> usb_storage as it wrongly assumes that every transfer can use 4k
> per qt_buffer. This is wrong if the start address of the data
> is not 4k aligned and leads to 'EHCI timed out on TD' messages
> because of 'out of buffer pointers' in ehci_td_buffer function.
> 
> Cc: Marek Vasut <marex@denx.de>
> Signed-off-by: Stefan Herbrechtsmeier <stefan@herbrechtsmeier.net>

Ok, first I have to admit I broke my promise to look into this ASAP, sorry about 
it :-(

Just curious, but shouldn't it be ((4096 * 5) / dev_desc->blk_sz) - 1 ?

> ---
>  common/usb_storage.c |    6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/common/usb_storage.c b/common/usb_storage.c
> index faad237..612a553 100644
> --- a/common/usb_storage.c
> +++ b/common/usb_storage.c
> @@ -1416,10 +1416,10 @@ int usb_stor_get_info(struct usb_device *dev,
> struct us_data *ss, USB_STOR_PRINTF("partype: %d\n", dev_desc->part_type);
> 
>  	/*
> -	 * The U-Boot EHCI driver cannot handle more than 4096 * 5 bytes in a
> -	 * transfer without running itself out of qt_buffers.
> +	 * The U-Boot EHCI driver cannot handle more than 4096 * 4 bytes in a
> +	 * unaligned transfer without running itself out of qt_buffers.
>  	 */
> -	ss->max_xfer_blk = (4096 * 5) / dev_desc->blksz;
> +	ss->max_xfer_blk = (4096 * 4) / dev_desc->blksz;
> 
>  	init_part(dev_desc);

Best regards,
Marek Vasut
Stefan Herbrechtsmeier - July 4, 2012, 6:32 a.m.
Am 03.07.2012 20:10, schrieb Marek Vasut:
>> The commit 5dd95cf93dfffa1d19a1928990852aac9f55b9d9 'usb_storage:
>> Fix EHCI "out of buffer pointers" with CD-ROM' introduce a bug in
>> usb_storage as it wrongly assumes that every transfer can use 4k
>> per qt_buffer. This is wrong if the start address of the data
>> is not 4k aligned and leads to 'EHCI timed out on TD' messages
>> because of 'out of buffer pointers' in ehci_td_buffer function.
>>
>> Cc: Marek Vasut <marex@denx.de>
>> Signed-off-by: Stefan Herbrechtsmeier <stefan@herbrechtsmeier.net>
> Ok, first I have to admit I broke my promise to look into this ASAP, sorry about
> it :-(
No problem, as long as we get it into the next release. ;-)
>
> Just curious, but shouldn't it be ((4096 * 5) / dev_desc->blk_sz) - 1 ?
No, because the first blk need to be aligned with the 4096. In worst 
case the blk is at the end of the 4096 range. If we assume that the blk 
is aligned to blk_sz we can change it to ((4096 * 4) / dev_desc->blk_sz) 
+ 1. I skip the last blk (+ 1) because with 4096 aligned first blk we 
unaligned the next transfer and add extra short packages to each ehci 
transfer.

If we want to maximise the usage we need to calculate the max_xfer_blk 
depending on the start address of the first blk.

>> ---
>>   common/usb_storage.c |    6 +++---
>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/common/usb_storage.c b/common/usb_storage.c
>> index faad237..612a553 100644
>> --- a/common/usb_storage.c
>> +++ b/common/usb_storage.c
>> @@ -1416,10 +1416,10 @@ int usb_stor_get_info(struct usb_device *dev,
>> struct us_data *ss, USB_STOR_PRINTF("partype: %d\n", dev_desc->part_type);
>>
>>   	/*
>> -	 * The U-Boot EHCI driver cannot handle more than 4096 * 5 bytes in a
>> -	 * transfer without running itself out of qt_buffers.
>> +	 * The U-Boot EHCI driver cannot handle more than 4096 * 4 bytes in a
>> +	 * unaligned transfer without running itself out of qt_buffers.
>>   	 */
>> -	ss->max_xfer_blk = (4096 * 5) / dev_desc->blksz;
>> +	ss->max_xfer_blk = (4096 * 4) / dev_desc->blksz;
>>
>>   	init_part(dev_desc);
>
Stefan Herbrechtsmeier - July 4, 2012, 7:59 a.m.
Am 04.07.2012 08:57, schrieb Schneider, Kolja:
>> Am 03.07.2012 20:10, schrieb Marek Vasut:
>>>> The commit 5dd95cf93dfffa1d19a1928990852aac9f55b9d9 'usb_storage:
>>>> Fix EHCI "out of buffer pointers" with CD-ROM' introduce a bug in
>>>> usb_storage as it wrongly assumes that every transfer can use 4k
>>>> per qt_buffer. This is wrong if the start address of the data
>>>> is not 4k aligned and leads to 'EHCI timed out on TD' messages
>>>> because of 'out of buffer pointers' in ehci_td_buffer function.
>>>>
>>>> Cc: Marek Vasut <marex@denx.de>
>>>> Signed-off-by: Stefan Herbrechtsmeier <stefan@herbrechtsmeier.net>
>>> Ok, first I have to admit I broke my promise to look into this ASAP, sorry
>> about
>>> it :-(
>> No problem, as long as we get it into the next release. ;-)
>>> Just curious, but shouldn't it be ((4096 * 5) / dev_desc->blk_sz) - 1 ?
>> No, because the first blk need to be aligned with the 4096. In worst
>> case the blk is at the end of the 4096 range. If we assume that the blk
>> is aligned to blk_sz we can change it to ((4096 * 4) / dev_desc->blk_sz)
>> + 1. I skip the last blk (+ 1) because with 4096 aligned first blk we
>> unaligned the next transfer and add extra short packages to each ehci
>> transfer.
>>
>> If we want to maximise the usage we need to calculate the max_xfer_blk
>> depending on the start address of the first blk.
>>
> I admit to not totally getting it. However, there are two things that come to my mind:
>   - 	Doesn't the EHCI Specification mention exactly five buffers that can/should/must
>     	be used?
Yes, you can use up to five 4096 byte buffers.
>   - 	I think I once stumbled across some comment that said as much as the blocks
> 	always having to be aligned anyway?
The buffers must be aligned to a 4096 byte page. This means that you 
have to use the first and last buffer to align your data to the next or 
previous 4096 byte page boundary.
Marek Vasut - July 7, 2012, 9:58 p.m.
Dear Stefan Herbrechtsmeier,

> Am 04.07.2012 08:57, schrieb Schneider, Kolja:
> >> Am 03.07.2012 20:10, schrieb Marek Vasut:
> >>>> The commit 5dd95cf93dfffa1d19a1928990852aac9f55b9d9 'usb_storage:
> >>>> Fix EHCI "out of buffer pointers" with CD-ROM' introduce a bug in
> >>>> usb_storage as it wrongly assumes that every transfer can use 4k
> >>>> per qt_buffer. This is wrong if the start address of the data
> >>>> is not 4k aligned and leads to 'EHCI timed out on TD' messages
> >>>> because of 'out of buffer pointers' in ehci_td_buffer function.
> >>>> 
> >>>> Cc: Marek Vasut <marex@denx.de>
> >>>> Signed-off-by: Stefan Herbrechtsmeier <stefan@herbrechtsmeier.net>
> >>> 
> >>> Ok, first I have to admit I broke my promise to look into this ASAP,
> >>> sorry
> >> 
> >> about
> >> 
> >>> it :-(
> >> 
> >> No problem, as long as we get it into the next release. ;-)
> >> 
> >>> Just curious, but shouldn't it be ((4096 * 5) / dev_desc->blk_sz) - 1 ?
> >> 
> >> No, because the first blk need to be aligned with the 4096. In worst
> >> case the blk is at the end of the 4096 range. If we assume that the blk
> >> is aligned to blk_sz we can change it to ((4096 * 4) / dev_desc->blk_sz)
> >> + 1. I skip the last blk (+ 1) because with 4096 aligned first blk we
> >> unaligned the next transfer and add extra short packages to each ehci
> >> transfer.
> >> 
> >> If we want to maximise the usage we need to calculate the max_xfer_blk
> >> depending on the start address of the first blk.
> > 
> > I admit to not totally getting it. However, there are two things that come 
to my mind:
> >   - 	Doesn't the EHCI Specification mention exactly five buffers that
> >   can/should/must
> >   
> >     	be used?
> 
> Yes, you can use up to five 4096 byte buffers.
> 
> >   - 	I think I once stumbled across some comment that said as much as 
the
> >   blocks
> > 	
> > 	always having to be aligned anyway?
> 
> The buffers must be aligned to a 4096 byte page. This means that you
> have to use the first and last buffer to align your data to the next or
> previous 4096 byte page boundary.

All right, I managed to replicate the issue. This (or similar) doesn't work for 
you, right:

usb read 0x42000004 0x0 0x400

The proper solution would be to introduce a bounce buffer for such unaligned 
transfers. A proper, generic bounce buffer that can be configured to bounce on 
specified boundaries and warns about performance penalties.

Best regards,
Marek Vasut
Stefan Herbrechtsmeier - July 8, 2012, 11:08 a.m.
Am 07.07.2012 23:58, schrieb Marek Vasut:
>> Am 04.07.2012 08:57, schrieb Schneider, Kolja:
>>>> Am 03.07.2012 20:10, schrieb Marek Vasut:
>>>>>> The commit 5dd95cf93dfffa1d19a1928990852aac9f55b9d9 'usb_storage:
>>>>>> Fix EHCI "out of buffer pointers" with CD-ROM' introduce a bug in
>>>>>> usb_storage as it wrongly assumes that every transfer can use 4k
>>>>>> per qt_buffer. This is wrong if the start address of the data
>>>>>> is not 4k aligned and leads to 'EHCI timed out on TD' messages
>>>>>> because of 'out of buffer pointers' in ehci_td_buffer function.
>>>>>>
>>>>>> Cc: Marek Vasut <marex@denx.de>
>>>>>> Signed-off-by: Stefan Herbrechtsmeier <stefan@herbrechtsmeier.net>
>>>>> Ok, first I have to admit I broke my promise to look into this ASAP,
>>>>> sorry
>>>> about
>>>>
>>>>> it :-(
>>>> No problem, as long as we get it into the next release. ;-)
>>>>
>>>>> Just curious, but shouldn't it be ((4096 * 5) / dev_desc->blk_sz) - 1 ?
>>>> No, because the first blk need to be aligned with the 4096. In worst
>>>> case the blk is at the end of the 4096 range. If we assume that the blk
>>>> is aligned to blk_sz we can change it to ((4096 * 4) / dev_desc->blk_sz)
>>>> + 1. I skip the last blk (+ 1) because with 4096 aligned first blk we
>>>> unaligned the next transfer and add extra short packages to each ehci
>>>> transfer.
>>>>
>>>> If we want to maximise the usage we need to calculate the max_xfer_blk
>>>> depending on the start address of the first blk.
>>> I admit to not totally getting it. However, there are two things that come
> to my mind:
>>>    - 	Doesn't the EHCI Specification mention exactly five buffers that
>>>    can/should/must
>>>    
>>>      	be used?
>> Yes, you can use up to five 4096 byte buffers.
>>
>>>    - 	I think I once stumbled across some comment that said as much as
> the
>>>    blocks
>>> 	
>>> 	always having to be aligned anyway?
>> The buffers must be aligned to a 4096 byte page. This means that you
>> have to use the first and last buffer to align your data to the next or
>> previous 4096 byte page boundary.
> All right, I managed to replicate the issue. This (or similar) doesn't work for
> you, right:
>
> usb read 0x42000004 0x0 0x400
I observe the bug during
fatload usb 0 0x800000 uImage

...
dev=0ffb0440, pipe=c8008283, buffer=00a90000, length=20480, req=(null)
...
dev=0ffb0440, pipe=c8008283, buffer=00a95000, length=20480, req=(null)
...
dev=0ffb0440, pipe=c8008283, buffer=00a9a000, length=18432, req=(null)
...
dev=0ffb0440, pipe=c8008283, buffer=00a9e800, length=20480, req=(null)
out of buffer pointers (2048 bytes left)
...

It looks like the file is fragmented. During load the start address 
becomes unaligned and thereby the code wrongly tries to transfer more 
blocks than possible.

Before the above mentioned patch the max_xfer_blk was much smaller (20), 
so that the problem never appears.
> The proper solution would be to introduce a bounce buffer for such unaligned
> transfers. A proper, generic bounce buffer that can be configured to bounce on
> specified boundaries and warns about performance penalties.
Why not limit the max_xfer_blk to the maximum save count [(4096 * 4) / 
dev_desc->blksz)] or calculate the max_xfer_blk depending on the start 
address of the transfer:

max_xfer_blk = ((4096 * 5) - (start & ~4096) / dev_desc->blksz

Regards,
     Stefan
Marek Vasut - July 8, 2012, 6:58 p.m.
Dear Stefan Herbrechtsmeier,

> Am 07.07.2012 23:58, schrieb Marek Vasut:
> >> Am 04.07.2012 08:57, schrieb Schneider, Kolja:
> >>>> Am 03.07.2012 20:10, schrieb Marek Vasut:
> >>>>>> The commit 5dd95cf93dfffa1d19a1928990852aac9f55b9d9 'usb_storage:
> >>>>>> Fix EHCI "out of buffer pointers" with CD-ROM' introduce a bug in
> >>>>>> usb_storage as it wrongly assumes that every transfer can use 4k
> >>>>>> per qt_buffer. This is wrong if the start address of the data
> >>>>>> is not 4k aligned and leads to 'EHCI timed out on TD' messages
> >>>>>> because of 'out of buffer pointers' in ehci_td_buffer function.
> >>>>>> 
> >>>>>> Cc: Marek Vasut <marex@denx.de>
> >>>>>> Signed-off-by: Stefan Herbrechtsmeier <stefan@herbrechtsmeier.net>
> >>>>> 
> >>>>> Ok, first I have to admit I broke my promise to look into this ASAP,
> >>>>> sorry
> >>>> 
> >>>> about
> >>>> 
> >>>>> it :-(
> >>>> 
> >>>> No problem, as long as we get it into the next release. ;-)
> >>>> 
> >>>>> Just curious, but shouldn't it be ((4096 * 5) / dev_desc->blk_sz) - 1
> >>>>> ?
> >>>> 
> >>>> No, because the first blk need to be aligned with the 4096. In worst
> >>>> case the blk is at the end of the 4096 range. If we assume that the
> >>>> blk is aligned to blk_sz we can change it to ((4096 * 4) /
> >>>> dev_desc->blk_sz) + 1. I skip the last blk (+ 1) because with 4096
> >>>> aligned first blk we unaligned the next transfer and add extra short
> >>>> packages to each ehci transfer.
> >>>> 
> >>>> If we want to maximise the usage we need to calculate the max_xfer_blk
> >>>> depending on the start address of the first blk.
> >>> 
> >>> I admit to not totally getting it. However, there are two things that
> >>> come
> > 
> > to my mind:
> >>>    - 	Doesn't the EHCI Specification mention exactly five buffers that
> >>>    can/should/must
> >>>    
> >>>      	be used?
> >> 
> >> Yes, you can use up to five 4096 byte buffers.
> >> 
> >>>    - 	I think I once stumbled across some comment that said as much as
> > 
> > the
> > 
> >>>    blocks
> >>> 	
> >>> 	always having to be aligned anyway?
> >> 
> >> The buffers must be aligned to a 4096 byte page. This means that you
> >> have to use the first and last buffer to align your data to the next or
> >> previous 4096 byte page boundary.
> > 
> > All right, I managed to replicate the issue. This (or similar) doesn't
> > work for you, right:
> > 
> > usb read 0x42000004 0x0 0x400
> 
> I observe the bug during
> fatload usb 0 0x800000 uImage
> 
> ...
> dev=0ffb0440, pipe=c8008283, buffer=00a90000, length=20480, req=(null)
> ...
> dev=0ffb0440, pipe=c8008283, buffer=00a95000, length=20480, req=(null)
> ...
> dev=0ffb0440, pipe=c8008283, buffer=00a9a000, length=18432, req=(null)
> ...
> dev=0ffb0440, pipe=c8008283, buffer=00a9e800, length=20480, req=(null)
> out of buffer pointers (2048 bytes left)
> ...
> 
> It looks like the file is fragmented. During load the start address
> becomes unaligned and thereby the code wrongly tries to transfer more
> blocks than possible.
> 
> Before the above mentioned patch the max_xfer_blk was much smaller (20),
> so that the problem never appears.
> 
> > The proper solution would be to introduce a bounce buffer for such
> > unaligned transfers. A proper, generic bounce buffer that can be
> > configured to bounce on specified boundaries and warns about performance
> > penalties.
> 
> Why not limit the max_xfer_blk to the maximum save count [(4096 * 4) /
> dev_desc->blksz)] or calculate the max_xfer_blk depending on the start
> address of the transfer:
> 
> max_xfer_blk = ((4096 * 5) - (start & ~4096) / dev_desc->blksz

This looks like a much better solution ;-) Can you redo the patch for that? 
Also, there's some macro for that rounding in include/common.h

> 
> Regards,
>      Stefan

Best regards,
Marek Vasut

Patch

diff --git a/common/usb_storage.c b/common/usb_storage.c
index faad237..612a553 100644
--- a/common/usb_storage.c
+++ b/common/usb_storage.c
@@ -1416,10 +1416,10 @@  int usb_stor_get_info(struct usb_device *dev, struct us_data *ss,
 	USB_STOR_PRINTF("partype: %d\n", dev_desc->part_type);
 
 	/*
-	 * The U-Boot EHCI driver cannot handle more than 4096 * 5 bytes in a
-	 * transfer without running itself out of qt_buffers.
+	 * The U-Boot EHCI driver cannot handle more than 4096 * 4 bytes in a
+	 * unaligned transfer without running itself out of qt_buffers.
 	 */
-	ss->max_xfer_blk = (4096 * 5) / dev_desc->blksz;
+	ss->max_xfer_blk = (4096 * 4) / dev_desc->blksz;
 
 	init_part(dev_desc);