diff mbox

[V2] mtd: bcm47xxsflash: support reading flash out of mapping window

Message ID 20170117105110.26328-1-zajec5@gmail.com
State Superseded
Headers show

Commit Message

Rafał Miłecki Jan. 17, 2017, 10:51 a.m. UTC
From: Rafał Miłecki <rafal@milecki.pl>

For reading flash content we use MMIO but it's possible to read only
first 16 MiB this way. It's simply an arch design/limitation.
To support flash sizes bigger than 16 MiB implement indirect access
using ChipCommon registers.
This has been tested using MX25L25635F.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
---
V2: Simplify line writing to buf
    Add some trivial comment for OPCODE_ST_READ4B
    Both requested by Marek
---
 drivers/mtd/devices/bcm47xxsflash.c | 12 +++++++++++-
 drivers/mtd/devices/bcm47xxsflash.h |  3 +++
 2 files changed, 14 insertions(+), 1 deletion(-)

Comments

Marek Vasut Jan. 17, 2017, 11 a.m. UTC | #1
On 01/17/2017 11:51 AM, Rafał Miłecki wrote:
> From: Rafał Miłecki <rafal@milecki.pl>
> 
> For reading flash content we use MMIO but it's possible to read only
> first 16 MiB this way. It's simply an arch design/limitation.
> To support flash sizes bigger than 16 MiB implement indirect access
> using ChipCommon registers.
> This has been tested using MX25L25635F.
> 
> Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
> ---
> V2: Simplify line writing to buf
>     Add some trivial comment for OPCODE_ST_READ4B
>     Both requested by Marek
> ---
>  drivers/mtd/devices/bcm47xxsflash.c | 12 +++++++++++-
>  drivers/mtd/devices/bcm47xxsflash.h |  3 +++
>  2 files changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mtd/devices/bcm47xxsflash.c b/drivers/mtd/devices/bcm47xxsflash.c
> index 4decd8c..8d4c1db 100644
> --- a/drivers/mtd/devices/bcm47xxsflash.c
> +++ b/drivers/mtd/devices/bcm47xxsflash.c
> @@ -110,7 +110,17 @@ static int bcm47xxsflash_read(struct mtd_info *mtd, loff_t from, size_t len,
>  	if ((from + len) > mtd->size)
>  		return -EINVAL;
>  
> -	memcpy_fromio(buf, b47s->window + from, len);
> +	if (from + len <= BCM47XXSFLASH_WINDOW_SIZE) {
> +		memcpy_fromio(buf, b47s->window + from, len);

One last nit, what if the read starts < 16 MiB and ends > 16 MiB ?
Shouldn't that use partly the windowed mode and partly the other mode?

> +	} else {
> +		size_t i;
> +
> +		for (i = 0; i < len; i++) {
> +			b47s->cc_write(b47s, BCMA_CC_FLASHADDR, from++);
> +			bcm47xxsflash_cmd(b47s, OPCODE_ST_READ4B);
> +			*buf++ = b47s->cc_read(b47s, BCMA_CC_FLASHDATA);
> +		}
> +	}
>  	*retlen = len;
>  
>  	return len;
> diff --git a/drivers/mtd/devices/bcm47xxsflash.h b/drivers/mtd/devices/bcm47xxsflash.h
> index 1564b62..6b478af 100644
> --- a/drivers/mtd/devices/bcm47xxsflash.h
> +++ b/drivers/mtd/devices/bcm47xxsflash.h
> @@ -3,6 +3,8 @@
>  
>  #include <linux/mtd/mtd.h>
>  
> +#define BCM47XXSFLASH_WINDOW_SIZE		SZ_16M
> +
>  /* Used for ST flashes only. */
>  #define OPCODE_ST_WREN		0x0006		/* Write Enable */
>  #define OPCODE_ST_WRDIS		0x0004		/* Write Disable */
> @@ -16,6 +18,7 @@
>  #define OPCODE_ST_RES		0x03ab		/* Read Electronic Signature */
>  #define OPCODE_ST_CSA		0x1000		/* Keep chip select asserted */
>  #define OPCODE_ST_SSE		0x0220		/* Sub-sector Erase */
> +#define OPCODE_ST_READ4B	0x6313		/* Read Data Bytes in 4Byte address */
>  
>  /* Used for Atmel flashes only. */
>  #define OPCODE_AT_READ				0x07e8
>
Rafał Miłecki Jan. 17, 2017, 11:08 a.m. UTC | #2
On 17 January 2017 at 12:00, Marek Vasut <marek.vasut@gmail.com> wrote:
> On 01/17/2017 11:51 AM, Rafał Miłecki wrote:
>> From: Rafał Miłecki <rafal@milecki.pl>
>>
>> For reading flash content we use MMIO but it's possible to read only
>> first 16 MiB this way. It's simply an arch design/limitation.
>> To support flash sizes bigger than 16 MiB implement indirect access
>> using ChipCommon registers.
>> This has been tested using MX25L25635F.
>>
>> Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
>> ---
>> V2: Simplify line writing to buf
>>     Add some trivial comment for OPCODE_ST_READ4B
>>     Both requested by Marek
>> ---
>>  drivers/mtd/devices/bcm47xxsflash.c | 12 +++++++++++-
>>  drivers/mtd/devices/bcm47xxsflash.h |  3 +++
>>  2 files changed, 14 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/mtd/devices/bcm47xxsflash.c b/drivers/mtd/devices/bcm47xxsflash.c
>> index 4decd8c..8d4c1db 100644
>> --- a/drivers/mtd/devices/bcm47xxsflash.c
>> +++ b/drivers/mtd/devices/bcm47xxsflash.c
>> @@ -110,7 +110,17 @@ static int bcm47xxsflash_read(struct mtd_info *mtd, loff_t from, size_t len,
>>       if ((from + len) > mtd->size)
>>               return -EINVAL;
>>
>> -     memcpy_fromio(buf, b47s->window + from, len);
>> +     if (from + len <= BCM47XXSFLASH_WINDOW_SIZE) {
>> +             memcpy_fromio(buf, b47s->window + from, len);
>
> One last nit, what if the read starts < 16 MiB and ends > 16 MiB ?
> Shouldn't that use partly the windowed mode and partly the other mode?

You'll lost 10ns*... or not as splitting it into 2 code paths could
take longer, who knows. Most access are block aligned anyway. I don't
think such corner case with doubtful gain is much worth considering &
optimizing.

* Statistic from my mind
Marek Vasut Jan. 17, 2017, 11:49 a.m. UTC | #3
On 01/17/2017 12:08 PM, Rafał Miłecki wrote:
> On 17 January 2017 at 12:00, Marek Vasut <marek.vasut@gmail.com> wrote:
>> On 01/17/2017 11:51 AM, Rafał Miłecki wrote:
>>> From: Rafał Miłecki <rafal@milecki.pl>
>>>
>>> For reading flash content we use MMIO but it's possible to read only
>>> first 16 MiB this way. It's simply an arch design/limitation.
>>> To support flash sizes bigger than 16 MiB implement indirect access
>>> using ChipCommon registers.
>>> This has been tested using MX25L25635F.
>>>
>>> Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
>>> ---
>>> V2: Simplify line writing to buf
>>>     Add some trivial comment for OPCODE_ST_READ4B
>>>     Both requested by Marek
>>> ---
>>>  drivers/mtd/devices/bcm47xxsflash.c | 12 +++++++++++-
>>>  drivers/mtd/devices/bcm47xxsflash.h |  3 +++
>>>  2 files changed, 14 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/mtd/devices/bcm47xxsflash.c b/drivers/mtd/devices/bcm47xxsflash.c
>>> index 4decd8c..8d4c1db 100644
>>> --- a/drivers/mtd/devices/bcm47xxsflash.c
>>> +++ b/drivers/mtd/devices/bcm47xxsflash.c
>>> @@ -110,7 +110,17 @@ static int bcm47xxsflash_read(struct mtd_info *mtd, loff_t from, size_t len,
>>>       if ((from + len) > mtd->size)
>>>               return -EINVAL;
>>>
>>> -     memcpy_fromio(buf, b47s->window + from, len);
>>> +     if (from + len <= BCM47XXSFLASH_WINDOW_SIZE) {
>>> +             memcpy_fromio(buf, b47s->window + from, len);
>>
>> One last nit, what if the read starts < 16 MiB and ends > 16 MiB ?
>> Shouldn't that use partly the windowed mode and partly the other mode?
> 
> You'll lost 10ns*... or not as splitting it into 2 code paths could
> take longer, who knows. Most access are block aligned anyway. I don't
> think such corner case with doubtful gain is much worth considering &
> optimizing.

So you can also read the first 16 MiB using the new method , right ?

> * Statistic from my mind
>
Rafał Miłecki Jan. 17, 2017, 12:04 p.m. UTC | #4
On 17 January 2017 at 12:49, Marek Vasut <marek.vasut@gmail.com> wrote:
> On 01/17/2017 12:08 PM, Rafał Miłecki wrote:
>> On 17 January 2017 at 12:00, Marek Vasut <marek.vasut@gmail.com> wrote:
>>> On 01/17/2017 11:51 AM, Rafał Miłecki wrote:
>>>> From: Rafał Miłecki <rafal@milecki.pl>
>>>>
>>>> For reading flash content we use MMIO but it's possible to read only
>>>> first 16 MiB this way. It's simply an arch design/limitation.
>>>> To support flash sizes bigger than 16 MiB implement indirect access
>>>> using ChipCommon registers.
>>>> This has been tested using MX25L25635F.
>>>>
>>>> Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
>>>> ---
>>>> V2: Simplify line writing to buf
>>>>     Add some trivial comment for OPCODE_ST_READ4B
>>>>     Both requested by Marek
>>>> ---
>>>>  drivers/mtd/devices/bcm47xxsflash.c | 12 +++++++++++-
>>>>  drivers/mtd/devices/bcm47xxsflash.h |  3 +++
>>>>  2 files changed, 14 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/mtd/devices/bcm47xxsflash.c b/drivers/mtd/devices/bcm47xxsflash.c
>>>> index 4decd8c..8d4c1db 100644
>>>> --- a/drivers/mtd/devices/bcm47xxsflash.c
>>>> +++ b/drivers/mtd/devices/bcm47xxsflash.c
>>>> @@ -110,7 +110,17 @@ static int bcm47xxsflash_read(struct mtd_info *mtd, loff_t from, size_t len,
>>>>       if ((from + len) > mtd->size)
>>>>               return -EINVAL;
>>>>
>>>> -     memcpy_fromio(buf, b47s->window + from, len);
>>>> +     if (from + len <= BCM47XXSFLASH_WINDOW_SIZE) {
>>>> +             memcpy_fromio(buf, b47s->window + from, len);
>>>
>>> One last nit, what if the read starts < 16 MiB and ends > 16 MiB ?
>>> Shouldn't that use partly the windowed mode and partly the other mode?
>>
>> You'll lost 10ns*... or not as splitting it into 2 code paths could
>> take longer, who knows. Most access are block aligned anyway. I don't
>> think such corner case with doubtful gain is much worth considering &
>> optimizing.
>
> So you can also read the first 16 MiB using the new method , right ?

I could, but this could be noticeable in performance. Reading 16 MiB
using slower method is different from reading what... a few KiB? Are
you actually sure mtd does unaligned reads at all?

Please let's argue about real problems and not few % performance in
some corner case on some obsolete hardware with poor design.
Rafał Miłecki Jan. 17, 2017, 12:37 p.m. UTC | #5
On 17 January 2017 at 13:04, Rafał Miłecki <zajec5@gmail.com> wrote:
> On 17 January 2017 at 12:49, Marek Vasut <marek.vasut@gmail.com> wrote:
>> On 01/17/2017 12:08 PM, Rafał Miłecki wrote:
>>> On 17 January 2017 at 12:00, Marek Vasut <marek.vasut@gmail.com> wrote:
>>>> On 01/17/2017 11:51 AM, Rafał Miłecki wrote:
>>>>> From: Rafał Miłecki <rafal@milecki.pl>
>>>>>
>>>>> For reading flash content we use MMIO but it's possible to read only
>>>>> first 16 MiB this way. It's simply an arch design/limitation.
>>>>> To support flash sizes bigger than 16 MiB implement indirect access
>>>>> using ChipCommon registers.
>>>>> This has been tested using MX25L25635F.
>>>>>
>>>>> Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
>>>>> ---
>>>>> V2: Simplify line writing to buf
>>>>>     Add some trivial comment for OPCODE_ST_READ4B
>>>>>     Both requested by Marek
>>>>> ---
>>>>>  drivers/mtd/devices/bcm47xxsflash.c | 12 +++++++++++-
>>>>>  drivers/mtd/devices/bcm47xxsflash.h |  3 +++
>>>>>  2 files changed, 14 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/mtd/devices/bcm47xxsflash.c b/drivers/mtd/devices/bcm47xxsflash.c
>>>>> index 4decd8c..8d4c1db 100644
>>>>> --- a/drivers/mtd/devices/bcm47xxsflash.c
>>>>> +++ b/drivers/mtd/devices/bcm47xxsflash.c
>>>>> @@ -110,7 +110,17 @@ static int bcm47xxsflash_read(struct mtd_info *mtd, loff_t from, size_t len,
>>>>>       if ((from + len) > mtd->size)
>>>>>               return -EINVAL;
>>>>>
>>>>> -     memcpy_fromio(buf, b47s->window + from, len);
>>>>> +     if (from + len <= BCM47XXSFLASH_WINDOW_SIZE) {
>>>>> +             memcpy_fromio(buf, b47s->window + from, len);
>>>>
>>>> One last nit, what if the read starts < 16 MiB and ends > 16 MiB ?
>>>> Shouldn't that use partly the windowed mode and partly the other mode?
>>>
>>> You'll lost 10ns*... or not as splitting it into 2 code paths could
>>> take longer, who knows. Most access are block aligned anyway. I don't
>>> think such corner case with doubtful gain is much worth considering &
>>> optimizing.
>>
>> So you can also read the first 16 MiB using the new method , right ?
>
> I could, but this could be noticeable in performance. Reading 16 MiB
> using slower method is different from reading what... a few KiB? Are
> you actually sure mtd does unaligned reads at all?

I did some quick test:
if ((from & (b47s->blocksize - 1)) + len > b47s->blocksize)
        pr_warn("[%s] Block unaligned read from 0x%llx len:0x%zx\n",
                __func__, from, len);

And it seems unaligned reads can happen:
[  147.338850] [bcm47xxsflash_read] Block unaligned read from 0x4fe1c len:0x200
[  147.663053] [bcm47xxsflash_read] Block unaligned read from 0x5fe1c len:0x200
[  147.983868] [bcm47xxsflash_read] Block unaligned read from 0x6fe1c len:0x200
[  148.304766] [bcm47xxsflash_read] Block unaligned read from 0x7fe1c len:0x200
[  148.625637] [bcm47xxsflash_read] Block unaligned read from 0x8fe1c len:0x200
[  148.955133] [bcm47xxsflash_read] Block unaligned read from 0x9fe1c len:0x200
[  149.275948] [bcm47xxsflash_read] Block unaligned read from 0xafe1c len:0x200
[  149.596790] [bcm47xxsflash_read] Block unaligned read from 0xbfe1c len:0x200
[  149.917604] [bcm47xxsflash_read] Block unaligned read from 0xcfe1c len:0x200
[  150.248641] [bcm47xxsflash_read] Block unaligned read from 0xdfe1c len:0x200
[  150.569484] [bcm47xxsflash_read] Block unaligned read from 0xefe1c len:0x200
[  150.890298] [bcm47xxsflash_read] Block unaligned read from 0xffe1c len:0x200
[  151.211140] [bcm47xxsflash_read] Block unaligned read from 0x10fe1c len:0x200
[  151.541393] [bcm47xxsflash_read] Block unaligned read from 0x11fe1c len:0x200
[  151.862292] [bcm47xxsflash_read] Block unaligned read from 0x12fe1c len:0x200
[  152.183246] [bcm47xxsflash_read] Block unaligned read from 0x13fe1c len:0x200
[  152.504200] [bcm47xxsflash_read] Block unaligned read from 0x14fe1c len:0x200
[  152.834957] [bcm47xxsflash_read] Block unaligned read from 0x15fe1c len:0x200
[  153.155856] [bcm47xxsflash_read] Block unaligned read from 0x16fe1c len:0x200
[  153.476782] [bcm47xxsflash_read] Block unaligned read from 0x17fe1c len:0x200
[  153.797681] [bcm47xxsflash_read] Block unaligned read from 0x18fe1c len:0x200
[  154.126925] [bcm47xxsflash_read] Block unaligned read from 0x19fe1c len:0x200
[  154.447823] [bcm47xxsflash_read] Block unaligned read from 0x1afe1c len:0x200

I got these reads of only 0x200 so that doesn't worry me much. Should
I ever expect bigger reads in my driver callback?
Marek Vasut Jan. 19, 2017, 9:30 p.m. UTC | #6
On 01/17/2017 01:04 PM, Rafał Miłecki wrote:
> On 17 January 2017 at 12:49, Marek Vasut <marek.vasut@gmail.com> wrote:
>> On 01/17/2017 12:08 PM, Rafał Miłecki wrote:
>>> On 17 January 2017 at 12:00, Marek Vasut <marek.vasut@gmail.com> wrote:
>>>> On 01/17/2017 11:51 AM, Rafał Miłecki wrote:
>>>>> From: Rafał Miłecki <rafal@milecki.pl>
>>>>>
>>>>> For reading flash content we use MMIO but it's possible to read only
>>>>> first 16 MiB this way. It's simply an arch design/limitation.
>>>>> To support flash sizes bigger than 16 MiB implement indirect access
>>>>> using ChipCommon registers.
>>>>> This has been tested using MX25L25635F.
>>>>>
>>>>> Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
>>>>> ---
>>>>> V2: Simplify line writing to buf
>>>>>     Add some trivial comment for OPCODE_ST_READ4B
>>>>>     Both requested by Marek
>>>>> ---
>>>>>  drivers/mtd/devices/bcm47xxsflash.c | 12 +++++++++++-
>>>>>  drivers/mtd/devices/bcm47xxsflash.h |  3 +++
>>>>>  2 files changed, 14 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/mtd/devices/bcm47xxsflash.c b/drivers/mtd/devices/bcm47xxsflash.c
>>>>> index 4decd8c..8d4c1db 100644
>>>>> --- a/drivers/mtd/devices/bcm47xxsflash.c
>>>>> +++ b/drivers/mtd/devices/bcm47xxsflash.c
>>>>> @@ -110,7 +110,17 @@ static int bcm47xxsflash_read(struct mtd_info *mtd, loff_t from, size_t len,
>>>>>       if ((from + len) > mtd->size)
>>>>>               return -EINVAL;
>>>>>
>>>>> -     memcpy_fromio(buf, b47s->window + from, len);
>>>>> +     if (from + len <= BCM47XXSFLASH_WINDOW_SIZE) {
>>>>> +             memcpy_fromio(buf, b47s->window + from, len);
>>>>
>>>> One last nit, what if the read starts < 16 MiB and ends > 16 MiB ?
>>>> Shouldn't that use partly the windowed mode and partly the other mode?
>>>
>>> You'll lost 10ns*... or not as splitting it into 2 code paths could
>>> take longer, who knows. Most access are block aligned anyway. I don't
>>> think such corner case with doubtful gain is much worth considering &
>>> optimizing.
>>
>> So you can also read the first 16 MiB using the new method , right ?
> 
> I could, but this could be noticeable in performance. Reading 16 MiB
> using slower method is different from reading what... a few KiB? Are
> you actually sure mtd does unaligned reads at all?

No, but I'm quite sure the code above can be pushed to misbehave and I'm
trying to confirm/refute that.

> Please let's argue about real problems and not few % performance in
> some corner case on some obsolete hardware with poor design.

OK, real issue:
$ dd if=/dev/mtdX of=/dev/null bs=32M

How will this driver handle it ?
Marek Vasut Jan. 19, 2017, 9:31 p.m. UTC | #7
On 01/17/2017 01:37 PM, Rafał Miłecki wrote:
> On 17 January 2017 at 13:04, Rafał Miłecki <zajec5@gmail.com> wrote:
>> On 17 January 2017 at 12:49, Marek Vasut <marek.vasut@gmail.com> wrote:
>>> On 01/17/2017 12:08 PM, Rafał Miłecki wrote:
>>>> On 17 January 2017 at 12:00, Marek Vasut <marek.vasut@gmail.com> wrote:
>>>>> On 01/17/2017 11:51 AM, Rafał Miłecki wrote:
>>>>>> From: Rafał Miłecki <rafal@milecki.pl>
>>>>>>
>>>>>> For reading flash content we use MMIO but it's possible to read only
>>>>>> first 16 MiB this way. It's simply an arch design/limitation.
>>>>>> To support flash sizes bigger than 16 MiB implement indirect access
>>>>>> using ChipCommon registers.
>>>>>> This has been tested using MX25L25635F.
>>>>>>
>>>>>> Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
>>>>>> ---
>>>>>> V2: Simplify line writing to buf
>>>>>>     Add some trivial comment for OPCODE_ST_READ4B
>>>>>>     Both requested by Marek
>>>>>> ---
>>>>>>  drivers/mtd/devices/bcm47xxsflash.c | 12 +++++++++++-
>>>>>>  drivers/mtd/devices/bcm47xxsflash.h |  3 +++
>>>>>>  2 files changed, 14 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/mtd/devices/bcm47xxsflash.c b/drivers/mtd/devices/bcm47xxsflash.c
>>>>>> index 4decd8c..8d4c1db 100644
>>>>>> --- a/drivers/mtd/devices/bcm47xxsflash.c
>>>>>> +++ b/drivers/mtd/devices/bcm47xxsflash.c
>>>>>> @@ -110,7 +110,17 @@ static int bcm47xxsflash_read(struct mtd_info *mtd, loff_t from, size_t len,
>>>>>>       if ((from + len) > mtd->size)
>>>>>>               return -EINVAL;
>>>>>>
>>>>>> -     memcpy_fromio(buf, b47s->window + from, len);
>>>>>> +     if (from + len <= BCM47XXSFLASH_WINDOW_SIZE) {
>>>>>> +             memcpy_fromio(buf, b47s->window + from, len);
>>>>>
>>>>> One last nit, what if the read starts < 16 MiB and ends > 16 MiB ?
>>>>> Shouldn't that use partly the windowed mode and partly the other mode?
>>>>
>>>> You'll lost 10ns*... or not as splitting it into 2 code paths could
>>>> take longer, who knows. Most access are block aligned anyway. I don't
>>>> think such corner case with doubtful gain is much worth considering &
>>>> optimizing.
>>>
>>> So you can also read the first 16 MiB using the new method , right ?
>>
>> I could, but this could be noticeable in performance. Reading 16 MiB
>> using slower method is different from reading what... a few KiB? Are
>> you actually sure mtd does unaligned reads at all?
> 
> I did some quick test:
> if ((from & (b47s->blocksize - 1)) + len > b47s->blocksize)
>         pr_warn("[%s] Block unaligned read from 0x%llx len:0x%zx\n",
>                 __func__, from, len);
> 
> And it seems unaligned reads can happen:
> [  147.338850] [bcm47xxsflash_read] Block unaligned read from 0x4fe1c len:0x200
> [  147.663053] [bcm47xxsflash_read] Block unaligned read from 0x5fe1c len:0x200
> [  147.983868] [bcm47xxsflash_read] Block unaligned read from 0x6fe1c len:0x200
> [  148.304766] [bcm47xxsflash_read] Block unaligned read from 0x7fe1c len:0x200
> [  148.625637] [bcm47xxsflash_read] Block unaligned read from 0x8fe1c len:0x200
> [  148.955133] [bcm47xxsflash_read] Block unaligned read from 0x9fe1c len:0x200
> [  149.275948] [bcm47xxsflash_read] Block unaligned read from 0xafe1c len:0x200
> [  149.596790] [bcm47xxsflash_read] Block unaligned read from 0xbfe1c len:0x200
> [  149.917604] [bcm47xxsflash_read] Block unaligned read from 0xcfe1c len:0x200
> [  150.248641] [bcm47xxsflash_read] Block unaligned read from 0xdfe1c len:0x200
> [  150.569484] [bcm47xxsflash_read] Block unaligned read from 0xefe1c len:0x200
> [  150.890298] [bcm47xxsflash_read] Block unaligned read from 0xffe1c len:0x200
> [  151.211140] [bcm47xxsflash_read] Block unaligned read from 0x10fe1c len:0x200
> [  151.541393] [bcm47xxsflash_read] Block unaligned read from 0x11fe1c len:0x200
> [  151.862292] [bcm47xxsflash_read] Block unaligned read from 0x12fe1c len:0x200
> [  152.183246] [bcm47xxsflash_read] Block unaligned read from 0x13fe1c len:0x200
> [  152.504200] [bcm47xxsflash_read] Block unaligned read from 0x14fe1c len:0x200
> [  152.834957] [bcm47xxsflash_read] Block unaligned read from 0x15fe1c len:0x200
> [  153.155856] [bcm47xxsflash_read] Block unaligned read from 0x16fe1c len:0x200
> [  153.476782] [bcm47xxsflash_read] Block unaligned read from 0x17fe1c len:0x200
> [  153.797681] [bcm47xxsflash_read] Block unaligned read from 0x18fe1c len:0x200
> [  154.126925] [bcm47xxsflash_read] Block unaligned read from 0x19fe1c len:0x200
> [  154.447823] [bcm47xxsflash_read] Block unaligned read from 0x1afe1c len:0x200
> 
> I got these reads of only 0x200 so that doesn't worry me much. Should
> I ever expect bigger reads in my driver callback?

Try using /dev/mtdX for read directly (do not use for write), see my
previous email.
Rafał Miłecki Jan. 23, 2017, 8:14 a.m. UTC | #8
On 19 January 2017 at 22:30, Marek Vasut <marek.vasut@gmail.com> wrote:
> On 01/17/2017 01:04 PM, Rafał Miłecki wrote:
>> On 17 January 2017 at 12:49, Marek Vasut <marek.vasut@gmail.com> wrote:
>>> On 01/17/2017 12:08 PM, Rafał Miłecki wrote:
>>>> On 17 January 2017 at 12:00, Marek Vasut <marek.vasut@gmail.com> wrote:
>>>>> On 01/17/2017 11:51 AM, Rafał Miłecki wrote:
>>>>>> From: Rafał Miłecki <rafal@milecki.pl>
>>>>>>
>>>>>> For reading flash content we use MMIO but it's possible to read only
>>>>>> first 16 MiB this way. It's simply an arch design/limitation.
>>>>>> To support flash sizes bigger than 16 MiB implement indirect access
>>>>>> using ChipCommon registers.
>>>>>> This has been tested using MX25L25635F.
>>>>>>
>>>>>> Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
>>>>>> ---
>>>>>> V2: Simplify line writing to buf
>>>>>>     Add some trivial comment for OPCODE_ST_READ4B
>>>>>>     Both requested by Marek
>>>>>> ---
>>>>>>  drivers/mtd/devices/bcm47xxsflash.c | 12 +++++++++++-
>>>>>>  drivers/mtd/devices/bcm47xxsflash.h |  3 +++
>>>>>>  2 files changed, 14 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/mtd/devices/bcm47xxsflash.c b/drivers/mtd/devices/bcm47xxsflash.c
>>>>>> index 4decd8c..8d4c1db 100644
>>>>>> --- a/drivers/mtd/devices/bcm47xxsflash.c
>>>>>> +++ b/drivers/mtd/devices/bcm47xxsflash.c
>>>>>> @@ -110,7 +110,17 @@ static int bcm47xxsflash_read(struct mtd_info *mtd, loff_t from, size_t len,
>>>>>>       if ((from + len) > mtd->size)
>>>>>>               return -EINVAL;
>>>>>>
>>>>>> -     memcpy_fromio(buf, b47s->window + from, len);
>>>>>> +     if (from + len <= BCM47XXSFLASH_WINDOW_SIZE) {
>>>>>> +             memcpy_fromio(buf, b47s->window + from, len);
>>>>>
>>>>> One last nit, what if the read starts < 16 MiB and ends > 16 MiB ?
>>>>> Shouldn't that use partly the windowed mode and partly the other mode?
>>>>
>>>> You'll lost 10ns*... or not as splitting it into 2 code paths could
>>>> take longer, who knows. Most access are block aligned anyway. I don't
>>>> think such corner case with doubtful gain is much worth considering &
>>>> optimizing.
>>>
>>> So you can also read the first 16 MiB using the new method , right ?
>>
>> I could, but this could be noticeable in performance. Reading 16 MiB
>> using slower method is different from reading what... a few KiB? Are
>> you actually sure mtd does unaligned reads at all?
>
> No, but I'm quite sure the code above can be pushed to misbehave and I'm
> trying to confirm/refute that.
>
>> Please let's argue about real problems and not few % performance in
>> some corner case on some obsolete hardware with poor design.
>
> OK, real issue:
> $ dd if=/dev/mtdX of=/dev/null bs=32M
>
> How will this driver handle it ?

It results in reading using 0x400000 chunks. Now this is a good
argument to optimize bcm47xxsflash_read for reads crossing flash
window boundary.
[  401.927715] [bcm47xxsflash_read] len:0x400000
[  403.476166] [bcm47xxsflash_read] len:0x400000
(...)
diff mbox

Patch

diff --git a/drivers/mtd/devices/bcm47xxsflash.c b/drivers/mtd/devices/bcm47xxsflash.c
index 4decd8c..8d4c1db 100644
--- a/drivers/mtd/devices/bcm47xxsflash.c
+++ b/drivers/mtd/devices/bcm47xxsflash.c
@@ -110,7 +110,17 @@  static int bcm47xxsflash_read(struct mtd_info *mtd, loff_t from, size_t len,
 	if ((from + len) > mtd->size)
 		return -EINVAL;
 
-	memcpy_fromio(buf, b47s->window + from, len);
+	if (from + len <= BCM47XXSFLASH_WINDOW_SIZE) {
+		memcpy_fromio(buf, b47s->window + from, len);
+	} else {
+		size_t i;
+
+		for (i = 0; i < len; i++) {
+			b47s->cc_write(b47s, BCMA_CC_FLASHADDR, from++);
+			bcm47xxsflash_cmd(b47s, OPCODE_ST_READ4B);
+			*buf++ = b47s->cc_read(b47s, BCMA_CC_FLASHDATA);
+		}
+	}
 	*retlen = len;
 
 	return len;
diff --git a/drivers/mtd/devices/bcm47xxsflash.h b/drivers/mtd/devices/bcm47xxsflash.h
index 1564b62..6b478af 100644
--- a/drivers/mtd/devices/bcm47xxsflash.h
+++ b/drivers/mtd/devices/bcm47xxsflash.h
@@ -3,6 +3,8 @@ 
 
 #include <linux/mtd/mtd.h>
 
+#define BCM47XXSFLASH_WINDOW_SIZE		SZ_16M
+
 /* Used for ST flashes only. */
 #define OPCODE_ST_WREN		0x0006		/* Write Enable */
 #define OPCODE_ST_WRDIS		0x0004		/* Write Disable */
@@ -16,6 +18,7 @@ 
 #define OPCODE_ST_RES		0x03ab		/* Read Electronic Signature */
 #define OPCODE_ST_CSA		0x1000		/* Keep chip select asserted */
 #define OPCODE_ST_SSE		0x0220		/* Sub-sector Erase */
+#define OPCODE_ST_READ4B	0x6313		/* Read Data Bytes in 4Byte address */
 
 /* Used for Atmel flashes only. */
 #define OPCODE_AT_READ				0x07e8