Patchwork [09/11] fsmc/nand:FIX: replace change_bit routine

login
register
mail settings
Submitter Vipin Kumar
Date Oct. 9, 2012, 10:44 a.m.
Message ID <ec169eb041e87e66c06108e85ea4a67e5797cecc.1349778821.git.vipin.kumar@st.com>
Download mbox | patch
Permalink /patch/190274/
State New
Headers show

Comments

Vipin Kumar - Oct. 9, 2012, 10:44 a.m.
change_bit routine accepts only ulong pointers as buffer, so an unaligned char
pointer passed to change_bit may lead to a crash.

Fix this bug by accessing the buffer as char pointer.

Signed-off-by: Vipin Kumar <vipin.kumar@st.com>
---
 drivers/mtd/nand/fsmc_nand.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
Linus Walleij - Oct. 10, 2012, 5:22 p.m.
On Tue, Oct 9, 2012 at 12:44 PM, Vipin Kumar <vipin.kumar@st.com> wrote:

> change_bit routine accepts only ulong pointers as buffer, so an unaligned char
> pointer passed to change_bit may lead to a crash.
>
> Fix this bug by accessing the buffer as char pointer.

Why not see if we can fix change_bit() instead?
Since I suspect this is on ARM I bet Russell and Nico
want to hear about this if there is a problem.

Can the ARM change_bit() not be fixed, so that
long arguments are the only option?

>                 if (err_idx[i] < chip->ecc.size * 8) {
> -                       change_bit(err_idx[i], (unsigned long *)dat);
> +                       uint8_t *p = dat + err_idx[i] / 8;
> +                       *p = *p ^ (1 << (err_idx[i] % 8));

I'm one of these people who would write >>3 and
&7 rather than /8 or %8 but I guess we are all
different. Atleast consider it if you stick with this...

Yours,
Linus Walleij
Nicolas Pitre - Oct. 10, 2012, 8:21 p.m.
On Wed, 10 Oct 2012, Linus Walleij wrote:

> On Tue, Oct 9, 2012 at 12:44 PM, Vipin Kumar <vipin.kumar@st.com> wrote:
> 
> > change_bit routine accepts only ulong pointers as buffer, so an unaligned char
> > pointer passed to change_bit may lead to a crash.
> >
> > Fix this bug by accessing the buffer as char pointer.
> 
> Why not see if we can fix change_bit() instead?
> Since I suspect this is on ARM I bet Russell and Nico
> want to hear about this if there is a problem.
> 
> Can the ARM change_bit() not be fixed, so that
> long arguments are the only option?

No.  It is this code which is totally broken.

The change_bit() is defined to operate on long values, in the _native_ 
endian.  Now imagine what this is going to do to your data buffer if 
instead you are running on a big endian device.

And I doubt there is anything requiring atomic bit manipulation here 
either.

> >                 if (err_idx[i] < chip->ecc.size * 8) {
> > -                       change_bit(err_idx[i], (unsigned long *)dat);
> > +                       uint8_t *p = dat + err_idx[i] / 8;
> > +                       *p = *p ^ (1 << (err_idx[i] % 8));
> 
> I'm one of these people who would write >>3 and
> &7 rather than /8 or %8 but I guess we are all
> different. Atleast consider it if you stick with this...

Better yet:

		dat[err_idx[i] / 8] ^= (1 << (err_idx[i] % 8));

The /8 and %8 will be changed into >>3 and &7 by the compiler anyway, 
while the /8 and %8 form is possibly a bit less obscure.

Of course, this needs to be done over _all_ this driver, not only a few 
cases.


Nicolas
Russell King - ARM Linux - Oct. 10, 2012, 8:45 p.m.
On Wed, Oct 10, 2012 at 07:22:04PM +0200, Linus Walleij wrote:
> On Tue, Oct 9, 2012 at 12:44 PM, Vipin Kumar <vipin.kumar@st.com> wrote:
> 
> > change_bit routine accepts only ulong pointers as buffer, so an unaligned char
> > pointer passed to change_bit may lead to a crash.
> >
> > Fix this bug by accessing the buffer as char pointer.
> 
> Why not see if we can fix change_bit() instead?
> Since I suspect this is on ARM I bet Russell and Nico
> want to hear about this if there is a problem.

Not particularly.  There's a reason the argument to change_bit() is typed
'unsigned long' and that's not because it can take a void, char, or a
short.  It's because it _expects_ the buffer to be aligned to an
"unsigned long" quantity.

That's because on many architectures, misaligned loads and stores are
not atomic operations - and in this case, load/store exclusive will
fail when they're misalighed.

So...

-                       change_bit(err_idx[i], (unsigned long *)dat);

is highly invalid code.

> Can the ARM change_bit() not be fixed, so that
> long arguments are the only option?

Spot the intentional cast:

> > -                       change_bit(err_idx[i], (unsigned long *)dat);

which tries to work around this.  Remember my attitude towards casts:
if you're having to use a cast, you are _probably_ doing something
wrong.  In this case, it's hiding a warning which was saying that
the code is doing something wrong, and then the result blows up.
By adding that cast, the wrong wire was cut... you get to keep the
remains. ;)
Vipin Kumar - Oct. 11, 2012, 4:17 a.m.
On 10/11/2012 1:51 AM, Nicolas Pitre wrote:
> On Wed, 10 Oct 2012, Linus Walleij wrote:
>
>> On Tue, Oct 9, 2012 at 12:44 PM, Vipin Kumar<vipin.kumar@st.com>  wrote:
>>
>>> change_bit routine accepts only ulong pointers as buffer, so an unaligned char
>>> pointer passed to change_bit may lead to a crash.
>>>
>>> Fix this bug by accessing the buffer as char pointer.
>>
>> Why not see if we can fix change_bit() instead?
>> Since I suspect this is on ARM I bet Russell and Nico
>> want to hear about this if there is a problem.
>>
>> Can the ARM change_bit() not be fixed, so that
>> long arguments are the only option?
>

Hello Nicolas

> No.  It is this code which is totally broken.
>

Yes, I understand and accept the probelm. That's why the fix

> The change_bit() is defined to operate on long values, in the _native_
> endian.  Now imagine what this is going to do to your data buffer if
> instead you are running on a big endian device.
>
> And I doubt there is anything requiring atomic bit manipulation here
> either.
>

No, there is no requirement for an atomic change_bit

>>>                  if (err_idx[i]<  chip->ecc.size * 8) {
>>> -                       change_bit(err_idx[i], (unsigned long *)dat);
>>> +                       uint8_t *p = dat + err_idx[i] / 8;
>>> +                       *p = *p ^ (1<<  (err_idx[i] % 8));
>>
>> I'm one of these people who would write>>3 and
>> &7 rather than /8 or %8 but I guess we are all
>> different. Atleast consider it if you stick with this...
>
> Better yet:
>
> 		dat[err_idx[i] / 8] ^= (1<<  (err_idx[i] % 8));
>
> The /8 and %8 will be changed into>>3 and&7 by the compiler anyway,
> while the /8 and %8 form is possibly a bit less obscure.
>

Yes, that's exactly why I kept /8 and %8. I would change the code as 
suggested by you

> Of course, this needs to be done over _all_ this driver, not only a few
> cases.
>

There is only one place which needs a change_bit. I would send a v2 with 
the suggested change

Vipin

>
> Nicolas
> .
>
Vipin Kumar - Oct. 11, 2012, 4:20 a.m.
On 10/11/2012 2:15 AM, Russell King - ARM Linux wrote:
> On Wed, Oct 10, 2012 at 07:22:04PM +0200, Linus Walleij wrote:
>> On Tue, Oct 9, 2012 at 12:44 PM, Vipin Kumar<vipin.kumar@st.com>  wrote:
>>
>>> change_bit routine accepts only ulong pointers as buffer, so an unaligned char
>>> pointer passed to change_bit may lead to a crash.
>>>
>>> Fix this bug by accessing the buffer as char pointer.
>>
>> Why not see if we can fix change_bit() instead?
>> Since I suspect this is on ARM I bet Russell and Nico
>> want to hear about this if there is a problem.
>
> Not particularly.  There's a reason the argument to change_bit() is typed
> 'unsigned long' and that's not because it can take a void, char, or a
> short.  It's because it _expects_ the buffer to be aligned to an
> "unsigned long" quantity.
>
> That's because on many architectures, misaligned loads and stores are
> not atomic operations - and in this case, load/store exclusive will
> fail when they're misalighed.
>
> So...
>
> -                       change_bit(err_idx[i], (unsigned long *)dat);
>
> is highly invalid code.
>

Thanks. Got your point

>> Can the ARM change_bit() not be fixed, so that
>> long arguments are the only option?
>
> Spot the intentional cast:
>
>>> -                       change_bit(err_idx[i], (unsigned long *)dat);
>
> which tries to work around this.  Remember my attitude towards casts:
> if you're having to use a cast, you are _probably_ doing something
> wrong.  In this case, it's hiding a warning which was saying that
> the code is doing something wrong, and then the result blows up.
> By adding that cast, the wrong wire was cut... you get to keep the
> remains. ;)
> .

I agree with you. Although I was a bit relaxed this time, I would really 
think before adding a cast explicitly just to avoid a warning

-Vipin

Patch

diff --git a/drivers/mtd/nand/fsmc_nand.c b/drivers/mtd/nand/fsmc_nand.c
index f48ee60..762cf83 100644
--- a/drivers/mtd/nand/fsmc_nand.c
+++ b/drivers/mtd/nand/fsmc_nand.c
@@ -859,7 +859,9 @@  static int fsmc_bch8_correct_data(struct mtd_info *mtd, uint8_t *dat,
 		change_bit(1, (unsigned long *)&err_idx[i]);
 
 		if (err_idx[i] < chip->ecc.size * 8) {
-			change_bit(err_idx[i], (unsigned long *)dat);
+			uint8_t *p = dat + err_idx[i] / 8;
+			*p = *p ^ (1 << (err_idx[i] % 8));
+
 			i++;
 		}
 	}