Message ID | 20170421105107.GA7259@amd |
---|---|
State | Accepted |
Delegated to: | Boris Brezillon |
Headers | show |
On 21/04/2017 12:51, Pavel Machek wrote: > If we see ~0UL in flash, there's no need for hweight, and no need to > check number of bitflips. So this should be net win. > > Signed-off-by: Pavel Machek <pavel@denx.de> > > diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c > index b0524f8..96c27ec 100644 > --- a/drivers/mtd/nand/nand_base.c > +++ b/drivers/mtd/nand/nand_base.c > @@ -1357,7 +1357,10 @@ static int nand_check_erased_buf(void *buf, int len, int bitflips_threshold) > > for (; len >= sizeof(long); > len -= sizeof(long), bitmap += sizeof(long)) { > - weight = hweight_long(*((unsigned long *)bitmap)); I hadn't noticed this earlier. There is, obviously, an implicit requirement that 'buf' must be 4-byte aligned on 32-bit platforms, and 8-byte aligned on 64-bit platforms. This is not true for my platform, as the ecc pointer is chip->oob_poi + 10 I suppose it's not a problem if the platform can handle unaligned loads, otherwise it spells trouble. > + unsigned long d = *((unsigned long *)bitmap); > + if (d == ~0UL) > + continue; > + weight = hweight_long(d); > bitflips += BITS_PER_LONG - weight; > if (unlikely(bitflips > bitflips_threshold)) > return -EBADMSG; > The optimization makes sense in itself, but given that it's on an error path (?) I'm not sure it will bring any tangible benefits? Regards.
On 17/05/2017 13:27, Mason wrote: > On 21/04/2017 12:51, Pavel Machek wrote: > >> If we see ~0UL in flash, there's no need for hweight, and no need to >> check number of bitflips. So this should be net win. >> >> Signed-off-by: Pavel Machek <pavel@denx.de> >> >> diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c >> index b0524f8..96c27ec 100644 >> --- a/drivers/mtd/nand/nand_base.c >> +++ b/drivers/mtd/nand/nand_base.c >> @@ -1357,7 +1357,10 @@ static int nand_check_erased_buf(void *buf, int len, int bitflips_threshold) >> >> for (; len >= sizeof(long); >> len -= sizeof(long), bitmap += sizeof(long)) { >> - weight = hweight_long(*((unsigned long *)bitmap)); > > I hadn't noticed this earlier. There is, obviously, an implicit > requirement that 'buf' must be 4-byte aligned on 32-bit platforms, > and 8-byte aligned on 64-bit platforms. > > This is not true for my platform, as the ecc pointer is > chip->oob_poi + 10 Doh! As Boris points out, the prologue/epilogue handle all alignment & size issues. Regards.
> This is not true for my platform, as the ecc pointer is > chip->oob_poi + 10 > > I suppose it's not a problem if the platform can handle > unaligned loads, otherwise it spells trouble. > > > > + unsigned long d = *((unsigned long *)bitmap); > > + if (d == ~0UL) > > + continue; > > + weight = hweight_long(d); > > bitflips += BITS_PER_LONG - weight; > > if (unlikely(bitflips > bitflips_threshold)) > > return -EBADMSG; > > > > The optimization makes sense in itself, but given that it's on an > error path (?) I'm not sure it will bring any tangible benefits? Well, if it is critical enough to do magic it does, it is also important enough not to do expensive bit-counting needlessly...
diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c index b0524f8..96c27ec 100644 --- a/drivers/mtd/nand/nand_base.c +++ b/drivers/mtd/nand/nand_base.c @@ -1357,7 +1357,10 @@ static int nand_check_erased_buf(void *buf, int len, int bitflips_threshold) for (; len >= sizeof(long); len -= sizeof(long), bitmap += sizeof(long)) { - weight = hweight_long(*((unsigned long *)bitmap)); + unsigned long d = *((unsigned long *)bitmap); + if (d == ~0UL) + continue; + weight = hweight_long(d); bitflips += BITS_PER_LONG - weight; if (unlikely(bitflips > bitflips_threshold)) return -EBADMSG;
If we see ~0UL in flash, there's no need for hweight, and no need to check number of bitflips. So this should be net win. Signed-off-by: Pavel Machek <pavel@denx.de>