Patchwork [1/3,v3] powerpc/32: Always order writes to halves of 64-bit PTEs

login
register
mail settings
Submitter Paul Mackerras
Date Aug. 17, 2009, 11 p.m.
Message ID <19081.57584.173693.798535@cargo.ozlabs.ibm.com>
Download mbox | patch
Permalink /patch/31540/
State Accepted, archived
Commit 1660e9d3d04b6c636b7171bf6c08ac7b82a7de79
Headers show

Comments

Paul Mackerras - Aug. 17, 2009, 11 p.m.
On 32-bit systems with 64-bit PTEs, the PTEs have to be written in two
32-bit halves.  On SMP we write the higher-order half and then the
lower-order half, with a write barrier between the two halves, but on
UP there was no particular ordering of the writes to the two halves.

This extends the ordering that we already do on SMP to the UP case as
well.  The reason is that with the perf_counter subsystem potentially
accessing user memory at interrupt time to get stack traces, we have
to be careful not to create an incorrect but apparently valid PTE even
on UP.

Signed-off-by: Paul Mackerras <paulus@samba.org>
---
 arch/powerpc/include/asm/pgtable.h |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)
Kumar Gala - Aug. 18, 2009, midnight
On Aug 17, 2009, at 6:00 PM, Paul Mackerras wrote:

> On 32-bit systems with 64-bit PTEs, the PTEs have to be written in two
> 32-bit halves.  On SMP we write the higher-order half and then the
> lower-order half, with a write barrier between the two halves, but on
> UP there was no particular ordering of the writes to the two halves.
>
> This extends the ordering that we already do on SMP to the UP case as
> well.  The reason is that with the perf_counter subsystem potentially
> accessing user memory at interrupt time to get stack traces, we have
> to be careful not to create an incorrect but apparently valid PTE even
> on UP.
>
> Signed-off-by: Paul Mackerras <paulus@samba.org>
> ---
> arch/powerpc/include/asm/pgtable.h |    6 +++---
> 1 files changed, 3 insertions(+), 3 deletions(-)

Just out of interest did you end up hitting this in testing?

- k
Paul Mackerras - Aug. 18, 2009, 12:14 a.m.
Kumar Gala writes:

> On Aug 17, 2009, at 6:00 PM, Paul Mackerras wrote:
> 
> > On 32-bit systems with 64-bit PTEs, the PTEs have to be written in two
> > 32-bit halves.  On SMP we write the higher-order half and then the
> > lower-order half, with a write barrier between the two halves, but on
> > UP there was no particular ordering of the writes to the two halves.
> >
> > This extends the ordering that we already do on SMP to the UP case as
> > well.  The reason is that with the perf_counter subsystem potentially
> > accessing user memory at interrupt time to get stack traces, we have
> > to be careful not to create an incorrect but apparently valid PTE even
> > on UP.
> >
> > Signed-off-by: Paul Mackerras <paulus@samba.org>
> > ---
> > arch/powerpc/include/asm/pgtable.h |    6 +++---
> > 1 files changed, 3 insertions(+), 3 deletions(-)
> 
> Just out of interest did you end up hitting this in testing?

No.  Ben told me he wanted this change, so I did what I was told. :)

Paul.
Benjamin Herrenschmidt - Aug. 18, 2009, 4:24 a.m.
On Tue, 2009-08-18 at 09:00 +1000, Paul Mackerras wrote:
> On 32-bit systems with 64-bit PTEs, the PTEs have to be written in two
> 32-bit halves.  On SMP we write the higher-order half and then the
> lower-order half, with a write barrier between the two halves, but on
> UP there was no particular ordering of the writes to the two halves.
> 
> This extends the ordering that we already do on SMP to the UP case as
> well.  The reason is that with the perf_counter subsystem potentially
> accessing user memory at interrupt time to get stack traces, we have
> to be careful not to create an incorrect but apparently valid PTE even
> on UP.
> 
> Signed-off-by: Paul Mackerras <paulus@samba.org>

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
> ---
>  arch/powerpc/include/asm/pgtable.h |    6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
> index eb17da7..2a5da06 100644
> --- a/arch/powerpc/include/asm/pgtable.h
> +++ b/arch/powerpc/include/asm/pgtable.h
> @@ -104,8 +104,8 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
>  	else
>  		pte_update(ptep, ~_PAGE_HASHPTE, pte_val(pte));
>  
> -#elif defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT) && defined(CONFIG_SMP)
> -	/* Second case is 32-bit with 64-bit PTE in SMP mode. In this case, we
> +#elif defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT)
> +	/* Second case is 32-bit with 64-bit PTE.  In this case, we
>  	 * can just store as long as we do the two halves in the right order
>  	 * with a barrier in between. This is possible because we take care,
>  	 * in the hash code, to pre-invalidate if the PTE was already hashed,
> @@ -140,7 +140,7 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
>  
>  #else
>  	/* Anything else just stores the PTE normally. That covers all 64-bit
> -	 * cases, and 32-bit non-hash with 64-bit PTEs in UP mode
> +	 * cases, and 32-bit non-hash with 32-bit PTEs.
>  	 */
>  	*ptep = pte;
>  #endif
fkan@amcc.com - Aug. 19, 2009, 11:16 p.m.
Hi All:

It seems that the ECC correction is broken on the Linux with the 4xx 
NDFC driver.
It uses the SMC order when reading the ECC code. 2-1-3

static int ndfc_calculate_ecc(struct mtd_info *mtd,
                               const u_char *dat, u_char *ecc_code)
{
         struct ndfc_controller *ndfc = &ndfc_ctrl;
         uint32_t ecc;
         uint8_t *p = (uint8_t *)&ecc;

         wmb();
         ecc = in_be32(ndfc->ndfcbase + NDFC_ECC);
         /* The NDFC uses Smart Media (SMC) bytes order */
         ecc_code[0] = p[2];
         ecc_code[1] = p[1];
         ecc_code[2] = p[3];

         return 0;
}

However, when in the correction function, the byte address order is 
again reverses
causing incorrect byte location.

                  * performace it does not make any difference
                  */
                 if (eccsize_mult == 1)
                         byte_addr = (addressbits[b0] << 4) + 
addressbits[b1];
 >>>> The above really should be byte_addr = (addressbits[b1] << 4) + 
addressbits[b0];

                 else
                         byte_addr = (addressbits[b2 & 0x3] << 8) +
                                     (addressbits[b1] << 4) + 
addressbits[b0];
                 bit_addr = addressbits[b2 >> 2];
                 /* flip the bit */
                 buf[byte_addr] ^= (1 << bit_addr);
                 printk(KERN_INFO "Corrected b[0] 0x%x b[1]0x%x\n", b0, b1);
                 printk(KERN_INFO "cal ecc b[0] 0x%x b[1]0x%x\n", 
calc_ecc[0] , calc_ecc[1]);
                 printk(KERN_INFO "read ecc b[0] 0x%x b[1]0x%x\n", 
read_ecc[0] , read_ecc[1]);
                 return 1;

I see other boards using SMC as well, can someone comment on the change 
I am proposing.
Should I change the correction algorithm or the calculate function? If 
the later is preferred
it would mean the change must be pushed in both U-Boot and Linux.

Feng Kan
AMCC Software
Sean MacLennan - Aug. 20, 2009, 4:38 a.m.
On Wed, 19 Aug 2009 16:16:54 -0700
Feng Kan <fkan@amcc.com> wrote:

> I see other boards using SMC as well, can someone comment on the
> change I am proposing.
> Should I change the correction algorithm or the calculate function?
> If the later is preferred
> it would mean the change must be pushed in both U-Boot and Linux.

Odds are the calculate function is wrong. The correction algo is used
by many nand drivers, I *assume* it is correct. The calculate function
was set to agree with u-boot (1.3.0).

Cheers,
   Sean

P.S. Yes, I know the u-boot is an ancient version :(
Stefan Roese - Aug. 20, 2009, 5:01 a.m.
On Thursday 20 August 2009 06:38:51 Sean MacLennan wrote:
> > I see other boards using SMC as well, can someone comment on the
> > change I am proposing.
> > Should I change the correction algorithm or the calculate function?
> > If the later is preferred
> > it would mean the change must be pushed in both U-Boot and Linux.
>
> Odds are the calculate function is wrong. The correction algo is used
> by many nand drivers, I *assume* it is correct. The calculate function
> was set to agree with u-boot (1.3.0).

Yes, it seems that you changed the order in the calculation function while 
reworking the NDFC driver for arch/powerpc. So we should probably change this 
order back to the original version. And change it in U-Boot as well.

BTW: I didn't see any problems with ECC so far with the current code. Feng, 
how did you spot this problem?

Cheers,
Stefan

--
DENX Software Engineering GmbH,      MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich,  Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-0 Fax: (+49)-8142-66989-80 Email: office@denx.de
fkan@amcc.com - Aug. 20, 2009, 11:42 p.m.
Hi Stefan:

We had a board with high number of correctable ECC errors. Which crashed 
the jffs when it
was miss correcting the wrong byte location.

Do you want me to submit a patch for this, or do you prefer to do it. I 
am submitting a patch
for linux right now.

Feng Kan
AMCC Software

On 08/19/2009 10:01 PM, Stefan Roese wrote:
> On Thursday 20 August 2009 06:38:51 Sean MacLennan wrote:
>    
>>> I see other boards using SMC as well, can someone comment on the
>>> change I am proposing.
>>> Should I change the correction algorithm or the calculate function?
>>> If the later is preferred
>>> it would mean the change must be pushed in both U-Boot and Linux.
>>>        
>> Odds are the calculate function is wrong. The correction algo is used
>> by many nand drivers, I *assume* it is correct. The calculate function
>> was set to agree with u-boot (1.3.0).
>>      
> Yes, it seems that you changed the order in the calculation function while
> reworking the NDFC driver for arch/powerpc. So we should probably change this
> order back to the original version. And change it in U-Boot as well.
>
> BTW: I didn't see any problems with ECC so far with the current code. Feng,
> how did you spot this problem?
>
> Cheers,
> Stefan
>
> --
> DENX Software Engineering GmbH,      MD: Wolfgang Denk&  Detlev Zundel
> HRB 165235 Munich,  Office: Kirchenstr.5, D-82194 Groebenzell, Germany
> Phone: (+49)-8142-66989-0 Fax: (+49)-8142-66989-80 Email: office@denx.de
>
Stefan Roese - Aug. 21, 2009, 7:59 a.m.
Hi Feng,

On Friday 21 August 2009 01:42:42 Feng Kan wrote:
> We had a board with high number of correctable ECC errors. Which crashed
> the jffs when it
> was miss correcting the wrong byte location.

OK, thanks.

> Do you want me to submit a patch for this, or do you prefer to do it.

Sure, please go ahead and send a patch to fix this in U-Boot as well.

Thanks.

Cheers,
Stefan

--
DENX Software Engineering GmbH,      MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich,  Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-0 Fax: (+49)-8142-66989-80 Email: office@denx.de

Patch

diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h
index eb17da7..2a5da06 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -104,8 +104,8 @@  static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
 	else
 		pte_update(ptep, ~_PAGE_HASHPTE, pte_val(pte));
 
-#elif defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT) && defined(CONFIG_SMP)
-	/* Second case is 32-bit with 64-bit PTE in SMP mode. In this case, we
+#elif defined(CONFIG_PPC32) && defined(CONFIG_PTE_64BIT)
+	/* Second case is 32-bit with 64-bit PTE.  In this case, we
 	 * can just store as long as we do the two halves in the right order
 	 * with a barrier in between. This is possible because we take care,
 	 * in the hash code, to pre-invalidate if the PTE was already hashed,
@@ -140,7 +140,7 @@  static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
 
 #else
 	/* Anything else just stores the PTE normally. That covers all 64-bit
-	 * cases, and 32-bit non-hash with 64-bit PTEs in UP mode
+	 * cases, and 32-bit non-hash with 32-bit PTEs.
 	 */
 	*ptep = pte;
 #endif