Patchwork powerpc/8xx: fix regression introduced by cache coherency rewrite

login
register
mail settings
Submitter Joakim Tjernlund
Date Oct. 3, 2009, 8:05 a.m.
Message ID <OF4175DD8E.796930EB-ONC1257644.002A7135-C1257644.002C7929@transmode.se>
Download mbox | patch
Permalink /patch/34921/
State Superseded
Headers show

Comments

Joakim Tjernlund - Oct. 3, 2009, 8:05 a.m.
Scott Wood <scottwood@freescale.com> wrote on 02/10/2009 23:49:49:
>
> On Thu, Oct 01, 2009 at 08:35:59AM +1000, Benjamin Herrenschmidt wrote:
> > >From what I can see, the TLB miss code will check _PAGE_PRESENT, and
> > when not set, it will -still- insert something into the TLB (unlike
> > all other CPU types that go straight to data access faults from there).
> >
> > So we end up populating with those unpopulated entries that will then
> > cause us to take a DSI (or ISI) instead of a TLB miss the next time
> > around ... and so we need to remove them once we do that, no ? IE. Once
> > we have actually put a valid PTE in.
> >
> > At least that's my understanding and why I believe we should always have
> > tlbil_va() in set_pte() and ptep_set_access_flags(), which boils down
> > in putting it into the 2 "filter" functions in the new code.
> >
> > Well.. actually, the ptep_set_access_flags() will already do a
> > flush_tlb_page_nohash() when the PTE is changed. So I suppose all we
> > really need here would be in set_pte_filter(), to do the same if we are
> > writing a PTE that has _PAGE_PRESENT, at least on 8xx.
> >
> > But just unconditionally doing a tlbil_va() in both the filter functions
> > shouldn't harm and also fix the problem, though Rex seems to indicate
> > that is not the case.
>
> Adding a tlbil_va to do_page_fault makes the problem go away for me (on
> top of your "merge" branch) -- none of the other changes in this thread
> do (assuming I didn't miss any).  FWIW, when it gets stuck on a fault,
> DSISR is 0xc0000000, and handle_mm_fault returns zero.

OK, that is a no translation error for a load (assuming trap is 0x400)
Do you know what insn this is? I am adding a patch last in this mail
for catching dcbX insn in do_page_fault()

You need the patch I posted yesterday too:

From c5f1a70561730b8a443f7081fbd9c5b023147043 Mon Sep 17 00:00:00 2001
From: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
Date: Fri, 2 Oct 2009 14:59:21 +0200
Subject: [PATCH] powerpc, 8xx: DTLB Error must check for more errors.

DataTLBError currently does:
 if ((err & 0x02000000) == 0)
    DSI();
This won't handle a store with no valid translation.
Change this to
 if ((err & 0x48000000) != 0)
    DSI();
that is, branch to DSI if either !permission or
!translation.
---
 arch/powerpc/kernel/head_8xx.S |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)
Benjamin Herrenschmidt - Oct. 3, 2009, 8:31 a.m.
On Sat, 2009-10-03 at 10:05 +0200, Joakim Tjernlund wrote:
> Cannot shake the feeling that it this snip of code that causes it
>         lwz     r11, 0(r10)     /* Get the level 1 entry */
>         rlwinm. r10, r11,0,0,19 /* Extract page descriptor page
> address */
>         beq     2f              /* If zero, don't try to find a pte */
> Perhaps we can do something better? I still feel that we need to
> force a TLB Error as the TLBMiss does not set DSISR so we have no way
> of
> knowing if it is an load or store.

Can't we manufacture a DSISR and branch to the right function ?

Ben.
Joakim Tjernlund - Oct. 3, 2009, 9:24 a.m.
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 03/10/2009 10:31:18:
>
> On Sat, 2009-10-03 at 10:05 +0200, Joakim Tjernlund wrote:
> > Cannot shake the feeling that it this snip of code that causes it
> >         lwz     r11, 0(r10)     /* Get the level 1 entry */
> >         rlwinm. r10, r11,0,0,19 /* Extract page descriptor page
> > address */
> >         beq     2f              /* If zero, don't try to find a pte */
> > Perhaps we can do something better? I still feel that we need to
> > force a TLB Error as the TLBMiss does not set DSISR so we have no way
> > of
> > knowing if it is an load or store.
>
> Can't we manufacture a DSISR and branch to the right function ?

Not if we want know if it is a load or store. There is no info to manufacture
a DSISR from. The best we can do here is try getting the RPN physical page
number correct. Perhaps something like this will do:
	/* Copy 20 msb from MD_EPN to r20 to get the correct page
	 * number. Do not rely on DAR since the dcxx instructions fails
	 * to update DAR when they cause a DTLB Miss
	 */
	mfspr	r21, MD_EPN
	li	r20, 0x0
	rlwimi	r20, r21, 0, 0, 19
Then go back and set the RPN accordingly.

The 8xx is different as as it will force a TLB error every time
it needs to deal with a page fault.
I suspect adding
if (!ret)
   _tlbil_va(address);
in do_page_fault() will do the trick too.

So yes, there is a missing _tlbil_va() missing for 8xx somewhere
but there is something more too.
Maybe your new filter functions and my
 powerpc, 8xx: DTLB Error must check for more errors.
will do the trick?

 Jocke
Benjamin Herrenschmidt - Oct. 3, 2009, 10:57 a.m.
On Sat, 2009-10-03 at 11:24 +0200, Joakim Tjernlund wrote:
> 
> So yes, there is a missing _tlbil_va() missing for 8xx somewhere
> but there is something more too.
> Maybe your new filter functions and my
>  powerpc, 8xx: DTLB Error must check for more errors.
> will do the trick?

Well, if we can't tell between a load and a store on a TLB miss, then
we should probably let it create an unpopulated entry in all cases,
so that we do take a proper DSI/ISI the second time around, which
would then tell us where we come from...

Ben.
Joakim Tjernlund - Oct. 3, 2009, 11:47 a.m.
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 03/10/2009 12:57:28:

> On Sat, 2009-10-03 at 11:24 +0200, Joakim Tjernlund wrote:
> >
> > So yes, there is a missing _tlbil_va() missing for 8xx somewhere
> > but there is something more too.
> > Maybe your new filter functions and my
> >  powerpc, 8xx: DTLB Error must check for more errors.
> > will do the trick?
>
> Well, if we can't tell between a load and a store on a TLB miss, then
> we should probably let it create an unpopulated entry in all cases,
> so that we do take a proper DSI/ISI the second time around, which
> would then tell us where we come from...

Not quite sure what you mean here?
Always branch to DSI at TLB Miss and create a unpopulated
entry? That does not feel right.

The current method is better. The only odd thing is
the null pmd entry and what to load into RPN. I am not convinced
that it is a problem but I would like to see a generic impl. of
a null pmd and use that instead. A 8xx impl. of that
looks like this(tested on 2.4, works fine)

	lwz	r11, 0(r10)	/* Get the level 1 entry */
	rlwinm.	r10, r11,0,0,19	/* Extract page descriptor page address */
	bne+	4f		/* If zero, use a null pmd */
	lis	r11, null_pmd_dir@h
	ori	r11, r11, null_pmd_dir@l
4:	tophys(r11,r11)

...
	.data
	.globl	null_pmd_dir
null_pmd_dir:
	.space	4096

If the pmd_none() and friends are updated to test for a null_pmd_dir
we can loose the test above and save 4 insn :)

Anyhow did you post a patch(can't find one) about your suggested
filter functions for 8xx? I sure would like to see one :)
Joakim Tjernlund - Oct. 4, 2009, 8:35 a.m.
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 03/10/2009 12:57:28:
>
> On Sat, 2009-10-03 at 11:24 +0200, Joakim Tjernlund wrote:
> >
> > So yes, there is a missing _tlbil_va() missing for 8xx somewhere
> > but there is something more too.
> > Maybe your new filter functions and my
> >  powerpc, 8xx: DTLB Error must check for more errors.
> > will do the trick?
>
> Well, if we can't tell between a load and a store on a TLB miss, then
> we should probably let it create an unpopulated entry in all cases,
> so that we do take a proper DSI/ISI the second time around, which
> would then tell us where we come from...

While looking closer on 8xx TLB handling I noticed we were taking
an extra TLB Error when making a page dirty. Tracked it down to:
static inline pte_t pte_mkdirty(pte_t pte) {
	pte_val(pte) |= _PAGE_DIRTY; return pte; }
pte_mkdirty does not set HWWRITE thus forcing a new
TLB error to clear it. Adding HWWRITE to pte_mkdirty fixes the
problem.

Looking at (especially pte_mkclean):
static inline pte_t pte_wrprotect(pte_t pte) {
	pte_val(pte) &= ~(_PAGE_RW | _PAGE_HWWRITE); return pte; }
static inline pte_t pte_mkclean(pte_t pte) {
	pte_val(pte) &= ~(_PAGE_DIRTY | _PAGE_HWWRITE); return pte; }

it looks like a mistake not to include HWWRITE in pte_mkdirty(),
what do you think?

 Jocke
Benjamin Herrenschmidt - Oct. 4, 2009, 8:26 p.m.
On Sun, 2009-10-04 at 10:35 +0200, Joakim Tjernlund wrote:
> Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 03/10/2009 12:57:28:
> >
> > On Sat, 2009-10-03 at 11:24 +0200, Joakim Tjernlund wrote:
> > >
> > > So yes, there is a missing _tlbil_va() missing for 8xx somewhere
> > > but there is something more too.
> > > Maybe your new filter functions and my
> > >  powerpc, 8xx: DTLB Error must check for more errors.
> > > will do the trick?
> >
> > Well, if we can't tell between a load and a store on a TLB miss, then
> > we should probably let it create an unpopulated entry in all cases,
> > so that we do take a proper DSI/ISI the second time around, which
> > would then tell us where we come from...
> 
> While looking closer on 8xx TLB handling I noticed we were taking
> an extra TLB Error when making a page dirty. Tracked it down to:
> static inline pte_t pte_mkdirty(pte_t pte) {
> 	pte_val(pte) |= _PAGE_DIRTY; return pte; }
> pte_mkdirty does not set HWWRITE thus forcing a new
> TLB error to clear it. Adding HWWRITE to pte_mkdirty fixes the
> problem.

We should just get rid of HWWRITE like I did for 44x and BookE.

> Looking at (especially pte_mkclean):
> static inline pte_t pte_wrprotect(pte_t pte) {
> 	pte_val(pte) &= ~(_PAGE_RW | _PAGE_HWWRITE); return pte; }
> static inline pte_t pte_mkclean(pte_t pte) {
> 	pte_val(pte) &= ~(_PAGE_DIRTY | _PAGE_HWWRITE); return pte; }
> 
> it looks like a mistake not to include HWWRITE in pte_mkdirty(),
> what do you think?

Maybe yes. No big deal right now, we have more important problems
to fix no ? :-)

Ben.
Joakim Tjernlund - Oct. 4, 2009, 8:38 p.m.
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 04/10/2009 22:26:42:
> On Sun, 2009-10-04 at 10:35 +0200, Joakim Tjernlund wrote:
> > Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 03/10/2009 12:57:28:
> > >
> > > On Sat, 2009-10-03 at 11:24 +0200, Joakim Tjernlund wrote:
> > > >
> > > > So yes, there is a missing _tlbil_va() missing for 8xx somewhere
> > > > but there is something more too.
> > > > Maybe your new filter functions and my
> > > >  powerpc, 8xx: DTLB Error must check for more errors.
> > > > will do the trick?
> > >
> > > Well, if we can't tell between a load and a store on a TLB miss, then
> > > we should probably let it create an unpopulated entry in all cases,
> > > so that we do take a proper DSI/ISI the second time around, which
> > > would then tell us where we come from...
> >
> > While looking closer on 8xx TLB handling I noticed we were taking
> > an extra TLB Error when making a page dirty. Tracked it down to:
> > static inline pte_t pte_mkdirty(pte_t pte) {
> >    pte_val(pte) |= _PAGE_DIRTY; return pte; }
> > pte_mkdirty does not set HWWRITE thus forcing a new
> > TLB error to clear it. Adding HWWRITE to pte_mkdirty fixes the
> > problem.
>
> We should just get rid of HWWRITE like I did for 44x and BookE.

I am trying :)

>
> > Looking at (especially pte_mkclean):
> > static inline pte_t pte_wrprotect(pte_t pte) {
> >    pte_val(pte) &= ~(_PAGE_RW | _PAGE_HWWRITE); return pte; }
> > static inline pte_t pte_mkclean(pte_t pte) {
> >    pte_val(pte) &= ~(_PAGE_DIRTY | _PAGE_HWWRITE); return pte; }
> >
> > it looks like a mistake not to include HWWRITE in pte_mkdirty(),
> > what do you think?
>
> Maybe yes. No big deal right now, we have more important problems
> to fix no ? :-)

The missing invalidate you guys need to work out. I have no clue
where to put it. If we are really lucky, getting rid of HWWRITE
might help :)
Scott Wood - Oct. 5, 2009, 6:24 p.m.
On Sat, Oct 03, 2009 at 10:05:46AM +0200, Joakim Tjernlund wrote:
> Scott Wood <scottwood@freescale.com> wrote on 02/10/2009 23:49:49:
> > Adding a tlbil_va to do_page_fault makes the problem go away for me (on
> > top of your "merge" branch) -- none of the other changes in this thread
> > do (assuming I didn't miss any).  FWIW, when it gets stuck on a fault,
> > DSISR is 0xc0000000, and handle_mm_fault returns zero.
> 
> OK, that is a no translation error for a load (assuming trap is 0x400)

Yes, 0x400.

> Do you know what insn this is? 

Various lbz and lwz.

-Scott
Joakim Tjernlund - Oct. 5, 2009, 6:50 p.m.
Scott Wood <scottwood@freescale.com> wrote on 05/10/2009 20:24:29:
>
> On Sat, Oct 03, 2009 at 10:05:46AM +0200, Joakim Tjernlund wrote:
> > Scott Wood <scottwood@freescale.com> wrote on 02/10/2009 23:49:49:
> > > Adding a tlbil_va to do_page_fault makes the problem go away for me (on
> > > top of your "merge" branch) -- none of the other changes in this thread
> > > do (assuming I didn't miss any).  FWIW, when it gets stuck on a fault,
> > > DSISR is 0xc0000000, and handle_mm_fault returns zero.
> >
> > OK, that is a no translation error for a load (assuming trap is 0x400)
>
> Yes, 0x400.
>
> > Do you know what insn this is?
>
> Various lbz and lwz.

OK, this rules out any dcbX problem. Perhaps you can try any of Bens
ideas? Not sure what they were but hopefully you and Ben do :)

Preferably after you have tested my new patches :)

 Jocke

Patch

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 52ff8c5..118bb05 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -472,8 +472,8 @@  DataTLBError:
 	/* First, make sure this was a store operation.
 	*/
 	mfspr	r10, SPRN_DSISR
-	andis.	r11, r10, 0x0200	/* If set, indicates store op */
-	beq	2f
+	andis.	r11, r10, 0x4800 /* no translation, no permission. */
+	bne	2f	/* branch if either is set */

 	/* The EA of a data TLB miss is automatically stored in the MD_EPN
 	 * register.  The EA of a data TLB error is automatically stored in
--
1.6.4.4


Cannot shake the feeling that it this snip of code that causes it
	lwz	r11, 0(r10)	/* Get the level 1 entry */
	rlwinm.	r10, r11,0,0,19	/* Extract page descriptor page address */
	beq	2f		/* If zero, don't try to find a pte */
Perhaps we can do something better? I still feel that we need to
force a TLB Error as the TLBMiss does not set DSISR so we have no way of
knowing if it is an load or store.

 Jocke

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 7699394..c33c6de 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -139,6 +139,88 @@  int __kprobes do_page_fault(struct pt_regs *regs, unsigned long address,
 #else
 	is_write = error_code & ESR_DST;
 #endif /* CONFIG_4xx || CONFIG_BOOKE */
+#if 1 /* defined(CONFIG_8xx)*/
+#define DEBUG_DCBX
+/*
+ Work around DTLB Miss/Error, as these do not update
+ DAR for dcbf, dcbi, dcbst, dcbz and icbi instructions
+ This relies on every exception tagging DAR with 0xf0
+ before returning (rfi)
+ DAR is passed as 'address' to this function.
+ */
+	{
+		unsigned long ra, rb, dar, insn;
+#ifdef DEBUG_DCBX
+		const char *istr = NULL;
+
+		insn = *((unsigned long *)regs->nip);
+		if (((insn >> (31-5)) & 0x3f) == 31) {
+			if (((insn >> 1) & 0x3ff) == 1014) /* dcbz ? 0x3f6 */
+				istr = "dcbz";
+			if (((insn >> 1) & 0x3ff) == 86) /* dcbf ? 0x56 */
+				istr = "dcbf";
+			if (((insn >> 1) & 0x3ff) == 470) /* dcbi ? 0x1d6 */
+				istr = "dcbi";
+			if (((insn >> 1) & 0x3ff) == 54) /* dcbst ? 0x36 */
+				istr = "dcbst";
+			if (((insn >> 1) & 0x3ff) == 982) /* icbi ? 0x3d6 */
+				istr = "icbi";
+			if (istr) {
+				ra = (insn >> (31-15)) & 0x1f; /* Reg RA */
+				rb = (insn >> (31-20)) & 0x1f; /* Reg RB */
+				dar = regs->gpr[rb];
+				if (ra)
+					dar += regs->gpr[ra];
+				if (dar != address && address != 0x00f0 && trap == 0x300)
+					printk(KERN_CRIT "%s: address:%lx, dar:%lx!\n", istr, address, dar);
+				if (!strcmp(istr, "dcbst") && is_write) {
+					printk(KERN_CRIT "dcbst R%ld,R%ld = %lx as a store, fixing!\n",
+					       ra, rb, dar);
+					is_write = 0;
+				}
+
+				if (trap == 0x300 && address != dar) {
+					__asm__ ("mtdar %0" : : "r" (dar));
+					return 0;
+				}
+			}
+		}
+#endif
+		if (address == 0x00f0 && trap == 0x300) {
+			pte_t *ptep;
+
+			/* This is from a dcbX or icbi insn gone bad, these
+			 * insn do not set DAR so we have to do it here instead */
+			insn = *((unsigned long *)regs->nip);
+
+			ra = (insn >> (31-15)) & 0x1f; /* Reg RA */
+			rb = (insn >> (31-20)) & 0x1f; /* Reg RB */
+			dar = regs->gpr[rb];
+			if (ra)
+				dar += regs->gpr[ra];
+			/* Set DAR to correct address for the DTLB Miss/Error handler
+			 * to redo the TLB exception. This time with correct address */
+			__asm__ ("mtdar %0" : : "r" (dar));
+#ifdef DEBUG_DCBX
+			printk(KERN_CRIT "trap:%x address:%lx, dar:%lx,err:%lx %s\n",
+			       trap, address, dar, error_code, istr);
+#endif
+			address = dar;
+#if 1
+			if (is_write && get_pteptr(mm, dar, &ptep, NULL)) {
+				pte_t my_pte = *ptep;
+
+				if (pte_present(my_pte) && pte_write(my_pte)) {
+					pte_val(my_pte) |= _PAGE_DIRTY|_PAGE_ACCESSED|_PAGE_HWWRITE;
+					set_pte_at(mm, dar, ptep, my_pte);
+				}
+			}
+#else
+			return 0;
+#endif
+		}
+	}
+#endif

 	if (notify_page_fault(regs))
 		return 0;