diff mbox

unaligned accesses in SLAB etc.

Message ID 20141016.165017.1151349565275102498.davem@davemloft.net
State Accepted
Delegated to: David Miller
Headers show

Commit Message

David Miller Oct. 16, 2014, 8:50 p.m. UTC
From: David Miller <davem@redhat.com>
Date: Thu, 16 Oct 2014 16:20:01 -0400 (EDT)

> So I'm going to audit all the code paths to make sure we don't put garbage
> into the fault_code value.

There are two code paths where we can put garbage into the fault_code
value.  And for the dtlb_prot.S case, the value we put in there is
TLB_TAG_ACCESS which is 0x30, which include bit 0x20 which is that
FAULT_CODE_BAD_RA indication which is erroneously triggering.

The other path is via hugepage TLB misses, for the situation where
we haven't allocated the huge TSB for the thread yet.  That might
explain some other longer-term problems we've had.

I'm about to test the following fix:

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Meelis Roos Oct. 17, 2014, 11:12 a.m. UTC | #1
> From: David Miller <davem@redhat.com>
> Date: Thu, 16 Oct 2014 16:20:01 -0400 (EDT)
> 
> > So I'm going to audit all the code paths to make sure we don't put garbage
> > into the fault_code value.
> 
> There are two code paths where we can put garbage into the fault_code
> value.  And for the dtlb_prot.S case, the value we put in there is
> TLB_TAG_ACCESS which is 0x30, which include bit 0x20 which is that
> FAULT_CODE_BAD_RA indication which is erroneously triggering.
> 
> The other path is via hugepage TLB misses, for the situation where
> we haven't allocated the huge TSB for the thread yet.  That might
> explain some other longer-term problems we've had.
> 
> I'm about to test the following fix:

Thank you - it seems to work fine for me on E3500 on top of 
3.17.0-07551-g052db7e + slab alignment fix.

However, on top of mainline HEAD 3.17.0-09670-g0429fbc it explodes with 
scheduler BUG - just reported to LKML + sched maintainers.
David Miller Oct. 18, 2014, 5:59 p.m. UTC | #2
From: Meelis Roos <mroos@linux.ee>
Date: Fri, 17 Oct 2014 14:12:09 +0300 (EEST)

> However, on top of mainline HEAD 3.17.0-09670-g0429fbc it explodes with 
> scheduler BUG - just reported to LKML + sched maintainers.

task_stack_end_corrupted() cannot work properly on sparc64.

It stores the magic value at "task_thread_info(p) + 1", but on
sparc64 that's where we store the nested array of FPU register
saves.

In fact this facility could be corrupting FPU register state in
certain circumstances.

The current sparc64 design is intentional, the CPU stack grows down
toward the thread_info, and the FPU stack saving area grows up from
the end of thread_info.

I don't want to define the array size of the fpregs save area
explicitly and thereby placing an artificial limit there.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/sparc/kernel/dtlb_prot.S b/arch/sparc/kernel/dtlb_prot.S
index b2c2c5b..d668ca14 100644
--- a/arch/sparc/kernel/dtlb_prot.S
+++ b/arch/sparc/kernel/dtlb_prot.S
@@ -24,11 +24,11 @@ 
 	mov		TLB_TAG_ACCESS, %g4		! For reload of vaddr
 
 /* PROT ** ICACHE line 2: More real fault processing */
+	ldxa		[%g4] ASI_DMMU, %g5		! Put tagaccess in %g5
 	bgu,pn		%xcc, winfix_trampoline		! Yes, perform winfixup
-	 ldxa		[%g4] ASI_DMMU, %g5		! Put tagaccess in %g5
-	ba,pt		%xcc, sparc64_realfault_common	! Nope, normal fault
 	 mov		FAULT_CODE_DTLB | FAULT_CODE_WRITE, %g4
-	nop
+	ba,pt		%xcc, sparc64_realfault_common	! Nope, normal fault
+	 nop
 	nop
 	nop
 	nop
diff --git a/arch/sparc/kernel/tsb.S b/arch/sparc/kernel/tsb.S
index 14158d4..be98685 100644
--- a/arch/sparc/kernel/tsb.S
+++ b/arch/sparc/kernel/tsb.S
@@ -162,10 +162,10 @@  tsb_miss_page_table_walk_sun4v_fastpath:
 	nop
 	.previous
 
-	rdpr	%tl, %g3
-	cmp	%g3, 1
+	rdpr	%tl, %g7
+	cmp	%g7, 1
 	bne,pn	%xcc, winfix_trampoline
-	 nop
+	 mov	%g3, %g4
 	ba,pt	%xcc, etrap
 	 rd	%pc, %g7
 	call	hugetlb_setup