diff mbox

[056/270] ARM: vfp: fix saving d16-d31 vfp registers on v6+ kernels

Message ID 1353949160-26803-57-git-send-email-herton.krzesinski@canonical.com
State New
Headers show

Commit Message

Herton Ronaldo Krzesinski Nov. 26, 2012, 4:55 p.m. UTC
3.5.7u1 -stable review patch.  If anyone has any objections, please let me know.

------------------

From: Russell King <rmk+kernel@arm.linux.org.uk>

commit 846a136881b8f73c1f74250bf6acfaa309cab1f2 upstream.

Michael Olbrich reported that his test program fails when built with
-O2 -mcpu=cortex-a8 -mfpu=neon, and a kernel which supports v6 and v7
CPUs:

volatile int x = 2;
volatile int64_t y = 2;

int main() {
	volatile int a = 0;
	volatile int64_t b = 0;
	while (1) {
		a = (a + x) % (1 << 30);
		b = (b + y) % (1 << 30);
		assert(a == b);
	}
}

and two instances are run.  When built for just v7 CPUs, this program
works fine.  It uses the "vadd.i64 d19, d18, d16" VFP instruction.

It appears that we do not save the high-16 double VFP registers across
context switches when the kernel is built for v6 CPUs.  Fix that.

Tested-By: Michael Olbrich <m.olbrich@pengutronix.de>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com>
---
 arch/arm/include/asm/vfpmacros.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
diff mbox

Patch

diff --git a/arch/arm/include/asm/vfpmacros.h b/arch/arm/include/asm/vfpmacros.h
index 3d5fc41..bf53047 100644
--- a/arch/arm/include/asm/vfpmacros.h
+++ b/arch/arm/include/asm/vfpmacros.h
@@ -28,7 +28,7 @@ 
 	ldr	\tmp, =elf_hwcap		    @ may not have MVFR regs
 	ldr	\tmp, [\tmp, #0]
 	tst	\tmp, #HWCAP_VFPv3D16
-	ldceq	p11, cr0, [\base],#32*4		    @ FLDMIAD \base!, {d16-d31}
+	ldceql	p11, cr0, [\base],#32*4		    @ FLDMIAD \base!, {d16-d31}
 	addne	\base, \base, #32*4		    @ step over unused register space
 #else
 	VFPFMRX	\tmp, MVFR0			    @ Media and VFP Feature Register 0
@@ -52,7 +52,7 @@ 
 	ldr	\tmp, =elf_hwcap		    @ may not have MVFR regs
 	ldr	\tmp, [\tmp, #0]
 	tst	\tmp, #HWCAP_VFPv3D16
-	stceq	p11, cr0, [\base],#32*4		    @ FSTMIAD \base!, {d16-d31}
+	stceql	p11, cr0, [\base],#32*4		    @ FSTMIAD \base!, {d16-d31}
 	addne	\base, \base, #32*4		    @ step over unused register space
 #else
 	VFPFMRX	\tmp, MVFR0			    @ Media and VFP Feature Register 0