diff mbox

sparc64: Fix in-place byteswap calls

Message ID 20141116.162801.336662281870381504.davem@davemloft.net
State Accepted
Delegated to: David Miller
Headers show

Commit Message

David Miller Nov. 16, 2014, 9:28 p.m. UTC
From: "Martin K. Petersen" <martin.petersen@oracle.com>
Date: Fri, 14 Nov 2014 13:30:30 -0500

> A different approach would be to use the "w" constraint. However, this
> was not introduced until gcc 4.7.0. swab.h is also used by userland so
> we don't have access to things like GCC_VERSION and would either have to
> pull that macro in or open code the version checks.  Furthermore, the
> output generated by gcc is the same with either approach. And
> consequently inserting the barrier seems like the simplest solution.

Just add a memory input constraint, but don't actually use it in the
assembler string.  There is no requirement that you reference every
operand mentioned in the constraints, yet their effects are always
taken into consideration by the compiler.

	__asm__ __volatile__ ("lduha [%2] %3, %0"
			      : "=r" (ret)
			      : "m" (*addr), "r" (addr), "i" (ASI_PL));

This solves the problem, because it shows the compiler the data flow
through the memory operand, yet still forces it to put 'addr' into a
register for the instruction.

This is the patch I'm going to use, thanks for the report.
diff mbox

Patch

====================
sparc64: Fix constraints on swab helpers.

We are reading the memory location, so we have to have a memory
constraint in there purely for the sake of showing the data flow
to the compiler.

Reported-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 arch/sparc/include/uapi/asm/swab.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/sparc/include/uapi/asm/swab.h b/arch/sparc/include/uapi/asm/swab.h
index a34ad07..4c7c12d 100644
--- a/arch/sparc/include/uapi/asm/swab.h
+++ b/arch/sparc/include/uapi/asm/swab.h
@@ -9,9 +9,9 @@  static inline __u16 __arch_swab16p(const __u16 *addr)
 {
 	__u16 ret;
 
-	__asm__ __volatile__ ("lduha [%1] %2, %0"
+	__asm__ __volatile__ ("lduha [%2] %3, %0"
 			      : "=r" (ret)
-			      : "r" (addr), "i" (ASI_PL));
+			      : "m" (*addr), "r" (addr), "i" (ASI_PL));
 	return ret;
 }
 #define __arch_swab16p __arch_swab16p
@@ -20,9 +20,9 @@  static inline __u32 __arch_swab32p(const __u32 *addr)
 {
 	__u32 ret;
 
-	__asm__ __volatile__ ("lduwa [%1] %2, %0"
+	__asm__ __volatile__ ("lduwa [%2] %3, %0"
 			      : "=r" (ret)
-			      : "r" (addr), "i" (ASI_PL));
+			      : "m" (*addr), "r" (addr), "i" (ASI_PL));
 	return ret;
 }
 #define __arch_swab32p __arch_swab32p
@@ -31,9 +31,9 @@  static inline __u64 __arch_swab64p(const __u64 *addr)
 {
 	__u64 ret;
 
-	__asm__ __volatile__ ("ldxa [%1] %2, %0"
+	__asm__ __volatile__ ("ldxa [%2] %3, %0"
 			      : "=r" (ret)
-			      : "r" (addr), "i" (ASI_PL));
+			      : "m" (*addr), "r" (addr), "i" (ASI_PL));
 	return ret;
 }
 #define __arch_swab64p __arch_swab64p