diff mbox

[1/2] sparc: remove ceil, floor, trunc sparc specific implementations

Message ID 20160802.210132.1761940999777091681.davem@davemloft.net
State New
Headers show

Commit Message

David Miller Aug. 3, 2016, 4:01 a.m. UTC
From: David Miller <davem@davemloft.net>
Date: Mon, 01 Aug 2016 18:10:33 -0700 (PDT)

> From: Aurelien Jarno <aurelien@aurel32.net>
> Date: Tue, 2 Aug 2016 01:59:24 +0200
> 
>> On 2016-08-01 14:57, David Miller wrote:
>>> >> Aurelien, it looks like we have the same exact issue with nearbyint on
>>> >> sparc, right?
>>> > 
>>> > I don't see the issue on nearbyint here. What is the issue exactly?
>>> 
>>> Maybe only the vis3 variant shows the problem:
>>> 
>>> Failure: nearbyint (sNaN): Exception "Invalid operation" not set
>>> Failure: nearbyint (-sNaN): Exception "Invalid operation" not set
>>> Failure: nearbyint_downward (sNaN): Exception "Invalid operation" not set
>>> Failure: nearbyint_downward (-sNaN): Exception "Invalid operation" not set
>>> Failure: nearbyint_towardzero (sNaN): Exception "Invalid operation" not set
>>> Failure: nearbyint_towardzero (-sNaN): Exception "Invalid operation" not set
>>> Failure: nearbyint_upward (sNaN): Exception "Invalid operation" not set
>>> Failure: nearbyint_upward (-sNaN): Exception "Invalid operation" not set
 ...
>> Now about the fix itself, we have to move the check before the fsr is
>> saved and after the value has been moved to the floating point register,
>> which is not something easy to do without breaking the whole code. One
>> option would be to do it after loading the fsr at the end, the other one
>> would be to use the generic version.
> 
> I'll look into this.

Aurelien I just tested the following and committed it to master:

Comments

Aurelien Jarno Aug. 3, 2016, 10:30 a.m. UTC | #1
On 2016-08-02 21:01, David Miller wrote:
> From: David Miller <davem@davemloft.net>
> Date: Mon, 01 Aug 2016 18:10:33 -0700 (PDT)
> 
> > From: Aurelien Jarno <aurelien@aurel32.net>
> > Date: Tue, 2 Aug 2016 01:59:24 +0200
> > 
> >> On 2016-08-01 14:57, David Miller wrote:
> >>> >> Aurelien, it looks like we have the same exact issue with nearbyint on
> >>> >> sparc, right?
> >>> > 
> >>> > I don't see the issue on nearbyint here. What is the issue exactly?
> >>> 
> >>> Maybe only the vis3 variant shows the problem:
> >>> 
> >>> Failure: nearbyint (sNaN): Exception "Invalid operation" not set
> >>> Failure: nearbyint (-sNaN): Exception "Invalid operation" not set
> >>> Failure: nearbyint_downward (sNaN): Exception "Invalid operation" not set
> >>> Failure: nearbyint_downward (-sNaN): Exception "Invalid operation" not set
> >>> Failure: nearbyint_towardzero (sNaN): Exception "Invalid operation" not set
> >>> Failure: nearbyint_towardzero (-sNaN): Exception "Invalid operation" not set
> >>> Failure: nearbyint_upward (sNaN): Exception "Invalid operation" not set
> >>> Failure: nearbyint_upward (-sNaN): Exception "Invalid operation" not set
>  ...
> >> Now about the fix itself, we have to move the check before the fsr is
> >> saved and after the value has been moved to the floating point register,
> >> which is not something easy to do without breaking the whole code. One
> >> option would be to do it after loading the fsr at the end, the other one
> >> would be to use the generic version.
> > 
> > I'll look into this.
> 
> Aurelien I just tested the following and committed it to master:

Thanks!

I have finally been able to look at the fdim issue. I have tried to
compile the generic s_fdim.c without the missing test to generate
ERANGE. It appears that GCC generate very similar code on sparc32. On
sparc64 it does the same provided the file is compiled with -mvis. It
appears we compile sparc64 (and sysdeps/sparc/sparc32/sparcv9) assuming
a CPU with VIS instructions but we don't ask GCC to use them.

I therefore propose the following set of patches:
- Remove the current fdim/fdimf assembler implementation. This way this
  patch can be easily backported.
- Pass -mvis when targetting sparc64
- Pass -mvis for files in sysdeps/sparc/sparc32/sparcv9. This will also
  improve some more functions.
- Compile the generic fdim/fdimf functions with -mvis and -mvis3 in the
  multiarch directory on sparc32.

I have already started working on that, I hope to get them by tomorrow.

In general we should probably review the existing optimized assembly
code. It looks like recent versions of GCC are good at generating
optimized code, provided they are fed with the correct -mvis or -mvis3
options.

Aurelien
diff mbox

Patch

====================
[PATCH] Fix sNaN handling in nearbyint on 32-bit sparc.

	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyint-vis3.S
	(__nearbyint_vis3): Don't check for sNaN before float register is
	loaded with the incoming argument.
	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyintf-vis3.S
	(__nearbyintf_vis3): Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyint.S (__nearbyint):
	Likewise.
	* sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyintf.S (__nearbyintf):
	Likewise.
---
 ChangeLog                                                      | 10 ++++++++++
 sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyint-vis3.S |  6 +++---
 .../sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyintf-vis3.S    |  2 +-
 sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyint.S                |  8 ++++----
 sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyintf.S               |  4 ++--
 5 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 8caa301..f84b981 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,15 @@ 
 2016-08-02  David S. Miller  <davem@davemloft.net>
 
+	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyint-vis3.S
+	(__nearbyint_vis3): Don't check for sNaN before float register is
+	loaded with the incoming argument.
+	* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyintf-vis3.S
+	(__nearbyintf_vis3): Likewise.
+	* sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyint.S (__nearbyint):
+	Likewise.
+	* sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyintf.S (__nearbyintf):
+	Likewise.
+
 	* string/test-strncmp.c (do_test_limit): Make sure the test data
 	stream is aligned as required for the type "CHAR".
 	(do_test): Likewise.
diff --git a/sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyint-vis3.S b/sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyint-vis3.S
index d9ff0cc..ff81b0d 100644
--- a/sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyint-vis3.S
+++ b/sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyint-vis3.S
@@ -36,15 +36,15 @@ 
 #define SIGN_BIT	%f12			/* -0.0 */
 
 ENTRY (__nearbyint_vis3)
+	sllx	%o0, 32, %o0
+	or	%o0, %o1, %o0
+	movxtod	%o0, %f0
 	fcmpd	%fcc3, %f0, %f0			/* Check for sNaN */
 	st	%fsr, [%sp + 88]
 	sethi	%hi(TWO_FIFTYTWO), %o2
 	sethi	%hi(0xf8003e0), %o5
 	ld	[%sp + 88], %o4
-	sllx	%o0, 32, %o0
 	or	%o5, %lo(0xf8003e0), %o5
-	or	%o0, %o1, %o0
-	movxtod	%o0, %f0
 	andn	%o4, %o5, %o4
 	fzero	ZERO
 	st	%o4, [%sp + 80]
diff --git a/sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyintf-vis3.S b/sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyintf-vis3.S
index 5cd1eb0..833a0df 100644
--- a/sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyintf-vis3.S
+++ b/sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyintf-vis3.S
@@ -35,9 +35,9 @@ 
 #define SIGN_BIT	%f12			/* -0.0 */
 
 ENTRY (__nearbyintf_vis3)
+	movwtos	%o0, %f1
 	fcmps	%fcc3, %f1, %f1			/* Check for sNaN */
 	st	%fsr, [%sp + 88]
-	movwtos	%o0, %f1
 	sethi	%hi(TWO_TWENTYTHREE), %o2
 	sethi	%hi(0xf8003e0), %o5
 	ld	[%sp + 88], %o4
diff --git a/sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyint.S b/sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyint.S
index 84a1097..198440a 100644
--- a/sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyint.S
+++ b/sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyint.S
@@ -36,21 +36,21 @@ 
 #define SIGN_BIT	%f12			/* -0.0 */
 
 ENTRY (__nearbyint)
+	sllx	%o0, 32, %o0
+	or	%o0, %o1, %o0
+	stx	%o0, [%sp + 72]
+	ldd	[%sp + 72], %f0
 	fcmpd	%fcc3, %f0, %f0			/* Check for sNaN */
 	st	%fsr, [%sp + 88]
 	sethi	%hi(TWO_FIFTYTWO), %o2
 	sethi	%hi(0xf8003e0), %o5
 	ld	[%sp + 88], %o4
-	sllx	%o0, 32, %o0
 	or	%o5, %lo(0xf8003e0), %o5
-	or	%o0, %o1, %o0
 	andn	%o4, %o5, %o4
 	fzero	ZERO
 	st	%o4, [%sp + 80]
-	stx	%o0, [%sp + 72]
 	sllx	%o2, 32, %o2
 	fnegd	ZERO, SIGN_BIT
-	ldd	[%sp + 72], %f0
 	ld	[%sp + 80], %fsr
 	stx	%o2, [%sp + 72]
 	fabsd	%f0, %f14
diff --git a/sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyintf.S b/sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyintf.S
index d5cf5ce..9be41f6 100644
--- a/sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyintf.S
+++ b/sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyintf.S
@@ -35,9 +35,10 @@ 
 #define SIGN_BIT	%f12			/* -0.0 */
 
 ENTRY (__nearbyintf)
+	st	%o0, [%sp + 68]
+	ld	[%sp + 68], %f1
 	fcmps	%fcc3, %f1, %f1			/* Check for sNaN */
 	st	%fsr, [%sp + 88]
-	st	%o0, [%sp + 68]
 	sethi	%hi(TWO_TWENTYTHREE), %o2
 	sethi	%hi(0xf8003e0), %o5
 	ld	[%sp + 88], %o4
@@ -46,7 +47,6 @@  ENTRY (__nearbyintf)
 	fnegs	ZERO, SIGN_BIT
 	andn	%o4, %o5, %o4
 	st	%o4, [%sp + 80]
-	ld	[%sp + 68], %f1
 	ld	[%sp + 80], %fsr
 	st	%o2, [%sp + 68]
 	fabss	%f1, %f14