Patchwork [v8] Vectorized _cpp_clean_line

login
register
mail settings
Submitter Rainer Orth
Date Aug. 24, 2010, 4:20 p.m.
Message ID <yddaaocq8x2.fsf@manam.CeBiTec.Uni-Bielefeld.DE>
Download mbox | patch
Permalink /patch/62609/
State New
Headers show

Comments

Rainer Orth - Aug. 24, 2010, 4:20 p.m.
Richard Henderson <rth@redhat.com> writes:

> In the short term, could you simply add the appropriate && !defined(__sun__)
> or whatever to the #if protecting the sse code?

Here's the patch I've tested.  i386-pc-solaris2.1[01] bootstraps with Sun
as completed successfully, i386-pc-solaris2.9 is running the testsuite,
i386-pc-solaris2.8 still running:

2010-08-24  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

	* lex.c [__sun__ && __svr4__]: Disable init_vectorized_lexer
	etc. on Solaris 2/x86.


Ok for mainline?

To properly fix this, we need several changes:

* Check if the assembler used supports SSE2/SSE4.2 insns.

* Perform a runtime check if SSE insns can be executed;
  cf. gcc/testsuite/gcc.target/i386/sse-os-support.h.

* On Solaris 10+, link executables using libcpp.a with a linker mapfile
  like gcc/testsuite/gcc.target/i386/clearcap.map.

I'm not sure if the performance gain from this code is worth the effort,
though.

	Rainer
Richard Henderson - Aug. 24, 2010, 4:31 p.m.
On 08/24/2010 09:20 AM, Rainer Orth wrote:
> Ok for mainline?

Ok, thanks.

> * Check if the assembler used supports SSE2/SSE4.2 insns.

Easy.

> * Perform a runtime check if SSE insns can be executed;
>   cf. gcc/testsuite/gcc.target/i386/sse-os-support.h.

Easy, though irritating.

> * On Solaris 10+, link executables using libcpp.a with a linker mapfile
>   like gcc/testsuite/gcc.target/i386/clearcap.map.

Difficult and irritating.  There's no assembler flag or
directive that can force the flag the way we want?

> I'm not sure if the performance gain from this code is worth the effort,
> though.

Certainly not for SSE4.2.

Is Solaris 10 a 64-bit OS?  We could leave the SSE2 code
path enabled for 64-bit, where we know SSE2 must be present.
That's still quite a bit faster than the integer code path.


r~
Rainer Orth - Aug. 24, 2010, 5:44 p.m.
Richard Henderson <rth@redhat.com> writes:

>> * Perform a runtime check if SSE insns can be executed;
>>   cf. gcc/testsuite/gcc.target/i386/sse-os-support.h.
>
> Easy, though irritating.

Indeed.  I thought about integrating it into cpuid.h, but it probably
doesn't really belong there.

>> * On Solaris 10+, link executables using libcpp.a with a linker mapfile
>>   like gcc/testsuite/gcc.target/i386/clearcap.map.
>
> Difficult and irritating.  There's no assembler flag or
> directive that can force the flag the way we want?

Unfortunately not.  I'll file an RFE for that.

>> I'm not sure if the performance gain from this code is worth the effort,
>> though.
>
> Certainly not for SSE4.2.
>
> Is Solaris 10 a 64-bit OS?  We could leave the SSE2 code
> path enabled for 64-bit, where we know SSE2 must be present.
> That's still quite a bit faster than the integer code path.

Solaris 10+ boots in 64-bit mode if the hardware supports that, but still
allows booting on 32-bit hardware.  I'm not completely sure what the
minimal hardware requirements are, though.  On the other hand, on S10
one could use getisax(2) to determine if some ISA extension is supported
by both the hardware and the OS:

	http://docs.sun.com/app/docs/doc/816-5167/getisax-2?l=all&a=view

Unfortunately, this is not available in Solaris 8/9, otherwise the
contortions in sse-os-support.h wouldn't be necessary.

	Rainer

Patch

diff -r 0cf77b5772bf libcpp/lex.c
--- a/libcpp/lex.c	Mon Aug 23 13:25:29 2010 +0200
+++ b/libcpp/lex.c	Tue Aug 24 17:33:23 2010 +0200
@@ -264,7 +264,17 @@ 
     }
 }
 
-#if (GCC_VERSION >= 4005) && (defined(__i386__) || defined(__x86_64__))
+/* Disable on Solaris 2/x86 until the following problems can be properly
+   autoconfed:
+
+   The Solaris 8 assembler cannot assemble SSE2/SSE4.2 insns.
+   The Solaris 9 assembler cannot assemble SSE4.2 insns.
+   Before Solaris 9 Update 6, SSE insns cannot be executed.
+   The Solaris 10+ assembler tags objects with the instruction set
+   extensions used, so SSE4.2 executables cannot run on machines that
+   don't support that extension.  */
+
+#if (GCC_VERSION >= 4005) && (defined(__i386__) || defined(__x86_64__)) && !(defined(__sun__) && defined(__svr4__))
 
 /* Replicated character data to be shared between implementations.
    Recall that outside of a context with vector support we can't