Update list of i686-class processors in sysdeps/x86/cpu-features.h

Message ID alpine.DEB.2.21.1809172305170.13111@digraph.polyomino.org.uk
State New
Headers show
Series
  • Update list of i686-class processors in sysdeps/x86/cpu-features.h
Related show

Commit Message

Joseph Myers Sept. 17, 2018, 11:07 p.m.
I noticed that sysdeps/x86/cpu-features.h had conditionals on whether
to define HAS_CPUID, HAS_I586 and HAS_I686 with a long list of
preprocessor macros for i686-and-later processors which however was
out of date.  This patch adds more such macros based on the list in
current GCC.  It seems HAS_I586 and HAS_I686 are unused so the only
effect of these macros being missing is that 32-bit glibc built for
one of these processors would end up doing runtime detection of CPUID
availability.

Tested for x86.

Question: would it be better to implement this conditional in a
negative sense (!defined __i486__ && !defined __i586__ && !defined
__geode__, based on what macros are in the fixed conditional, but
maybe using __k6__ instead of __geode__ based on GCC's understanding
of the options in question) to reduce the chances of it needing
updating in future?  (__i586__ is actually handled above.)

Question: is the inclusion of __k6__ in the present conditional
logically incorrect, and the omission of __geode__ logically
incorrect, as regards whether to define HAVE_I686?  The way GCC
defines -march=k6 and -march=geode, it seems to think that the former
excludes CMOV and the latter includes it.  (-march=geode is
specifically "AMD Geode embedded processor with MMX and 3DNow!@:
instruction set support.".)

2018-09-17  Joseph Myers  <joseph@codesourcery.com>

	* sysdeps/x86/cpu-features.h [__goldmont__ || __goldmont_plus__ ||
	__tremont__ || __knm__ || __skylake__ || __skylake_avx512__ ||
	__cannonlake__ || __icelake_client__ || __icelake_server__ ||
	__znver1__]: Also count as i686 processors.

Comments

Florian Weimer Sept. 19, 2018, 6:33 a.m. | #1
* Joseph Myers:

> Question: would it be better to implement this conditional in a
> negative sense (!defined __i486__ && !defined __i586__ && !defined
> __geode__, based on what macros are in the fixed conditional, but
> maybe using __k6__ instead of __geode__ based on GCC's understanding
> of the options in question) to reduce the chances of it needing
> updating in future?  (__i586__ is actually handled above.)

I think the GCC developers expect that we use the individual feature
test macros (__SSE2__ etc.).  The model-based macros are useless
because CPU features are increasingly non-monotonic, and it does not
make sense to replicate GCC's view of the Intel and AMD roadmaps in
glibc.

> Question: is the inclusion of __k6__ in the present conditional
> logically incorrect, and the omission of __geode__ logically
> incorrect, as regards whether to define HAVE_I686?  The way GCC
> defines -march=k6 and -march=geode, it seems to think that the former
> excludes CMOV and the latter includes it.  (-march=geode is
> specifically "AMD Geode embedded processor with MMX and 3DNow!@:
> instruction set support.".)

Geode lacks support for long NOPs, but our i686 port requires them.
Joseph Myers Sept. 19, 2018, 12:39 p.m. | #2
On Wed, 19 Sep 2018, Florian Weimer wrote:

> * Joseph Myers:
> 
> > Question: would it be better to implement this conditional in a
> > negative sense (!defined __i486__ && !defined __i586__ && !defined
> > __geode__, based on what macros are in the fixed conditional, but
> > maybe using __k6__ instead of __geode__ based on GCC's understanding
> > of the options in question) to reduce the chances of it needing
> > updating in future?  (__i586__ is actually handled above.)
> 
> I think the GCC developers expect that we use the individual feature
> test macros (__SSE2__ etc.).  The model-based macros are useless

Which don't exist for CMOV, for example.

We have a general principle that it's best for compile-time selection of 
subarchitecture variants of functions to be based on how the compiler 
behaves rather than configuring for a variant host triplet or using 
--with-cpu.  We could e.g. select SSE functions by default based on 
__SSE2__ (even if we don't do so at present).  But testing whether the 
compiler produces a CMOV instruction for particular code seems more 
fragile than testing #if conditions for processors without CMOV.

As I suggested in the discussion in bug 23630, I think it would make sense 
to eliminate, or at least minimize, the i586 / i686 / i786 sysdeps 
directories, to reduce the number of function variants to be maintained; 
maybe there are a few places where having conditionals in files on whether 
CMOV is supported would be awkward, but in many places either an 
i686-tuned version can be used as generic, or only small local 
conditionals for CMOV support are needed.  E.g. 
sysdeps/i386/i686/pthread_spin_trylock.S defines HAVE_CMOV; that ought to 
be something available in a generic header, based on some way on what the 
compiler predefines.

> > Question: is the inclusion of __k6__ in the present conditional
> > logically incorrect, and the omission of __geode__ logically
> > incorrect, as regards whether to define HAVE_I686?  The way GCC
> > defines -march=k6 and -march=geode, it seems to think that the former
> > excludes CMOV and the latter includes it.  (-march=geode is
> > specifically "AMD Geode embedded processor with MMX and 3DNow!@:
> > instruction set support.".)
> 
> Geode lacks support for long NOPs, but our i686 port requires them.

Requires them where?  Through the use of -Wa,-mtune=i686 in 
sysdeps/i386/i686/Makefile, or also through explicit uses?
H.J. Lu Sept. 19, 2018, 12:47 p.m. | #3
On Wed, Sep 19, 2018 at 5:39 AM, Joseph Myers <joseph@codesourcery.com> wrote:

>>
>> Geode lacks support for long NOPs, but our i686 port requires them.
>
> Requires them where?  Through the use of -Wa,-mtune=i686 in
> sysdeps/i386/i686/Makefile, or also through explicit uses?

CET is allowed for i686, which uses long NOPs.
Joseph Myers Sept. 19, 2018, 12:52 p.m. | #4
On Wed, 19 Sep 2018, H.J. Lu wrote:

> On Wed, Sep 19, 2018 at 5:39 AM, Joseph Myers <joseph@codesourcery.com> wrote:
> 
> >>
> >> Geode lacks support for long NOPs, but our i686 port requires them.
> >
> > Requires them where?  Through the use of -Wa,-mtune=i686 in
> > sysdeps/i386/i686/Makefile, or also through explicit uses?
> 
> CET is allowed for i686, which uses long NOPs.

Isn't that just a matter of what's enabled by an explicit configure 
option, and so not relevant to conditionals that might be needed when 
merging / moving some code from i686 sysdeps directories to the generic 
i386 versions?
Florian Weimer Sept. 19, 2018, 8:05 p.m. | #5
Joseph Myers <joseph@codesourcery.com> writes:

> On Wed, 19 Sep 2018, Florian Weimer wrote:
>
>> * Joseph Myers:
>> 
>> > Question: would it be better to implement this conditional in a
>> > negative sense (!defined __i486__ && !defined __i586__ && !defined
>> > __geode__, based on what macros are in the fixed conditional, but
>> > maybe using __k6__ instead of __geode__ based on GCC's understanding
>> > of the options in question) to reduce the chances of it needing
>> > updating in future?  (__i586__ is actually handled above.)
>> 
>> I think the GCC developers expect that we use the individual feature
>> test macros (__SSE2__ etc.).  The model-based macros are useless
>
> Which don't exist for CMOV, for example.

That's really unfortunate.

> We have a general principle that it's best for compile-time selection of 
> subarchitecture variants of functions to be based on how the compiler 
> behaves rather than configuring for a variant host triplet or using 
> --with-cpu.  We could e.g. select SSE functions by default based on 
> __SSE2__ (even if we don't do so at present).  But testing whether the 
> compiler produces a CMOV instruction for particular code seems more 
> fragile than testing #if conditions for processors without CMOV.

I agree about CMOV because it is extremely unlikely that we will see
another useful x86 CPU that doesn't have CMOV.

> As I suggested in the discussion in bug 23630, I think it would make sense 
> to eliminate, or at least minimize, the i586 / i686 / i786 sysdeps 
> directories, to reduce the number of function variants to be maintained; 
> maybe there are a few places where having conditionals in files on whether 
> CMOV is supported would be awkward, but in many places either an 
> i686-tuned version can be used as generic, or only small local 
> conditionals for CMOV support are needed.  E.g. 
> sysdeps/i386/i686/pthread_spin_trylock.S defines HAVE_CMOV; that ought to 
> be something available in a generic header, based on some way on what the 
> compiler predefines.

That particular function should probably be written in C, so that the
compiler can produce a better branch-free sequence, without using CMOV.
(I would be slightly surprised if the availability of CMOV would matter
to assembler code tuned for generic x86.)

>> > Question: is the inclusion of __k6__ in the present conditional
>> > logically incorrect, and the omission of __geode__ logically
>> > incorrect, as regards whether to define HAVE_I686?  The way GCC
>> > defines -march=k6 and -march=geode, it seems to think that the former
>> > excludes CMOV and the latter includes it.  (-march=geode is
>> > specifically "AMD Geode embedded processor with MMX and 3DNow!@:
>> > instruction set support.".)
>> 
>> Geode lacks support for long NOPs, but our i686 port requires them.
>
> Requires them where?  Through the use of -Wa,-mtune=i686 in 
> sysdeps/i386/i686/Makefile, or also through explicit uses?

Hmm.  I may have misremembered.  The current port should be okay as long
as CET isn't enabled.

For a while, GAS generated long NOPs with -Wa,-mtune=i686, but I found
this commit in Fedora:

commit 55a30785d463ecd9cc1fa0e4b384b9f8b93a370a
Author: Jeff Law <law@redhat.com>
Date:   Mon Sep 24 09:25:31 2012 -0600

      - Remove most of fedora-nscd patch as we no longer use the
        old init files, but systemd instead.
      - Remove path-to-vi patch.  With the usr-move changes that
        patch is totally unnecessary.
      - Remove i686-nopl patch.  Gas was changed back in 2011 to
        avoid nopl.
      - Move gai-rfc1918 patch to submitted upstream status

And that re-instantiated the -Wa,-mtune=i686 options which were patched
out before.

We also didn't receive any complaints that Fedora 27 did not run on AMD
Geode, so glibc 2.26 at least was still okay.

Thanks,
Florian
Joseph Myers Sept. 19, 2018, 9:42 p.m. | #6
On Wed, 19 Sep 2018, Florian Weimer wrote:

> > We have a general principle that it's best for compile-time selection of 
> > subarchitecture variants of functions to be based on how the compiler 
> > behaves rather than configuring for a variant host triplet or using 
> > --with-cpu.  We could e.g. select SSE functions by default based on 
> > __SSE2__ (even if we don't do so at present).  But testing whether the 
> > compiler produces a CMOV instruction for particular code seems more 
> > fragile than testing #if conditions for processors without CMOV.
> 
> I agree about CMOV because it is extremely unlikely that we will see
> another useful x86 CPU that doesn't have CMOV.

So how about this patch that makes sysdeps/x86/cpu-features.h use short 
lists of processors without certain features instead of long lists of 
those with them?


Invert sense of list of i686-class processors in sysdeps/x86/cpu-features.h.

I noticed that sysdeps/x86/cpu-features.h had conditionals on whether
to define HAS_CPUID, HAS_I586 and HAS_I686 with a long list of
preprocessor macros for i686-and-later processors which however was
out of date.  This patch avoids the problem of the list getting out of
date by instead having conditionals on all the (few, old) pre-i686
processors for which GCC has preprocessor macros, rather than the
(many, expanding list) i686-and-later processors.  It seems HAS_I586
and HAS_I686 are unused so the only effect of these macros being
missing is that 32-bit glibc built for one of these processors would
end up doing runtime detection of CPUID availability.

i386 builds are prevented by a configure test so there is no need to
allow for them here.  __geode__ (no long nops?) and __k6__ (no CMOV,
at least according to GCC) are conservatively handled as i586, not
i686, here (as noted above, this is a theoretical distinction at
present in that only HAS_CPUID appears to be used).

Tested for x86.

2018-09-19  Joseph Myers  <joseph@codesourcery.com>

	* sysdeps/x86/cpu-features.h [__geode__ || __k6__]: Handle like
	[__i586__ || __pentium__].
	[__i486__]: Handle explicitly.
	(HAS_CPUID): Define to 1 if above macros are undefined.
	(HAS_I586): Likewise.
	(HAS_I686): Likewise.

diff --git a/sysdeps/x86/cpu-features.h b/sysdeps/x86/cpu-features.h
index d342664c64..fb22d7b9d6 100644
--- a/sysdeps/x86/cpu-features.h
+++ b/sysdeps/x86/cpu-features.h
@@ -257,30 +257,19 @@ extern const struct cpu_features *__get_cpu_features (void)
 
 #ifdef __x86_64__
 # define HAS_CPUID 1
-#elif defined __i586__ || defined __pentium__
+#elif (defined __i586__ || defined __pentium__	\
+       || defined __geode__ || defined __k6__)
 # define HAS_CPUID 1
 # define HAS_I586 1
 # define HAS_I686 HAS_ARCH_FEATURE (I686)
-#elif (defined __i686__ || defined __pentiumpro__		\
-       || defined __pentium4__ || defined __nocona__		\
-       || defined __atom__ || defined __core2__			\
-       || defined __corei7__ || defined __corei7_avx__		\
-       || defined __core_avx2__	|| defined __nehalem__		\
-       || defined __sandybridge__ || defined __haswell__	\
-       || defined __knl__ || defined __bonnell__		\
-       || defined __silvermont__				\
-       || defined __k6__ || defined __k8__			\
-       || defined __athlon__ || defined __amdfam10__		\
-       || defined __bdver1__ || defined __bdver2__		\
-       || defined __bdver3__ || defined __bdver4__		\
-       || defined __btver1__ || defined __btver2__)
-# define HAS_CPUID 1
-# define HAS_I586 1
-# define HAS_I686 1
-#else
+#elif defined __i486__
 # define HAS_CPUID 0
 # define HAS_I586 HAS_ARCH_FEATURE (I586)
 # define HAS_I686 HAS_ARCH_FEATURE (I686)
+#else
+# define HAS_CPUID 1
+# define HAS_I586 1
+# define HAS_I686 1
 #endif
 
 #endif  /* cpu_features_h */
Florian Weimer Sept. 20, 2018, 8:45 a.m. | #7
* Joseph Myers:

> 2018-09-19  Joseph Myers  <joseph@codesourcery.com>
>
> 	* sysdeps/x86/cpu-features.h [__geode__ || __k6__]: Handle like
> 	[__i586__ || __pentium__].
> 	[__i486__]: Handle explicitly.
> 	(HAS_CPUID): Define to 1 if above macros are undefined.
> 	(HAS_I586): Likewise.
> 	(HAS_I686): Likewise.

Looks okay to me, thanks.

Florian

Patch

diff --git a/sysdeps/x86/cpu-features.h b/sysdeps/x86/cpu-features.h
index d342664c64..e086b5f09e 100644
--- a/sysdeps/x86/cpu-features.h
+++ b/sysdeps/x86/cpu-features.h
@@ -268,12 +268,18 @@  extern const struct cpu_features *__get_cpu_features (void)
        || defined __core_avx2__	|| defined __nehalem__		\
        || defined __sandybridge__ || defined __haswell__	\
        || defined __knl__ || defined __bonnell__		\
-       || defined __silvermont__				\
+       || defined __silvermont__ || defined __goldmont__	\
+       || defined __goldmont_plus__ || defined __tremont__	\
+       || defined __knm__ || defined __skylake__		\
+       || defined __skylake_avx512__ || defined __cannonlake__	\
+       || defined __icelake_client__				\
+       || defined __icelake_server__				\
        || defined __k6__ || defined __k8__			\
        || defined __athlon__ || defined __amdfam10__		\
        || defined __bdver1__ || defined __bdver2__		\
        || defined __bdver3__ || defined __bdver4__		\
-       || defined __btver1__ || defined __btver2__)
+       || defined __btver1__ || defined __btver2__		\
+       || defined __znver1__)
 # define HAS_CPUID 1
 # define HAS_I586 1
 # define HAS_I686 1