diff mbox series

[PATCHv4,2/2] powerpc64le: ifunc (almost) all *f128 routines in multiarch mode

Message ID f22cb700cb51c7e3562b22e459677047b0f6b03a.1592254374.git.murphyp@linux.vnet.ibm.com
State New
Headers show
Series [PATCHv4,1/2] powerpc64le: refactor e_sqrtf128.c | expand

Commit Message

Paul E. Murphy June 15, 2020, 8:59 p.m. UTC
See the Makefile changes for high level design/commentary.

V4 changes -
  * Drop patch to add libm_alias_exclusive_ldouble.  After
    recent refactoring of fmaf128, it showed some unfixable
    flaws.  Instead, use macro renaming for nextafterf128 to
    generate the needed symbols, and rework.

V3 changes -
  * Cleanup comments.
  * Rebase against fmaf128 cleanup
  * Use Makeconfig trick to set var in le/power9 sysdep dir to
    determine if ifunc support is necessary.  This works with
    the upcoming CPU detection patch.
  * fmaf128 patch is no longer needed.

V2 changes -
  * move duplicate redirect macros into float128-ifunc-redirect-macros.h
  * replace subshell usage with command sequencing
  * Add more instructive documentation in Makefile about how all
    these ugly pieces work togethor
  * Minor comment cleanup throughout
  * Improve inline documentation/commentary throughout

---8<---

Programatically generate simple wrappers for most libm *f128
objects and a set of ifunc objects to unify them.

A second set of implementation files are generated which simply
include the first implementation encountered along the search
path.  This usually works, excepting when a wrapper is overriden
and makefile search order slightly diverges from include order.

A set of additional headers are included which primarily rely
on asm redirects to rename, and less frequently macro renames
where an asm redirect is not possible.  These intercept several
common headers to install redirect and disable macros at specific
times.  This works surprisingly well.  Notably, some ugliness
occurs when header inclusion must be coerced at certain times
before turning off aliasing and plt bypass wrappers.

Notably, the only special case is s_significandf128.c.  It is
doubly special as exists to support ldouble redirects, and
exposes subtle difference between makefile rules and search path
orders.  Commentary is inlined.

Admittedly, this makes shared maintenance a tiny bit more
difficult, but lays groundwork for supporting more optimized
float128 routines which very overtly assume a soft-fp runtime.
Changes to internal float128 API should fail at compile time,
thus build-many-glibcs.py should readily catch any divergence.

Finally, don't build this support if requested CPU is newer
than power8.

fixup f128 ifunc

drop the patch to introduce the new macro to assist simplification of
s_nextafter.c.  It wasn't thought out well enough.  Instead just add
the ugly macro redirections needed to generate the appropriate
nexttoward symbols.
---
 .../powerpc64/le/fpu/multiarch/Makefile       | 210 ++++++++++++++++-
 .../le/fpu/multiarch/float128-ifunc-macros.h  |  68 ++++++
 .../float128-ifunc-redirect-macros.h          |  52 +++++
 .../multiarch/float128-ifunc-redirects-mp.h   |  64 ++++++
 .../fpu/multiarch/float128-ifunc-redirects.h  |  40 ++++
 .../le/fpu/multiarch/float128-ifunc.c         |  66 ++++++
 .../le/fpu/multiarch/float128-ifunc.h         | 217 ++++++++++++++++++
 .../le/fpu/multiarch/float128_private.h       | 143 ++++++++++++
 .../fpu/multiarch/math-type-macros-float128.h | 136 +++++++++++
 .../powerpc64/le/fpu/multiarch/math_private.h |  15 ++
 .../le/fpu/multiarch/s_fmaf128-power9.c       |  28 ---
 .../le/fpu/multiarch/s_fmaf128-ppc64.c        |  26 ---
 .../powerpc64/le/fpu/multiarch/s_fmaf128.c    |  36 ---
 .../le/fpu/multiarch/w_sqrtf128-power9.c      |  35 ---
 .../le/fpu/multiarch/w_sqrtf128-ppc64le.c     |  35 ---
 .../powerpc64/le/fpu/multiarch/w_sqrtf128.c   |  31 ---
 .../powerpc/powerpc64/le/power9/Makeconfig    |   3 +
 17 files changed, 1008 insertions(+), 197 deletions(-)
 create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-macros.h
 create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirect-macros.h
 create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirects-mp.h
 create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirects.h
 create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc.c
 create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc.h
 create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128_private.h
 create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/math-type-macros-float128.h
 create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/math_private.h
 delete mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128-power9.c
 delete mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128-ppc64.c
 delete mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128.c
 delete mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128-power9.c
 delete mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128-ppc64le.c
 delete mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128.c
 create mode 100644 sysdeps/powerpc/powerpc64/le/power9/Makeconfig

Comments

Tulio Magno Quites Machado Filho June 19, 2020, 10:36 p.m. UTC | #1
"Paul E. Murphy via Libc-alpha" <libc-alpha@sourceware.org> writes:

> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/Makefile b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/Makefile
> index 8747b02127..3974345d24 100644
> --- a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/Makefile
> +++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/Makefile
> @@ -1,10 +1,208 @@
>  ifeq ($(subdir),math)
> -libm-sysdep_routines += s_fmaf128-ppc64 s_fmaf128-power9 \
> -			w_sqrtf128-power9 w_sqrtf128-ppc64le
>  
> -CFLAGS-s_fmaf128-ppc64.c += $(type-float128-CFLAGS) $(no-gnu-attribute-CFLAGS)
> -CFLAGS-s_fmaf128-power9.c += $(type-float128-CFLAGS) -mcpu=power9 $(no-gnu-attribute-CFLAGS)
> +#
> +# Only enable ifunc _Float128 support if the baseline cpu support
> +# is older than power9.
> +ifneq (yes,$(libc-submachine-power9))
> +do_f128_multiarch = yes
> +endif

I expected that:

--disable-multi-arch	-> do_f128_multiarch = no,	OK
--with-cpu=power8	-> do_f128_multiarch = yes,	OK
--with-cpu=power9	-> do_f128_multiarch = no,	OK
--with-cpu=power >= 9	-> do_f128_multiarch = no,	OK

The last line is true because Makeconfig is inherited based on Implies.
So, looks good.

> +#   * float128_private.h is currently used to rename the ldouble == ieee128
> +#                      object files today.  This takes it a step further and
> +#                      redirects symbols to _power9 or _power8 variants of the
> +#                      functions.  This supports nearly all files in
> +#                      sysdeps/ieee754/float128, but not all _Float128 objects.
> +#                      However, there are three distinct build configurations
> +#                      used to compile _Float128 support.  Two other headers
> +#                      below complete the ABI redirection.
> +#   * math-type-macros-float128.h supports renames for the common object files
> +#                      which are built from templates in math/.
> +#   * math_private.h provides rename support for the common files built in math/
> +#                      which are neither template generated nor ldbl-128 specific.
> +#                      It should be noted that float128_private.h and math_private.h
> +#                      overlap in their declarations, and are used orthogonally.

There are a couple of long lines here.

> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirect-macros.h
> +   F128_SFX_APPEND(sym)
> +    Append the the multiarch cpu specific suffix to the sym. sym is not
> +    expanded.  This is sym ## cpu, where cpu is eiter power8 or power9

s/eiter/either/

> +/* _F128_IFUNC2(func, from, r)
> +      Generate an ifunc symbol func ## r from the symbols
> +	from ## {power8, power9} ## r
> +
> +      We use the PPC hwcap bit HAS_IEEE128 to select between the two with
> +      the assumption all P9 features are available on such targets.  */
> +#define _F128_IFUNC2(func, from, r) \
> +	libc_ifunc (func ## r, (hwcap2 & PPC_FEATURE2_HAS_IEEE128) \
> +                                ? from ## _power9 ## r : from ## _power8 ## r)

OK.

> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/math-type-macros-float128.h b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/math-type-macros-float128.h
> new file mode 100644
> index 0000000000..bc210b17cf
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/math-type-macros-float128.h
> +/* This is hack.  The build directory is favored over the sysdep directorys.

s/directorys/directories/

LGTM.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Adhemerval Zanella June 22, 2020, 4:57 p.m. UTC | #2
On 15/06/2020 17:59, Paul E. Murphy via Libc-alpha wrote:
> See the Makefile changes for high level design/commentary.
> 
> V4 changes -
>   * Drop patch to add libm_alias_exclusive_ldouble.  After
>     recent refactoring of fmaf128, it showed some unfixable
>     flaws.  Instead, use macro renaming for nextafterf128 to
>     generate the needed symbols, and rework.
> 
> V3 changes -
>   * Cleanup comments.
>   * Rebase against fmaf128 cleanup
>   * Use Makeconfig trick to set var in le/power9 sysdep dir to
>     determine if ifunc support is necessary.  This works with
>     the upcoming CPU detection patch.
>   * fmaf128 patch is no longer needed.
> 
> V2 changes -
>   * move duplicate redirect macros into float128-ifunc-redirect-macros.h
>   * replace subshell usage with command sequencing
>   * Add more instructive documentation in Makefile about how all
>     these ugly pieces work togethor
>   * Minor comment cleanup throughout
>   * Improve inline documentation/commentary throughout
> 
> ---8<---
> 
> Programatically generate simple wrappers for most libm *f128
> objects and a set of ifunc objects to unify them.
> 
> A second set of implementation files are generated which simply
> include the first implementation encountered along the search
> path.  This usually works, excepting when a wrapper is overriden
> and makefile search order slightly diverges from include order.
> 
> A set of additional headers are included which primarily rely
> on asm redirects to rename, and less frequently macro renames
> where an asm redirect is not possible.  These intercept several
> common headers to install redirect and disable macros at specific
> times.  This works surprisingly well.  Notably, some ugliness
> occurs when header inclusion must be coerced at certain times
> before turning off aliasing and plt bypass wrappers.
> 
> Notably, the only special case is s_significandf128.c.  It is
> doubly special as exists to support ldouble redirects, and
> exposes subtle difference between makefile rules and search path
> orders.  Commentary is inlined.
> 
> Admittedly, this makes shared maintenance a tiny bit more
> difficult, but lays groundwork for supporting more optimized
> float128 routines which very overtly assume a soft-fp runtime.
> Changes to internal float128 API should fail at compile time,
> thus build-many-glibcs.py should readily catch any divergence.
> 
> Finally, don't build this support if requested CPU is newer
> than power8.
> 
> fixup f128 ifunc
> 
> drop the patch to introduce the new macro to assist simplification of
> s_nextafter.c.  It wasn't thought out well enough.  Instead just add
> the ugly macro redirections needed to generate the appropriate
> nexttoward symbols.

I am trying to digest the requirements to add such complexity on the
powerpc64le build rules, specially the internally Makefile hackery 
required.

So if I understood correctly, let say we have these targets:

  1. powerpc64le-linux-gnu
  2. powerpc64le-linux-gnu with --with-cpu=power9

The ifunc mechanism to build optimized versions for power9 will be
built only for 1, while for 2. only versions that uses hardware 
instruction for __float128 (-mfloat128-hardware gcc option)
will be used.

So all the rediretion machinery done in the float128-ifunc-* are to
list and redirect internal libm symbols to its float128 counterparts.
One initial issue is this tend to be fragile: it requires to change
arch-specific code when generic code is changed (for instance by
changing the internal symbol name or the caller implementation).

Another issue the rules exceptions (such as s_totalorderf128) that
require additional care to check if they result in correct code.

Another possible mantainance issue is to keep updating the exported 
symbol list at float128-ifunc.c, float128-ifunc.h, and 
float128_private.hfor each new possible symbol in future version.
It against means to correct/change arch-specific code for generic
changes.

It also increases code size considerable with the potential to keep
increasing with the addition on new libm functions.

Finally the question is how useful would be this change on real
world cases to justify this huge build and permutation complexity.

What I would expect in realword cases is if the workload really
uses float128 extensivelly to be built with -mcpu=power9 and/or
-mfloat128/-mfloat128-hardware. It should cover most the required
hotspots and glibc can focus on providing only cases where adding
an specialized ifunc variant does make sense (as for the x86_64
sysdeps/x86_64/fpu/multiarch/mp*) for instance. 

Also, if an optimized float128 glibc build is paramount, a much 
simpler solution would be to just provide a -mcpu=power9 built one.


> ---
>  .../powerpc64/le/fpu/multiarch/Makefile       | 210 ++++++++++++++++-
>  .../le/fpu/multiarch/float128-ifunc-macros.h  |  68 ++++++
>  .../float128-ifunc-redirect-macros.h          |  52 +++++
>  .../multiarch/float128-ifunc-redirects-mp.h   |  64 ++++++
>  .../fpu/multiarch/float128-ifunc-redirects.h  |  40 ++++
>  .../le/fpu/multiarch/float128-ifunc.c         |  66 ++++++
>  .../le/fpu/multiarch/float128-ifunc.h         | 217 ++++++++++++++++++
>  .../le/fpu/multiarch/float128_private.h       | 143 ++++++++++++
>  .../fpu/multiarch/math-type-macros-float128.h | 136 +++++++++++
>  .../powerpc64/le/fpu/multiarch/math_private.h |  15 ++
>  .../le/fpu/multiarch/s_fmaf128-power9.c       |  28 ---
>  .../le/fpu/multiarch/s_fmaf128-ppc64.c        |  26 ---
>  .../powerpc64/le/fpu/multiarch/s_fmaf128.c    |  36 ---
>  .../le/fpu/multiarch/w_sqrtf128-power9.c      |  35 ---
>  .../le/fpu/multiarch/w_sqrtf128-ppc64le.c     |  35 ---
>  .../powerpc64/le/fpu/multiarch/w_sqrtf128.c   |  31 ---
>  .../powerpc/powerpc64/le/power9/Makeconfig    |   3 +
>  17 files changed, 1008 insertions(+), 197 deletions(-)
>  create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-macros.h
>  create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirect-macros.h
>  create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirects-mp.h
>  create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirects.h
>  create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc.c
>  create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc.h
>  create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128_private.h
>  create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/math-type-macros-float128.h
>  create mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/math_private.h
>  delete mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128-power9.c
>  delete mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128-ppc64.c
>  delete mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128.c
>  delete mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128-power9.c
>  delete mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128-ppc64le.c
>  delete mode 100644 sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128.c
>  create mode 100644 sysdeps/powerpc/powerpc64/le/power9/Makeconfig
> 
> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/Makefile b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/Makefile
> index 8747b02127..3974345d24 100644
> --- a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/Makefile
> +++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/Makefile
> @@ -1,10 +1,208 @@
>  ifeq ($(subdir),math)
> -libm-sysdep_routines += s_fmaf128-ppc64 s_fmaf128-power9 \
> -			w_sqrtf128-power9 w_sqrtf128-ppc64le
>  
> -CFLAGS-s_fmaf128-ppc64.c += $(type-float128-CFLAGS) $(no-gnu-attribute-CFLAGS)
> -CFLAGS-s_fmaf128-power9.c += $(type-float128-CFLAGS) -mcpu=power9 $(no-gnu-attribute-CFLAGS)
> +#
> +# Only enable ifunc _Float128 support if the baseline cpu support
> +# is older than power9.
> +ifneq (yes,$(libc-submachine-power9))
> +do_f128_multiarch = yes
> +endif
> +
> +#
> +# This is an ugly, but contained, mechanism to provide hardware optimized
> +# _Float128 and ldouble == ieee128 optimized routines for P9 and beyond
> +# hardware.  At a very high level, we rely on ASM renames, and rarely
> +# macro renames to build two sets of _Float128 ABI, one with _power8 (the
> +# baseline powerpc64le cpu) and power9 (the first powerpc64le cpu to introduce
> +# hardware support for _Float128).
> +#
> +# At a high level, we compile 3 files for each object file.
> +#   1.  The baseline soft-float128, unsuffixed objects $(object).$(sfx).
> +#       This ABI is suffixed with _power8.
> +#   2.  The hard-float128, power9, suffixed objects $(object)-power9.$(sfx)
> +#   3.  The IFUNC wrapper object to export ABI, $(object)-ifunc.$(sfx)
> +#
> +# 2 & 3 are automatically generated by Makefile rule.  Placing the exported
> +# ABI into a separate file allows reuse of existing aliasing macros
> +# with minimal hassle.  Likewise, a backdoor is provided to unilaterally
> +# disable this support per object.
> +#
> +# Changes to APIs will require minor updates to one (or two) places:
> +#
> +#   * Internal float128 API: the float128_private.h interposer.
> +#   * math_private.h API: float128-ifunc-redirects-mp.h
> +#   * templated math API: the math-type-macros-float128.h interposer.
> +#
> +# Some redirects are duplicated between both float128_private.h and
> +# math-type-macros-float128.h as they are not usually included together
> +# when building libm.  The hope is this provides minimal burden on
> +# maintainers, and is readily caught by build-many-glibcs.py.
> +#
> +# The above is supported by several carefully crafted header files as
> +# described below:
> +#
> +#   * float128-ifunc.h provides support for generating the IFUNC objects
> +#                      in part 3 above.  It also enables case-by-case
> +#                      overriding as some objects do not expose a uniform
> +#                      ABI.
> +#   * float128-ifunc.c provides compatibility ABI using the IFUNC objects.
> +#                      These should rarely change and don't cause trouble
> +#                      when grouped into a single object file as they are
> +#                      only needed for the shared library.
> +#   * float128-ifunc-macros.h disables all first-order aliasing macros
> +#                      used in libm/_Float128, but not the backing
> +#                      implementations provided by libc-symbols.h as some
> +#                      objects generate strong aliases which make this
> +#                      work easier.
> +#   * float128-ifunc-redirect-macros.h provides macros to support ASM
> +#                      redirect of _Float128 ABI.
> +#   * float128-ifunc-redirects.h provides ASM redirects for functions
> +#                      which are nominally redirected in the private
> +#                      copy of math.h.
> +#   * float128-ifunc-redirects-mp.h provides ASM redirects which are used
> +#                      by math_private.h (the -mp suffix) and the interposer
> +#                      float128_private.h discussed late.
> +#
> +# The headers above should only be included via the interposed headers
> +# discussed below.  Several commonly used headers are interposed to rename all
> +# via ASM redirects.  This requires careful orchestration of header inclusion
> +# to ensure headers are redirected to exclusively _power8 or _power9 suffixed
> +# ABI.  This also has the desirable side-effect of bypassing the PLT locally
> +# and generating compile time errors if a function is missed or changed.
> +#
> +#   * float128_private.h is currently used to rename the ldouble == ieee128
> +#                      object files today.  This takes it a step further and
> +#                      redirects symbols to _power9 or _power8 variants of the
> +#                      functions.  This supports nearly all files in
> +#                      sysdeps/ieee754/float128, but not all _Float128 objects.
> +#                      However, there are three distinct build configurations
> +#                      used to compile _Float128 support.  Two other headers
> +#                      below complete the ABI redirection.
> +#   * math-type-macros-float128.h supports renames for the common object files
> +#                      which are built from templates in math/.
> +#   * math_private.h provides rename support for the common files built in math/
> +#                      which are neither template generated nor ldbl-128 specific.
> +#                      It should be noted that float128_private.h and math_private.h
> +#                      overlap in their declarations, and are used orthogonally.
> +#
> +#
> +# The above usually works out very well, but there are sometimes special cases
> +# so special you need throw your hands up and give up.  For that, support
> +# is provided to disable the above entirely at an object level.  Today this
> +# includes objects which only provide tables, or have macros so unspeakably
> +# heinous that no reasonable fixup can be provided.  Such objects are declared
> +# in gen-libm-f128-no-ifunc-calls.
> +#
> +# Secondly, this enforces a slightly different mechanism for machine specific
> +# overrides.  That is, all optimizations for all targets must all be reachable
> +# from the same file as the above relies on rebuilding the same file with
> +# different compiler settings.  Most arch specific overrides should be trivial
> +# implementations (e.g sqrt or fma), thus it should present no obstacle.
> +# Likewise, this also enforces them to use the same language (C or ASM today).
> +#
> +# Finally, some designer notes/rambling.  One could naively use target cloning,
> +# but that generates an ifunc per function, not per entry point.  The above
> +# gives us two copies of _Float128 ABI which are entirely isolated, and
> +# need no internal ifunc usage to disambiguate.  ASM renames are preferable
> +# to macro renames.  The latter causes many macro expansion bugs which require
> +# many ugly fixups (that was my first attempt).  Secondly, one may note libgcc
> +# provides ifunc routines for soft-fp functions, why this?  Such callouts
> +# inhibit most compiler optimization and result in not so great code.  Next,
> +# why not libc too?  Inspecting libc, the reachable _Float128 code only makes
> +# a single digit number of soft-fp calls.  The benefit of the above is limited.
> +#
> +ifeq ($(do_f128_multiarch),yes)
> +
> +gen-libm-all-f128-ifunc-calls = \
> +	$(strip $(subst F,$(type-float128-suffix),$(libm-calls)) \
> +		$(foreach f,$(libm-narrow-fns),$(subst F,$(f),$(libm-narrow-types-float128-yes))) \
> +		$(type-float128-routines))
> +
> +# Some functions are not trivial to ifunc today without some extensive refactoring.
> +# totalorder{,mag} have no benefit to native IEEE support and have complex versioning requirements.
> +# Likewise, tables require no special treatment.
> +gen-libm-f128-no-ifunc-calls := s_totalorderf128 s_totalordermagf128 t_sincosf128
> +gen-libm-f128-ifunc-calls = $(filter-out $(gen-libm-f128-no-ifunc-calls),$(gen-libm-all-f128-ifunc-calls))
> +
> +f128-march-routines-p9 = $(addsuffix -power9,$(gen-libm-f128-ifunc-calls))
> +f128-march-routines-ifunc = $(addsuffix -ifunc,$(gen-libm-f128-ifunc-calls))
> +f128-march-routines = $(f128-march-routines-p9) $(f128-march-routines-ifunc)
> +f128-march-cpus = power9
> +
> +libm-routines += $(f128-march-routines) float128-ifunc
> +generated += $(f128-march-routines)
> +
> +# When multiarch support must be explicitly disabled for an object
> +# file, we must also supply a macro hint when building it.  Only
> +# objects which contain executable code should require this.
> +CPPFLAGS-s_totalorderf128.c += -D_F128_DISABLE_IFUNC
> +CPPFLAGS-s_totalordermagf128.c += -D_F128_DISABLE_IFUNC
> +CPPFLAGS-float128-ifunc.c += -D_F128_DISABLE_IFUNC
> +
> +CFLAGS-float128-ifunc.c += $(type-float128-CFLAGS) $(no-gnu-attribute-CFLAGS)
> +
> +# Copy special CFLAGS for some functions
> +CFLAGS-m_modff128-power9.c += -fsignaling-nans
> +
> +# Generate wrapper objects for each machine,
> +# and a separate ifunc wrapper.  Likewise substitute
> +# m_%.c files should include s_%.c to match common libm rules
> +# for files built in both libm and libc.
> +$(objpfx)gen-float128-ifuncs.stmp: Makefile
> +	$(make-target-directory)
> +	for gcall in $(gen-libm-f128-ifunc-calls); do    \
> +	  ifile="$${gcall}";                             \
> +	  if [ $${gcall##m_} != $${gcall} ]; then        \
> +	    ifile="s_$${gcall##m_}";                     \
> +	  fi;                                            \
> +	  for cpu in $(f128-march-cpus); do              \
> +	    file=$(objpfx)$${gcall}-$${cpu}.c;           \
> +	    {                                            \
> +	      echo "#include <$${ifile}.c>";             \
> +	    } > $${file};                                \
> +	  done;                                          \
> +	  name="$${gcall##?_}";                          \
> +	  pfx="$${gcall%%_*}";                           \
> +	  R="";                                          \
> +	  r="";                                          \
> +	  if [ $${gcall##m_} != $${gcall} ]; then        \
> +	    pfx="s";                                     \
> +	  fi;                                            \
> +	  if [ $${#pfx} != 1 ]; then                     \
> +	    pfx="";                                      \
> +	  else                                           \
> +	    pfx="_$${pfx}";                              \
> +	  fi;                                            \
> +	  if [ $${name%%_r} != $${name} ]; then          \
> +	    R="_R";                                      \
> +	    r="_r";                                      \
> +	    name="$${name%%_r}";                         \
> +	  fi;                                            \
> +	  name="$${name%%f128}";                         \
> +	  decl="DECL_ALIAS$${pfx}_$${name}$${r}";        \
> +	  declc="DECL_ALIAS$${R}$${pfx}";                \
> +	  {                                              \
> +	    echo "#include <float128-ifunc.h>";          \
> +	    echo "#ifndef $${decl}";                     \
> +	    echo "# define $${decl}(f) $${declc} (f)";   \
> +	    echo "#endif";                               \
> +	    echo "$${decl} ($${name});";                 \
> +	  } > $(objpfx)$${gcall}-ifunc.c;                \
> +	done;                                            \
> +	echo > $(@)
> +
> +$(foreach f,$(f128-march-routines),$(objpfx)$(f).c): $(objpfx)gen-float128-ifuncs.stmp
> +
> +include $(o-iterator)
> +define o-iterator-doit
> +$(foreach f,$(f128-march-routines-p9),$(objpfx)$(f)$(o)): sysdep-CFLAGS += -mcpu=power9 $$(type-float128-CFLAGS) $$(no-gnu-attributes-CFLAGS)
> +endef
> +object-suffixes-left := $(all-object-suffixes)
> +include $(o-iterator)
> +
> +else
> +
> +# Minimum CPU is POWER9 or newer, this support is not needed.
> +math-CPPFLAGS += -D_F128_DISABLE_IFUNC
>  
> -CFLAGS-w_sqrtf128-ppc64le.c += $(type-float128-CFLAGS) $(no-gnu-attribute-CFLAGS)
> -CFLAGS-w_sqrtf128-power9.c += $(type-float128-CFLAGS) -mcpu=power9 $(no-gnu-attribute-CFLAGS)
> +endif # do_f128_multiarch
>  endif
> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-macros.h b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-macros.h
> new file mode 100644
> index 0000000000..f66d255478
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-macros.h
> @@ -0,0 +1,68 @@
> +/* _Float128 aliasing macro support for ifunc generation on PPC.
> +   Copyright (C) 2020 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#ifndef _FLOAT128_IFUNC_MACROS_PPC64LE
> +#define _FLOAT128_IFUNC_MACROS_PPC64LE 1
> +
> +/* Bring in the various alias providing headers, and disable
> +   those used for _Float128.  This prevents exporting any ABI
> +   from _Float128 implementation objects, or confusing errors
> +   when a renamed symbol fails to compile.  */
> +#include <libm-alias-float128.h>
> +#include <math-narrow.h>
> +#include <libm-alias-finite.h>
> +
> +#undef libm_alias_float32_float128
> +#undef libm_alias_float64_float128
> +#undef libm_alias_float64x_float128
> +#undef libm_alias_float128_r
> +#undef libm_alias_finite
> +#undef libm_alias_exclusive_ldouble
> +#undef libm_alias_float128_other_r_ldbl
> +#undef declare_mgen_finite_alias
> +#undef declare_mgen_alias
> +#undef declare_mgen_alias_r
> +
> +#define libm_alias_finite(from, to)
> +#define libm_alias_float128_r(from, to, r)
> +#define libm_alias_float32_float128(func)
> +#define libm_alias_float64_float128(func)
> +#define libm_alias_float64x_float128(func)
> +#define libm_alias_exclusive_ldouble(from, to)
> +#define libm_alias_float128_other_r_ldbl(from, to, r)
> +#define declare_mgen_finite_alias(from, to)
> +#define declare_mgen_alias(from, to)
> +#define declare_mgen_alias_r(from, to)
> +
> +/*  Likewise, disable hidden symbol support.  This is not needed
> +    for the implementation objects as the redirects already give
> +    us this support.  This also means any non-_Float128 headers
> +    which provide hidden_def's should be included prior to this
> +    header (only fenv.h during initial support).  */
> +#undef mathx_hidden_def
> +#define mathx_hidden_def(func)
> +#undef libm_hidden_def
> +#define libm_hidden_def(func)
> +#undef libm_hidden_proto
> +#define libm_hidden_proto(f)
> +#undef hidden_proto
> +#define hidden_proto(f)
> +
> +#include <float128-ifunc-redirect-macros.h>
> +
> +#endif /* _FLOAT128_IFUNC_MACROS_PPC64LE */
> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirect-macros.h b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirect-macros.h
> new file mode 100644
> index 0000000000..a9369d1fae
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirect-macros.h
> @@ -0,0 +1,52 @@
> +/* _Float128 aliasing macro support for ifunc generation on PPC.
> +   Copyright (C) 2020 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#ifndef _FLOAT128_IFUNC_REDIRECT_MACROS_PPC64LE
> +#define _FLOAT128_IFUNC_REDIRECT_MACROS_PPC64LE 1
> +
> +/* Define the redirection macros used throughout most of the IFUNC headers.
> +
> +   F128_REDIR_PFX_R(function, destination_prefix, reentrant_suffix)
> +    Redirect function, optionally suffixed by reentrant_suffix, to a function
> +    named destination_prefix ## function ## cpu ## reentrant_suffix where cpu
> +    is either _power8 or _power9 as inferred by compiler options.
> +
> +   F128_SFX_APPEND(sym)
> +    Append the the multiarch cpu specific suffix to the sym. sym is not
> +    expanded.  This is sym ## cpu, where cpu is eiter power8 or power9
> +    inferred by compiler options.
> +
> +   F128_REDIR_R(func, reentrant_suffix)
> +    Redirect func to a function named function ## cpu ## reentrant_suffix
> +    where cpu is either _power8 or _power9 as inferred by compiler options.
> +
> +   F128_REDIR(function)
> +    Redirect function, to a function named function ## cpu where cpu is
> +    either _power8 or _power9 as inferred by compiler options.
> +*/
> +#ifndef _ARCH_PWR9
> +#define F128_REDIR_PFX_R(func, pfx, r) extern __typeof(func ## r) func ## r __asm( #pfx #func "_power8" #r );
> +#define F128_SFX_APPEND(x) x ## _power8
> +#else
> +#define F128_REDIR_PFX_R(func, pfx, r) extern __typeof(func ## r) func ## r __asm( #pfx #func "_power9" #r );
> +#define F128_SFX_APPEND(x) x ## _power9
> +#endif
> +#define F128_REDIR_R(func, r) F128_REDIR_PFX_R (func, , r)
> +#define F128_REDIR(func) F128_REDIR_R (func, )
> +
> +#endif /*_FLOAT128_IFUNC_REDIRECT_MACROS_PPC64LE */
> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirects-mp.h b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirects-mp.h
> new file mode 100644
> index 0000000000..3c8b6f1291
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirects-mp.h
> @@ -0,0 +1,64 @@
> +/* _Float128 multiarch redirects shared with math_private.h
> +   Copyright (C) 2020 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#ifndef _FLOAT128_IFUNC_REDIRECTS_MP_H
> +#define _FLOAT128_IFUNC_REDIRECTS_MP_H 1
> +
> +#include <float128-ifunc-redirect-macros.h>
> +
> +F128_REDIR (__ieee754_acosf128)
> +F128_REDIR (__ieee754_acoshf128)
> +F128_REDIR (__ieee754_asinf128)
> +F128_REDIR (__ieee754_atan2f128)
> +F128_REDIR (__ieee754_atanhf128)
> +F128_REDIR (__ieee754_coshf128)
> +F128_REDIR (__ieee754_expf128)
> +F128_REDIR (__ieee754_exp10f128)
> +F128_REDIR (__ieee754_exp2f128)
> +F128_REDIR (__ieee754_fmodf128)
> +F128_REDIR (__ieee754_gammaf128)
> +F128_REDIR_R (__ieee754_gammaf128, _r)
> +F128_REDIR (__ieee754_hypotf128)
> +F128_REDIR (__ieee754_j0f128)
> +F128_REDIR (__ieee754_j1f128)
> +F128_REDIR (__ieee754_jnf128)
> +F128_REDIR (__ieee754_lgammaf128)
> +F128_REDIR_R (__ieee754_lgammaf128, _r)
> +F128_REDIR (__ieee754_logf128)
> +F128_REDIR (__ieee754_log10f128)
> +F128_REDIR (__ieee754_log2f128)
> +F128_REDIR (__ieee754_powf128)
> +F128_REDIR (__ieee754_remainderf128)
> +F128_REDIR (__ieee754_sinhf128)
> +F128_REDIR (__ieee754_sqrtf128)
> +F128_REDIR (__ieee754_y0f128)
> +F128_REDIR (__ieee754_y1f128)
> +F128_REDIR (__ieee754_ynf128)
> +F128_REDIR (__ieee754_scalbf128)
> +F128_REDIR (__ieee754_ilogbf128)
> +F128_REDIR (__ieee754_rem_pio2f128)
> +F128_REDIR (__kernel_sinf128)
> +F128_REDIR (__kernel_cosf128)
> +F128_REDIR (__kernel_tanf128)
> +F128_REDIR (__kernel_sincosf128)
> +F128_REDIR (__kernel_rem_pio2f128)
> +F128_REDIR (__x2y2m1f128)
> +F128_REDIR (__gamma_productf128)
> +F128_REDIR (__lgamma_negf128)
> +
> +#endif /*_FLOAT128_IFUNC_REDIRECTS_MP_H */
> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirects.h b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirects.h
> new file mode 100644
> index 0000000000..88b71558b0
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirects.h
> @@ -0,0 +1,40 @@
> +/* _Float128 redirects for ppc64le multiarch env.
> +   Copyright (C) 2020 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#ifndef _FLOAT128_IFUNC_REDIRECTS
> +#define _FLOAT128_IFUNC_REDIRECTS 1
> +
> +#include <float128-ifunc-macros.h>
> +
> +F128_REDIR_PFX_R (sqrtf128, __,);
> +F128_REDIR_PFX_R (rintf128, __,);
> +F128_REDIR_PFX_R (ceilf128, __,);
> +F128_REDIR_PFX_R (floorf128, __,);
> +F128_REDIR_PFX_R (truncf128, __,);
> +F128_REDIR_PFX_R (roundf128, __,);
> +F128_REDIR_PFX_R (fabsf128, __,);
> +F128_REDIR (__issignalingf128)
> +
> +extern __typeof (ldexpf128) F128_SFX_APPEND (__ldexpf128);
> +
> +#define __isinff128 F128_SFX_APPEND (__isinff128)
> +#define __isnanf128 F128_SFX_APPEND (__isnanf128)
> +#define __finitef128 F128_SFX_APPEND (__finitef128)
> +#define __ldexpf128 F128_SFX_APPEND (__ldexpf128)
> +
> +#endif /* _FLOAT128_IFUNC_REDIRECTS */
> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc.c b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc.c
> new file mode 100644
> index 0000000000..cefaa6d889
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc.c
> @@ -0,0 +1,66 @@
> +/* _Float128 ifunc definitions for compat symbols.
> +   Copyright (C) 2017-2020 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <float128-ifunc.h>
> +#include <libm-alias-finite.h>
> +
> +#if SHLIB_COMPAT (libm, GLIBC_2_15, GLIBC_2_31)
> +
> +/* __gammaf128_r is a special case.  This prototype keeps compat macro simple. */
> +extern _Float128 gammaf128_r (_Float128 x, int *signamp);
> +
> +/* Generate compatibility alias macros for finite math functions.  IFUNC is
> +   used to avoid complicating the macros in float128-ifunc.h, and avoids the
> +   need to use special macros while constructing the baseline objects.  */
> +#define MAKE_IFUNC_COMPAT_R(func, r) \
> +	extern __typeof(func ## r) __ieee754_ ## func ## _power8 ## r; \
> +	extern __typeof(func ## r) __ieee754_ ## func ## _power9 ## r; \
> +	extern __typeof(func ## r) __ieee754_ ## func ## r; \
> +	_F128_IFUNC(__ieee754_ ## func, r); \
> +	libm_alias_finite (__ieee754_ ## func ## r, __ ## func ## r)
> +
> +#define MAKE_IFUNC_COMPAT(func) MAKE_IFUNC_COMPAT_R (func,)
> +
> +MAKE_IFUNC_COMPAT (acosf128)
> +MAKE_IFUNC_COMPAT (acoshf128)
> +MAKE_IFUNC_COMPAT (asinf128)
> +MAKE_IFUNC_COMPAT (atan2f128)
> +MAKE_IFUNC_COMPAT (atanhf128)
> +MAKE_IFUNC_COMPAT (coshf128)
> +MAKE_IFUNC_COMPAT (exp10f128)
> +MAKE_IFUNC_COMPAT (exp2f128)
> +MAKE_IFUNC_COMPAT (expf128)
> +MAKE_IFUNC_COMPAT (fmodf128)
> +MAKE_IFUNC_COMPAT_R (gammaf128, _r)
> +MAKE_IFUNC_COMPAT (hypotf128)
> +MAKE_IFUNC_COMPAT (j0f128)
> +MAKE_IFUNC_COMPAT (j1f128)
> +MAKE_IFUNC_COMPAT (jnf128)
> +MAKE_IFUNC_COMPAT_R (lgammaf128, _r)
> +MAKE_IFUNC_COMPAT (log10f128)
> +MAKE_IFUNC_COMPAT (log2f128)
> +MAKE_IFUNC_COMPAT (logf128)
> +MAKE_IFUNC_COMPAT (powf128)
> +MAKE_IFUNC_COMPAT (remainderf128)
> +MAKE_IFUNC_COMPAT (sinhf128)
> +MAKE_IFUNC_COMPAT (sqrtf128)
> +MAKE_IFUNC_COMPAT (y0f128)
> +MAKE_IFUNC_COMPAT (y1f128)
> +MAKE_IFUNC_COMPAT (ynf128)
> +
> +#endif
> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc.h b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc.h
> new file mode 100644
> index 0000000000..3e5b573091
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc.h
> @@ -0,0 +1,217 @@
> +/* _Float128 ifunc symboling macros.
> +   Copyright (C) 2020 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#ifndef _FLOAT128_IFUNC_H
> +#define _FLOAT128_IFUNC_H 1
> +
> +/* These cause conflicts when aliasing.  Hide their definitions. */
> +#define f32addf64x __hide_f32addf64x
> +#define f32subf64x __hide_f32subf64x
> +#define f32mulf64x __hide_f32mulf64x
> +#define f32divf64x __hide_f32divf64x
> +#define f32xaddf64x __hide_f32xaddf64x
> +#define f32xsubf64x __hide_f32xsubf64x
> +#define f32xmulf64x __hide_f32xmulf64x
> +#define f32xdivf64x __hide_f32xdivf64x
> +#define f32xaddf128 __hide_f32xaddf128
> +#define f32xsubf128 __hide_f32xsubf128
> +#define f32xmulf128 __hide_f32xmulf128
> +#define f32xdivf128 __hide_f32xdivf128
> +#define f32addf64 __hide_f32addf64
> +#define f32subf64 __hide_f32subf64
> +#define f32mulf64 __hide_f32mulf64
> +#define f32divf64 __hide_f32divf64
> +#define f64addf64x __hide_f64addf64x
> +#define f64subf64x __hide_f64subf64x
> +#define f64mulf64x __hide_f64mulf64x
> +#define f64divf64x __hide_f64divf64x
> +
> +/* We want the real prototypes. */
> +#include <math/math.h>
> +#include <math/complex.h>
> +#include <first-versions.h>
> +#include <shlib-compat.h>
> +#include "init-arch.h"
> +
> +#undef f32addf64x
> +#undef f32subf64x
> +#undef f32mulf64x
> +#undef f32divf64x
> +#undef f32xaddf64x
> +#undef f32xsubf64x
> +#undef f32xmulf64x
> +#undef f32xdivf64x
> +#undef f32xaddf128
> +#undef f32xsubf128
> +#undef f32xmulf128
> +#undef f32xdivf128
> +#undef f32addf64
> +#undef f32subf64
> +#undef f32mulf64
> +#undef f32divf64
> +#undef f64addf64x
> +#undef f64subf64x
> +#undef f64mulf64x
> +#undef f64divf64x
> +
> +#include <libm-alias-float128.h>
> +#include <math-narrow.h>
> +
> +/* _F128_IFUNC2(func, from, r)
> +      Generate an ifunc symbol func ## r from the symbols
> +	from ## {power8, power9} ## r
> +
> +      We use the PPC hwcap bit HAS_IEEE128 to select between the two with
> +      the assumption all P9 features are available on such targets.  */
> +#define _F128_IFUNC2(func, from, r) \
> +	libc_ifunc (func ## r, (hwcap2 & PPC_FEATURE2_HAS_IEEE128) \
> +                                ? from ## _power9 ## r : from ## _power8 ## r)
> +
> +/* _F128_IFUNC(func, r)
> +      Similar to above, except the exported symbol name trivially remaps from
> +      func ## {cpu} ## r to func ## r.  */
> +#define _F128_IFUNC(func, r) _F128_IFUNC2(func, func, r)
> +
> +/* MAKE_IMPL_IFUNC2(func, pfx1, pfx2, r)
> +     Declare external symbols of type pfx1 ## func ## f128 ## r with the name
> +                                      pfx2 ## func ## f128 ## _{cpu} ## r
> +     which are exported as implementation specific symbols (i.e backing support
> +     for type classification macros).  */
> +#define MAKE_IMPL_IFUNC2(func, pfx1, pfx2, r) \
> +	extern __typeof (pfx1 ## func ## f128 ## r) pfx2 ## func ## f128_power8 ## r; \
> +	extern __typeof (pfx1 ## func ## f128 ## r) pfx2 ## func ## f128_power9 ## r; \
> +        _F128_IFUNC2 (__ ## func ## f128, pfx2 ## func ## f128, r);
> +
> +/* MAKE_IMPL_IFUNC(func, pfx1, r)
> +     Same as MAKE_IMPL_IFUNC2, but pfx2 is assumed to be '__'.  */
> +#define MAKE_IMPL_IFUNC(func, pfx1, r) MAKE_IMPL_IFUNC2(func,pfx1,__,r)
> +
> +/* _libm_alias_narrow(func, size)
> +     Export a narrowing function func of type _Float{size}.  This is
> +     worked to reuse the exist aliasing macros provided by glibc.  */
> +#define _libm_alias_narrow(func, size) \
> +	extern __typeof (f ## size ## func ## f128) __f ## size ## func ## f128; \
> +	MAKE_IMPL_IFUNC (f ## size ## func,,) \
> +	libm_alias_float ## size ## _float128 (func)
> +
> +/* Helper macros to use the above.  Prefixed only to avoid namespace
> +   clashes with the existing glibc macros.  */
> +#define _libm_alias_float32_float128(func) _libm_alias_narrow (func, 32)
> +#define _libm_alias_float64_float128(func) _libm_alias_narrow (func, 64)
> +#define _libm_alias_float64x_float128(func) _libm_alias_narrow (func, 64x)
> +
> +/* MAKE_IFUNCP_WRAP_R(w, func, r)
> +      Export a function which the implementation wraps with prefix w to
> +      to func ## r. */
> +#define MAKE_IFUNCP_WRAP_R(w, func, r) \
> +	extern __typeof (func ## f128 ## r) __ ## func ## f128 ## r; \
> +	MAKE_IMPL_IFUNC2 (func,__,__ ## w, r) \
> +	weak_alias (__ ## func ## f128 ## r, func ## f128 ## r); \
> +	libm_alias_float128_other_r (__ ## func, func, r);
> +
> +/* MAKE_IFUNCP_R(func, r)
> +    The default IFUNC generator for all libm _Float128 ABI except
> +    when specifically overwritten.  This is a convenience wrapper
> +    around MAKE_IFUNCP_R where w is not used.  */
> +#define MAKE_IFUNCP_R(func,r) MAKE_IFUNCP_WRAP_R (,func,r)
> +
> +
> +/* Generic aliasing functions.  */
> +#define DECL_ALIAS(f) MAKE_IFUNCP_R (f,)
> +#define DECL_ALIAS_s(f) MAKE_IFUNCP_R (f,)
> +#define DECL_ALIAS_w(f) MAKE_IFUNCP_R (f,)
> +#define DECL_ALIAS_e(f)
> +#define DECL_ALIAS_k(f)
> +#define DECL_ALIAS_R_w(f) MAKE_IFUNCP_R (f, _r)
> +#define DECL_ALIAS_R_e(f)
> +
> +/* Handle expanding/narrowing functions specially.  */
> +#define DECL_ALIAS_s_f32add(x) _libm_alias_float32_float128 (add)
> +#define DECL_ALIAS_s_f64add(x) _libm_alias_float64_float128 (add)
> +#define DECL_ALIAS_s_f64xadd(x) _libm_alias_float64x_float128 (add)
> +#define DECL_ALIAS_s_f32sub(x) _libm_alias_float32_float128 (sub)
> +#define DECL_ALIAS_s_f64sub(x) _libm_alias_float64_float128 (sub)
> +#define DECL_ALIAS_s_f64xsub(x) _libm_alias_float64x_float128 (sub)
> +#define DECL_ALIAS_s_f32mul(x) _libm_alias_float32_float128 (mul)
> +#define DECL_ALIAS_s_f64mul(x) _libm_alias_float64_float128 (mul)
> +#define DECL_ALIAS_s_f64xmul(x) _libm_alias_float64x_float128 (mul)
> +#define DECL_ALIAS_s_f32div(x) _libm_alias_float32_float128 (div)
> +#define DECL_ALIAS_s_f64div(x) _libm_alias_float64_float128 (div)
> +#define DECL_ALIAS_s_f64xdiv(x) _libm_alias_float64x_float128 (div)
> +
> +/* These are fallback support for classification functions.  */
> +#define DECL_ALIAS_s_isinf(x) MAKE_IMPL_IFUNC (x, __,)
> +#define DECL_ALIAS_s_isnan(x) MAKE_IMPL_IFUNC (x, __,)
> +#define DECL_ALIAS_s_issignaling(x) MAKE_IMPL_IFUNC (x, __,)
> +#define DECL_ALIAS_s_iseqsig(x) MAKE_IMPL_IFUNC (x, __,)
> +#define DECL_ALIAS_s_signbit(x) MAKE_IMPL_IFUNC (x, __,)
> +#define DECL_ALIAS_s_finite(x) MAKE_IMPL_IFUNC (x, __,)
> +#define DECL_ALIAS_s_fpclassify(x) MAKE_IMPL_IFUNC (x, __,)
> +
> +/* This doesn't have a public strong implementatation alias.  */
> +extern __typeof (canonicalizef128) __canonicalizef128;
> +
> +/* No symbols are defined in these helper/wrapper objects. */
> +#define DECL_ALIAS_lgamma_neg(x)
> +#define DECL_ALIAS_lgamma_product(x)
> +#define DECL_ALIAS_gamma_product(x)
> +#define DECL_ALIAS_x2y2m1(x)
> +#define DECL_ALIAS_s_log1p(x)
> +#define DECL_ALIAS_s_scalbln(x)
> +#define DECL_ALIAS_s_scalbn(x)
> +
> +/* Ensure the wrapper functions get exposed via IFUNC, not the
> +   wrappee (e.g __w_log1pf128_power8 instead of __log1pf128_power8. */
> +#define DECL_ALIAS_w_log1p(x) MAKE_IFUNCP_WRAP_R(w_,x,)
> +#define DECL_ALIAS_w_scalbln(x) MAKE_IFUNCP_WRAP_R(w_,x,)
> +
> +/* Expose ldouble only redirected symbols.  */
> +#define DECL_LDOUBLE_ALIAS(func, RTYPE, ARGS) \
> +	extern RTYPE func ARGS; \
> +	extern __typeof (func) func ## _power8; \
> +	extern __typeof (func) func ## _power9; \
> +	_F128_IFUNC ( func,)
> +
> +/* These are declared in their respective jX objects.  */
> +#define DECL_ALIAS_w_j0(f) MAKE_IFUNCP_R (f,) MAKE_IFUNCP_R (y0,)
> +#define DECL_ALIAS_w_j1(f) MAKE_IFUNCP_R (f,) MAKE_IFUNCP_R (y1,)
> +#define DECL_ALIAS_w_jn(f) MAKE_IFUNCP_R (f,) MAKE_IFUNCP_R (yn,)
> +
> +#define DECL_ALIAS_s_erf(f) MAKE_IFUNCP_R (f,) MAKE_IFUNCP_R (erfc,)
> +
> +/* scalbnf128 is an alias of ldexpf128.  */
> +#define DECL_ALIAS_s_ldexp(f) MAKE_IFUNCP_R (f,) MAKE_IFUNCP_WRAP_R (wrap_, scalbn,)
> +
> +/* Handle the special case functions which exist only to support
> +   ldouble == ieee128.  */
> +#define DECL_ALIAS_s_nexttoward(x) \
> +	DECL_LDOUBLE_ALIAS (__nexttowardf_to_ieee128, float, (float, _Float128)) \
> +	DECL_LDOUBLE_ALIAS (__nexttoward_to_ieee128, double, (double, _Float128))
> +
> +#define DECL_ALIAS_w_scalb(x) \
> +	DECL_LDOUBLE_ALIAS (__scalbf128,_Float128, (_Float128, _Float128)) \
> +	libm_alias_float128_other_r_ldbl (__scalb, scalb,)
> +
> +#define DECL_ALIAS_s_significand(x) \
> +	DECL_LDOUBLE_ALIAS (__significandieee128, _Float128, (_Float128))
> +
> +#define DECL_ALIAS_s_nextafter(f) \
> +	MAKE_IFUNCP_R (f,) \
> +	libm_alias_float128_other_r_ldbl (__nextafter, nexttoward,)
> +
> +#endif /* ifndef _FLOAT128_IFUNC_H  */
> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128_private.h b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128_private.h
> new file mode 100644
> index 0000000000..3c52735ba7
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128_private.h
> @@ -0,0 +1,143 @@
> +/* _Float128 overrides for float128 in ppc64le multiarch env.
> +   Copyright (C) 2020 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#ifndef _FLOAT128_PRIVATE_PPC64LE
> +#define _FLOAT128_PRIVATE_PPC64LE 1
> +
> +#if IS_IN(libc) || defined(_F128_DISABLE_IFUNC)
> +/* multiarch is not supported.  Do nothing and pass through. */
> +#include_next <float128_private.h>
> +#else
> +
> +/* Include fenv.h now before turning off PLT bypass tricks.  At
> +   minimum fereaiseexcept is used today. */
> +#include <fenv.h>
> +
> +/* Likewise, the PLT bypass trick uses the same trick to rename
> +   as we do.  Only one asm-rename is allowed.  Only fenv.h
> +   functions require this today, so we include them above.  */
> +#undef libm_hidden_proto
> +#define libm_hidden_proto(f)
> +#undef hidden_proto
> +#define hidden_proto(f)
> +
> +/* Always disable redirects.  We supply these uniquely later on. */
> +#undef NO_MATH_REDIRECT
> +#define NO_MATH_REDIRECT
> +#include <math.h>
> +#undef NO_MATH_REDIRECT
> +
> +#include_next <float128_private.h>
> +
> +#include <float128-ifunc-macros.h>
> +
> +/* Declare these now, as they otherwise are not. */
> +extern __typeof (cosf128) __ieee754_cosf128;
> +extern __typeof (asinhf128) __ieee754_asinhf128;
> +
> +F128_REDIR (__ieee754_asinhf128)
> +F128_REDIR (__ieee754_cosf128)
> +F128_REDIR (__asinhf128)
> +F128_REDIR (__atanf128)
> +F128_REDIR (__cbrtf128)
> +F128_REDIR (__ceilf128)
> +F128_REDIR (__copysignf128)
> +F128_REDIR (__cosf128)
> +F128_REDIR (__erfcf128)
> +F128_REDIR (__erff128)
> +F128_REDIR (__expf128)
> +F128_REDIR (__expm1f128)
> +F128_REDIR (__fabsf128)
> +F128_REDIR (__fdimf128)
> +F128_REDIR (__finitef128)
> +F128_REDIR (__floorf128)
> +F128_REDIR (__fmaf128)
> +F128_REDIR (__fmaxf128)
> +F128_REDIR (__fminf128)
> +F128_REDIR (__fpclassifyf128)
> +F128_REDIR (__frexpf128)
> +F128_REDIR (__getpayloadf128)
> +F128_REDIR (__isinff128)
> +F128_REDIR (__isnanf128)
> +F128_REDIR (__ldexpf128)
> +F128_REDIR (__llrintf128)
> +F128_REDIR (__llroundf128)
> +F128_REDIR (__log1pf128)
> +F128_REDIR (__logbf128)
> +F128_REDIR (__logf128)
> +F128_REDIR (__lrintf128)
> +F128_REDIR (__lroundf128)
> +F128_REDIR (__modff128)
> +F128_REDIR (__nearbyintf128)
> +F128_REDIR (__nextdownf128)
> +F128_REDIR (__nextupf128)
> +F128_REDIR (__remquof128)
> +F128_REDIR (__rintf128)
> +F128_REDIR (__roundevenf128)
> +F128_REDIR (__roundf128)
> +F128_REDIR (__scalblnf128)
> +F128_REDIR (__scalbnf128)
> +F128_REDIR (__signbitf128)
> +F128_REDIR (__sincosf128)
> +F128_REDIR (__sinf128)
> +F128_REDIR (__sqrtf128)
> +F128_REDIR (__tanhf128)
> +F128_REDIR (__tanf128)
> +F128_REDIR (__truncf128)
> +F128_REDIR (__lgamma_productf128)
> +F128_REDIR (__mpn_extract_float128)
> +F128_REDIR (__fromfpxf128);
> +F128_REDIR (__ufromfpxf128);
> +F128_REDIR (__fromfpf128);
> +F128_REDIR (__ufromfpf128);
> +
> +#include <float128-ifunc-redirects-mp.h>
> +
> +/* Macro-rename these as it is simpler than making F128_REDIR work.  */
> +#define __nexttoward_to_ieee128 F128_SFX_APPEND (__nexttoward_to_ieee128)
> +#define __nexttowardf_to_ieee128 F128_SFX_APPEND (__nexttowardf_to_ieee128)
> +#define __f32divf128 F128_SFX_APPEND (__f32divf128)
> +#define __f32mulf128 F128_SFX_APPEND (__f32mulf128)
> +#define __f32addf128 F128_SFX_APPEND (__f32addf128)
> +#define __f32subf128 F128_SFX_APPEND (__f32subf128)
> +#define __f64divf128 F128_SFX_APPEND (__f64divf128)
> +#define __f64mulf128 F128_SFX_APPEND (__f64mulf128)
> +#define __f64addf128 F128_SFX_APPEND (__f64addf128)
> +#define __f64subf128 F128_SFX_APPEND (__f64subf128)
> +#define __f64xdivf128 F128_SFX_APPEND (__f64xdivf128)
> +#define __f64xmulf128 F128_SFX_APPEND (__f64xmulf128)
> +#define __f64xaddf128 F128_SFX_APPEND (__f64xaddf128)
> +#define __f64xsubf128 F128_SFX_APPEND (__f64xsubf128)
> +#define __setpayloadf128 F128_SFX_APPEND (__setpayloadf128)
> +#define __setpayloadsigf128 F128_SFX_APPEND (__setpayloadsigf128)
> +
> +/* Special case fixup for s_nextafterf128.c,  it creates an alias
> +   which is used for long double, but not _Float128.  Notably,
> +   we don't generate the __nextafterieee128 aliase since those
> +   macros are disabled.  We rename the input to strong_alias to
> +   get it to generate __nextafterieee128<SFX>.  */
> +#define __nextafterf128 F128_SFX_APPEND (__nextafterf128)
> +#define __nextafterieee128 F128_SFX_APPEND (__nextafterf128)
> +#define __nexttowardieee128 F128_SFX_APPEND (__nexttowardieee128)
> +#define __nexttowardf128_do_not_use F128_SFX_APPEND (__nexttowardf128_dnu)
> +
> +#include <float128-ifunc-redirects.h>
> +
> +#endif /* !(IS_IN(libc) || defined(_F128_DISABLE_IFUNC) */
> +
> +#endif /* _FLOAT128_PRIVATE_PPC64LE */
> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/math-type-macros-float128.h b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/math-type-macros-float128.h
> new file mode 100644
> index 0000000000..bc210b17cf
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/math-type-macros-float128.h
> @@ -0,0 +1,136 @@
> +/* _Float128 overrides for float128 in ppc64le multiarch env.
> +   Copyright (C) 2020 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#ifndef _MATH_TYPE_MACROS_FLOAT128_PPC64_MULTI
> +#define _MATH_TYPE_MACROS_FLOAT128_PPC64_MULTI 1
> +
> +#include_next <math-type-macros-float128.h>
> +
> +#if !IS_IN(libc) && !defined(_F128_DISABLE_IFUNC)
> +
> +/* Include fenv.h now before turning off PLT bypass.  At
> +   minimum fereaiseexcept is used today. */
> +#include <fenv.h>
> +
> +#include <float128-ifunc-macros.h>
> +
> +/* Ensure local redirects are always disabled by including
> +   math.h in the following manner.  */
> +#undef NO_MATH_REDIRECT
> +#define NO_MATH_REDIRECT
> +#include <math.h>
> +#undef NO_MATH_REDIRECT
> +
> +/* Include forward defitions to redirect complex functions
> +   below.  */
> +#include <complex.h>
> +
> +/* Declare redirects for an implementation function f which
> +   has a complex analogue.  f is assumed to be prefixed
> +   with '__' and is thus passed through to F128_REDIR.  */
> +#define F128_C_REDIR(f) F128_REDIR (__c ## f ## f128); \
> +			F128_REDIR (__ ## f ## f128); \
> +
> +/* Similar to F128_C_REDIR, declare the set of implementation
> +   redirects for the trigonometric family f for {a,}f{,h}
> +   and {a,}cf{,h} complex variants where f is sin/cos/tan.  */
> +#define F128_TRIG_REDIR(f) F128_C_REDIR (a ## f); \
> +			   F128_C_REDIR (a ## f ## h); \
> +			   F128_C_REDIR (f); \
> +			   F128_C_REDIR (f ## h);
> +
> +F128_TRIG_REDIR (cos)
> +F128_TRIG_REDIR (sin)
> +F128_TRIG_REDIR (tan)
> +
> +F128_C_REDIR (log);
> +F128_C_REDIR (log10);
> +F128_C_REDIR (exp);
> +F128_C_REDIR (sqrt);
> +F128_C_REDIR (pow);
> +
> +F128_REDIR (__atan2f128)
> +F128_REDIR (__kernel_casinhf128);
> +F128_REDIR (__rintf128);
> +F128_REDIR (__floorf128);
> +F128_REDIR (__fabsf128);
> +F128_REDIR (__hypotf128);
> +F128_REDIR (__scalbnf128);
> +F128_REDIR (__scalblnf128);
> +F128_REDIR (__sincosf128);
> +F128_REDIR (__log1pf128);
> +F128_REDIR (__ilogbf128);
> +F128_REDIR (__ldexpf128);
> +F128_REDIR (__cargf128);
> +F128_REDIR (__cimagf128);
> +F128_REDIR (__crealf128);
> +F128_REDIR (__conjf128);
> +F128_REDIR (__cprojf128);
> +F128_REDIR (__cabsf128);
> +F128_REDIR (__fdimf128);
> +F128_REDIR (__fminf128);
> +F128_REDIR (__fmaxf128);
> +F128_REDIR (__fmodf128);
> +F128_REDIR (__fmaxmagf128);
> +F128_REDIR (__fminmagf128);
> +F128_REDIR (__nanf128);
> +F128_REDIR (__nextupf128);
> +F128_REDIR (__nextdownf128);
> +F128_REDIR (__llogbf128);
> +F128_REDIR (__log2f128);
> +F128_REDIR (__exp10f128);
> +F128_REDIR (__exp2f128);
> +F128_REDIR (__j0f128);
> +F128_REDIR (__j1f128);
> +F128_REDIR (__jnf128);
> +F128_REDIR (__y0f128);
> +F128_REDIR (__y1f128);
> +F128_REDIR (__ynf128);
> +F128_REDIR (__lgammaf128);
> +F128_REDIR_R (__lgammaf128, _r);
> +F128_REDIR (__tgammaf128);
> +F128_REDIR (__remainderf128);
> +F128_REDIR (__iseqsigf128);
> +
> +/* Assist implementations which declare additional symbols
> +   which require forward declarations to redirect.  */
> +extern _Float128 __wrap_scalbnf128 (_Float128, int);
> +extern _Float128 __w_scalblnf128 (_Float128, long int);
> +extern _Float128 __w_log1pf128 (_Float128);
> +extern __typeof (canonicalizef128) __canonicalizef128;
> +extern _Float128 __significandieee128 (_Float128);
> +extern _Float128 __scalbf128 (_Float128, _Float128);
> +F128_REDIR (__scalbf128);
> +F128_REDIR (__wrap_scalbnf128);
> +F128_REDIR (__w_scalblnf128);
> +F128_REDIR (__w_log1pf128);
> +F128_REDIR (__canonicalizef128);
> +F128_REDIR (__significandieee128);
> +
> +/* This is hack.  The build directory is favored over the sysdep directorys.
> +   This causes the generated generic version of s_significandf128.c to build.
> +   The only effective difference is the C symbol name.  Workaround this special
> +   case by redirecting the symbol name emitted from the template.  */
> +extern _Float128 __significandf128 (_Float128) asm ("__significandieee128_power9");
> +
> +/* Include the redirects shared with math_private.h users.  */
> +#include <float128-ifunc-redirects.h>
> +
> +#endif /* !IS_IN(libc) && !defined(_F128_DISABLE_IFUNC) */
> +
> +#endif /*_MATH_TYPE_MACROS_FLOAT128_PPC64_MULTI */
> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/math_private.h b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/math_private.h
> new file mode 100644
> index 0000000000..30212b5d09
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/math_private.h
> @@ -0,0 +1,15 @@
> +#ifndef MATH_PRIVATE_PPC64LE_MA
> +#define MATH_PRIVATE_PPC64LE_MA 1
> +
> +#include_next <math_private.h>
> +
> +#if !defined (_F128_DISABLE_IFUNC)
> +
> +/* math_private.h redeclares many float128_private.h renamed functions, but
> +   we can't inclue float128_private.h as this header is used beyond
> +   private float128 files.  */
> +#include <float128-ifunc-redirects-mp.h>
> +
> +#endif
> +
> +#endif /* MATH_PRIVATE_PPC64LE_MA */
> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128-power9.c b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128-power9.c
> deleted file mode 100644
> index 49aeb3a8f4..0000000000
> --- a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128-power9.c
> +++ /dev/null
> @@ -1,28 +0,0 @@
> -/* __fmaf128() PowerPC64LE POWER9 version.
> -   Copyright (C) 2020 Free Software Foundation, Inc.
> -   This file is part of the GNU C Library.
> -
> -   The GNU C Library is free software; you can redistribute it and/or
> -   modify it under the terms of the GNU Lesser General Public
> -   License as published by the Free Software Foundation; either
> -   version 2.1 of the License, or (at your option) any later version.
> -
> -   The GNU C Library is distributed in the hope that it will be useful,
> -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> -   Lesser General Public License for more details.
> -
> -   You should have received a copy of the GNU Lesser General Public
> -   License along with the GNU C Library; if not, see
> -   <http://www.gnu.org/licenses/>.  */
> -
> -#include <libm-alias-float128.h>
> -
> -#undef libm_alias_float128
> -#define libm_alias_float128(a, b)
> -#undef strong_alias
> -#define strong_alias(a, b)
> -
> -#define __fmaf128 __fmaf128_power9
> -
> -#include <sysdeps/ieee754/float128/s_fmaf128.c>
> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128-ppc64.c b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128-ppc64.c
> deleted file mode 100644
> index ab0c4d03a8..0000000000
> --- a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128-ppc64.c
> +++ /dev/null
> @@ -1,26 +0,0 @@
> -/* __fmaf128() PowerPC64LE version.
> -   Copyright (C) 2020 Free Software Foundation, Inc.
> -   This file is part of the GNU C Library.
> -
> -   The GNU C Library is free software; you can redistribute it and/or
> -   modify it under the terms of the GNU Lesser General Public
> -   License as published by the Free Software Foundation; either
> -   version 2.1 of the License, or (at your option) any later version.
> -
> -   The GNU C Library is distributed in the hope that it will be useful,
> -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> -   Lesser General Public License for more details.
> -
> -   You should have received a copy of the GNU Lesser General Public
> -   License along with the GNU C Library; if not, see
> -   <http://www.gnu.org/licenses/>.  */
> -
> -#undef weak_alias
> -#define weak_alias(a, b)
> -#undef strong_alias
> -#define strong_alias(a, b)
> -
> -#define __fmaf128 __fmaf128_ppc64
> -
> -#include <sysdeps/ieee754/float128/s_fmaf128.c>
> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128.c b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128.c
> deleted file mode 100644
> index 3a370950f9..0000000000
> --- a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128.c
> +++ /dev/null
> @@ -1,36 +0,0 @@
> -/* Multiple versions of fmaf128.
> -   Copyright (C) 2020 Free Software Foundation, Inc.
> -   This file is part of the GNU C Library.
> -
> -   The GNU C Library is free software; you can redistribute it and/or
> -   modify it under the terms of the GNU Lesser General Public
> -   License as published by the Free Software Foundation; either
> -   version 2.1 of the License, or (at your option) any later version.
> -
> -   The GNU C Library is distributed in the hope that it will be useful,
> -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> -   Lesser General Public License for more details.
> -
> -   You should have received a copy of the GNU Lesser General Public
> -   License along with the GNU C Library; if not, see
> -   <http://www.gnu.org/licenses/>.  */
> -
> -#include <libm-alias-float128.h>
> -
> -#define fmaf128 __redirect_fmaf128
> -#include <math.h>
> -#undef fmaf128
> -
> -#include <math_ldbl_opt.h>
> -#include "init-arch.h"
> -
> -extern __typeof (__redirect_fmaf128) __fmaf128_ppc64 attribute_hidden;
> -extern __typeof (__redirect_fmaf128) __fmaf128_power9 attribute_hidden;
> -
> -libc_ifunc_redirected (__redirect_fmaf128, __fmaf128,
> -		       (hwcap2 & PPC_FEATURE2_HAS_IEEE128)
> -		       ? __fmaf128_power9
> -		       : __fmaf128_ppc64);
> -
> -libm_alias_float128 (__fma, fma)
> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128-power9.c b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128-power9.c
> deleted file mode 100644
> index e7414f4a59..0000000000
> --- a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128-power9.c
> +++ /dev/null
> @@ -1,35 +0,0 @@
> -/* POWER9 sqrt for _Float128
> -   Copyright (C) 2018-2020 Free Software Foundation, Inc.
> -   This file is part of the GNU C Library.
> -
> -   The GNU C Library is free software; you can redistribute it and/or
> -   modify it under the terms of the GNU Lesser General Public
> -   License as published by the Free Software Foundation; either
> -   version 2.1 of the License, or (at your option) any later version.
> -
> -   In addition to the permissions in the GNU Lesser General Public
> -   License, the Free Software Foundation gives you unlimited
> -   permission to link the compiled version of this file into
> -   combinations with other programs, and to distribute those
> -   combinations without any restriction coming from the use of this
> -   file.  (The Lesser General Public License restrictions do apply in
> -   other respects; for example, they cover modification of the file,
> -   and distribution when not linked into a combine executable.)
> -
> -   The GNU C Library is distributed in the hope that it will be useful,
> -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> -   Lesser General Public License for more details.
> -
> -   You should have received a copy of the GNU Lesser General Public
> -   License along with the GNU C Library; if not, see
> -   <https://www.gnu.org/licenses/>.  */
> -
> -#include <math-type-macros-float128.h>
> -
> -#define __sqrtf128 __sqrtf128_power9
> -
> -#undef declare_mgen_alias
> -#define declare_mgen_alias(a, b)
> -
> -#include <w_sqrt_template.c>
> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128-ppc64le.c b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128-ppc64le.c
> deleted file mode 100644
> index e03ecb193f..0000000000
> --- a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128-ppc64le.c
> +++ /dev/null
> @@ -1,35 +0,0 @@
> -/* PPC64LE sqrt for _Float128
> -   Copyright (C) 2018-2020 Free Software Foundation, Inc.
> -   This file is part of the GNU C Library.
> -
> -   The GNU C Library is free software; you can redistribute it and/or
> -   modify it under the terms of the GNU Lesser General Public
> -   License as published by the Free Software Foundation; either
> -   version 2.1 of the License, or (at your option) any later version.
> -
> -   In addition to the permissions in the GNU Lesser General Public
> -   License, the Free Software Foundation gives you unlimited
> -   permission to link the compiled version of this file into
> -   combinations with other programs, and to distribute those
> -   combinations without any restriction coming from the use of this
> -   file.  (The Lesser General Public License restrictions do apply in
> -   other respects; for example, they cover modification of the file,
> -   and distribution when not linked into a combine executable.)
> -
> -   The GNU C Library is distributed in the hope that it will be useful,
> -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> -   Lesser General Public License for more details.
> -
> -   You should have received a copy of the GNU Lesser General Public
> -   License along with the GNU C Library; if not, see
> -   <https://www.gnu.org/licenses/>.  */
> -
> -#include <math-type-macros-float128.h>
> -
> -#define __sqrtf128 __sqrtf128_ppc64le
> -
> -#undef declare_mgen_alias
> -#define declare_mgen_alias(a, b)
> -
> -#include <w_sqrt_template.c>
> diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128.c b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128.c
> deleted file mode 100644
> index e2db0a2864..0000000000
> --- a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128.c
> +++ /dev/null
> @@ -1,31 +0,0 @@
> -/* Multiple versions of __sqrtf128.
> -   Copyright (C) 2018-2020 Free Software Foundation, Inc.
> -   This file is part of the GNU C Library.
> -
> -   The GNU C Library is free software; you can redistribute it and/or
> -   modify it under the terms of the GNU Lesser General Public
> -   License as published by the Free Software Foundation; either
> -   version 2.1 of the License, or (at your option) any later version.
> -
> -   The GNU C Library is distributed in the hope that it will be useful,
> -   but WITHOUT ANY WARRANTY; without even the implied warranty of
> -   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> -   Lesser General Public License for more details.
> -
> -   You should have received a copy of the GNU Lesser General Public
> -   License along with the GNU C Library; if not, see
> -   <https://www.gnu.org/licenses/>.  */
> -
> -#define NO_MATH_REDIRECT
> -#include <math.h>
> -#include "init-arch.h"
> -#include <math-type-macros-float128.h>
> -
> -extern __typeof (__sqrtf128) __sqrtf128_ppc64le attribute_hidden;
> -extern __typeof (__sqrtf128) __sqrtf128_power9 attribute_hidden;
> -
> -libc_ifunc (__sqrtf128,
> -	    (hwcap2 & PPC_FEATURE2_ARCH_3_00)
> -	    ? __sqrtf128_power9
> -	    : __sqrtf128_ppc64le);
> -declare_mgen_alias (__sqrt, sqrt)
> diff --git a/sysdeps/powerpc/powerpc64/le/power9/Makeconfig b/sysdeps/powerpc/powerpc64/le/power9/Makeconfig
> new file mode 100644
> index 0000000000..a9190c7b15
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/le/power9/Makeconfig
> @@ -0,0 +1,3 @@
> +# Hint to multiarch (if used) we support power9
> +# on powerpc64le.
> +libc-submachine-power9 = yes
>
Paul E Murphy June 22, 2020, 11:04 p.m. UTC | #3
On 6/22/20 11:57 AM, Adhemerval Zanella via Libc-alpha wrote:
> 
> 
> On 15/06/2020 17:59, Paul E. Murphy via Libc-alpha wrote:
>> See the Makefile changes for high level design/commentary.
>>
>> V4 changes -
>>    * Drop patch to add libm_alias_exclusive_ldouble.  After
>>      recent refactoring of fmaf128, it showed some unfixable
>>      flaws.  Instead, use macro renaming for nextafterf128 to
>>      generate the needed symbols, and rework.
>>
>> V3 changes -
>>    * Cleanup comments.
>>    * Rebase against fmaf128 cleanup
>>    * Use Makeconfig trick to set var in le/power9 sysdep dir to
>>      determine if ifunc support is necessary.  This works with
>>      the upcoming CPU detection patch.
>>    * fmaf128 patch is no longer needed.
>>
>> V2 changes -
>>    * move duplicate redirect macros into float128-ifunc-redirect-macros.h
>>    * replace subshell usage with command sequencing
>>    * Add more instructive documentation in Makefile about how all
>>      these ugly pieces work togethor
>>    * Minor comment cleanup throughout
>>    * Improve inline documentation/commentary throughout
>>
>> ---8<---
>>
>> Programatically generate simple wrappers for most libm *f128
>> objects and a set of ifunc objects to unify them.
>>
>> A second set of implementation files are generated which simply
>> include the first implementation encountered along the search
>> path.  This usually works, excepting when a wrapper is overriden
>> and makefile search order slightly diverges from include order.
>>
>> A set of additional headers are included which primarily rely
>> on asm redirects to rename, and less frequently macro renames
>> where an asm redirect is not possible.  These intercept several
>> common headers to install redirect and disable macros at specific
>> times.  This works surprisingly well.  Notably, some ugliness
>> occurs when header inclusion must be coerced at certain times
>> before turning off aliasing and plt bypass wrappers.
>>
>> Notably, the only special case is s_significandf128.c.  It is
>> doubly special as exists to support ldouble redirects, and
>> exposes subtle difference between makefile rules and search path
>> orders.  Commentary is inlined.
>>
>> Admittedly, this makes shared maintenance a tiny bit more
>> difficult, but lays groundwork for supporting more optimized
>> float128 routines which very overtly assume a soft-fp runtime.
>> Changes to internal float128 API should fail at compile time,
>> thus build-many-glibcs.py should readily catch any divergence.
>>
>> Finally, don't build this support if requested CPU is newer
>> than power8.
>>


>> fixup f128 ifunc
>>
>> drop the patch to introduce the new macro to assist simplification of
>> s_nextafter.c.  It wasn't thought out well enough.  Instead just add
>> the ugly macro redirections needed to generate the appropriate >> nexttoward symbols.

This is refactoring noise, and while not wrong is not meant to be
in the final commit message.

> 
> I am trying to digest the requirements to add such complexity on the
> powerpc64le build rules, specially the internally Makefile hackery
> required.

This is addressed in the notes. Mildly speaking, soft-fp code
generation on P8 is quite limited.  This is pretty easy to identify in 
any non-trivial binary128 function.  e.g expf128 is almost 1/3 the
size on P9. Likewise many complex functions are almost 1/2 the size. 
Anything soft-fp touches massively increases code size and impedes 
instruction scheduling.

I can get some more concrete numbers, but my hope is this enables us
to make even more meaningful improvements to common code when hardware
support is available.

> 
> So if I understood correctly, let say we have these targets:
> 
>    1. powerpc64le-linux-gnu
>    2. powerpc64le-linux-gnu with --with-cpu=power9
> 
> The ifunc mechanism to build optimized versions for power9 will be
> built only for 1, while for 2. only versions that uses hardware
> instruction for __float128 (-mfloat128-hardware gcc option)
> will be used.

In case 2 (and with any newer cpu), this patch is a no-op.

> 
> So all the rediretion machinery done in the float128-ifunc-* are to
> list and redirect internal libm symbols to its float128 counterparts.
> One initial issue is this tend to be fragile: it requires to change
> arch-specific code when generic code is changed (for instance by
> changing the internal symbol name or the caller implementation)

The interesting symbol names are likely to see less change, and those
that do should mostly be hidden via local calls.  This is the price
the ppc64le maintainers pay to support multiarch for a large swath
of libm.  This greatly simplifies the most mundane and error prone
pieces.

> 
> Another issue the rules exceptions (such as s_totalorderf128) that
> require additional care to check if they result in correct code.

Such is already tested via the existing test suite.

> 
> Another possible mantainance issue is to keep updating the exported
> symbol list at float128-ifunc.c, float128-ifunc.h, and
> float128_private.hfor each new possible symbol in future version.
> It against means to correct/change arch-specific code for generic
> changes.

Note that float128-ifunc.c only defines compat symbols for the old
finite entry points. That set should never grow.

> 
> It also increases code size considerable with the potential to keep
> increasing with the addition on new libm functions.

Stripping debug info, the code size increase of libm is about 220kb
added 1210kb library.  Not trivial, but not overwhelming.

> 
> Finally the question is how useful would be this change on real
> world cases to justify this huge build and permutation complexity.

Code size is an interesting metric to measure.  The P9 variants
are substantially smaller where soft-fp is involved. expf128 is almost
1/3 the size.

> 
> What I would expect in realword cases is if the workload really
> uses float128 extensivelly to be built with -mcpu=power9 and/or
> -mfloat128/-mfloat128-hardware. It should cover most the required
> hotspots and glibc can focus on providing only cases where adding
> an specialized ifunc variant does make sense (as for the x86_64
> sysdeps/x86_64/fpu/multiarch/mp*) for instance.
> 
> Also, if an optimized float128 glibc build is paramount, a much
> simpler solution would be to just provide a -mcpu=power9 built one.

That kicks the can to the distros.  I think few ship such libraries. 
The whole value of multiarch is to expose these benefits without having 
to make the end user jump through such hurdles.  I don't think the x86 
comparison holds.  Adding a couple of helpful instructions is tame 
compared to going from soft to hard fp.
Paul E Murphy June 23, 2020, 4:19 p.m. UTC | #4
On 6/22/20 6:04 PM, Paul E Murphy via Libc-alpha wrote:
> 
> 
> On 6/22/20 11:57 AM, Adhemerval Zanella via Libc-alpha wrote:
>>
>>
>> On 15/06/2020 17:59, Paul E. Murphy via Libc-alpha wrote:
>>> See the Makefile changes for high level design/commentary.
>>>
>>> V4 changes -
>>>    * Drop patch to add libm_alias_exclusive_ldouble.  After
>>>      recent refactoring of fmaf128, it showed some unfixable
>>>      flaws.  Instead, use macro renaming for nextafterf128 to
>>>      generate the needed symbols, and rework.
>>>
>>> V3 changes -
>>>    * Cleanup comments.
>>>    * Rebase against fmaf128 cleanup
>>>    * Use Makeconfig trick to set var in le/power9 sysdep dir to
>>>      determine if ifunc support is necessary.  This works with
>>>      the upcoming CPU detection patch.
>>>    * fmaf128 patch is no longer needed.
>>>
>>> V2 changes -
>>>    * move duplicate redirect macros into 
>>> float128-ifunc-redirect-macros.h
>>>    * replace subshell usage with command sequencing
>>>    * Add more instructive documentation in Makefile about how all
>>>      these ugly pieces work togethor
>>>    * Minor comment cleanup throughout
>>>    * Improve inline documentation/commentary throughout
>>>
>>> ---8<---
>>>
>>> Programatically generate simple wrappers for most libm *f128
>>> objects and a set of ifunc objects to unify them.
>>>
>>> A second set of implementation files are generated which simply
>>> include the first implementation encountered along the search
>>> path.  This usually works, excepting when a wrapper is overriden
>>> and makefile search order slightly diverges from include order.
>>>
>>> A set of additional headers are included which primarily rely
>>> on asm redirects to rename, and less frequently macro renames
>>> where an asm redirect is not possible.  These intercept several
>>> common headers to install redirect and disable macros at specific
>>> times.  This works surprisingly well.  Notably, some ugliness
>>> occurs when header inclusion must be coerced at certain times
>>> before turning off aliasing and plt bypass wrappers.
>>>
>>> Notably, the only special case is s_significandf128.c.  It is
>>> doubly special as exists to support ldouble redirects, and
>>> exposes subtle difference between makefile rules and search path
>>> orders.  Commentary is inlined.
>>>
>>> Admittedly, this makes shared maintenance a tiny bit more
>>> difficult, but lays groundwork for supporting more optimized
>>> float128 routines which very overtly assume a soft-fp runtime.
>>> Changes to internal float128 API should fail at compile time,
>>> thus build-many-glibcs.py should readily catch any divergence.
>>>
>>> Finally, don't build this support if requested CPU is newer
>>> than power8.
>>>

> 
> This is refactoring noise, and while not wrong is not meant to be
> in the final commit message.
> 
>>
>> I am trying to digest the requirements to add such complexity on the
>> powerpc64le build rules, specially the internally Makefile hackery
>> required.
> 
> This is addressed in the notes. Mildly speaking, soft-fp code
> generation on P8 is quite limited.  This is pretty easy to identify in 
> any non-trivial binary128 function.  e.g expf128 is almost 1/3 the
> size on P9. Likewise many complex functions are almost 1/2 the size. 
> Anything soft-fp touches massively increases code size and impedes 
> instruction scheduling.
> 
> I can get some more concrete numbers, but my hope is this enables us
> to make even more meaningful improvements to common code when hardware
> support is available.

I did a quick test for expf128.  It's around a 2.5x speedup on the fast 
path (Building a table of 1M small values).  This massive speedup
is due to the expensive PLT calls required for every FP operation, and
the soft-fp variants cannot use FMA.  That hurts.  Quite a bit of libm 
centers around series approximation like expf128.
Adhemerval Zanella June 24, 2020, 8:41 p.m. UTC | #5
On 22/06/2020 20:04, Paul E Murphy wrote:
> 
> 
> On 6/22/20 11:57 AM, Adhemerval Zanella via Libc-alpha wrote:
>>
>>
>> On 15/06/2020 17:59, Paul E. Murphy via Libc-alpha wrote:
>>> See the Makefile changes for high level design/commentary.
>>>
>>> V4 changes -
>>>    * Drop patch to add libm_alias_exclusive_ldouble.  After
>>>      recent refactoring of fmaf128, it showed some unfixable
>>>      flaws.  Instead, use macro renaming for nextafterf128 to
>>>      generate the needed symbols, and rework.
>>>
>>> V3 changes -
>>>    * Cleanup comments.
>>>    * Rebase against fmaf128 cleanup
>>>    * Use Makeconfig trick to set var in le/power9 sysdep dir to
>>>      determine if ifunc support is necessary.  This works with
>>>      the upcoming CPU detection patch.
>>>    * fmaf128 patch is no longer needed.
>>>
>>> V2 changes -
>>>    * move duplicate redirect macros into float128-ifunc-redirect-macros.h
>>>    * replace subshell usage with command sequencing
>>>    * Add more instructive documentation in Makefile about how all
>>>      these ugly pieces work togethor
>>>    * Minor comment cleanup throughout
>>>    * Improve inline documentation/commentary throughout
>>>
>>> ---8<---
>>>
>>> Programatically generate simple wrappers for most libm *f128
>>> objects and a set of ifunc objects to unify them.
>>>
>>> A second set of implementation files are generated which simply
>>> include the first implementation encountered along the search
>>> path.  This usually works, excepting when a wrapper is overriden
>>> and makefile search order slightly diverges from include order.
>>>
>>> A set of additional headers are included which primarily rely
>>> on asm redirects to rename, and less frequently macro renames
>>> where an asm redirect is not possible.  These intercept several
>>> common headers to install redirect and disable macros at specific
>>> times.  This works surprisingly well.  Notably, some ugliness
>>> occurs when header inclusion must be coerced at certain times
>>> before turning off aliasing and plt bypass wrappers.
>>>
>>> Notably, the only special case is s_significandf128.c.  It is
>>> doubly special as exists to support ldouble redirects, and
>>> exposes subtle difference between makefile rules and search path
>>> orders.  Commentary is inlined.
>>>
>>> Admittedly, this makes shared maintenance a tiny bit more
>>> difficult, but lays groundwork for supporting more optimized
>>> float128 routines which very overtly assume a soft-fp runtime.
>>> Changes to internal float128 API should fail at compile time,
>>> thus build-many-glibcs.py should readily catch any divergence.
>>>
>>> Finally, don't build this support if requested CPU is newer
>>> than power8.
>>>
> 
> 
>>> fixup f128 ifunc
>>>
>>> drop the patch to introduce the new macro to assist simplification of
>>> s_nextafter.c.  It wasn't thought out well enough.  Instead just add
>>> the ugly macro redirections needed to generate the appropriate >> nexttoward symbols.
> 
> This is refactoring noise, and while not wrong is not meant to be
> in the final commit message.
> 
>>
>> I am trying to digest the requirements to add such complexity on the
>> powerpc64le build rules, specially the internally Makefile hackery
>> required.
> 
> This is addressed in the notes. Mildly speaking, soft-fp code
> generation on P8 is quite limited.  This is pretty easy to identify in any non-trivial binary128 function.  e.g expf128 is almost 1/3 the
> size on P9. Likewise many complex functions are almost 1/2 the size. Anything soft-fp touches massively increases code size and impedes instruction scheduling.
> 
> I can get some more concrete numbers, but my hope is this enables us
> to make even more meaningful improvements to common code when hardware
> support is available.

Indeed soft-fp is most likely bloated and incur is a lot of libcalls
of most operations. 

> 
>>
>> So if I understood correctly, let say we have these targets:
>>
>>    1. powerpc64le-linux-gnu
>>    2. powerpc64le-linux-gnu with --with-cpu=power9
>>
>> The ifunc mechanism to build optimized versions for power9 will be
>> built only for 1, while for 2. only versions that uses hardware
>> instruction for __float128 (-mfloat128-hardware gcc option)
>> will be used.
> 
> In case 2 (and with any newer cpu), this patch is a no-op.

Ack, this was my understanding.

> 
>>
>> So all the rediretion machinery done in the float128-ifunc-* are to
>> list and redirect internal libm symbols to its float128 counterparts.
>> One initial issue is this tend to be fragile: it requires to change
>> arch-specific code when generic code is changed (for instance by
>> changing the internal symbol name or the caller implementation)
> 
> The interesting symbol names are likely to see less change, and those
> that do should mostly be hidden via local calls.  This is the price
> the ppc64le maintainers pay to support multiarch for a large swath
> of libm.  This greatly simplifies the most mundane and error prone
> pieces.
> 
>>
>> Another issue the rules exceptions (such as s_totalorderf128) that
>> require additional care to check if they result in correct code.
> 
> Such is already tested via the existing test suite.

This issue is not really lacking of testing, but added complexity in
Makefile to handle such specific cases.

> 
>>
>> Another possible mantainance issue is to keep updating the exported
>> symbol list at float128-ifunc.c, float128-ifunc.h, and
>> float128_private.hfor each new possible symbol in future version.
>> It against means to correct/change arch-specific code for generic
>> changes.
> 
> Note that float128-ifunc.c only defines compat symbols for the old
> finite entry points. That set should never grow.
> 
>>
>> It also increases code size considerable with the potential to keep
>> increasing with the addition on new libm functions.
> 
> Stripping debug info, the code size increase of libm is about 220kb
> added 1210kb library.  Not trivial, but not overwhelming.
> 
>>
>> Finally the question is how useful would be this change on real
>> world cases to justify this huge build and permutation complexity.
> 
> Code size is an interesting metric to measure.  The P9 variants
> are substantially smaller where soft-fp is involved. expf128 is almost
> 1/3 the size.
> 
>>
>> What I would expect in realword cases is if the workload really
>> uses float128 extensivelly to be built with -mcpu=power9 and/or
>> -mfloat128/-mfloat128-hardware. It should cover most the required
>> hotspots and glibc can focus on providing only cases where adding
>> an specialized ifunc variant does make sense (as for the x86_64
>> sysdeps/x86_64/fpu/multiarch/mp*) for instance.
>>
>> Also, if an optimized float128 glibc build is paramount, a much
>> simpler solution would be to just provide a -mcpu=power9 built one.
> 
> That kicks the can to the distros.  I think few ship such libraries. The whole value of multiarch is to expose these benefits without having to make the end user jump through such hurdles.  I don't think the x86 comparison holds.  Adding a couple of helpful instructions is tame compared to going from soft to hard fp.

My main issue with this approach is twofold: it basically tries to
provide a soft and hard fp variant of of libm in the same library
(adding build complexity, code bloat, and extra maintainability burden) 
and it relies heavily on the ifunc (which has it own issues that bites 
us now and then).

The x86 comparison is sounded because we could make something similar
and start to provide libm variants for AVX, AVX256, etc in the same
manner.  Instead the approach used was to profile and provide specific
ifunc variants to hotpots. 

I give you that this ISA change is somewhat more intrusive than a vector 
extension, but other ABI examples (armhf with its multiple fp variants) 
usually take the example of relying of the toolchain target to provide 
such optimizations.

I not sure if the best option would be to provide a more easy way 
to configure and build just libm or add a option to build libm for
multiple configuration. And I understand that distro want to minimize
the libc.so variants (that's why ifunc was pushed initially afaik).

That's why I suggested to provide hardware float128 optimized variant
when realword usercases provide us feedback that this might a gain.
Besides the limited float128 current usage, I also expect in most 
scenarios that symbols that compiler implement as builtin (such sqrt) 
won't be called at all. Even for more complex math functions, most likely 
only a subset will be extensively used, that these are the ones that
I think we should focus on instead of just push for the bigger hammer
and optimize everything (which would be just simpler by providing
a specific libm anyways).
Paul E Murphy June 24, 2020, 10:42 p.m. UTC | #6
On 6/24/20 3:41 PM, Adhemerval Zanella wrote:
> On 22/06/2020 20:04, Paul E Murphy wrote:
>> On 6/22/20 11:57 AM, Adhemerval Zanella via Libc-alpha wrote:

>>> What I would expect in realword cases is if the workload really
>>> uses float128 extensivelly to be built with -mcpu=power9 and/or
>>> -mfloat128/-mfloat128-hardware. It should cover most the required
>>> hotspots and glibc can focus on providing only cases where adding
>>> an specialized ifunc variant does make sense (as for the x86_64
>>> sysdeps/x86_64/fpu/multiarch/mp*) for instance.
>>>
>>> Also, if an optimized float128 glibc build is paramount, a much
>>> simpler solution would be to just provide a -mcpu=power9 built one.
>>
>> That kicks the can to the distros.  I think few ship such libraries. The whole value of multiarch is to expose these benefits without having to make the end user jump through such hurdles.  I don't think the x86 comparison holds.  Adding a couple of helpful instructions is tame compared to going from soft to hard fp.
> 
> My main issue with this approach is twofold: it basically tries to
> provide a soft and hard fp variant of of libm in the same library
> (adding build complexity, code bloat, and extra maintainability burden)
> and it relies heavily on the ifunc (which has it own issues that bites
> us now and then).

The design intentionally keeps all of the complexity in one place
hidden, without changes to common code.  Doing each individually is
fool's errand for even a small set of functions.  ifunc is the de facto
standard for multiarch.  Renaming a few redirects is a trivial amount
of work solved via grep.  Likewises, adding 200kb to one library is 
better than shipping a second 1+MB library.

> 
> The x86 comparison is sounded because we could make something similar
> and start to provide libm variants for AVX, AVX256, etc in the same
> manner.  Instead the approach used was to profile and provide specific
> ifunc variants to hotpots.

Again, those are incremental changes to an existing scalar isa.


> That's why I suggested to provide hardware float128 optimized variant
> when realword usercases provide us feedback that this might a gain.
> Besides the limited float128 current usage, I also expect in most
> scenarios that symbols that compiler implement as builtin (such sqrt)
> won't be called at all. Even for more complex math functions, most likely
> only a subset will be extensively used, that these are the ones that
> I think we should focus on instead of just push for the bigger hammer
> and optimize everything (which would be just simpler by providing
> a specific libm anyways).
> 

I disagree.  There is an obvious massive performance gap for all
transcendental functions.  It's our responsibility to proactively
solve this transparently for our users who are far more restricted
in which glibc they get to use.  Doubly so as ppc64le starts
transitioning to the new long double abi.

How about slightly changing the makefile to an opt-in model whereby
only transcendental abi (and other single instruction like sqrt) are
run through the auto-float128-ifunc machinery?
Joseph Myers June 25, 2020, midnight UTC | #7
On Wed, 24 Jun 2020, Paul E Murphy via Libc-alpha wrote:

> The design intentionally keeps all of the complexity in one place
> hidden, without changes to common code.  Doing each individually is

It's not really hidden, when it means that many new libm functions need 
architecture-specific changes.  The normal expectation is that any new 
function (whether using type-generic templates or type-specific 
implementations, whether real or complex or narrowing-real, and whatever 
internal functions might be used in the implementation) does not need 
architecture-specific changes beyond updating the ABI baselines.
Paul E Murphy June 25, 2020, 4:29 p.m. UTC | #8
On 6/24/20 7:00 PM, Joseph Myers wrote:
> On Wed, 24 Jun 2020, Paul E Murphy via Libc-alpha wrote:
> 
>> The design intentionally keeps all of the complexity in one place
>> hidden, without changes to common code.  Doing each individually is
> 
> It's not really hidden, when it means that many new libm functions need
> architecture-specific changes.  The normal expectation is that any new
> function (whether using type-generic templates or type-specific
> implementations, whether real or complex or narrowing-real, and whatever
> internal functions might be used in the implementation) does not need
> architecture-specific changes beyond updating the ABI baselines.
> 

Can I interpret that as saying an opt-in approach to generating ifunc'ed
math functions would make a patch like this acceptable?
Joseph Myers June 25, 2020, 6:35 p.m. UTC | #9
On Thu, 25 Jun 2020, Paul E Murphy via Libc-alpha wrote:

> On 6/24/20 7:00 PM, Joseph Myers wrote:
> > On Wed, 24 Jun 2020, Paul E Murphy via Libc-alpha wrote:
> > 
> > > The design intentionally keeps all of the complexity in one place
> > > hidden, without changes to common code.  Doing each individually is
> > 
> > It's not really hidden, when it means that many new libm functions need
> > architecture-specific changes.  The normal expectation is that any new
> > function (whether using type-generic templates or type-specific
> > implementations, whether real or complex or narrowing-real, and whatever
> > internal functions might be used in the implementation) does not need
> > architecture-specific changes beyond updating the ABI baselines.
> > 
> 
> Can I interpret that as saying an opt-in approach to generating ifunc'ed
> math functions would make a patch like this acceptable?

I don't know what that opt-in approach might look like, or whether it 
would address concerns other people have expressed in this discussion.
diff mbox series

Patch

diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/Makefile b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/Makefile
index 8747b02127..3974345d24 100644
--- a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/Makefile
+++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/Makefile
@@ -1,10 +1,208 @@ 
 ifeq ($(subdir),math)
-libm-sysdep_routines += s_fmaf128-ppc64 s_fmaf128-power9 \
-			w_sqrtf128-power9 w_sqrtf128-ppc64le
 
-CFLAGS-s_fmaf128-ppc64.c += $(type-float128-CFLAGS) $(no-gnu-attribute-CFLAGS)
-CFLAGS-s_fmaf128-power9.c += $(type-float128-CFLAGS) -mcpu=power9 $(no-gnu-attribute-CFLAGS)
+#
+# Only enable ifunc _Float128 support if the baseline cpu support
+# is older than power9.
+ifneq (yes,$(libc-submachine-power9))
+do_f128_multiarch = yes
+endif
+
+#
+# This is an ugly, but contained, mechanism to provide hardware optimized
+# _Float128 and ldouble == ieee128 optimized routines for P9 and beyond
+# hardware.  At a very high level, we rely on ASM renames, and rarely
+# macro renames to build two sets of _Float128 ABI, one with _power8 (the
+# baseline powerpc64le cpu) and power9 (the first powerpc64le cpu to introduce
+# hardware support for _Float128).
+#
+# At a high level, we compile 3 files for each object file.
+#   1.  The baseline soft-float128, unsuffixed objects $(object).$(sfx).
+#       This ABI is suffixed with _power8.
+#   2.  The hard-float128, power9, suffixed objects $(object)-power9.$(sfx)
+#   3.  The IFUNC wrapper object to export ABI, $(object)-ifunc.$(sfx)
+#
+# 2 & 3 are automatically generated by Makefile rule.  Placing the exported
+# ABI into a separate file allows reuse of existing aliasing macros
+# with minimal hassle.  Likewise, a backdoor is provided to unilaterally
+# disable this support per object.
+#
+# Changes to APIs will require minor updates to one (or two) places:
+#
+#   * Internal float128 API: the float128_private.h interposer.
+#   * math_private.h API: float128-ifunc-redirects-mp.h
+#   * templated math API: the math-type-macros-float128.h interposer.
+#
+# Some redirects are duplicated between both float128_private.h and
+# math-type-macros-float128.h as they are not usually included together
+# when building libm.  The hope is this provides minimal burden on
+# maintainers, and is readily caught by build-many-glibcs.py.
+#
+# The above is supported by several carefully crafted header files as
+# described below:
+#
+#   * float128-ifunc.h provides support for generating the IFUNC objects
+#                      in part 3 above.  It also enables case-by-case
+#                      overriding as some objects do not expose a uniform
+#                      ABI.
+#   * float128-ifunc.c provides compatibility ABI using the IFUNC objects.
+#                      These should rarely change and don't cause trouble
+#                      when grouped into a single object file as they are
+#                      only needed for the shared library.
+#   * float128-ifunc-macros.h disables all first-order aliasing macros
+#                      used in libm/_Float128, but not the backing
+#                      implementations provided by libc-symbols.h as some
+#                      objects generate strong aliases which make this
+#                      work easier.
+#   * float128-ifunc-redirect-macros.h provides macros to support ASM
+#                      redirect of _Float128 ABI.
+#   * float128-ifunc-redirects.h provides ASM redirects for functions
+#                      which are nominally redirected in the private
+#                      copy of math.h.
+#   * float128-ifunc-redirects-mp.h provides ASM redirects which are used
+#                      by math_private.h (the -mp suffix) and the interposer
+#                      float128_private.h discussed late.
+#
+# The headers above should only be included via the interposed headers
+# discussed below.  Several commonly used headers are interposed to rename all
+# via ASM redirects.  This requires careful orchestration of header inclusion
+# to ensure headers are redirected to exclusively _power8 or _power9 suffixed
+# ABI.  This also has the desirable side-effect of bypassing the PLT locally
+# and generating compile time errors if a function is missed or changed.
+#
+#   * float128_private.h is currently used to rename the ldouble == ieee128
+#                      object files today.  This takes it a step further and
+#                      redirects symbols to _power9 or _power8 variants of the
+#                      functions.  This supports nearly all files in
+#                      sysdeps/ieee754/float128, but not all _Float128 objects.
+#                      However, there are three distinct build configurations
+#                      used to compile _Float128 support.  Two other headers
+#                      below complete the ABI redirection.
+#   * math-type-macros-float128.h supports renames for the common object files
+#                      which are built from templates in math/.
+#   * math_private.h provides rename support for the common files built in math/
+#                      which are neither template generated nor ldbl-128 specific.
+#                      It should be noted that float128_private.h and math_private.h
+#                      overlap in their declarations, and are used orthogonally.
+#
+#
+# The above usually works out very well, but there are sometimes special cases
+# so special you need throw your hands up and give up.  For that, support
+# is provided to disable the above entirely at an object level.  Today this
+# includes objects which only provide tables, or have macros so unspeakably
+# heinous that no reasonable fixup can be provided.  Such objects are declared
+# in gen-libm-f128-no-ifunc-calls.
+#
+# Secondly, this enforces a slightly different mechanism for machine specific
+# overrides.  That is, all optimizations for all targets must all be reachable
+# from the same file as the above relies on rebuilding the same file with
+# different compiler settings.  Most arch specific overrides should be trivial
+# implementations (e.g sqrt or fma), thus it should present no obstacle.
+# Likewise, this also enforces them to use the same language (C or ASM today).
+#
+# Finally, some designer notes/rambling.  One could naively use target cloning,
+# but that generates an ifunc per function, not per entry point.  The above
+# gives us two copies of _Float128 ABI which are entirely isolated, and
+# need no internal ifunc usage to disambiguate.  ASM renames are preferable
+# to macro renames.  The latter causes many macro expansion bugs which require
+# many ugly fixups (that was my first attempt).  Secondly, one may note libgcc
+# provides ifunc routines for soft-fp functions, why this?  Such callouts
+# inhibit most compiler optimization and result in not so great code.  Next,
+# why not libc too?  Inspecting libc, the reachable _Float128 code only makes
+# a single digit number of soft-fp calls.  The benefit of the above is limited.
+#
+ifeq ($(do_f128_multiarch),yes)
+
+gen-libm-all-f128-ifunc-calls = \
+	$(strip $(subst F,$(type-float128-suffix),$(libm-calls)) \
+		$(foreach f,$(libm-narrow-fns),$(subst F,$(f),$(libm-narrow-types-float128-yes))) \
+		$(type-float128-routines))
+
+# Some functions are not trivial to ifunc today without some extensive refactoring.
+# totalorder{,mag} have no benefit to native IEEE support and have complex versioning requirements.
+# Likewise, tables require no special treatment.
+gen-libm-f128-no-ifunc-calls := s_totalorderf128 s_totalordermagf128 t_sincosf128
+gen-libm-f128-ifunc-calls = $(filter-out $(gen-libm-f128-no-ifunc-calls),$(gen-libm-all-f128-ifunc-calls))
+
+f128-march-routines-p9 = $(addsuffix -power9,$(gen-libm-f128-ifunc-calls))
+f128-march-routines-ifunc = $(addsuffix -ifunc,$(gen-libm-f128-ifunc-calls))
+f128-march-routines = $(f128-march-routines-p9) $(f128-march-routines-ifunc)
+f128-march-cpus = power9
+
+libm-routines += $(f128-march-routines) float128-ifunc
+generated += $(f128-march-routines)
+
+# When multiarch support must be explicitly disabled for an object
+# file, we must also supply a macro hint when building it.  Only
+# objects which contain executable code should require this.
+CPPFLAGS-s_totalorderf128.c += -D_F128_DISABLE_IFUNC
+CPPFLAGS-s_totalordermagf128.c += -D_F128_DISABLE_IFUNC
+CPPFLAGS-float128-ifunc.c += -D_F128_DISABLE_IFUNC
+
+CFLAGS-float128-ifunc.c += $(type-float128-CFLAGS) $(no-gnu-attribute-CFLAGS)
+
+# Copy special CFLAGS for some functions
+CFLAGS-m_modff128-power9.c += -fsignaling-nans
+
+# Generate wrapper objects for each machine,
+# and a separate ifunc wrapper.  Likewise substitute
+# m_%.c files should include s_%.c to match common libm rules
+# for files built in both libm and libc.
+$(objpfx)gen-float128-ifuncs.stmp: Makefile
+	$(make-target-directory)
+	for gcall in $(gen-libm-f128-ifunc-calls); do    \
+	  ifile="$${gcall}";                             \
+	  if [ $${gcall##m_} != $${gcall} ]; then        \
+	    ifile="s_$${gcall##m_}";                     \
+	  fi;                                            \
+	  for cpu in $(f128-march-cpus); do              \
+	    file=$(objpfx)$${gcall}-$${cpu}.c;           \
+	    {                                            \
+	      echo "#include <$${ifile}.c>";             \
+	    } > $${file};                                \
+	  done;                                          \
+	  name="$${gcall##?_}";                          \
+	  pfx="$${gcall%%_*}";                           \
+	  R="";                                          \
+	  r="";                                          \
+	  if [ $${gcall##m_} != $${gcall} ]; then        \
+	    pfx="s";                                     \
+	  fi;                                            \
+	  if [ $${#pfx} != 1 ]; then                     \
+	    pfx="";                                      \
+	  else                                           \
+	    pfx="_$${pfx}";                              \
+	  fi;                                            \
+	  if [ $${name%%_r} != $${name} ]; then          \
+	    R="_R";                                      \
+	    r="_r";                                      \
+	    name="$${name%%_r}";                         \
+	  fi;                                            \
+	  name="$${name%%f128}";                         \
+	  decl="DECL_ALIAS$${pfx}_$${name}$${r}";        \
+	  declc="DECL_ALIAS$${R}$${pfx}";                \
+	  {                                              \
+	    echo "#include <float128-ifunc.h>";          \
+	    echo "#ifndef $${decl}";                     \
+	    echo "# define $${decl}(f) $${declc} (f)";   \
+	    echo "#endif";                               \
+	    echo "$${decl} ($${name});";                 \
+	  } > $(objpfx)$${gcall}-ifunc.c;                \
+	done;                                            \
+	echo > $(@)
+
+$(foreach f,$(f128-march-routines),$(objpfx)$(f).c): $(objpfx)gen-float128-ifuncs.stmp
+
+include $(o-iterator)
+define o-iterator-doit
+$(foreach f,$(f128-march-routines-p9),$(objpfx)$(f)$(o)): sysdep-CFLAGS += -mcpu=power9 $$(type-float128-CFLAGS) $$(no-gnu-attributes-CFLAGS)
+endef
+object-suffixes-left := $(all-object-suffixes)
+include $(o-iterator)
+
+else
+
+# Minimum CPU is POWER9 or newer, this support is not needed.
+math-CPPFLAGS += -D_F128_DISABLE_IFUNC
 
-CFLAGS-w_sqrtf128-ppc64le.c += $(type-float128-CFLAGS) $(no-gnu-attribute-CFLAGS)
-CFLAGS-w_sqrtf128-power9.c += $(type-float128-CFLAGS) -mcpu=power9 $(no-gnu-attribute-CFLAGS)
+endif # do_f128_multiarch
 endif
diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-macros.h b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-macros.h
new file mode 100644
index 0000000000..f66d255478
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-macros.h
@@ -0,0 +1,68 @@ 
+/* _Float128 aliasing macro support for ifunc generation on PPC.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _FLOAT128_IFUNC_MACROS_PPC64LE
+#define _FLOAT128_IFUNC_MACROS_PPC64LE 1
+
+/* Bring in the various alias providing headers, and disable
+   those used for _Float128.  This prevents exporting any ABI
+   from _Float128 implementation objects, or confusing errors
+   when a renamed symbol fails to compile.  */
+#include <libm-alias-float128.h>
+#include <math-narrow.h>
+#include <libm-alias-finite.h>
+
+#undef libm_alias_float32_float128
+#undef libm_alias_float64_float128
+#undef libm_alias_float64x_float128
+#undef libm_alias_float128_r
+#undef libm_alias_finite
+#undef libm_alias_exclusive_ldouble
+#undef libm_alias_float128_other_r_ldbl
+#undef declare_mgen_finite_alias
+#undef declare_mgen_alias
+#undef declare_mgen_alias_r
+
+#define libm_alias_finite(from, to)
+#define libm_alias_float128_r(from, to, r)
+#define libm_alias_float32_float128(func)
+#define libm_alias_float64_float128(func)
+#define libm_alias_float64x_float128(func)
+#define libm_alias_exclusive_ldouble(from, to)
+#define libm_alias_float128_other_r_ldbl(from, to, r)
+#define declare_mgen_finite_alias(from, to)
+#define declare_mgen_alias(from, to)
+#define declare_mgen_alias_r(from, to)
+
+/*  Likewise, disable hidden symbol support.  This is not needed
+    for the implementation objects as the redirects already give
+    us this support.  This also means any non-_Float128 headers
+    which provide hidden_def's should be included prior to this
+    header (only fenv.h during initial support).  */
+#undef mathx_hidden_def
+#define mathx_hidden_def(func)
+#undef libm_hidden_def
+#define libm_hidden_def(func)
+#undef libm_hidden_proto
+#define libm_hidden_proto(f)
+#undef hidden_proto
+#define hidden_proto(f)
+
+#include <float128-ifunc-redirect-macros.h>
+
+#endif /* _FLOAT128_IFUNC_MACROS_PPC64LE */
diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirect-macros.h b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirect-macros.h
new file mode 100644
index 0000000000..a9369d1fae
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirect-macros.h
@@ -0,0 +1,52 @@ 
+/* _Float128 aliasing macro support for ifunc generation on PPC.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _FLOAT128_IFUNC_REDIRECT_MACROS_PPC64LE
+#define _FLOAT128_IFUNC_REDIRECT_MACROS_PPC64LE 1
+
+/* Define the redirection macros used throughout most of the IFUNC headers.
+
+   F128_REDIR_PFX_R(function, destination_prefix, reentrant_suffix)
+    Redirect function, optionally suffixed by reentrant_suffix, to a function
+    named destination_prefix ## function ## cpu ## reentrant_suffix where cpu
+    is either _power8 or _power9 as inferred by compiler options.
+
+   F128_SFX_APPEND(sym)
+    Append the the multiarch cpu specific suffix to the sym. sym is not
+    expanded.  This is sym ## cpu, where cpu is eiter power8 or power9
+    inferred by compiler options.
+
+   F128_REDIR_R(func, reentrant_suffix)
+    Redirect func to a function named function ## cpu ## reentrant_suffix
+    where cpu is either _power8 or _power9 as inferred by compiler options.
+
+   F128_REDIR(function)
+    Redirect function, to a function named function ## cpu where cpu is
+    either _power8 or _power9 as inferred by compiler options.
+*/
+#ifndef _ARCH_PWR9
+#define F128_REDIR_PFX_R(func, pfx, r) extern __typeof(func ## r) func ## r __asm( #pfx #func "_power8" #r );
+#define F128_SFX_APPEND(x) x ## _power8
+#else
+#define F128_REDIR_PFX_R(func, pfx, r) extern __typeof(func ## r) func ## r __asm( #pfx #func "_power9" #r );
+#define F128_SFX_APPEND(x) x ## _power9
+#endif
+#define F128_REDIR_R(func, r) F128_REDIR_PFX_R (func, , r)
+#define F128_REDIR(func) F128_REDIR_R (func, )
+
+#endif /*_FLOAT128_IFUNC_REDIRECT_MACROS_PPC64LE */
diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirects-mp.h b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirects-mp.h
new file mode 100644
index 0000000000..3c8b6f1291
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirects-mp.h
@@ -0,0 +1,64 @@ 
+/* _Float128 multiarch redirects shared with math_private.h
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _FLOAT128_IFUNC_REDIRECTS_MP_H
+#define _FLOAT128_IFUNC_REDIRECTS_MP_H 1
+
+#include <float128-ifunc-redirect-macros.h>
+
+F128_REDIR (__ieee754_acosf128)
+F128_REDIR (__ieee754_acoshf128)
+F128_REDIR (__ieee754_asinf128)
+F128_REDIR (__ieee754_atan2f128)
+F128_REDIR (__ieee754_atanhf128)
+F128_REDIR (__ieee754_coshf128)
+F128_REDIR (__ieee754_expf128)
+F128_REDIR (__ieee754_exp10f128)
+F128_REDIR (__ieee754_exp2f128)
+F128_REDIR (__ieee754_fmodf128)
+F128_REDIR (__ieee754_gammaf128)
+F128_REDIR_R (__ieee754_gammaf128, _r)
+F128_REDIR (__ieee754_hypotf128)
+F128_REDIR (__ieee754_j0f128)
+F128_REDIR (__ieee754_j1f128)
+F128_REDIR (__ieee754_jnf128)
+F128_REDIR (__ieee754_lgammaf128)
+F128_REDIR_R (__ieee754_lgammaf128, _r)
+F128_REDIR (__ieee754_logf128)
+F128_REDIR (__ieee754_log10f128)
+F128_REDIR (__ieee754_log2f128)
+F128_REDIR (__ieee754_powf128)
+F128_REDIR (__ieee754_remainderf128)
+F128_REDIR (__ieee754_sinhf128)
+F128_REDIR (__ieee754_sqrtf128)
+F128_REDIR (__ieee754_y0f128)
+F128_REDIR (__ieee754_y1f128)
+F128_REDIR (__ieee754_ynf128)
+F128_REDIR (__ieee754_scalbf128)
+F128_REDIR (__ieee754_ilogbf128)
+F128_REDIR (__ieee754_rem_pio2f128)
+F128_REDIR (__kernel_sinf128)
+F128_REDIR (__kernel_cosf128)
+F128_REDIR (__kernel_tanf128)
+F128_REDIR (__kernel_sincosf128)
+F128_REDIR (__kernel_rem_pio2f128)
+F128_REDIR (__x2y2m1f128)
+F128_REDIR (__gamma_productf128)
+F128_REDIR (__lgamma_negf128)
+
+#endif /*_FLOAT128_IFUNC_REDIRECTS_MP_H */
diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirects.h b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirects.h
new file mode 100644
index 0000000000..88b71558b0
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc-redirects.h
@@ -0,0 +1,40 @@ 
+/* _Float128 redirects for ppc64le multiarch env.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _FLOAT128_IFUNC_REDIRECTS
+#define _FLOAT128_IFUNC_REDIRECTS 1
+
+#include <float128-ifunc-macros.h>
+
+F128_REDIR_PFX_R (sqrtf128, __,);
+F128_REDIR_PFX_R (rintf128, __,);
+F128_REDIR_PFX_R (ceilf128, __,);
+F128_REDIR_PFX_R (floorf128, __,);
+F128_REDIR_PFX_R (truncf128, __,);
+F128_REDIR_PFX_R (roundf128, __,);
+F128_REDIR_PFX_R (fabsf128, __,);
+F128_REDIR (__issignalingf128)
+
+extern __typeof (ldexpf128) F128_SFX_APPEND (__ldexpf128);
+
+#define __isinff128 F128_SFX_APPEND (__isinff128)
+#define __isnanf128 F128_SFX_APPEND (__isnanf128)
+#define __finitef128 F128_SFX_APPEND (__finitef128)
+#define __ldexpf128 F128_SFX_APPEND (__ldexpf128)
+
+#endif /* _FLOAT128_IFUNC_REDIRECTS */
diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc.c b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc.c
new file mode 100644
index 0000000000..cefaa6d889
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc.c
@@ -0,0 +1,66 @@ 
+/* _Float128 ifunc definitions for compat symbols.
+   Copyright (C) 2017-2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <float128-ifunc.h>
+#include <libm-alias-finite.h>
+
+#if SHLIB_COMPAT (libm, GLIBC_2_15, GLIBC_2_31)
+
+/* __gammaf128_r is a special case.  This prototype keeps compat macro simple. */
+extern _Float128 gammaf128_r (_Float128 x, int *signamp);
+
+/* Generate compatibility alias macros for finite math functions.  IFUNC is
+   used to avoid complicating the macros in float128-ifunc.h, and avoids the
+   need to use special macros while constructing the baseline objects.  */
+#define MAKE_IFUNC_COMPAT_R(func, r) \
+	extern __typeof(func ## r) __ieee754_ ## func ## _power8 ## r; \
+	extern __typeof(func ## r) __ieee754_ ## func ## _power9 ## r; \
+	extern __typeof(func ## r) __ieee754_ ## func ## r; \
+	_F128_IFUNC(__ieee754_ ## func, r); \
+	libm_alias_finite (__ieee754_ ## func ## r, __ ## func ## r)
+
+#define MAKE_IFUNC_COMPAT(func) MAKE_IFUNC_COMPAT_R (func,)
+
+MAKE_IFUNC_COMPAT (acosf128)
+MAKE_IFUNC_COMPAT (acoshf128)
+MAKE_IFUNC_COMPAT (asinf128)
+MAKE_IFUNC_COMPAT (atan2f128)
+MAKE_IFUNC_COMPAT (atanhf128)
+MAKE_IFUNC_COMPAT (coshf128)
+MAKE_IFUNC_COMPAT (exp10f128)
+MAKE_IFUNC_COMPAT (exp2f128)
+MAKE_IFUNC_COMPAT (expf128)
+MAKE_IFUNC_COMPAT (fmodf128)
+MAKE_IFUNC_COMPAT_R (gammaf128, _r)
+MAKE_IFUNC_COMPAT (hypotf128)
+MAKE_IFUNC_COMPAT (j0f128)
+MAKE_IFUNC_COMPAT (j1f128)
+MAKE_IFUNC_COMPAT (jnf128)
+MAKE_IFUNC_COMPAT_R (lgammaf128, _r)
+MAKE_IFUNC_COMPAT (log10f128)
+MAKE_IFUNC_COMPAT (log2f128)
+MAKE_IFUNC_COMPAT (logf128)
+MAKE_IFUNC_COMPAT (powf128)
+MAKE_IFUNC_COMPAT (remainderf128)
+MAKE_IFUNC_COMPAT (sinhf128)
+MAKE_IFUNC_COMPAT (sqrtf128)
+MAKE_IFUNC_COMPAT (y0f128)
+MAKE_IFUNC_COMPAT (y1f128)
+MAKE_IFUNC_COMPAT (ynf128)
+
+#endif
diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc.h b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc.h
new file mode 100644
index 0000000000..3e5b573091
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128-ifunc.h
@@ -0,0 +1,217 @@ 
+/* _Float128 ifunc symboling macros.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _FLOAT128_IFUNC_H
+#define _FLOAT128_IFUNC_H 1
+
+/* These cause conflicts when aliasing.  Hide their definitions. */
+#define f32addf64x __hide_f32addf64x
+#define f32subf64x __hide_f32subf64x
+#define f32mulf64x __hide_f32mulf64x
+#define f32divf64x __hide_f32divf64x
+#define f32xaddf64x __hide_f32xaddf64x
+#define f32xsubf64x __hide_f32xsubf64x
+#define f32xmulf64x __hide_f32xmulf64x
+#define f32xdivf64x __hide_f32xdivf64x
+#define f32xaddf128 __hide_f32xaddf128
+#define f32xsubf128 __hide_f32xsubf128
+#define f32xmulf128 __hide_f32xmulf128
+#define f32xdivf128 __hide_f32xdivf128
+#define f32addf64 __hide_f32addf64
+#define f32subf64 __hide_f32subf64
+#define f32mulf64 __hide_f32mulf64
+#define f32divf64 __hide_f32divf64
+#define f64addf64x __hide_f64addf64x
+#define f64subf64x __hide_f64subf64x
+#define f64mulf64x __hide_f64mulf64x
+#define f64divf64x __hide_f64divf64x
+
+/* We want the real prototypes. */
+#include <math/math.h>
+#include <math/complex.h>
+#include <first-versions.h>
+#include <shlib-compat.h>
+#include "init-arch.h"
+
+#undef f32addf64x
+#undef f32subf64x
+#undef f32mulf64x
+#undef f32divf64x
+#undef f32xaddf64x
+#undef f32xsubf64x
+#undef f32xmulf64x
+#undef f32xdivf64x
+#undef f32xaddf128
+#undef f32xsubf128
+#undef f32xmulf128
+#undef f32xdivf128
+#undef f32addf64
+#undef f32subf64
+#undef f32mulf64
+#undef f32divf64
+#undef f64addf64x
+#undef f64subf64x
+#undef f64mulf64x
+#undef f64divf64x
+
+#include <libm-alias-float128.h>
+#include <math-narrow.h>
+
+/* _F128_IFUNC2(func, from, r)
+      Generate an ifunc symbol func ## r from the symbols
+	from ## {power8, power9} ## r
+
+      We use the PPC hwcap bit HAS_IEEE128 to select between the two with
+      the assumption all P9 features are available on such targets.  */
+#define _F128_IFUNC2(func, from, r) \
+	libc_ifunc (func ## r, (hwcap2 & PPC_FEATURE2_HAS_IEEE128) \
+                                ? from ## _power9 ## r : from ## _power8 ## r)
+
+/* _F128_IFUNC(func, r)
+      Similar to above, except the exported symbol name trivially remaps from
+      func ## {cpu} ## r to func ## r.  */
+#define _F128_IFUNC(func, r) _F128_IFUNC2(func, func, r)
+
+/* MAKE_IMPL_IFUNC2(func, pfx1, pfx2, r)
+     Declare external symbols of type pfx1 ## func ## f128 ## r with the name
+                                      pfx2 ## func ## f128 ## _{cpu} ## r
+     which are exported as implementation specific symbols (i.e backing support
+     for type classification macros).  */
+#define MAKE_IMPL_IFUNC2(func, pfx1, pfx2, r) \
+	extern __typeof (pfx1 ## func ## f128 ## r) pfx2 ## func ## f128_power8 ## r; \
+	extern __typeof (pfx1 ## func ## f128 ## r) pfx2 ## func ## f128_power9 ## r; \
+        _F128_IFUNC2 (__ ## func ## f128, pfx2 ## func ## f128, r);
+
+/* MAKE_IMPL_IFUNC(func, pfx1, r)
+     Same as MAKE_IMPL_IFUNC2, but pfx2 is assumed to be '__'.  */
+#define MAKE_IMPL_IFUNC(func, pfx1, r) MAKE_IMPL_IFUNC2(func,pfx1,__,r)
+
+/* _libm_alias_narrow(func, size)
+     Export a narrowing function func of type _Float{size}.  This is
+     worked to reuse the exist aliasing macros provided by glibc.  */
+#define _libm_alias_narrow(func, size) \
+	extern __typeof (f ## size ## func ## f128) __f ## size ## func ## f128; \
+	MAKE_IMPL_IFUNC (f ## size ## func,,) \
+	libm_alias_float ## size ## _float128 (func)
+
+/* Helper macros to use the above.  Prefixed only to avoid namespace
+   clashes with the existing glibc macros.  */
+#define _libm_alias_float32_float128(func) _libm_alias_narrow (func, 32)
+#define _libm_alias_float64_float128(func) _libm_alias_narrow (func, 64)
+#define _libm_alias_float64x_float128(func) _libm_alias_narrow (func, 64x)
+
+/* MAKE_IFUNCP_WRAP_R(w, func, r)
+      Export a function which the implementation wraps with prefix w to
+      to func ## r. */
+#define MAKE_IFUNCP_WRAP_R(w, func, r) \
+	extern __typeof (func ## f128 ## r) __ ## func ## f128 ## r; \
+	MAKE_IMPL_IFUNC2 (func,__,__ ## w, r) \
+	weak_alias (__ ## func ## f128 ## r, func ## f128 ## r); \
+	libm_alias_float128_other_r (__ ## func, func, r);
+
+/* MAKE_IFUNCP_R(func, r)
+    The default IFUNC generator for all libm _Float128 ABI except
+    when specifically overwritten.  This is a convenience wrapper
+    around MAKE_IFUNCP_R where w is not used.  */
+#define MAKE_IFUNCP_R(func,r) MAKE_IFUNCP_WRAP_R (,func,r)
+
+
+/* Generic aliasing functions.  */
+#define DECL_ALIAS(f) MAKE_IFUNCP_R (f,)
+#define DECL_ALIAS_s(f) MAKE_IFUNCP_R (f,)
+#define DECL_ALIAS_w(f) MAKE_IFUNCP_R (f,)
+#define DECL_ALIAS_e(f)
+#define DECL_ALIAS_k(f)
+#define DECL_ALIAS_R_w(f) MAKE_IFUNCP_R (f, _r)
+#define DECL_ALIAS_R_e(f)
+
+/* Handle expanding/narrowing functions specially.  */
+#define DECL_ALIAS_s_f32add(x) _libm_alias_float32_float128 (add)
+#define DECL_ALIAS_s_f64add(x) _libm_alias_float64_float128 (add)
+#define DECL_ALIAS_s_f64xadd(x) _libm_alias_float64x_float128 (add)
+#define DECL_ALIAS_s_f32sub(x) _libm_alias_float32_float128 (sub)
+#define DECL_ALIAS_s_f64sub(x) _libm_alias_float64_float128 (sub)
+#define DECL_ALIAS_s_f64xsub(x) _libm_alias_float64x_float128 (sub)
+#define DECL_ALIAS_s_f32mul(x) _libm_alias_float32_float128 (mul)
+#define DECL_ALIAS_s_f64mul(x) _libm_alias_float64_float128 (mul)
+#define DECL_ALIAS_s_f64xmul(x) _libm_alias_float64x_float128 (mul)
+#define DECL_ALIAS_s_f32div(x) _libm_alias_float32_float128 (div)
+#define DECL_ALIAS_s_f64div(x) _libm_alias_float64_float128 (div)
+#define DECL_ALIAS_s_f64xdiv(x) _libm_alias_float64x_float128 (div)
+
+/* These are fallback support for classification functions.  */
+#define DECL_ALIAS_s_isinf(x) MAKE_IMPL_IFUNC (x, __,)
+#define DECL_ALIAS_s_isnan(x) MAKE_IMPL_IFUNC (x, __,)
+#define DECL_ALIAS_s_issignaling(x) MAKE_IMPL_IFUNC (x, __,)
+#define DECL_ALIAS_s_iseqsig(x) MAKE_IMPL_IFUNC (x, __,)
+#define DECL_ALIAS_s_signbit(x) MAKE_IMPL_IFUNC (x, __,)
+#define DECL_ALIAS_s_finite(x) MAKE_IMPL_IFUNC (x, __,)
+#define DECL_ALIAS_s_fpclassify(x) MAKE_IMPL_IFUNC (x, __,)
+
+/* This doesn't have a public strong implementatation alias.  */
+extern __typeof (canonicalizef128) __canonicalizef128;
+
+/* No symbols are defined in these helper/wrapper objects. */
+#define DECL_ALIAS_lgamma_neg(x)
+#define DECL_ALIAS_lgamma_product(x)
+#define DECL_ALIAS_gamma_product(x)
+#define DECL_ALIAS_x2y2m1(x)
+#define DECL_ALIAS_s_log1p(x)
+#define DECL_ALIAS_s_scalbln(x)
+#define DECL_ALIAS_s_scalbn(x)
+
+/* Ensure the wrapper functions get exposed via IFUNC, not the
+   wrappee (e.g __w_log1pf128_power8 instead of __log1pf128_power8. */
+#define DECL_ALIAS_w_log1p(x) MAKE_IFUNCP_WRAP_R(w_,x,)
+#define DECL_ALIAS_w_scalbln(x) MAKE_IFUNCP_WRAP_R(w_,x,)
+
+/* Expose ldouble only redirected symbols.  */
+#define DECL_LDOUBLE_ALIAS(func, RTYPE, ARGS) \
+	extern RTYPE func ARGS; \
+	extern __typeof (func) func ## _power8; \
+	extern __typeof (func) func ## _power9; \
+	_F128_IFUNC ( func,)
+
+/* These are declared in their respective jX objects.  */
+#define DECL_ALIAS_w_j0(f) MAKE_IFUNCP_R (f,) MAKE_IFUNCP_R (y0,)
+#define DECL_ALIAS_w_j1(f) MAKE_IFUNCP_R (f,) MAKE_IFUNCP_R (y1,)
+#define DECL_ALIAS_w_jn(f) MAKE_IFUNCP_R (f,) MAKE_IFUNCP_R (yn,)
+
+#define DECL_ALIAS_s_erf(f) MAKE_IFUNCP_R (f,) MAKE_IFUNCP_R (erfc,)
+
+/* scalbnf128 is an alias of ldexpf128.  */
+#define DECL_ALIAS_s_ldexp(f) MAKE_IFUNCP_R (f,) MAKE_IFUNCP_WRAP_R (wrap_, scalbn,)
+
+/* Handle the special case functions which exist only to support
+   ldouble == ieee128.  */
+#define DECL_ALIAS_s_nexttoward(x) \
+	DECL_LDOUBLE_ALIAS (__nexttowardf_to_ieee128, float, (float, _Float128)) \
+	DECL_LDOUBLE_ALIAS (__nexttoward_to_ieee128, double, (double, _Float128))
+
+#define DECL_ALIAS_w_scalb(x) \
+	DECL_LDOUBLE_ALIAS (__scalbf128,_Float128, (_Float128, _Float128)) \
+	libm_alias_float128_other_r_ldbl (__scalb, scalb,)
+
+#define DECL_ALIAS_s_significand(x) \
+	DECL_LDOUBLE_ALIAS (__significandieee128, _Float128, (_Float128))
+
+#define DECL_ALIAS_s_nextafter(f) \
+	MAKE_IFUNCP_R (f,) \
+	libm_alias_float128_other_r_ldbl (__nextafter, nexttoward,)
+
+#endif /* ifndef _FLOAT128_IFUNC_H  */
diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128_private.h b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128_private.h
new file mode 100644
index 0000000000..3c52735ba7
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/float128_private.h
@@ -0,0 +1,143 @@ 
+/* _Float128 overrides for float128 in ppc64le multiarch env.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _FLOAT128_PRIVATE_PPC64LE
+#define _FLOAT128_PRIVATE_PPC64LE 1
+
+#if IS_IN(libc) || defined(_F128_DISABLE_IFUNC)
+/* multiarch is not supported.  Do nothing and pass through. */
+#include_next <float128_private.h>
+#else
+
+/* Include fenv.h now before turning off PLT bypass tricks.  At
+   minimum fereaiseexcept is used today. */
+#include <fenv.h>
+
+/* Likewise, the PLT bypass trick uses the same trick to rename
+   as we do.  Only one asm-rename is allowed.  Only fenv.h
+   functions require this today, so we include them above.  */
+#undef libm_hidden_proto
+#define libm_hidden_proto(f)
+#undef hidden_proto
+#define hidden_proto(f)
+
+/* Always disable redirects.  We supply these uniquely later on. */
+#undef NO_MATH_REDIRECT
+#define NO_MATH_REDIRECT
+#include <math.h>
+#undef NO_MATH_REDIRECT
+
+#include_next <float128_private.h>
+
+#include <float128-ifunc-macros.h>
+
+/* Declare these now, as they otherwise are not. */
+extern __typeof (cosf128) __ieee754_cosf128;
+extern __typeof (asinhf128) __ieee754_asinhf128;
+
+F128_REDIR (__ieee754_asinhf128)
+F128_REDIR (__ieee754_cosf128)
+F128_REDIR (__asinhf128)
+F128_REDIR (__atanf128)
+F128_REDIR (__cbrtf128)
+F128_REDIR (__ceilf128)
+F128_REDIR (__copysignf128)
+F128_REDIR (__cosf128)
+F128_REDIR (__erfcf128)
+F128_REDIR (__erff128)
+F128_REDIR (__expf128)
+F128_REDIR (__expm1f128)
+F128_REDIR (__fabsf128)
+F128_REDIR (__fdimf128)
+F128_REDIR (__finitef128)
+F128_REDIR (__floorf128)
+F128_REDIR (__fmaf128)
+F128_REDIR (__fmaxf128)
+F128_REDIR (__fminf128)
+F128_REDIR (__fpclassifyf128)
+F128_REDIR (__frexpf128)
+F128_REDIR (__getpayloadf128)
+F128_REDIR (__isinff128)
+F128_REDIR (__isnanf128)
+F128_REDIR (__ldexpf128)
+F128_REDIR (__llrintf128)
+F128_REDIR (__llroundf128)
+F128_REDIR (__log1pf128)
+F128_REDIR (__logbf128)
+F128_REDIR (__logf128)
+F128_REDIR (__lrintf128)
+F128_REDIR (__lroundf128)
+F128_REDIR (__modff128)
+F128_REDIR (__nearbyintf128)
+F128_REDIR (__nextdownf128)
+F128_REDIR (__nextupf128)
+F128_REDIR (__remquof128)
+F128_REDIR (__rintf128)
+F128_REDIR (__roundevenf128)
+F128_REDIR (__roundf128)
+F128_REDIR (__scalblnf128)
+F128_REDIR (__scalbnf128)
+F128_REDIR (__signbitf128)
+F128_REDIR (__sincosf128)
+F128_REDIR (__sinf128)
+F128_REDIR (__sqrtf128)
+F128_REDIR (__tanhf128)
+F128_REDIR (__tanf128)
+F128_REDIR (__truncf128)
+F128_REDIR (__lgamma_productf128)
+F128_REDIR (__mpn_extract_float128)
+F128_REDIR (__fromfpxf128);
+F128_REDIR (__ufromfpxf128);
+F128_REDIR (__fromfpf128);
+F128_REDIR (__ufromfpf128);
+
+#include <float128-ifunc-redirects-mp.h>
+
+/* Macro-rename these as it is simpler than making F128_REDIR work.  */
+#define __nexttoward_to_ieee128 F128_SFX_APPEND (__nexttoward_to_ieee128)
+#define __nexttowardf_to_ieee128 F128_SFX_APPEND (__nexttowardf_to_ieee128)
+#define __f32divf128 F128_SFX_APPEND (__f32divf128)
+#define __f32mulf128 F128_SFX_APPEND (__f32mulf128)
+#define __f32addf128 F128_SFX_APPEND (__f32addf128)
+#define __f32subf128 F128_SFX_APPEND (__f32subf128)
+#define __f64divf128 F128_SFX_APPEND (__f64divf128)
+#define __f64mulf128 F128_SFX_APPEND (__f64mulf128)
+#define __f64addf128 F128_SFX_APPEND (__f64addf128)
+#define __f64subf128 F128_SFX_APPEND (__f64subf128)
+#define __f64xdivf128 F128_SFX_APPEND (__f64xdivf128)
+#define __f64xmulf128 F128_SFX_APPEND (__f64xmulf128)
+#define __f64xaddf128 F128_SFX_APPEND (__f64xaddf128)
+#define __f64xsubf128 F128_SFX_APPEND (__f64xsubf128)
+#define __setpayloadf128 F128_SFX_APPEND (__setpayloadf128)
+#define __setpayloadsigf128 F128_SFX_APPEND (__setpayloadsigf128)
+
+/* Special case fixup for s_nextafterf128.c,  it creates an alias
+   which is used for long double, but not _Float128.  Notably,
+   we don't generate the __nextafterieee128 aliase since those
+   macros are disabled.  We rename the input to strong_alias to
+   get it to generate __nextafterieee128<SFX>.  */
+#define __nextafterf128 F128_SFX_APPEND (__nextafterf128)
+#define __nextafterieee128 F128_SFX_APPEND (__nextafterf128)
+#define __nexttowardieee128 F128_SFX_APPEND (__nexttowardieee128)
+#define __nexttowardf128_do_not_use F128_SFX_APPEND (__nexttowardf128_dnu)
+
+#include <float128-ifunc-redirects.h>
+
+#endif /* !(IS_IN(libc) || defined(_F128_DISABLE_IFUNC) */
+
+#endif /* _FLOAT128_PRIVATE_PPC64LE */
diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/math-type-macros-float128.h b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/math-type-macros-float128.h
new file mode 100644
index 0000000000..bc210b17cf
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/math-type-macros-float128.h
@@ -0,0 +1,136 @@ 
+/* _Float128 overrides for float128 in ppc64le multiarch env.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _MATH_TYPE_MACROS_FLOAT128_PPC64_MULTI
+#define _MATH_TYPE_MACROS_FLOAT128_PPC64_MULTI 1
+
+#include_next <math-type-macros-float128.h>
+
+#if !IS_IN(libc) && !defined(_F128_DISABLE_IFUNC)
+
+/* Include fenv.h now before turning off PLT bypass.  At
+   minimum fereaiseexcept is used today. */
+#include <fenv.h>
+
+#include <float128-ifunc-macros.h>
+
+/* Ensure local redirects are always disabled by including
+   math.h in the following manner.  */
+#undef NO_MATH_REDIRECT
+#define NO_MATH_REDIRECT
+#include <math.h>
+#undef NO_MATH_REDIRECT
+
+/* Include forward defitions to redirect complex functions
+   below.  */
+#include <complex.h>
+
+/* Declare redirects for an implementation function f which
+   has a complex analogue.  f is assumed to be prefixed
+   with '__' and is thus passed through to F128_REDIR.  */
+#define F128_C_REDIR(f) F128_REDIR (__c ## f ## f128); \
+			F128_REDIR (__ ## f ## f128); \
+
+/* Similar to F128_C_REDIR, declare the set of implementation
+   redirects for the trigonometric family f for {a,}f{,h}
+   and {a,}cf{,h} complex variants where f is sin/cos/tan.  */
+#define F128_TRIG_REDIR(f) F128_C_REDIR (a ## f); \
+			   F128_C_REDIR (a ## f ## h); \
+			   F128_C_REDIR (f); \
+			   F128_C_REDIR (f ## h);
+
+F128_TRIG_REDIR (cos)
+F128_TRIG_REDIR (sin)
+F128_TRIG_REDIR (tan)
+
+F128_C_REDIR (log);
+F128_C_REDIR (log10);
+F128_C_REDIR (exp);
+F128_C_REDIR (sqrt);
+F128_C_REDIR (pow);
+
+F128_REDIR (__atan2f128)
+F128_REDIR (__kernel_casinhf128);
+F128_REDIR (__rintf128);
+F128_REDIR (__floorf128);
+F128_REDIR (__fabsf128);
+F128_REDIR (__hypotf128);
+F128_REDIR (__scalbnf128);
+F128_REDIR (__scalblnf128);
+F128_REDIR (__sincosf128);
+F128_REDIR (__log1pf128);
+F128_REDIR (__ilogbf128);
+F128_REDIR (__ldexpf128);
+F128_REDIR (__cargf128);
+F128_REDIR (__cimagf128);
+F128_REDIR (__crealf128);
+F128_REDIR (__conjf128);
+F128_REDIR (__cprojf128);
+F128_REDIR (__cabsf128);
+F128_REDIR (__fdimf128);
+F128_REDIR (__fminf128);
+F128_REDIR (__fmaxf128);
+F128_REDIR (__fmodf128);
+F128_REDIR (__fmaxmagf128);
+F128_REDIR (__fminmagf128);
+F128_REDIR (__nanf128);
+F128_REDIR (__nextupf128);
+F128_REDIR (__nextdownf128);
+F128_REDIR (__llogbf128);
+F128_REDIR (__log2f128);
+F128_REDIR (__exp10f128);
+F128_REDIR (__exp2f128);
+F128_REDIR (__j0f128);
+F128_REDIR (__j1f128);
+F128_REDIR (__jnf128);
+F128_REDIR (__y0f128);
+F128_REDIR (__y1f128);
+F128_REDIR (__ynf128);
+F128_REDIR (__lgammaf128);
+F128_REDIR_R (__lgammaf128, _r);
+F128_REDIR (__tgammaf128);
+F128_REDIR (__remainderf128);
+F128_REDIR (__iseqsigf128);
+
+/* Assist implementations which declare additional symbols
+   which require forward declarations to redirect.  */
+extern _Float128 __wrap_scalbnf128 (_Float128, int);
+extern _Float128 __w_scalblnf128 (_Float128, long int);
+extern _Float128 __w_log1pf128 (_Float128);
+extern __typeof (canonicalizef128) __canonicalizef128;
+extern _Float128 __significandieee128 (_Float128);
+extern _Float128 __scalbf128 (_Float128, _Float128);
+F128_REDIR (__scalbf128);
+F128_REDIR (__wrap_scalbnf128);
+F128_REDIR (__w_scalblnf128);
+F128_REDIR (__w_log1pf128);
+F128_REDIR (__canonicalizef128);
+F128_REDIR (__significandieee128);
+
+/* This is hack.  The build directory is favored over the sysdep directorys.
+   This causes the generated generic version of s_significandf128.c to build.
+   The only effective difference is the C symbol name.  Workaround this special
+   case by redirecting the symbol name emitted from the template.  */
+extern _Float128 __significandf128 (_Float128) asm ("__significandieee128_power9");
+
+/* Include the redirects shared with math_private.h users.  */
+#include <float128-ifunc-redirects.h>
+
+#endif /* !IS_IN(libc) && !defined(_F128_DISABLE_IFUNC) */
+
+#endif /*_MATH_TYPE_MACROS_FLOAT128_PPC64_MULTI */
diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/math_private.h b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/math_private.h
new file mode 100644
index 0000000000..30212b5d09
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/math_private.h
@@ -0,0 +1,15 @@ 
+#ifndef MATH_PRIVATE_PPC64LE_MA
+#define MATH_PRIVATE_PPC64LE_MA 1
+
+#include_next <math_private.h>
+
+#if !defined (_F128_DISABLE_IFUNC)
+
+/* math_private.h redeclares many float128_private.h renamed functions, but
+   we can't inclue float128_private.h as this header is used beyond
+   private float128 files.  */
+#include <float128-ifunc-redirects-mp.h>
+
+#endif
+
+#endif /* MATH_PRIVATE_PPC64LE_MA */
diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128-power9.c b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128-power9.c
deleted file mode 100644
index 49aeb3a8f4..0000000000
--- a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128-power9.c
+++ /dev/null
@@ -1,28 +0,0 @@ 
-/* __fmaf128() PowerPC64LE POWER9 version.
-   Copyright (C) 2020 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <libm-alias-float128.h>
-
-#undef libm_alias_float128
-#define libm_alias_float128(a, b)
-#undef strong_alias
-#define strong_alias(a, b)
-
-#define __fmaf128 __fmaf128_power9
-
-#include <sysdeps/ieee754/float128/s_fmaf128.c>
diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128-ppc64.c b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128-ppc64.c
deleted file mode 100644
index ab0c4d03a8..0000000000
--- a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128-ppc64.c
+++ /dev/null
@@ -1,26 +0,0 @@ 
-/* __fmaf128() PowerPC64LE version.
-   Copyright (C) 2020 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#undef weak_alias
-#define weak_alias(a, b)
-#undef strong_alias
-#define strong_alias(a, b)
-
-#define __fmaf128 __fmaf128_ppc64
-
-#include <sysdeps/ieee754/float128/s_fmaf128.c>
diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128.c b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128.c
deleted file mode 100644
index 3a370950f9..0000000000
--- a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/s_fmaf128.c
+++ /dev/null
@@ -1,36 +0,0 @@ 
-/* Multiple versions of fmaf128.
-   Copyright (C) 2020 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <libm-alias-float128.h>
-
-#define fmaf128 __redirect_fmaf128
-#include <math.h>
-#undef fmaf128
-
-#include <math_ldbl_opt.h>
-#include "init-arch.h"
-
-extern __typeof (__redirect_fmaf128) __fmaf128_ppc64 attribute_hidden;
-extern __typeof (__redirect_fmaf128) __fmaf128_power9 attribute_hidden;
-
-libc_ifunc_redirected (__redirect_fmaf128, __fmaf128,
-		       (hwcap2 & PPC_FEATURE2_HAS_IEEE128)
-		       ? __fmaf128_power9
-		       : __fmaf128_ppc64);
-
-libm_alias_float128 (__fma, fma)
diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128-power9.c b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128-power9.c
deleted file mode 100644
index e7414f4a59..0000000000
--- a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128-power9.c
+++ /dev/null
@@ -1,35 +0,0 @@ 
-/* POWER9 sqrt for _Float128
-   Copyright (C) 2018-2020 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   In addition to the permissions in the GNU Lesser General Public
-   License, the Free Software Foundation gives you unlimited
-   permission to link the compiled version of this file into
-   combinations with other programs, and to distribute those
-   combinations without any restriction coming from the use of this
-   file.  (The Lesser General Public License restrictions do apply in
-   other respects; for example, they cover modification of the file,
-   and distribution when not linked into a combine executable.)
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#include <math-type-macros-float128.h>
-
-#define __sqrtf128 __sqrtf128_power9
-
-#undef declare_mgen_alias
-#define declare_mgen_alias(a, b)
-
-#include <w_sqrt_template.c>
diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128-ppc64le.c b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128-ppc64le.c
deleted file mode 100644
index e03ecb193f..0000000000
--- a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128-ppc64le.c
+++ /dev/null
@@ -1,35 +0,0 @@ 
-/* PPC64LE sqrt for _Float128
-   Copyright (C) 2018-2020 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   In addition to the permissions in the GNU Lesser General Public
-   License, the Free Software Foundation gives you unlimited
-   permission to link the compiled version of this file into
-   combinations with other programs, and to distribute those
-   combinations without any restriction coming from the use of this
-   file.  (The Lesser General Public License restrictions do apply in
-   other respects; for example, they cover modification of the file,
-   and distribution when not linked into a combine executable.)
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#include <math-type-macros-float128.h>
-
-#define __sqrtf128 __sqrtf128_ppc64le
-
-#undef declare_mgen_alias
-#define declare_mgen_alias(a, b)
-
-#include <w_sqrt_template.c>
diff --git a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128.c b/sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128.c
deleted file mode 100644
index e2db0a2864..0000000000
--- a/sysdeps/powerpc/powerpc64/le/fpu/multiarch/w_sqrtf128.c
+++ /dev/null
@@ -1,31 +0,0 @@ 
-/* Multiple versions of __sqrtf128.
-   Copyright (C) 2018-2020 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#define NO_MATH_REDIRECT
-#include <math.h>
-#include "init-arch.h"
-#include <math-type-macros-float128.h>
-
-extern __typeof (__sqrtf128) __sqrtf128_ppc64le attribute_hidden;
-extern __typeof (__sqrtf128) __sqrtf128_power9 attribute_hidden;
-
-libc_ifunc (__sqrtf128,
-	    (hwcap2 & PPC_FEATURE2_ARCH_3_00)
-	    ? __sqrtf128_power9
-	    : __sqrtf128_ppc64le);
-declare_mgen_alias (__sqrt, sqrt)
diff --git a/sysdeps/powerpc/powerpc64/le/power9/Makeconfig b/sysdeps/powerpc/powerpc64/le/power9/Makeconfig
new file mode 100644
index 0000000000..a9190c7b15
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/le/power9/Makeconfig
@@ -0,0 +1,3 @@ 
+# Hint to multiarch (if used) we support power9
+# on powerpc64le.
+libc-submachine-power9 = yes