From patchwork Tue Feb 20 16:58:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sunil Pandey X-Patchwork-Id: 1901581 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4TfQbL2pr0z20Qg for ; Wed, 21 Feb 2024 03:58:38 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6AA4E3858D38 for ; Tue, 20 Feb 2024 16:58:35 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.16]) by sourceware.org (Postfix) with ESMTPS id 374263858D20 for ; Tue, 20 Feb 2024 16:58:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 374263858D20 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 374263858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.198.163.16 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708448295; cv=none; b=oceB8pRLJennEs2O9OSZw/gfrRg9a8AORLd4zROc7h2TbVimRnyi8YdBW3v4NaWhauuO5SMOFduuv78iEoYQgwFoETCZmIgWvy+3/mmbiJXH8tXSeFpp4yUHi7Xz4bH+NQK8v6K0tP0F/eIq+VD/SrpyYbhfWSZAjoUQ83OxYk8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708448295; c=relaxed/simple; bh=IY6olgKW4XSEcrPJQqiDFFyKmWWdNRCUlgl29CbpR2w=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=PAX3krLXFBoB38c5jbuTRCYcr2PmbnODB/ZKecaaZ+CsmQhUng3veYXRV6Xd0UkhrD14R0o4hYR0SrP2RyAwGttZqWXQckE9L8OxPp3QNgUJZ6TGcbPUgAGvaSnH81dARRUuL8HUVbS0lWy8+RQ2TiG/cnaNAx+GiSe3a7nhTdE= ARC-Authentication-Results: i=1; server2.sourceware.org X-IronPort-AV: E=McAfee;i="6600,9927,10990"; a="3026358" X-IronPort-AV: E=Sophos;i="6.06,174,1705392000"; d="scan'208";a="3026358" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmvoesa110.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Feb 2024 08:58:07 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10990"; a="827182135" X-IronPort-AV: E=Sophos;i="6.06,174,1705392000"; d="scan'208";a="827182135" Received: from scymds03.sc.intel.com ([10.148.94.166]) by orsmga001.jf.intel.com with ESMTP; 20 Feb 2024 08:58:05 -0800 Received: from gskx-1.sc.intel.com (gskx-1.sc.intel.com [172.25.149.211]) by scymds03.sc.intel.com (Postfix) with ESMTP id C96F276; Tue, 20 Feb 2024 08:58:05 -0800 (PST) From: Sunil K Pandey To: libc-alpha@sourceware.org Cc: hjl.tools@gmail.com Subject: [PATCH] x86_64: Exclude SSE, AVX and FMA4 variants in libm multiarch Date: Tue, 20 Feb 2024 08:58:05 -0800 Message-ID: <20240220165805.3629140-1-skpgkp2@gmail.com> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 X-Spam-Status: No, score=-4.7 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, FORGED_GMAIL_RCVD, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, GIT_PATCH_0, HK_RANDOM_ENVFROM, HK_RANDOM_FROM, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_SHORT, KAM_STOCKGEN, NML_ADSP_CUSTOM_MED, SPF_HELO_NONE, SPF_SOFTFAIL, SPOOFED_FREEMAIL, SPOOF_GMAIL_MID, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org When glibc is built with FMA and AVX2 enabled by default, the resulting glibc binaries won't run on SSE or FMA4 processors. Exclude SSE, AVX and FMA4 variants in libm multiarch when both FMA and AVX2 are enabled by default. Disallow glibc build with only AVX2 or FMA enabled as all AVX2 processors, including VMs, should also support FMA and vice versa. When glibc is built with SSE4.1 enabled by default, only keep SSE4.1 variant. Fixes BZ 31335. --- config.h.in | 5 + sysdeps/x86/configure | 77 +++++++++ sysdeps/x86/configure.ac | 44 ++++++ sysdeps/x86_64/fpu/multiarch/Makefile | 147 +++++++++--------- sysdeps/x86_64/fpu/multiarch/e_asin.c | 18 ++- sysdeps/x86_64/fpu/multiarch/e_atan2.c | 10 +- sysdeps/x86_64/fpu/multiarch/e_exp.c | 12 +- sysdeps/x86_64/fpu/multiarch/e_exp2f.c | 18 ++- sysdeps/x86_64/fpu/multiarch/e_expf.c | 18 ++- sysdeps/x86_64/fpu/multiarch/e_log.c | 12 +- sysdeps/x86_64/fpu/multiarch/e_log2.c | 18 ++- sysdeps/x86_64/fpu/multiarch/e_log2f.c | 18 ++- sysdeps/x86_64/fpu/multiarch/e_logf.c | 18 ++- sysdeps/x86_64/fpu/multiarch/e_pow.c | 12 +- sysdeps/x86_64/fpu/multiarch/e_powf.c | 26 ++-- sysdeps/x86_64/fpu/multiarch/s_atan.c | 10 +- sysdeps/x86_64/fpu/multiarch/s_ceil-avx.S | 28 ++++ sysdeps/x86_64/fpu/multiarch/s_ceil-sse4_1.S | 11 ++ sysdeps/x86_64/fpu/multiarch/s_ceil.c | 20 +-- sysdeps/x86_64/fpu/multiarch/s_ceilf-avx.S | 28 ++++ sysdeps/x86_64/fpu/multiarch/s_ceilf-sse4_1.S | 11 ++ sysdeps/x86_64/fpu/multiarch/s_ceilf.c | 20 +-- sysdeps/x86_64/fpu/multiarch/s_cosf.c | 10 +- sysdeps/x86_64/fpu/multiarch/s_expm1.c | 10 +- sysdeps/x86_64/fpu/multiarch/s_floor-avx.S | 28 ++++ sysdeps/x86_64/fpu/multiarch/s_floor-sse4_1.S | 11 ++ sysdeps/x86_64/fpu/multiarch/s_floor.c | 20 +-- sysdeps/x86_64/fpu/multiarch/s_floorf-avx.S | 28 ++++ .../x86_64/fpu/multiarch/s_floorf-sse4_1.S | 11 ++ sysdeps/x86_64/fpu/multiarch/s_floorf.c | 20 +-- sysdeps/x86_64/fpu/multiarch/s_log1p.c | 10 +- .../x86_64/fpu/multiarch/s_nearbyint-avx.S | 28 ++++ .../x86_64/fpu/multiarch/s_nearbyint-sse4_1.S | 11 ++ sysdeps/x86_64/fpu/multiarch/s_nearbyint.c | 18 ++- .../x86_64/fpu/multiarch/s_nearbyintf-avx.S | 28 ++++ .../fpu/multiarch/s_nearbyintf-sse4_1.S | 11 ++ sysdeps/x86_64/fpu/multiarch/s_nearbyintf.c | 18 ++- sysdeps/x86_64/fpu/multiarch/s_rint-avx.S | 28 ++++ sysdeps/x86_64/fpu/multiarch/s_rint-sse4_1.S | 11 ++ sysdeps/x86_64/fpu/multiarch/s_rint.c | 20 +-- sysdeps/x86_64/fpu/multiarch/s_rintf-avx.S | 28 ++++ sysdeps/x86_64/fpu/multiarch/s_rintf-sse4_1.S | 11 ++ sysdeps/x86_64/fpu/multiarch/s_rintf.c | 20 +-- .../x86_64/fpu/multiarch/s_roundeven-avx.S | 28 ++++ .../x86_64/fpu/multiarch/s_roundeven-sse4_1.S | 11 ++ sysdeps/x86_64/fpu/multiarch/s_roundeven.c | 18 ++- .../x86_64/fpu/multiarch/s_roundevenf-avx.S | 28 ++++ .../fpu/multiarch/s_roundevenf-sse4_1.S | 11 ++ sysdeps/x86_64/fpu/multiarch/s_roundevenf.c | 18 ++- sysdeps/x86_64/fpu/multiarch/s_sin.c | 18 ++- sysdeps/x86_64/fpu/multiarch/s_sincos.c | 10 +- sysdeps/x86_64/fpu/multiarch/s_sincosf.c | 10 +- sysdeps/x86_64/fpu/multiarch/s_sinf.c | 10 +- sysdeps/x86_64/fpu/multiarch/s_tan.c | 10 +- sysdeps/x86_64/fpu/multiarch/s_trunc-avx.S | 28 ++++ sysdeps/x86_64/fpu/multiarch/s_trunc-sse4_1.S | 11 ++ sysdeps/x86_64/fpu/multiarch/s_trunc.c | 20 +-- sysdeps/x86_64/fpu/multiarch/s_truncf-avx.S | 28 ++++ .../x86_64/fpu/multiarch/s_truncf-sse4_1.S | 11 ++ sysdeps/x86_64/fpu/multiarch/s_truncf.c | 20 +-- sysdeps/x86_64/fpu/multiarch/w_exp.c | 6 +- sysdeps/x86_64/fpu/multiarch/w_log.c | 6 +- sysdeps/x86_64/fpu/multiarch/w_pow.c | 6 +- 63 files changed, 974 insertions(+), 295 deletions(-) create mode 100644 sysdeps/x86_64/fpu/multiarch/s_ceil-avx.S create mode 100644 sysdeps/x86_64/fpu/multiarch/s_ceilf-avx.S create mode 100644 sysdeps/x86_64/fpu/multiarch/s_floor-avx.S create mode 100644 sysdeps/x86_64/fpu/multiarch/s_floorf-avx.S create mode 100644 sysdeps/x86_64/fpu/multiarch/s_nearbyint-avx.S create mode 100644 sysdeps/x86_64/fpu/multiarch/s_nearbyintf-avx.S create mode 100644 sysdeps/x86_64/fpu/multiarch/s_rint-avx.S create mode 100644 sysdeps/x86_64/fpu/multiarch/s_rintf-avx.S create mode 100644 sysdeps/x86_64/fpu/multiarch/s_roundeven-avx.S create mode 100644 sysdeps/x86_64/fpu/multiarch/s_roundevenf-avx.S create mode 100644 sysdeps/x86_64/fpu/multiarch/s_trunc-avx.S create mode 100644 sysdeps/x86_64/fpu/multiarch/s_truncf-avx.S diff --git a/config.h.in b/config.h.in index 2f0669e19b..0a9626cbe8 100644 --- a/config.h.in +++ b/config.h.in @@ -292,4 +292,9 @@ /* Define if -mmovbe is enabled by default on x86. */ #undef HAVE_X86_MOVBE +/* Define if -msse4.1 is enabled by default on x86. */ +#undef HAVE_X86_SSE4_1 + +/* Define if -mavx2 and -mfma are enabled by default on x86. */ +#undef HAVE_X86_AVX2_FMA #endif diff --git a/sysdeps/x86/configure b/sysdeps/x86/configure index 1f4c2d67fd..1c0e0d0640 100644 --- a/sysdeps/x86/configure +++ b/sysdeps/x86/configure @@ -128,3 +128,80 @@ enable-x86-isa-level = $libc_cv_include_x86_isa_level" printf "%s\n" "#define SUPPORT_STATIC_PIE 1" >>confdefs.h +# Check if AVX2 and FMA are available. +{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for AVX2 and FMA instruction support" >&5 +printf %s "checking for AVX2 and FMA instruction support... " >&6; } +if test ${libc_cv_have_x86_avx2_fma+y} +then : + printf %s "(cached) " >&6 +else $as_nop + cat > conftest.c <&5 + (eval $ac_try) 2>&5 + ac_status=$? + printf "%s\n" "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; }; then + libc_cv_have_x86_avx2_fma=yes + else + if { ac_try='grep -q "Only one of AVX2 and FMA is enabled" conftest.err' + { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5 + (eval $ac_try) 2>&5 + ac_status=$? + printf "%s\n" "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; }; then + as_fn_error $? "Only one of AVX2 and FMA is enabled." "$LINENO" 5 + fi + libc_cv_have_x86_avx2_fma=no + fi + rm -rf conftest* +fi +{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: $libc_cv_have_x86_avx2_fma" >&5 +printf "%s\n" "$libc_cv_have_x86_avx2_fma" >&6; } +if test $libc_cv_have_x86_avx2_fma = yes; then + printf "%s\n" "#define HAVE_X86_AVX2_FMA 1" >>confdefs.h + +fi +config_vars="$config_vars +enable-avx2-fma = $libc_cv_have_x86_avx2_fma" + +# Check if SSE4.1 is available. +{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for SSE4.1 instruction support" >&5 +printf %s "checking for SSE4.1 instruction support... " >&6; } +if test ${libc_cv_have_x86_sse4_1+y} +then : + printf %s "(cached) " >&6 +else $as_nop + cat > conftest.c <&5 + (eval $ac_try) 2>&5 + ac_status=$? + printf "%s\n" "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; }; then + libc_cv_have_x86_sse4_1=yes + else + libc_cv_have_x86_sse4_1=no + fi + rm -rf conftest* +fi +{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: $libc_cv_have_x86_sse4_1" >&5 +printf "%s\n" "$libc_cv_have_x86_sse4_1" >&6; } +if test $libc_cv_have_x86_sse4_1 = yes; then + printf "%s\n" "#define HAVE_X86_SSE4_1 1" >>confdefs.h + +fi +config_vars="$config_vars +enable-sse4-1 = $libc_cv_have_x86_sse4_1" + diff --git a/sysdeps/x86/configure.ac b/sysdeps/x86/configure.ac index 437a50623b..df3db3fdc2 100644 --- a/sysdeps/x86/configure.ac +++ b/sysdeps/x86/configure.ac @@ -87,3 +87,47 @@ LIBC_CONFIG_VAR([enable-x86-isa-level], [$libc_cv_include_x86_isa_level]) dnl Static PIE is supported. AC_DEFINE(SUPPORT_STATIC_PIE) + +# Check if AVX2 and FMA are available. +AC_CACHE_CHECK([for AVX2 and FMA instruction support], + libc_cv_have_x86_avx2_fma, [dnl +cat > conftest.c <&conftest.err); then + libc_cv_have_x86_avx2_fma=yes + else + if AC_TRY_COMMAND(grep -q "Only one of AVX2 and FMA is enabled" conftest.err); then + AC_MSG_ERROR([Only one of AVX2 and FMA is enabled.]) + fi + libc_cv_have_x86_avx2_fma=no + fi + rm -rf conftest*]) +if test $libc_cv_have_x86_avx2_fma = yes; then + AC_DEFINE(HAVE_X86_AVX2_FMA) +fi +LIBC_CONFIG_VAR([enable-avx2-fma], [$libc_cv_have_x86_avx2_fma]) + +# Check if SSE4.1 is available. +AC_CACHE_CHECK([for SSE4.1 instruction support], + libc_cv_have_x86_sse4_1, [dnl +cat > conftest.c <&AS_MESSAGE_LOG_FD); then + libc_cv_have_x86_sse4_1=yes + else + libc_cv_have_x86_sse4_1=no + fi + rm -rf conftest*]) +if test $libc_cv_have_x86_sse4_1 = yes; then + AC_DEFINE(HAVE_X86_SSE4_1) +fi +LIBC_CONFIG_VAR([enable-sse4-1], [$libc_cv_have_x86_sse4_1]) diff --git a/sysdeps/x86_64/fpu/multiarch/Makefile b/sysdeps/x86_64/fpu/multiarch/Makefile index e1a490dd98..5eeb106b79 100644 --- a/sysdeps/x86_64/fpu/multiarch/Makefile +++ b/sysdeps/x86_64/fpu/multiarch/Makefile @@ -1,49 +1,4 @@ ifeq ($(subdir),math) -libm-sysdep_routines += \ - s_ceil-c \ - s_ceilf-c \ - s_floor-c \ - s_floorf-c \ - s_nearbyint-c \ - s_nearbyintf-c \ - s_rint-c \ - s_rintf-c \ - s_roundeven-c \ - s_roundevenf-c \ - s_trunc-c \ - s_truncf-c \ -# libm-sysdep_routines - -libm-sysdep_routines += \ - s_ceil-sse4_1 \ - s_ceilf-sse4_1 \ - s_floor-sse4_1 \ - s_floorf-sse4_1 \ - s_nearbyint-sse4_1 \ - s_nearbyintf-sse4_1 \ - s_rint-sse4_1 \ - s_rintf-sse4_1 \ - s_roundeven-sse4_1 \ - s_roundevenf-sse4_1 \ - s_trunc-sse4_1 \ - s_truncf-sse4_1 \ -# libm-sysdep_routines - -libm-sysdep_routines += \ - e_asin-fma \ - e_atan2-fma \ - e_exp-fma \ - e_log-fma \ - e_log2-fma \ - e_pow-fma \ - s_atan-fma \ - s_expm1-fma \ - s_log1p-fma \ - s_sin-fma \ - s_sincos-fma \ - s_tan-fma \ -# libm-sysdep_routines - CFLAGS-e_asin-fma.c = -mfma -mavx2 CFLAGS-e_atan2-fma.c = -mfma -mavx2 CFLAGS-e_exp-fma.c = -mfma -mavx2 @@ -57,23 +12,6 @@ CFLAGS-s_sin-fma.c = -mfma -mavx2 CFLAGS-s_tan-fma.c = -mfma -mavx2 CFLAGS-s_sincos-fma.c = -mfma -mavx2 -libm-sysdep_routines += \ - s_cosf-sse2 \ - s_sincosf-sse2 \ - s_sinf-sse2 \ -# libm-sysdep_routines - -libm-sysdep_routines += \ - e_exp2f-fma \ - e_expf-fma \ - e_log2f-fma \ - e_logf-fma \ - e_powf-fma \ - s_cosf-fma \ - s_sincosf-fma \ - s_sinf-fma \ -# libm-sysdep_routines - CFLAGS-e_exp2f-fma.c = -mfma -mavx2 CFLAGS-e_expf-fma.c = -mfma -mavx2 CFLAGS-e_log2f-fma.c = -mfma -mavx2 @@ -83,17 +21,92 @@ CFLAGS-s_sinf-fma.c = -mfma -mavx2 CFLAGS-s_cosf-fma.c = -mfma -mavx2 CFLAGS-s_sincosf-fma.c = -mfma -mavx2 +ifeq ($(enable-avx2-fma),yes) libm-sysdep_routines += \ + s_ceil-avx \ + s_ceilf-avx \ + s_floor-avx \ + s_floorf-avx \ + s_nearbyint-avx \ + s_nearbyintf-avx \ + s_rint-avx \ + s_rintf-avx \ + s_roundeven-avx \ + s_roundevenf-avx \ + s_trunc-avx \ + s_truncf-avx \ +# libm-sysdep_routines +else +libm-sysdep_routines += \ + e_asin-fma \ e_asin-fma4 \ + e_atan2-avx \ + e_atan2-fma \ e_atan2-fma4 \ + e_exp-avx \ + e_exp-fma \ e_exp-fma4 \ + e_exp2f-fma \ + e_expf-fma \ + e_log-avx \ + e_log-fma \ e_log-fma4 \ + e_log2-fma \ + e_log2f-fma \ + e_logf-fma \ + e_pow-fma \ e_pow-fma4 \ + e_powf-fma \ + s_atan-avx \ + s_atan-fma \ s_atan-fma4 \ + s_ceil-sse4_1 \ + s_ceilf-sse4_1 \ + s_cosf-fma \ + s_cosf-sse2 \ + s_expm1-fma \ + s_floor-sse4_1 \ + s_floorf-sse4_1 \ + s_log1p-fma \ + s_nearbyint-sse4_1 \ + s_nearbyintf-sse4_1 \ + s_rint-sse4_1 \ + s_rintf-sse4_1 \ + s_roundeven-sse4_1 \ + s_roundevenf-sse4_1 \ + s_sin-avx \ + s_sin-fma \ s_sin-fma4 \ + s_sincos-avx \ + s_sincos-fma \ s_sincos-fma4 \ + s_sincosf-fma \ + s_sincosf-sse2 \ + s_sinf-fma \ + s_sinf-sse2 \ + s_tan-avx \ + s_tan-fma \ s_tan-fma4 \ + s_trunc-sse4_1 \ + s_truncf-sse4_1 \ # libm-sysdep_routines +ifeq ($(enable-sse4-1),no) +libm-sysdep_routines += \ + s_ceil-c \ + s_ceilf-c \ + s_floor-c \ + s_floorf-c \ + s_nearbyint-c \ + s_nearbyintf-c \ + s_rint-c \ + s_rintf-c \ + s_roundeven-c \ + s_roundevenf-c \ + s_trunc-c \ + s_truncf-c \ +# libm-sysdep_routines +endif +endif CFLAGS-e_asin-fma4.c = -mfma4 CFLAGS-e_atan2-fma4.c = -mfma4 @@ -105,16 +118,6 @@ CFLAGS-s_sin-fma4.c = -mfma4 CFLAGS-s_tan-fma4.c = -mfma4 CFLAGS-s_sincos-fma4.c = -mfma4 -libm-sysdep_routines += \ - e_atan2-avx \ - e_exp-avx \ - e_log-avx \ - s_atan-avx \ - s_sin-avx \ - s_sincos-avx \ - s_tan-avx \ -# libm-sysdep_routines - CFLAGS-e_atan2-avx.c = -msse2avx -DSSE2AVX CFLAGS-e_exp-avx.c = -msse2avx -DSSE2AVX CFLAGS-e_log-avx.c = -msse2avx -DSSE2AVX diff --git a/sysdeps/x86_64/fpu/multiarch/e_asin.c b/sysdeps/x86_64/fpu/multiarch/e_asin.c index 2eaa6c2c04..3c1654ba3e 100644 --- a/sysdeps/x86_64/fpu/multiarch/e_asin.c +++ b/sysdeps/x86_64/fpu/multiarch/e_asin.c @@ -16,26 +16,28 @@ License along with the GNU C Library; if not, see . */ -#include +#ifndef HAVE_X86_AVX2_FMA +# include extern double __redirect_ieee754_asin (double); extern double __redirect_ieee754_acos (double); -#define SYMBOL_NAME ieee754_asin -#include "ifunc-fma4.h" +# define SYMBOL_NAME ieee754_asin +# include "ifunc-fma4.h" libc_ifunc_redirected (__redirect_ieee754_asin, __ieee754_asin, IFUNC_SELECTOR ()); libm_alias_finite (__ieee754_asin, __asin) -#undef SYMBOL_NAME -#define SYMBOL_NAME ieee754_acos -#include "ifunc-fma4.h" +# undef SYMBOL_NAME +# define SYMBOL_NAME ieee754_acos +# include "ifunc-fma4.h" libc_ifunc_redirected (__redirect_ieee754_acos, __ieee754_acos, IFUNC_SELECTOR ()); libm_alias_finite (__ieee754_acos, __acos) -#define __ieee754_acos __ieee754_acos_sse2 -#define __ieee754_asin __ieee754_asin_sse2 +# define __ieee754_acos __ieee754_acos_sse2 +# define __ieee754_asin __ieee754_asin_sse2 +#endif #include diff --git a/sysdeps/x86_64/fpu/multiarch/e_atan2.c b/sysdeps/x86_64/fpu/multiarch/e_atan2.c index 17ee4f3c36..f48ab8762a 100644 --- a/sysdeps/x86_64/fpu/multiarch/e_atan2.c +++ b/sysdeps/x86_64/fpu/multiarch/e_atan2.c @@ -16,16 +16,18 @@ License along with the GNU C Library; if not, see . */ -#include +#ifndef HAVE_X86_AVX2_FMA +# include extern double __redirect_ieee754_atan2 (double, double); -#define SYMBOL_NAME ieee754_atan2 -#include "ifunc-avx-fma4.h" +# define SYMBOL_NAME ieee754_atan2 +# include "ifunc-avx-fma4.h" libc_ifunc_redirected (__redirect_ieee754_atan2, __ieee754_atan2, IFUNC_SELECTOR ()); libm_alias_finite (__ieee754_atan2, __atan2) -#define __ieee754_atan2 __ieee754_atan2_sse2 +# define __ieee754_atan2 __ieee754_atan2_sse2 +#endif #include diff --git a/sysdeps/x86_64/fpu/multiarch/e_exp.c b/sysdeps/x86_64/fpu/multiarch/e_exp.c index 406b7ebd44..034f5b894f 100644 --- a/sysdeps/x86_64/fpu/multiarch/e_exp.c +++ b/sysdeps/x86_64/fpu/multiarch/e_exp.c @@ -16,17 +16,19 @@ License along with the GNU C Library; if not, see . */ -#include -#include +#ifndef HAVE_X86_AVX2_FMA +# include +# include extern double __redirect_ieee754_exp (double); -#define SYMBOL_NAME ieee754_exp -#include "ifunc-avx-fma4.h" +# define SYMBOL_NAME ieee754_exp +# include "ifunc-avx-fma4.h" libc_ifunc_redirected (__redirect_ieee754_exp, __ieee754_exp, IFUNC_SELECTOR ()); libm_alias_finite (__ieee754_exp, __exp) -#define __exp __ieee754_exp_sse2 +# define __exp __ieee754_exp_sse2 +#endif #include diff --git a/sysdeps/x86_64/fpu/multiarch/e_exp2f.c b/sysdeps/x86_64/fpu/multiarch/e_exp2f.c index 804fd6be85..74f92bfa0c 100644 --- a/sysdeps/x86_64/fpu/multiarch/e_exp2f.c +++ b/sysdeps/x86_64/fpu/multiarch/e_exp2f.c @@ -16,25 +16,27 @@ License along with the GNU C Library; if not, see . */ -#include -#include +#ifndef HAVE_X86_AVX2_FMA +# include +# include extern float __redirect_exp2f (float); -#define SYMBOL_NAME exp2f -#include "ifunc-fma.h" +# define SYMBOL_NAME exp2f +# include "ifunc-fma.h" libc_ifunc_redirected (__redirect_exp2f, __exp2f, IFUNC_SELECTOR ()); -#ifdef SHARED +# ifdef SHARED versioned_symbol (libm, __ieee754_exp2f, exp2f, GLIBC_2_27); libm_alias_float_other (__exp2, exp2) -#else +# else libm_alias_float (__exp2, exp2) -#endif +# endif strong_alias (__exp2f, __ieee754_exp2f) libm_alias_finite (__exp2f, __exp2f) -#define __exp2f __exp2f_sse2 +# define __exp2f __exp2f_sse2 +#endif #include diff --git a/sysdeps/x86_64/fpu/multiarch/e_expf.c b/sysdeps/x86_64/fpu/multiarch/e_expf.c index 4a7e2a5bce..e8d6f393ff 100644 --- a/sysdeps/x86_64/fpu/multiarch/e_expf.c +++ b/sysdeps/x86_64/fpu/multiarch/e_expf.c @@ -16,28 +16,30 @@ License along with the GNU C Library; if not, see . */ -#include -#include +#ifndef HAVE_X86_AVX2_FMA +# include +# include extern float __redirect_expf (float); -#define SYMBOL_NAME expf -#include "ifunc-fma.h" +# define SYMBOL_NAME expf +# include "ifunc-fma.h" libc_ifunc_redirected (__redirect_expf, __expf, IFUNC_SELECTOR ()); -#ifdef SHARED +# ifdef SHARED __hidden_ver1 (__expf, __GI___expf, __redirect_expf) __attribute__ ((visibility ("hidden"))); versioned_symbol (libm, __ieee754_expf, expf, GLIBC_2_27); libm_alias_float_other (__exp, exp) -#else +# else libm_alias_float (__exp, exp) -#endif +# endif strong_alias (__expf, __ieee754_expf) libm_alias_finite (__expf, __expf) -#define __expf __expf_sse2 +# define __expf __expf_sse2 +#endif #include diff --git a/sysdeps/x86_64/fpu/multiarch/e_log.c b/sysdeps/x86_64/fpu/multiarch/e_log.c index 067fbf58c3..3a678235d9 100644 --- a/sysdeps/x86_64/fpu/multiarch/e_log.c +++ b/sysdeps/x86_64/fpu/multiarch/e_log.c @@ -16,17 +16,19 @@ License along with the GNU C Library; if not, see . */ -#include -#include +#ifndef HAVE_X86_AVX2_FMA +# include +# include extern double __redirect_ieee754_log (double); -#define SYMBOL_NAME ieee754_log -#include "ifunc-avx-fma4.h" +# define SYMBOL_NAME ieee754_log +# include "ifunc-avx-fma4.h" libc_ifunc_redirected (__redirect_ieee754_log, __ieee754_log, IFUNC_SELECTOR ()); libm_alias_finite (__ieee754_log, __log) -#define __log __ieee754_log_sse2 +# define __log __ieee754_log_sse2 +#endif #include diff --git a/sysdeps/x86_64/fpu/multiarch/e_log2.c b/sysdeps/x86_64/fpu/multiarch/e_log2.c index 9c57a2f6cc..c032758b4e 100644 --- a/sysdeps/x86_64/fpu/multiarch/e_log2.c +++ b/sysdeps/x86_64/fpu/multiarch/e_log2.c @@ -16,28 +16,30 @@ License along with the GNU C Library; if not, see . */ -#include -#include +#ifndef HAVE_X86_AVX2_FMA +# include +# include extern double __redirect_log2 (double); -#define SYMBOL_NAME log2 -#include "ifunc-fma.h" +# define SYMBOL_NAME log2 +# include "ifunc-fma.h" libc_ifunc_redirected (__redirect_log2, __log2, IFUNC_SELECTOR ()); -#ifdef SHARED +# ifdef SHARED __hidden_ver1 (__log2, __GI___log2, __redirect_log2) __attribute__ ((visibility ("hidden"))); versioned_symbol (libm, __ieee754_log2, log2, GLIBC_2_29); libm_alias_double_other (__log2, log2) -#else +# else libm_alias_double (__log2, log2) -#endif +# endif strong_alias (__log2, __ieee754_log2) libm_alias_finite (__log2, __log2) -#define __log2 __log2_sse2 +# define __log2 __log2_sse2 +#endif #include diff --git a/sysdeps/x86_64/fpu/multiarch/e_log2f.c b/sysdeps/x86_64/fpu/multiarch/e_log2f.c index 2b45c87f38..0f8d1f0abc 100644 --- a/sysdeps/x86_64/fpu/multiarch/e_log2f.c +++ b/sysdeps/x86_64/fpu/multiarch/e_log2f.c @@ -16,28 +16,30 @@ License along with the GNU C Library; if not, see . */ -#include -#include +#ifndef HAVE_X86_AVX2_FMA +# include +# include extern float __redirect_log2f (float); -#define SYMBOL_NAME log2f -#include "ifunc-fma.h" +# define SYMBOL_NAME log2f +# include "ifunc-fma.h" libc_ifunc_redirected (__redirect_log2f, __log2f, IFUNC_SELECTOR ()); -#ifdef SHARED +# ifdef SHARED __hidden_ver1 (__log2f, __GI___log2f, __redirect_log2f) __attribute__ ((visibility ("hidden"))); versioned_symbol (libm, __ieee754_log2f, log2f, GLIBC_2_27); libm_alias_float_other (__log2, log2) -#else +# else libm_alias_float (__log2, log2) -#endif +# endif strong_alias (__log2f, __ieee754_log2f) libm_alias_finite (__log2f, __log2f) -#define __log2f __log2f_sse2 +# define __log2f __log2f_sse2 +#endif #include diff --git a/sysdeps/x86_64/fpu/multiarch/e_logf.c b/sysdeps/x86_64/fpu/multiarch/e_logf.c index 97e23c8fea..9d94dd614f 100644 --- a/sysdeps/x86_64/fpu/multiarch/e_logf.c +++ b/sysdeps/x86_64/fpu/multiarch/e_logf.c @@ -16,28 +16,30 @@ License along with the GNU C Library; if not, see . */ -#include -#include +#ifndef HAVE_X86_AVX2_FMA +# include +# include extern float __redirect_logf (float); -#define SYMBOL_NAME logf -#include "ifunc-fma.h" +# define SYMBOL_NAME logf +# include "ifunc-fma.h" libc_ifunc_redirected (__redirect_logf, __logf, IFUNC_SELECTOR ()); -#ifdef SHARED +# ifdef SHARED __hidden_ver1 (__logf, __GI___logf, __redirect_logf) __attribute__ ((visibility ("hidden"))); versioned_symbol (libm, __ieee754_logf, logf, GLIBC_2_27); libm_alias_float_other (__log, log) -#else +# else libm_alias_float (__log, log) -#endif +# endif strong_alias (__logf, __ieee754_logf) libm_alias_finite (__logf, __logf) -#define __logf __logf_sse2 +# define __logf __logf_sse2 +#endif #include diff --git a/sysdeps/x86_64/fpu/multiarch/e_pow.c b/sysdeps/x86_64/fpu/multiarch/e_pow.c index 42618e7112..07436d420c 100644 --- a/sysdeps/x86_64/fpu/multiarch/e_pow.c +++ b/sysdeps/x86_64/fpu/multiarch/e_pow.c @@ -16,17 +16,19 @@ License along with the GNU C Library; if not, see . */ -#include -#include +#ifndef HAVE_X86_AVX2_FMA +# include +# include extern double __redirect_ieee754_pow (double, double); -#define SYMBOL_NAME ieee754_pow -#include "ifunc-fma4.h" +# define SYMBOL_NAME ieee754_pow +# include "ifunc-fma4.h" libc_ifunc_redirected (__redirect_ieee754_pow, __ieee754_pow, IFUNC_SELECTOR ()); libm_alias_finite (__ieee754_pow, __pow) -#define __pow __ieee754_pow_sse2 +# define __pow __ieee754_pow_sse2 +#endif #include diff --git a/sysdeps/x86_64/fpu/multiarch/e_powf.c b/sysdeps/x86_64/fpu/multiarch/e_powf.c index 8e6ce13cc1..c64c8a4302 100644 --- a/sysdeps/x86_64/fpu/multiarch/e_powf.c +++ b/sysdeps/x86_64/fpu/multiarch/e_powf.c @@ -16,31 +16,33 @@ License along with the GNU C Library; if not, see . */ -#include -#include +#ifndef HAVE_X86_AVX2_FMA +# include +# include -#define powf __redirect_powf -#define __DECL_SIMD___redirect_powf -#include -#undef powf +# define powf __redirect_powf +# define __DECL_SIMD___redirect_powf +# include +# undef powf -#define SYMBOL_NAME powf -#include "ifunc-fma.h" +# define SYMBOL_NAME powf +# include "ifunc-fma.h" libc_ifunc_redirected (__redirect_powf, __powf, IFUNC_SELECTOR ()); -#ifdef SHARED +# ifdef SHARED __hidden_ver1 (__powf, __GI___powf, __redirect_powf) __attribute__ ((visibility ("hidden"))); versioned_symbol (libm, __ieee754_powf, powf, GLIBC_2_27); libm_alias_float_other (__pow, pow) -#else +# else libm_alias_float (__pow, pow) -#endif +# endif strong_alias (__powf, __ieee754_powf) libm_alias_finite (__powf, __powf) -#define __powf __powf_sse2 +# define __powf __powf_sse2 +#endif #include diff --git a/sysdeps/x86_64/fpu/multiarch/s_atan.c b/sysdeps/x86_64/fpu/multiarch/s_atan.c index 71bad096a9..f9ec4e7b37 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_atan.c +++ b/sysdeps/x86_64/fpu/multiarch/s_atan.c @@ -16,15 +16,17 @@ License along with the GNU C Library; if not, see . */ -#include +#ifndef HAVE_X86_AVX2_FMA +# include extern double __redirect_atan (double); -#define SYMBOL_NAME atan -#include "ifunc-avx-fma4.h" +# define SYMBOL_NAME atan +# include "ifunc-avx-fma4.h" libc_ifunc_redirected (__redirect_atan, __atan, IFUNC_SELECTOR ()); libm_alias_double (__atan, atan) -#define __atan __atan_sse2 +# define __atan __atan_sse2 +#endif #include diff --git a/sysdeps/x86_64/fpu/multiarch/s_ceil-avx.S b/sysdeps/x86_64/fpu/multiarch/s_ceil-avx.S new file mode 100644 index 0000000000..e6c1106753 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_ceil-avx.S @@ -0,0 +1,28 @@ +/* AVX implementation of ceil function. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + + .text +ENTRY(__ceil) + vroundsd $10, %xmm0, %xmm0, %xmm0 + ret +END(__ceil) + +libm_alias_double (__ceil, ceil) diff --git a/sysdeps/x86_64/fpu/multiarch/s_ceil-sse4_1.S b/sysdeps/x86_64/fpu/multiarch/s_ceil-sse4_1.S index 64119011ad..4be069b8da 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_ceil-sse4_1.S +++ b/sysdeps/x86_64/fpu/multiarch/s_ceil-sse4_1.S @@ -17,8 +17,19 @@ #include +#ifdef HAVE_X86_SSE4_1 +# include +# define __ceil_sse41 __ceil + .text +#else .section .text.sse4.1,"ax",@progbits +#endif + ENTRY(__ceil_sse41) roundsd $10, %xmm0, %xmm0 ret END(__ceil_sse41) + +#ifdef HAVE_X86_SSE4_1 +libm_alias_double (__ceil, ceil) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_ceil.c b/sysdeps/x86_64/fpu/multiarch/s_ceil.c index cc028addee..0199863c8f 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_ceil.c +++ b/sysdeps/x86_64/fpu/multiarch/s_ceil.c @@ -16,17 +16,19 @@ License along with the GNU C Library; if not, see . */ -#define NO_MATH_REDIRECT -#include +#if !defined HAVE_X86_SSE4_1 && !defined HAVE_X86_AVX2_FMA +# define NO_MATH_REDIRECT +# include -#define ceil __redirect_ceil -#define __ceil __redirect___ceil -#include -#undef ceil -#undef __ceil +# define ceil __redirect_ceil +# define __ceil __redirect___ceil +# include +# undef ceil +# undef __ceil -#define SYMBOL_NAME ceil -#include "ifunc-sse4_1.h" +# define SYMBOL_NAME ceil +# include "ifunc-sse4_1.h" libc_ifunc_redirected (__redirect_ceil, __ceil, IFUNC_SELECTOR ()); libm_alias_double (__ceil, ceil) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_ceilf-avx.S b/sysdeps/x86_64/fpu/multiarch/s_ceilf-avx.S new file mode 100644 index 0000000000..b4d8ac0455 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_ceilf-avx.S @@ -0,0 +1,28 @@ +/* AVX implementation of ceilf function. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + + .text +ENTRY(__ceilf) + vroundss $10, %xmm0, %xmm0, %xmm0 + ret +END(__ceilf) + +libm_alias_float (__ceil, ceil) diff --git a/sysdeps/x86_64/fpu/multiarch/s_ceilf-sse4_1.S b/sysdeps/x86_64/fpu/multiarch/s_ceilf-sse4_1.S index dd9a9f6b71..1a85e9c925 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_ceilf-sse4_1.S +++ b/sysdeps/x86_64/fpu/multiarch/s_ceilf-sse4_1.S @@ -17,8 +17,19 @@ #include +#ifdef HAVE_X86_SSE4_1 +# include +# define __ceilf_sse41 __ceilf + .text +#else .section .text.sse4.1,"ax",@progbits +#endif + ENTRY(__ceilf_sse41) roundss $10, %xmm0, %xmm0 ret END(__ceilf_sse41) + +#ifdef HAVE_X86_SSE4_1 +libm_alias_float (__ceil, ceil) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_ceilf.c b/sysdeps/x86_64/fpu/multiarch/s_ceilf.c index 97a0ca7d19..dfce9225dd 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_ceilf.c +++ b/sysdeps/x86_64/fpu/multiarch/s_ceilf.c @@ -16,17 +16,19 @@ License along with the GNU C Library; if not, see . */ -#define NO_MATH_REDIRECT -#include +#if !defined HAVE_X86_SSE4_1 && !defined HAVE_X86_AVX2_FMA +# define NO_MATH_REDIRECT +# include -#define ceilf __redirect_ceilf -#define __ceilf __redirect___ceilf -#include -#undef ceilf -#undef __ceilf +# define ceilf __redirect_ceilf +# define __ceilf __redirect___ceilf +# include +# undef ceilf +# undef __ceilf -#define SYMBOL_NAME ceilf -#include "ifunc-sse4_1.h" +# define SYMBOL_NAME ceilf +# include "ifunc-sse4_1.h" libc_ifunc_redirected (__redirect_ceilf, __ceilf, IFUNC_SELECTOR ()); libm_alias_float (__ceil, ceil) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_cosf.c b/sysdeps/x86_64/fpu/multiarch/s_cosf.c index 2703c576df..9be9327b80 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_cosf.c +++ b/sysdeps/x86_64/fpu/multiarch/s_cosf.c @@ -16,13 +16,17 @@ License along with the GNU C Library; if not, see . */ -#include +#ifndef HAVE_X86_AVX2_FMA +# include extern float __redirect_cosf (float); -#define SYMBOL_NAME cosf -#include "ifunc-fma.h" +# define SYMBOL_NAME cosf +# include "ifunc-fma.h" libc_ifunc_redirected (__redirect_cosf, __cosf, IFUNC_SELECTOR ()); libm_alias_float (__cos, cos) +#else +# include +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_expm1.c b/sysdeps/x86_64/fpu/multiarch/s_expm1.c index 8a2d69f9b2..1ed45245cb 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_expm1.c +++ b/sysdeps/x86_64/fpu/multiarch/s_expm1.c @@ -16,21 +16,23 @@ License along with the GNU C Library; if not, see . */ -#include +#ifndef HAVE_X86_AVX2_FMA +# include extern double __redirect_expm1 (double); -#define SYMBOL_NAME expm1 -#include "ifunc-fma.h" +# define SYMBOL_NAME expm1 +# include "ifunc-fma.h" libc_ifunc_redirected (__redirect_expm1, __expm1, IFUNC_SELECTOR ()); libm_alias_double (__expm1, expm1) -#define __expm1 __expm1_sse2 +# define __expm1 __expm1_sse2 /* NB: __expm1 may be expanded to __expm1_sse2 in the following prototypes. */ extern long double __expm1l (long double); extern long double __expm1f128 (long double); +#endif #include diff --git a/sysdeps/x86_64/fpu/multiarch/s_floor-avx.S b/sysdeps/x86_64/fpu/multiarch/s_floor-avx.S new file mode 100644 index 0000000000..ff74b5a8bf --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_floor-avx.S @@ -0,0 +1,28 @@ +/* AVX implementation of floor function. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + + .text +ENTRY(__floor) + vroundsd $9, %xmm0, %xmm0, %xmm0 + ret +END(__floor) + +libm_alias_double (__floor, floor) diff --git a/sysdeps/x86_64/fpu/multiarch/s_floor-sse4_1.S b/sysdeps/x86_64/fpu/multiarch/s_floor-sse4_1.S index 2f7521f39f..957d018177 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_floor-sse4_1.S +++ b/sysdeps/x86_64/fpu/multiarch/s_floor-sse4_1.S @@ -17,8 +17,19 @@ #include +#ifdef HAVE_X86_SSE4_1 +# include +# define __floor_sse41 __floor + .text +#else .section .text.sse4.1,"ax",@progbits +#endif + ENTRY(__floor_sse41) roundsd $9, %xmm0, %xmm0 ret END(__floor_sse41) + +#ifdef HAVE_X86_SSE4_1 +libm_alias_double (__floor, floor) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_floor.c b/sysdeps/x86_64/fpu/multiarch/s_floor.c index 8cebd48e10..a30c88671e 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_floor.c +++ b/sysdeps/x86_64/fpu/multiarch/s_floor.c @@ -16,17 +16,19 @@ License along with the GNU C Library; if not, see . */ -#define NO_MATH_REDIRECT -#include +#if !defined HAVE_X86_SSE4_1 && !defined HAVE_X86_AVX2_FMA +# define NO_MATH_REDIRECT +# include -#define floor __redirect_floor -#define __floor __redirect___floor -#include -#undef floor -#undef __floor +# define floor __redirect_floor +# define __floor __redirect___floor +# include +# undef floor +# undef __floor -#define SYMBOL_NAME floor -#include "ifunc-sse4_1.h" +# define SYMBOL_NAME floor +# include "ifunc-sse4_1.h" libc_ifunc_redirected (__redirect_floor, __floor, IFUNC_SELECTOR ()); libm_alias_double (__floor, floor) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_floorf-avx.S b/sysdeps/x86_64/fpu/multiarch/s_floorf-avx.S new file mode 100644 index 0000000000..c378baae8e --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_floorf-avx.S @@ -0,0 +1,28 @@ +/* AVX implementation of floorf function. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + + .text +ENTRY(__floorf) + vroundss $9, %xmm0, %xmm0, %xmm0 + ret +END(__floorf) + +libm_alias_float (__floor, floor) diff --git a/sysdeps/x86_64/fpu/multiarch/s_floorf-sse4_1.S b/sysdeps/x86_64/fpu/multiarch/s_floorf-sse4_1.S index 5f6020d27d..eacabe167c 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_floorf-sse4_1.S +++ b/sysdeps/x86_64/fpu/multiarch/s_floorf-sse4_1.S @@ -17,8 +17,19 @@ #include +#ifdef HAVE_X86_SSE4_1 +# include +# define __floorf_sse41 __floorf + .text +#else .section .text.sse4.1,"ax",@progbits +#endif + ENTRY(__floorf_sse41) roundss $9, %xmm0, %xmm0 ret END(__floorf_sse41) + +#ifdef HAVE_X86_SSE4_1 +libm_alias_float (__floor, floor) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_floorf.c b/sysdeps/x86_64/fpu/multiarch/s_floorf.c index a14e18b03c..6531b78443 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_floorf.c +++ b/sysdeps/x86_64/fpu/multiarch/s_floorf.c @@ -16,17 +16,19 @@ License along with the GNU C Library; if not, see . */ -#define NO_MATH_REDIRECT -#include +#if !defined HAVE_X86_SSE4_1 && !defined HAVE_X86_AVX2_FMA +# define NO_MATH_REDIRECT +# include -#define floorf __redirect_floorf -#define __floorf __redirect___floorf -#include -#undef floorf -#undef __floorf +# define floorf __redirect_floorf +# define __floorf __redirect___floorf +# include +# undef floorf +# undef __floorf -#define SYMBOL_NAME floorf -#include "ifunc-sse4_1.h" +# define SYMBOL_NAME floorf +# include "ifunc-sse4_1.h" libc_ifunc_redirected (__redirect_floorf, __floorf, IFUNC_SELECTOR ()); libm_alias_float (__floor, floor) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_log1p.c b/sysdeps/x86_64/fpu/multiarch/s_log1p.c index a8e1a3f21b..76e1672e2d 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_log1p.c +++ b/sysdeps/x86_64/fpu/multiarch/s_log1p.c @@ -16,14 +16,16 @@ License along with the GNU C Library; if not, see . */ -#include +#ifndef HAVE_X86_AVX2_FMA +# include extern double __redirect_log1p (double); -#define SYMBOL_NAME log1p -#include "ifunc-fma.h" +# define SYMBOL_NAME log1p +# include "ifunc-fma.h" libc_ifunc_redirected (__redirect_log1p, __log1p, IFUNC_SELECTOR ()); -#define __log1p __log1p_sse2 +# define __log1p __log1p_sse2 +#endif #include diff --git a/sysdeps/x86_64/fpu/multiarch/s_nearbyint-avx.S b/sysdeps/x86_64/fpu/multiarch/s_nearbyint-avx.S new file mode 100644 index 0000000000..5bfdf73c28 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_nearbyint-avx.S @@ -0,0 +1,28 @@ +/* AVX implementation of nearbyint function. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + + .text +ENTRY(__nearbyint) + vroundsd $0xc, %xmm0, %xmm0, %xmm0 + ret +END(__nearbyint) + +libm_alias_double (__nearbyint, nearbyint) diff --git a/sysdeps/x86_64/fpu/multiarch/s_nearbyint-sse4_1.S b/sysdeps/x86_64/fpu/multiarch/s_nearbyint-sse4_1.S index 674f7eb40a..ee0b17e470 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_nearbyint-sse4_1.S +++ b/sysdeps/x86_64/fpu/multiarch/s_nearbyint-sse4_1.S @@ -17,8 +17,19 @@ #include +#ifdef HAVE_X86_SSE4_1 +# include +# define __nearbyint_sse41 __nearbyint + .text +#else .section .text.sse4.1,"ax",@progbits +#endif + ENTRY(__nearbyint_sse41) roundsd $0xc, %xmm0, %xmm0 ret END(__nearbyint_sse41) + +#ifdef HAVE_X86_SSE4_1 +libm_alias_double (__nearbyint, nearbyint) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_nearbyint.c b/sysdeps/x86_64/fpu/multiarch/s_nearbyint.c index 693e42dd4e..649a9df869 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_nearbyint.c +++ b/sysdeps/x86_64/fpu/multiarch/s_nearbyint.c @@ -16,17 +16,19 @@ License along with the GNU C Library; if not, see . */ -#include +#if !defined HAVE_X86_SSE4_1 && !defined HAVE_X86_AVX2_FMA +# include -#define nearbyint __redirect_nearbyint -#define __nearbyint __redirect___nearbyint -#include -#undef nearbyint -#undef __nearbyint +# define nearbyint __redirect_nearbyint +# define __nearbyint __redirect___nearbyint +# include +# undef nearbyint +# undef __nearbyint -#define SYMBOL_NAME nearbyint -#include "ifunc-sse4_1.h" +# define SYMBOL_NAME nearbyint +# include "ifunc-sse4_1.h" libc_ifunc_redirected (__redirect_nearbyint, __nearbyint, IFUNC_SELECTOR ()); libm_alias_double (__nearbyint, nearbyint) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_nearbyintf-avx.S b/sysdeps/x86_64/fpu/multiarch/s_nearbyintf-avx.S new file mode 100644 index 0000000000..1dbaed0324 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_nearbyintf-avx.S @@ -0,0 +1,28 @@ +/* AVX implmentation of nearbyintf function. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + + .text +ENTRY(__nearbyintf) + vroundss $0xc, %xmm0, %xmm0, %xmm0 + ret +END(__nearbyintf) + +libm_alias_float (__nearbyint, nearbyint) diff --git a/sysdeps/x86_64/fpu/multiarch/s_nearbyintf-sse4_1.S b/sysdeps/x86_64/fpu/multiarch/s_nearbyintf-sse4_1.S index 5892bd7563..8b3e307b78 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_nearbyintf-sse4_1.S +++ b/sysdeps/x86_64/fpu/multiarch/s_nearbyintf-sse4_1.S @@ -17,8 +17,19 @@ #include +#ifdef HAVE_X86_SSE4_1 +# include +# define __nearbyintf_sse41 __nearbyintf + .text +#else .section .text.sse4.1,"ax",@progbits +#endif + ENTRY(__nearbyintf_sse41) roundss $0xc, %xmm0, %xmm0 ret END(__nearbyintf_sse41) + +#ifdef HAVE_X86_SSE4_1 +libm_alias_float (__nearbyint, nearbyint) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_nearbyintf.c b/sysdeps/x86_64/fpu/multiarch/s_nearbyintf.c index a0ac009f4b..7762467ad9 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_nearbyintf.c +++ b/sysdeps/x86_64/fpu/multiarch/s_nearbyintf.c @@ -16,17 +16,19 @@ License along with the GNU C Library; if not, see . */ -#include +#if !defined HAVE_X86_SSE4_1 && !defined HAVE_X86_AVX2_FMA +# include -#define nearbyintf __redirect_nearbyintf -#define __nearbyintf __redirect___nearbyintf -#include -#undef nearbyintf -#undef __nearbyintf +# define nearbyintf __redirect_nearbyintf +# define __nearbyintf __redirect___nearbyintf +# include +# undef nearbyintf +# undef __nearbyintf -#define SYMBOL_NAME nearbyintf -#include "ifunc-sse4_1.h" +# define SYMBOL_NAME nearbyintf +# include "ifunc-sse4_1.h" libc_ifunc_redirected (__redirect_nearbyintf, __nearbyintf, IFUNC_SELECTOR ()); libm_alias_float (__nearbyint, nearbyint) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_rint-avx.S b/sysdeps/x86_64/fpu/multiarch/s_rint-avx.S new file mode 100644 index 0000000000..2b403b331f --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_rint-avx.S @@ -0,0 +1,28 @@ +/* AVX implementation of rint function. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + + .text +ENTRY(__rint) + vroundsd $4, %xmm0, %xmm0, %xmm0 + ret +END(__rint) + +libm_alias_double (__rint, rint) diff --git a/sysdeps/x86_64/fpu/multiarch/s_rint-sse4_1.S b/sysdeps/x86_64/fpu/multiarch/s_rint-sse4_1.S index 405372991b..4c7c1c37de 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_rint-sse4_1.S +++ b/sysdeps/x86_64/fpu/multiarch/s_rint-sse4_1.S @@ -17,8 +17,19 @@ #include +#ifdef HAVE_X86_SSE4_1 +# include +# define __rint_sse41 __rint + .text +#else .section .text.sse4.1,"ax",@progbits +#endif + ENTRY(__rint_sse41) roundsd $4, %xmm0, %xmm0 ret END(__rint_sse41) + +#ifdef HAVE_X86_SSE4_1 +libm_alias_double (__rint, rint) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_rint.c b/sysdeps/x86_64/fpu/multiarch/s_rint.c index 754c87e004..49693c9728 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_rint.c +++ b/sysdeps/x86_64/fpu/multiarch/s_rint.c @@ -16,17 +16,19 @@ License along with the GNU C Library; if not, see . */ -#define NO_MATH_REDIRECT -#include +#if !defined HAVE_X86_SSE4_1 && !defined HAVE_X86_AVX2_FMA +# define NO_MATH_REDIRECT +# include -#define rint __redirect_rint -#define __rint __redirect___rint -#include -#undef rint -#undef __rint +# define rint __redirect_rint +# define __rint __redirect___rint +# include +# undef rint +# undef __rint -#define SYMBOL_NAME rint -#include "ifunc-sse4_1.h" +# define SYMBOL_NAME rint +# include "ifunc-sse4_1.h" libc_ifunc_redirected (__redirect_rint, __rint, IFUNC_SELECTOR ()); libm_alias_double (__rint, rint) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_rintf-avx.S b/sysdeps/x86_64/fpu/multiarch/s_rintf-avx.S new file mode 100644 index 0000000000..171c2867f4 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_rintf-avx.S @@ -0,0 +1,28 @@ +/* AVX implementation of rintf function. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + + .text +ENTRY(__rintf) + vroundss $4, %xmm0, %xmm0, %xmm0 + ret +END(__rintf) + +libm_alias_float (__rint, rint) diff --git a/sysdeps/x86_64/fpu/multiarch/s_rintf-sse4_1.S b/sysdeps/x86_64/fpu/multiarch/s_rintf-sse4_1.S index 8ac67ce767..55443d7238 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_rintf-sse4_1.S +++ b/sysdeps/x86_64/fpu/multiarch/s_rintf-sse4_1.S @@ -17,8 +17,19 @@ #include +#ifdef HAVE_X86_SSE4_1 +# include +# define __rintf_sse41 __rintf + .text +#else .section .text.sse4.1,"ax",@progbits +#endif + ENTRY(__rintf_sse41) roundss $4, %xmm0, %xmm0 ret END(__rintf_sse41) + +#ifdef HAVE_X86_SSE4_1 +libm_alias_float (__rint, rint) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_rintf.c b/sysdeps/x86_64/fpu/multiarch/s_rintf.c index e9d6b7a5f2..c7cf09701d 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_rintf.c +++ b/sysdeps/x86_64/fpu/multiarch/s_rintf.c @@ -16,17 +16,19 @@ License along with the GNU C Library; if not, see . */ -#define NO_MATH_REDIRECT -#include +#if !defined HAVE_X86_SSE4_1 && !defined HAVE_X86_AVX2_FMA +# define NO_MATH_REDIRECT +# include -#define rintf __redirect_rintf -#define __rintf __redirect___rintf -#include -#undef rintf -#undef __rintf +# define rintf __redirect_rintf +# define __rintf __redirect___rintf +# include +# undef rintf +# undef __rintf -#define SYMBOL_NAME rintf -#include "ifunc-sse4_1.h" +# define SYMBOL_NAME rintf +# include "ifunc-sse4_1.h" libc_ifunc_redirected (__redirect_rintf, __rintf, IFUNC_SELECTOR ()); libm_alias_float (__rint, rint) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_roundeven-avx.S b/sysdeps/x86_64/fpu/multiarch/s_roundeven-avx.S new file mode 100644 index 0000000000..576790355c --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_roundeven-avx.S @@ -0,0 +1,28 @@ +/* AVX implementation of roundeven function. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + + .text +ENTRY(__roundeven) + vroundsd $8, %xmm0, %xmm0, %xmm0 + ret +END(__roundeven) + +libm_alias_double (__roundeven, roundeven) diff --git a/sysdeps/x86_64/fpu/multiarch/s_roundeven-sse4_1.S b/sysdeps/x86_64/fpu/multiarch/s_roundeven-sse4_1.S index 5ef102336b..f0644cce81 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_roundeven-sse4_1.S +++ b/sysdeps/x86_64/fpu/multiarch/s_roundeven-sse4_1.S @@ -17,8 +17,19 @@ #include +#ifdef HAVE_X86_SSE4_1 +# include +# define __roundeven_sse41 __roundeven + .text +#else .section .text.sse4.1,"ax",@progbits +#endif + ENTRY(__roundeven_sse41) roundsd $8, %xmm0, %xmm0 ret END(__roundeven_sse41) + +#ifdef HAVE_X86_SSE4_1 +libm_alias_double (__roundeven, roundeven) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_roundeven.c b/sysdeps/x86_64/fpu/multiarch/s_roundeven.c index 8737b32e26..a250297918 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_roundeven.c +++ b/sysdeps/x86_64/fpu/multiarch/s_roundeven.c @@ -16,16 +16,18 @@ License along with the GNU C Library; if not, see . */ -#include +#if !defined HAVE_X86_SSE4_1 && !defined HAVE_X86_AVX2_FMA +# include -#define roundeven __redirect_roundeven -#define __roundeven __redirect___roundeven -#include -#undef roundeven -#undef __roundeven +# define roundeven __redirect_roundeven +# define __roundeven __redirect___roundeven +# include +# undef roundeven +# undef __roundeven -#define SYMBOL_NAME roundeven -#include "ifunc-sse4_1.h" +# define SYMBOL_NAME roundeven +# include "ifunc-sse4_1.h" libc_ifunc_redirected (__redirect_roundeven, __roundeven, IFUNC_SELECTOR ()); libm_alias_double (__roundeven, roundeven) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_roundevenf-avx.S b/sysdeps/x86_64/fpu/multiarch/s_roundevenf-avx.S new file mode 100644 index 0000000000..42c359f4cd --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_roundevenf-avx.S @@ -0,0 +1,28 @@ +/* AVX implementation of roundevenf function. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + + .text +ENTRY(__roundevenf) + vroundss $8, %xmm0, %xmm0, %xmm0 + ret +END(__roundevenf) + +libm_alias_float (__roundeven, roundeven) diff --git a/sysdeps/x86_64/fpu/multiarch/s_roundevenf-sse4_1.S b/sysdeps/x86_64/fpu/multiarch/s_roundevenf-sse4_1.S index 792c90ba07..d1dd6b0e8b 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_roundevenf-sse4_1.S +++ b/sysdeps/x86_64/fpu/multiarch/s_roundevenf-sse4_1.S @@ -17,8 +17,19 @@ #include +#ifdef HAVE_X86_SSE4_1 +# include +# define __roundevenf_sse41 __roundevenf + .text +#else .section .text.sse4.1,"ax",@progbits +#endif + ENTRY(__roundevenf_sse41) roundss $8, %xmm0, %xmm0 ret END(__roundevenf_sse41) + +#ifdef HAVE_X86_SSE4_1 +libm_alias_float (__roundeven, roundeven) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_roundevenf.c b/sysdeps/x86_64/fpu/multiarch/s_roundevenf.c index e96016a4d5..534941e67f 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_roundevenf.c +++ b/sysdeps/x86_64/fpu/multiarch/s_roundevenf.c @@ -16,16 +16,18 @@ License along with the GNU C Library; if not, see . */ -#include +#if !defined HAVE_X86_SSE4_1 && !defined HAVE_X86_AVX2_FMA +# include -#define roundevenf __redirect_roundevenf -#define __roundevenf __redirect___roundevenf -#include -#undef roundevenf -#undef __roundevenf +# define roundevenf __redirect_roundevenf +# define __roundevenf __redirect___roundevenf +# include +# undef roundevenf +# undef __roundevenf -#define SYMBOL_NAME roundevenf -#include "ifunc-sse4_1.h" +# define SYMBOL_NAME roundevenf +# include "ifunc-sse4_1.h" libc_ifunc_redirected (__redirect_roundevenf, __roundevenf, IFUNC_SELECTOR ()); libm_alias_float (__roundeven, roundeven) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_sin.c b/sysdeps/x86_64/fpu/multiarch/s_sin.c index 355cc0092e..21eaa5e984 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_sin.c +++ b/sysdeps/x86_64/fpu/multiarch/s_sin.c @@ -16,24 +16,26 @@ License along with the GNU C Library; if not, see . */ -#include +#ifndef HAVE_X86_AVX2_FMA +# include extern double __redirect_sin (double); extern double __redirect_cos (double); -#define SYMBOL_NAME sin -#include "ifunc-avx-fma4.h" +# define SYMBOL_NAME sin +# include "ifunc-avx-fma4.h" libc_ifunc_redirected (__redirect_sin, __sin, IFUNC_SELECTOR ()); libm_alias_double (__sin, sin) -#undef SYMBOL_NAME -#define SYMBOL_NAME cos -#include "ifunc-avx-fma4.h" +# undef SYMBOL_NAME +# define SYMBOL_NAME cos +# include "ifunc-avx-fma4.h" libc_ifunc_redirected (__redirect_cos, __cos, IFUNC_SELECTOR ()); libm_alias_double (__cos, cos) -#define __cos __cos_sse2 -#define __sin __sin_sse2 +# define __cos __cos_sse2 +# define __sin __sin_sse2 +#endif #include diff --git a/sysdeps/x86_64/fpu/multiarch/s_sincos.c b/sysdeps/x86_64/fpu/multiarch/s_sincos.c index 70107e999c..729163cdde 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_sincos.c +++ b/sysdeps/x86_64/fpu/multiarch/s_sincos.c @@ -16,15 +16,17 @@ License along with the GNU C Library; if not, see . */ -#include +#ifndef HAVE_X86_AVX2_FMA +# include extern void __redirect_sincos (double, double *, double *); -#define SYMBOL_NAME sincos -#include "ifunc-fma4.h" +# define SYMBOL_NAME sincos +# include "ifunc-fma4.h" libc_ifunc_redirected (__redirect_sincos, __sincos, IFUNC_SELECTOR ()); libm_alias_double (__sincos, sincos) -#define __sincos __sincos_sse2 +# define __sincos __sincos_sse2 +#endif #include diff --git a/sysdeps/x86_64/fpu/multiarch/s_sincosf.c b/sysdeps/x86_64/fpu/multiarch/s_sincosf.c index 80bc028451..136dd62c81 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_sincosf.c +++ b/sysdeps/x86_64/fpu/multiarch/s_sincosf.c @@ -16,13 +16,17 @@ License along with the GNU C Library; if not, see . */ -#include +#ifndef HAVE_X86_AVX2_FMA +# include extern void __redirect_sincosf (float, float *, float *); -#define SYMBOL_NAME sincosf -#include "ifunc-fma.h" +# define SYMBOL_NAME sincosf +# include "ifunc-fma.h" libc_ifunc_redirected (__redirect_sincosf, __sincosf, IFUNC_SELECTOR ()); libm_alias_float (__sincos, sincos) +#else +# include +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_sinf.c b/sysdeps/x86_64/fpu/multiarch/s_sinf.c index a32b9e9550..fabbf55604 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_sinf.c +++ b/sysdeps/x86_64/fpu/multiarch/s_sinf.c @@ -16,13 +16,17 @@ License along with the GNU C Library; if not, see . */ -#include +#ifndef HAVE_X86_AVX2_FMA +# include extern float __redirect_sinf (float); -#define SYMBOL_NAME sinf -#include "ifunc-fma.h" +# define SYMBOL_NAME sinf +# include "ifunc-fma.h" libc_ifunc_redirected (__redirect_sinf, __sinf, IFUNC_SELECTOR ()); libm_alias_float (__sin, sin) +#else +# include +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_tan.c b/sysdeps/x86_64/fpu/multiarch/s_tan.c index f9a2474a13..c85e327ff8 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_tan.c +++ b/sysdeps/x86_64/fpu/multiarch/s_tan.c @@ -16,15 +16,17 @@ License along with the GNU C Library; if not, see . */ -#include +#ifndef HAVE_X86_AVX2_FMA +# include extern double __redirect_tan (double); -#define SYMBOL_NAME tan -#include "ifunc-avx-fma4.h" +# define SYMBOL_NAME tan +# include "ifunc-avx-fma4.h" libc_ifunc_redirected (__redirect_tan, __tan, IFUNC_SELECTOR ()); libm_alias_double (__tan, tan) -#define __tan __tan_sse2 +# define __tan __tan_sse2 +#endif #include diff --git a/sysdeps/x86_64/fpu/multiarch/s_trunc-avx.S b/sysdeps/x86_64/fpu/multiarch/s_trunc-avx.S new file mode 100644 index 0000000000..b3e87e9606 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_trunc-avx.S @@ -0,0 +1,28 @@ +/* AVX implementation of trunc function. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + + .text +ENTRY(__trunc) + vroundsd $11, %xmm0, %xmm0, %xmm0 + ret +END(__trunc) + +libm_alias_double (__trunc, trunc) diff --git a/sysdeps/x86_64/fpu/multiarch/s_trunc-sse4_1.S b/sysdeps/x86_64/fpu/multiarch/s_trunc-sse4_1.S index b496a6ef49..062cd1fb36 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_trunc-sse4_1.S +++ b/sysdeps/x86_64/fpu/multiarch/s_trunc-sse4_1.S @@ -18,8 +18,19 @@ #include +#ifdef HAVE_X86_SSE4_1 +# include +# define __trunc_sse41 __trunc + .text +#else .section .text.sse4.1,"ax",@progbits +#endif + ENTRY(__trunc_sse41) roundsd $11, %xmm0, %xmm0 ret END(__trunc_sse41) + +#ifdef HAVE_X86_SSE4_1 +libm_alias_double (__trunc, trunc) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_trunc.c b/sysdeps/x86_64/fpu/multiarch/s_trunc.c index 9bc9df8744..568e818826 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_trunc.c +++ b/sysdeps/x86_64/fpu/multiarch/s_trunc.c @@ -16,17 +16,19 @@ License along with the GNU C Library; if not, see . */ -#define NO_MATH_REDIRECT -#include +#if !defined HAVE_X86_SSE4_1 && !defined HAVE_X86_AVX2_FMA +# define NO_MATH_REDIRECT +# include -#define trunc __redirect_trunc -#define __trunc __redirect___trunc -#include -#undef trunc -#undef __trunc +# define trunc __redirect_trunc +# define __trunc __redirect___trunc +# include +# undef trunc +# undef __trunc -#define SYMBOL_NAME trunc -#include "ifunc-sse4_1.h" +# define SYMBOL_NAME trunc +# include "ifunc-sse4_1.h" libc_ifunc_redirected (__redirect_trunc, __trunc, IFUNC_SELECTOR ()); libm_alias_double (__trunc, trunc) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_truncf-avx.S b/sysdeps/x86_64/fpu/multiarch/s_truncf-avx.S new file mode 100644 index 0000000000..f31ac7d7f7 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/s_truncf-avx.S @@ -0,0 +1,28 @@ +/* AVX implementation of truncf function. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + + .text +ENTRY(__truncf) + vroundss $11, %xmm0, %xmm0, %xmm0 + ret +END(__truncf) + +libm_alias_float (__trunc, trunc) diff --git a/sysdeps/x86_64/fpu/multiarch/s_truncf-sse4_1.S b/sysdeps/x86_64/fpu/multiarch/s_truncf-sse4_1.S index 22e9a83307..ecd0ae5c05 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_truncf-sse4_1.S +++ b/sysdeps/x86_64/fpu/multiarch/s_truncf-sse4_1.S @@ -18,8 +18,19 @@ #include +#ifdef HAVE_X86_SSE4_1 +# include +# define __truncf_sse41 __truncf + .text +#else .section .text.sse4.1,"ax",@progbits +#endif + ENTRY(__truncf_sse41) roundss $11, %xmm0, %xmm0 ret END(__truncf_sse41) + +#ifdef HAVE_X86_SSE4_1 +libm_alias_float (__trunc, trunc) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/s_truncf.c b/sysdeps/x86_64/fpu/multiarch/s_truncf.c index dae01d166a..57783c805a 100644 --- a/sysdeps/x86_64/fpu/multiarch/s_truncf.c +++ b/sysdeps/x86_64/fpu/multiarch/s_truncf.c @@ -16,17 +16,19 @@ License along with the GNU C Library; if not, see . */ -#define NO_MATH_REDIRECT -#include +#if !defined HAVE_X86_SSE4_1 && !defined HAVE_X86_AVX2_FMA +# define NO_MATH_REDIRECT +# include -#define truncf __redirect_truncf -#define __truncf __redirect___truncf -#include -#undef truncf -#undef __truncf +# define truncf __redirect_truncf +# define __truncf __redirect___truncf +# include +# undef truncf +# undef __truncf -#define SYMBOL_NAME truncf -#include "ifunc-sse4_1.h" +# define SYMBOL_NAME truncf +# include "ifunc-sse4_1.h" libc_ifunc_redirected (__redirect_truncf, __truncf, IFUNC_SELECTOR ()); libm_alias_float (__trunc, trunc) +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/w_exp.c b/sysdeps/x86_64/fpu/multiarch/w_exp.c index 27eee98a0a..fb2045e6cf 100644 --- a/sysdeps/x86_64/fpu/multiarch/w_exp.c +++ b/sysdeps/x86_64/fpu/multiarch/w_exp.c @@ -1 +1,5 @@ -#include +#ifdef HAVE_X86_AVX2_FMA +# include +#else +# include +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/w_log.c b/sysdeps/x86_64/fpu/multiarch/w_log.c index 9b2b018711..b85be8221e 100644 --- a/sysdeps/x86_64/fpu/multiarch/w_log.c +++ b/sysdeps/x86_64/fpu/multiarch/w_log.c @@ -1 +1,5 @@ -#include +#ifdef HAVE_X86_AVX2_FMA +# include +#else +# include +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/w_pow.c b/sysdeps/x86_64/fpu/multiarch/w_pow.c index b50c1988de..849f4f97ff 100644 --- a/sysdeps/x86_64/fpu/multiarch/w_pow.c +++ b/sysdeps/x86_64/fpu/multiarch/w_pow.c @@ -1 +1,5 @@ -#include +#ifdef HAVE_X86_AVX2_FMA +# include +#else +# include +#endif