From patchwork Wed Mar 27 18:31:14 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Andre Vieira (lists)" <Andre.SimoesDiasVieira@arm.com>
X-Patchwork-Id: 1916972
Return-Path: <gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@legolas.ozlabs.org
Authentication-Results: legolas.ozlabs.org;
 spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org
 (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org;
 envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org;
 receiver=patchwork.ozlabs.org)
Received: from server2.sourceware.org (server2.sourceware.org
 [IPv6:2620:52:3:1:0:246e:9693:128c])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)
	(No client certificate requested)
	by legolas.ozlabs.org (Postfix) with ESMTPS id 4V4Zy544yQz1yWr
	for <incoming@patchwork.ozlabs.org>; Thu, 28 Mar 2024 05:31:41 +1100 (AEDT)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id B9471385DC0F
	for <incoming@patchwork.ozlabs.org>; Wed, 27 Mar 2024 18:31:39 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
 by sourceware.org (Postfix) with ESMTP id 916773858D20
 for <gcc-patches@gcc.gnu.org>; Wed, 27 Mar 2024 18:31:17 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 916773858D20
Authentication-Results: sourceware.org;
 dmarc=pass (p=none dis=none) header.from=arm.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 916773858D20
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=217.140.110.172
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711564281; cv=none;
 b=UWtLNqClairL0ZDzasMim/ciMHcxyAtOubdlKyLumU7RTteWpvC2xi03xfrHAIJ4om16nLdjQVpof/r87hUzFnyEduKO0P6czBNXl83nwqd8Ug7VMi1GVsF4ybQzmRtOl9hQwkF8zNwGg62nfM0XRSRLZn3noP8MfKZuri1Gh6o=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1711564281; c=relaxed/simple;
 bh=0VFzdbzVq+AQd4VNj8hZebVXo7GwzlT5VMb27ntmcws=;
 h=Message-ID:Date:MIME-Version:Subject:From:To;
 b=JyPDkEoFtDvrhHZAW1uK05mRYxrjZtrJPcx3yFsyxv73yKfbumqmkFcOsWub66fRe6+1REJgh5VneTreRE51ts7sdbgK5GCs9gQUxtUJEuhenaQAQ+2x79JBD/HeXkEQoEuwcHdqEpaIGuNlgYKDDqgESOmb8Wr1aLl9WMciUqo=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
 by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 88B3D2F4;
 Wed, 27 Mar 2024 11:31:51 -0700 (PDT)
Received: from [10.57.72.169] (unknown [10.57.72.169])
 by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 314493F694;
 Wed, 27 Mar 2024 11:31:16 -0700 (PDT)
Message-ID: <04fee3c0-b67c-4148-8cb9-cb0bd6170759@arm.com>
Date: Wed, 27 Mar 2024 18:31:14 +0000
MIME-Version: 1.0
User-Agent: Mozilla Thunderbird
Subject: [PATCHv2 2/2] aarch64: Add support for _BitInt
From: "Andre Vieira (lists)" <andre.simoesdiasvieira@arm.com>
To: gcc-patches@gcc.gnu.org
Cc: Richard.Sandiford@arm.com, Jakub@redhat.com, kyrylo.tkachov@arm.com
References: <20240125174501.32634-1-andre.simoesdiasvieira@arm.com>
 <a7d49d0a-6771-4fb8-b524-f6fc2d34b872@arm.com>
Content-Language: en-US
In-Reply-To: <a7d49d0a-6771-4fb8-b524-f6fc2d34b872@arm.com>
X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, GIT_PATCH_0,
 KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH,
 KAM_SHORT, SPF_HELO_NONE, SPF_NONE,
 TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org

This patch adds support for C23's _BitInt for the AArch64 port when 
compiling for little endianness.  Big Endianness requires further 
target-agnostic support and we therefor disable it for now.

The tests expose some suboptimal codegen for which I'll create PR's for 
optimizations after this goes in.

gcc/ChangeLog:

	* config/aarch64/aarch64.cc (TARGET_C_BITINT_TYPE_INFO): Declare MACRO.
	(aarch64_bitint_type_info): New function.
	(aarch64_return_in_memory_1): Return large _BitInt's in memory.
	(aarch64_function_arg_alignment): Adapt to correctly return the ABI
	mandated alignment of _BitInt(N) where N > 128 as the alignment of
	TImode.
	(aarch64_composite_type_p): Return true for _BitInt(N), where N > 128.

libgcc/ChangeLog:

	* config/aarch64/t-softfp (softfp_extras): Add floatbitinthf,
	floatbitintbf, floatbitinttf and fixtfbitint.
	* config/aarch64/libgcc-softfp.ver (GCC_14.0.0): Add __floatbitinthf,
	__floatbitintbf, __floatbitinttf and __fixtfbitint.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/bitint-alignments.c: New test.
	* gcc.target/aarch64/bitint-args.c: New test.
	* gcc.target/aarch64/bitint-sizes.c: New test.
	* gcc.target/aarch64/bitfield-bitint-abi.h: New header.
	* gcc.target/aarch64/bitfield-bitint-abi-align16.c: New test.
	* gcc.target/aarch64/bitfield-bitint-abi-align8.c: New test.

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index b68cf3e7cb9a6fa89b4e5826a39ffa11f64ca20a..5fe55c6e980bc1ea66df0e4357932123cd049366 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -6583,6 +6583,7 @@ aarch64_return_in_memory_1 (const_tree type)
   int count;
 
   if (!AGGREGATE_TYPE_P (type)
+      && TREE_CODE (type) != BITINT_TYPE
       && TREE_CODE (type) != COMPLEX_TYPE
       && TREE_CODE (type) != VECTOR_TYPE)
     /* Simple scalar types always returned in registers.  */
@@ -21991,6 +21992,11 @@ aarch64_composite_type_p (const_tree type,
   if (type && (AGGREGATE_TYPE_P (type) || TREE_CODE (type) == COMPLEX_TYPE))
     return true;
 
+  if (type
+      && TREE_CODE (type) == BITINT_TYPE
+      && int_size_in_bytes (type) > 16)
+    return true;
+
   if (mode == BLKmode
       || GET_MODE_CLASS (mode) == MODE_COMPLEX_FLOAT
       || GET_MODE_CLASS (mode) == MODE_COMPLEX_INT)
@@ -28472,6 +28478,42 @@ aarch64_excess_precision (enum excess_precision_type type)
   return FLT_EVAL_METHOD_UNPREDICTABLE;
 }
 
+/* Implement TARGET_C_BITINT_TYPE_INFO.
+   Return true if _BitInt(N) is supported and fill its details into *INFO.  */
+bool
+aarch64_bitint_type_info (int n, struct bitint_info *info)
+{
+  if (TARGET_BIG_END)
+    return false;
+
+  if (n <= 8)
+    info->limb_mode = QImode;
+  else if (n <= 16)
+    info->limb_mode = HImode;
+  else if (n <= 32)
+    info->limb_mode = SImode;
+  else if (n <= 64)
+    info->limb_mode = DImode;
+  else if (n <= 128)
+    info->limb_mode = TImode;
+  else
+    /* The AAPCS for AArch64 defines _BitInt(N > 128) as an array with
+       type {signed,unsigned} __int128[M] where M*128 >= N.  However, to be
+       able to use libgcc's implementation to support large _BitInt's we need
+       to use a LIMB_MODE that is no larger than 'long long'.  This is why we
+       use DImode for our internal LIMB_MODE and we define the ABI_LIMB_MODE to
+       be TImode to ensure we are ABI compliant.  */
+    info->limb_mode = DImode;
+
+  if (n > 128)
+    info->abi_limb_mode = TImode;
+  else
+    info->abi_limb_mode = info->limb_mode;
+  info->big_endian = TARGET_BIG_END;
+  info->extended = false;
+  return true;
+}
+
 /* Implement TARGET_SCHED_CAN_SPECULATE_INSN.  Return true if INSN can be
    scheduled for speculative execution.  Reject the long-running division
    and square-root instructions.  */
@@ -30596,6 +30638,9 @@ aarch64_run_selftests (void)
 #undef TARGET_C_EXCESS_PRECISION
 #define TARGET_C_EXCESS_PRECISION aarch64_excess_precision
 
+#undef TARGET_C_BITINT_TYPE_INFO
+#define TARGET_C_BITINT_TYPE_INFO aarch64_bitint_type_info
+
 #undef  TARGET_EXPAND_BUILTIN
 #define TARGET_EXPAND_BUILTIN aarch64_expand_builtin
 
diff --git a/gcc/testsuite/gcc.target/aarch64/bitfield-bitint-abi-align16.c b/gcc/testsuite/gcc.target/aarch64/bitfield-bitint-abi-align16.c
new file mode 100644
index 0000000000000000000000000000000000000000..048d04e4c1bf90215892aa0173f22226246a097d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/bitfield-bitint-abi-align16.c
@@ -0,0 +1,378 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-stack-protector -save-temps -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+#define ALIGN 16
+#include "bitfield-bitint-abi.h"
+
+// f1-f16 are all the same
+
+/*
+** f1:
+**	and	x0, x2, 1
+**	ret
+*/
+/*
+** f8:
+**	and	x0, x2, 1
+**	ret
+*/
+/*
+** f16:
+**	and	x0, x2, 1
+**	ret
+*/
+
+/* fp seems to be unable to optimize away stack-usage, TODO: to fix.  */
+
+/*
+** fp:
+**...
+**	and	x0, x1, 1
+**...
+**	ret
+*/
+
+// all other f1p-f8p generate the same code, for f16p the value comes from x2
+/*
+** f1p:
+**	and	x0, x1, 1
+**	ret
+*/
+/*
+** f8p:
+**	and	x0, x1, 1
+**	ret
+*/
+/*
+** f16p:
+**	and	x0, x2, 1
+**	ret
+*/
+
+// g1-g16 are all the same
+/*
+** g1:
+**	mov	(x[0-9]+), x0
+**	mov	w0, w1
+**	and	x4, \1, 9223372036854775807
+**	and	x2, \1, 1
+**	mov	x3, 0
+**	b	f1
+*/
+
+/*
+** g8:
+**	mov	(x[0-9]+), x0
+**	mov	w0, w1
+**	and	x4, \1, 9223372036854775807
+**	and	x2, \1, 1
+**	mov	x3, 0
+**	b	f8
+*/
+/*
+** g16:
+**	mov	(x[0-9]+), x0
+**	mov	w0, w1
+**	and	x4, \1, 9223372036854775807
+**	and	x2, \1, 1
+**	mov	x3, 0
+**	b	f16
+*/
+
+// again gp different from the rest
+
+/*
+** gp:
+**	sub	sp, sp, #16
+**	mov	(x[0-9]+), x0
+**	mov	w0, w1
+**	sbfx	x([0-9]+), \1, 0, 63
+**	mov	(w[0-9]+), 0
+**	bfi	\3, w\2, 0, 1
+**	and	x3, x\2, 9223372036854775807
+**	mov	x2, 0
+**	str	xzr, \[sp\]
+**	strb	\3, \[sp\]
+**	ldr	x1, \[sp\]
+**	add	sp, sp, 16
+**	b	fp
+*/
+
+// g1p-g8p are all the same, g16p uses x2 to pass parameter to f16p
+
+/*
+** g1p:
+**	mov	(w[0-9]+), w1
+**	and	x3, x0, 9223372036854775807
+**	and	x1, x0, 1
+**	mov	x2, 0
+**	mov	w0, \1
+**	b	f1p
+*/
+/*
+** g8p:
+**	mov	(w[0-9]+), w1
+**	and	x3, x0, 9223372036854775807
+**	and	x1, x0, 1
+**	mov	x2, 0
+**	mov	w0, \1
+**	b	f8p
+*/
+/*
+** g16p:
+**	mov	(x[0-9]+), x0
+**	mov	w0, w1
+**	and	x4, \1, 9223372036854775807
+**	and	x2, \1, 1
+**	mov	x3, 0
+**	b	f16p
+*/
+
+// f*_stack are all the same
+/*
+** f1_stack:
+**	ldr	(x[0-9]+), \[sp, 16\]
+**	and	x0, \1, 1
+**	ret
+*/
+/*
+** f8_stack:
+**	ldr	(x[0-9]+), \[sp, 16\]
+**	and	x0, \1, 1
+**	ret
+*/
+/*
+** f16_stack:
+**	ldr	(x[0-9]+), \[sp, 16\]
+**	and	x0, \1, 1
+**	ret
+*/
+
+// fp{,1,8}_stack are all the same but fp16_stack loads from sp+16
+/*
+** fp_stack:
+**	ldr	(x[0-9]+), \[sp, 8\]
+**	and	x0, \1, 1
+**	ret
+*/
+/*
+** f1p_stack:
+**	ldr	(x[0-9]+), \[sp, 8\]
+**	and	x0, \1, 1
+**	ret
+*/
+/*
+** f8p_stack:
+**	ldr	(x[0-9]+), \[sp, 8\]
+**	and	x0, \1, 1
+**	ret
+*/
+
+/*
+** f16p_stack:
+**	ldr	(x[0-9]+), \[sp, 16\]
+**	and	x0, \1, 1
+**	ret
+*/
+
+/*
+** gp_stack:
+**...
+**	mov	x([0-9]+), x0
+**	sxtw	(x[0-9]+), w1
+**	mov	x0, \2
+**	and	(x[0-9]+), \2, 9223372036854775807
+**	mov	(w[0-9]+), 0
+**	bfi	\4, w\1, 0, 1
+**	strb	wzr, \[sp, 16\]
+**	mov	x6, \3
+**	mov	x5, \3
+**	mov	x4, \3
+**	mov	x3, \3
+**	mov	x2, \3
+**	str	xzr, \[sp, 48\]
+**	strb	\4, \[sp, 48\]
+**	ldr	(x[0-9]+), \[sp, 48\]
+**	stp	\3, \5, \[sp\]
+**	mov	x1, \3
+**	bl	fp_stack
+**	sbfx	x0, x0, 0, 63
+**...
+**	ret
+*/
+
+/*
+** g1_stack:
+**...
+**	mov	(x[0-9]+), x0
+**	sxtw	(x[0-9]+), w1
+**	mov	x0, \2
+**	and	(x[0-9]+), \2, 9223372036854775807
+**	mov	(x[0-9]+), 0
+**	sbfx	(x[0-9]+), \1, 0, 63
+**	bfi	\4, \5, 0, 1
+**	stp	\4, xzr, \[sp, 16\]
+**	mov	x6, \3
+**	mov	x5, \3
+**	mov	x4, \3
+**	mov	x3, \3
+**	mov	x2, \3
+**	mov	x1, \3
+**	str	x7, \[sp\]
+**	bl	f1_stack
+**	sbfx	x0, x0, 0, 63
+**...
+**	ret
+*/
+
+
+/*
+** g8_stack:
+**...
+**	mov	(x[0-9]+), x0
+**	sxtw	(x[0-9]+), w1
+**	mov	x0, \2
+**	and	(x[0-9]+), \2, 9223372036854775807
+**	mov	(x[0-9]+), 0
+**	sbfx	(x[0-9]+), \1, 0, 63
+**	bfi	\4, \5, 0, 1
+**	stp	\4, xzr, \[sp, 16\]
+**	mov	x6, \3
+**	mov	x5, \3
+**	mov	x4, \3
+**	mov	x3, \3
+**	mov	x2, \3
+**	mov	x1, \3
+**	str	x7, \[sp\]
+**	bl	f8_stack
+**	sbfx	x0, x0, 0, 63
+**...
+**	ret
+*/
+
+/*
+** g16_stack:
+**...
+**	mov	(x[0-9]+), x0
+**	sxtw	(x[0-9]+), w1
+**	mov	x0, \2
+**	and	(x[0-9]+), \2, 9223372036854775807
+**	mov	(x[0-9]+), 0
+**	sbfx	(x[0-9]+), \1, 0, 63
+**	bfi	\4, \5, 0, 1
+**	stp	\4, xzr, \[sp, 16\]
+**	mov	x6, \3
+**	mov	x5, \3
+**	mov	x4, \3
+**	mov	x3, \3
+**	mov	x2, \3
+**	mov	x1, \3
+**	str	x7, \[sp\]
+**	bl	f16_stack
+**	sbfx	x0, x0, 0, 63
+**...
+**	ret
+*/
+
+/*
+** f1_stdarg:
+**...
+**	and	x0, x2, 1
+**...
+**	ret
+*/
+/*
+** f16_stdarg:
+**...
+**	and	x0, x2, 1
+**...
+**	ret
+*/
+
+/*
+** fp_stdarg:
+**...
+**	and	x0, x1, 1
+**...
+**	ret
+*/
+
+/*
+** f1p_stdarg:
+**...
+**	and	x0, x1, 1
+**...
+**	ret
+*/
+/*
+** f8p_stdarg:
+**...
+**	and	x0, x1, 1
+**...
+**	ret
+*/
+/*
+** f16p_stdarg:
+**...
+**	and	x0, x2, 1
+**...
+**	ret
+*/
+
+/*
+** g1_stdarg:
+**	and	x2, x0, 1
+**	mov	x3, 0
+**	mov	w0, w1
+**	b	f1_stdarg
+*/
+
+/*
+** g16_stdarg:
+**	and	x2, x0, 1
+**	mov	x3, 0
+**	mov	w0, w1
+**	b	f16_stdarg
+*/
+
+/*
+** gp_stdarg:
+**...
+**	mov	x([0-9]+), x0
+**	mov	w0, w1
+**	mov	(w[0-9]+), 0
+**	bfi	\2, w\1, 0, 1
+**	mov	x2, 0
+**	str	xzr, \[sp\]
+**	strb	\2, \[sp\]
+**	ldr	x1, \[sp\]
+**...
+**	b	fp_stdarg
+*/
+
+/*
+** g1p_stdarg:
+**	mov	(x[0-9]+), x0
+**	mov	w0, w1
+**	and	x1, \1, 1
+**	mov	x2, 0
+**	b	f1p_stdarg
+*/
+
+/*
+** g8p_stdarg:
+**	mov	(x[0-9]+), x0
+**	mov	w0, w1
+**	and	x1, \1, 1
+**	mov	x2, 0
+**	b	f8p_stdarg
+*/
+
+/*
+** g16p_stdarg:
+**	and	x2, x0, 1
+**	mov	x3, 0
+**	mov	w0, w1
+**	b	f16p_stdarg
+*/
diff --git a/gcc/testsuite/gcc.target/aarch64/bitfield-bitint-abi-align8.c b/gcc/testsuite/gcc.target/aarch64/bitfield-bitint-abi-align8.c
new file mode 100644
index 0000000000000000000000000000000000000000..11f0580fd60c3d619126c5b41d646e22374c3593
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/bitfield-bitint-abi-align8.c
@@ -0,0 +1,380 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-stack-protector -save-temps -fno-schedule-insns -fno-schedule-insns2" } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+
+#define ALIGN 8
+#include "bitfield-bitint-abi.h"
+
+// f1-f16 are all the same
+
+/*
+** f1:
+**	and	x0, x1, 1
+**	ret
+*/
+/*
+** f8:
+**	and	x0, x1, 1
+**	ret
+*/
+/*
+** f16:
+**	and	x0, x2, 1
+**	ret
+*/
+
+/* fp seems to be unable to optimize away stack-usage, TODO: to fix.  */
+
+/*
+** fp:
+**...
+**	and	x0, x1, 1
+**...
+**	ret
+*/
+
+// all other f1p-f8p generate the same code, for f16p the value comes from x2
+/*
+** f1p:
+**	and	x0, x1, 1
+**	ret
+*/
+/*
+** f8p:
+**	and	x0, x1, 1
+**	ret
+*/
+/*
+** f16p:
+**	and	x0, x2, 1
+**	ret
+*/
+
+// g1-g16 are all the same
+/*
+** g1:
+**	mov	(w[0-9]+), w1
+**	and	x3, x0, 9223372036854775807
+**	and	x1, x0, 1
+**	mov	x2, 0
+**	mov	w0, \1
+**	b	f1
+*/
+
+/*
+** g8:
+**	mov	(w[0-9]+), w1
+**	and	x3, x0, 9223372036854775807
+**	and	x1, x0, 1
+**	mov	x2, 0
+**	mov	w0, \1
+**	b	f8
+*/
+/*
+** g16:
+**	mov	x2, x0
+**	mov	w0, w1
+**	and	x4, x2, 9223372036854775807
+**	and	x2, x2, 1
+**	mov	x3, 0
+**	b	f16
+*/
+
+// again gp different from the rest
+
+/*
+** gp:
+**	sub	sp, sp, #16
+**	mov	(x[0-9]+), x0
+**	mov	w0, w1
+**	sbfx	x([0-9]+), \1, 0, 63
+**	mov	w1, 0
+**	bfi	w1, w\2, 0, 1
+**	and	x3, x\2, 9223372036854775807
+**	mov	x2, 0
+**	str	xzr, \[sp\]
+**	strb	w1, \[sp\]
+**	ldr	x1, \[sp\]
+**	add	sp, sp, 16
+**	b	fp
+*/
+
+// g1p-g8p are all the same, g16p uses x2 to pass parameter to f16p
+
+/*
+** g1p:
+**	mov	(w[0-9]+), w1
+**	and	x3, x0, 9223372036854775807
+**	and	x1, x0, 1
+**	mov	x2, 0
+**	mov	w0, \1
+**	b	f1p
+*/
+/*
+** g8p:
+**	mov	(w[0-9]+), w1
+**	and	x3, x0, 9223372036854775807
+**	and	x1, x0, 1
+**	mov	x2, 0
+**	mov	w0, \1
+**	b	f8p
+*/
+/*
+** g16p:
+**	mov	(x[0-9]+), x0
+**	mov	w0, w1
+**	and	x4, \1, 9223372036854775807
+**	and	x2, \1, 1
+**	mov	x3, 0
+**	b	f16p
+*/
+
+// f*_stack are all the same
+/*
+** f1_stack:
+**	ldr	(x[0-9]+), \[sp, 8\]
+**	and	x0, \1, 1
+**	ret
+*/
+/*
+** f8_stack:
+**	ldr	(x[0-9]+), \[sp, 8\]
+**	and	x0, \1, 1
+**	ret
+*/
+/*
+** f16_stack:
+**	ldr	(x[0-9]+), \[sp, 16\]
+**	and	x0, \1, 1
+**	ret
+*/
+
+// fp{,1,8}_stack are all the same but fp16_stack loads from sp+16
+/*
+** fp_stack:
+**	ldr	(x[0-9]+), \[sp, 8\]
+**	and	x0, \1, 1
+**	ret
+*/
+/*
+** f1p_stack:
+**	ldr	(x[0-9]+), \[sp, 8\]
+**	and	x0, \1, 1
+**	ret
+*/
+/*
+** f8p_stack:
+**	ldr	(x[0-9]+), \[sp, 8\]
+**	and	x0, \1, 1
+**	ret
+*/
+
+/*
+** f16p_stack:
+**	ldr	(x[0-9]+), \[sp, 16\]
+**	and	x0, \1, 1
+**	ret
+*/
+
+/*
+** gp_stack:
+**...
+**	mov	x([0-9]+), x0
+**	sxtw	(x[0-9]+), w1
+**	mov	x0, \2
+**	and	(x[0-9]+), \2, 9223372036854775807
+**	mov	(w[0-9]+), 0
+**	bfi	\4, w\1, 0, 1
+**	strb	wzr, \[sp, 16\]
+**	mov	x6, \3
+**	mov	x5, \3
+**	mov	x4, \3
+**	mov	x3, \3
+**	mov	x2, \3
+**	str	xzr, \[sp, 48\]
+**	strb	\4, \[sp, 48\]
+**	ldr	(x[0-9]+), \[sp, 48\]
+**	stp	\3, \5, \[sp\]
+**	mov	x1, \3
+**	bl	fp_stack
+**	sbfx	x0, x0, 0, 63
+**...
+**	ret
+*/
+
+/* g1 and g8 are the same.  */
+
+/*
+** g1_stack:
+**...
+**	mov	(x[0-9]+), x0
+**	sxtw	(x[0-9]+), w1
+**	mov	x0, \2
+**	and	(x[0-9]+), \2, 9223372036854775807
+**	mov	(x[0-9]+), 0
+**	sbfx	(x[0-9]+), \1, 0, 63
+**	bfi	\4, \5, 0, 1
+**	stp	x7, x1, \[sp\]
+**	mov	x6, \3
+**	mov	x5, \3
+**	mov	x4, \3
+**	mov	x3, \3
+**	mov	x2, \3
+**	mov	x1, \3
+**	str	xzr, \[sp, 16\]
+**	bl	f1_stack
+**	sbfx	x0, x0, 0, 63
+**...
+**	ret
+*/
+
+/*
+** g8_stack:
+**...
+**	mov	(x[0-9]+), x0
+**	sxtw	(x[0-9]+), w1
+**	mov	x0, \2
+**	and	(x[0-9]+), \2, 9223372036854775807
+**	mov	(x[0-9]+), 0
+**	sbfx	(x[0-9]+), \1, 0, 63
+**	bfi	\4, \5, 0, 1
+**	stp	x7, x1, \[sp\]
+**	mov	x6, \3
+**	mov	x5, \3
+**	mov	x4, \3
+**	mov	x3, \3
+**	mov	x2, \3
+**	mov	x1, \3
+**	str	xzr, \[sp, 16\]
+**	bl	f8_stack
+**	sbfx	x0, x0, 0, 63
+**...
+**	ret
+*/
+
+/*
+** g16_stack:
+**...
+**	mov	(x[0-9]+), x0
+**	sxtw	(x[0-9]+), w1
+**	mov	x0, \2
+**	and	(x[0-9]+), \2, 9223372036854775807
+**	mov	(x[0-9]+), 0
+**	sbfx	(x[0-9]+), \1, 0, 63
+**	bfi	\4, \5, 0, 1
+**	stp	\4, xzr, \[sp, 16\]
+**	mov	x6, \3
+**	mov	x5, \3
+**	mov	x4, \3
+**	mov	x3, \3
+**	mov	x2, \3
+**	mov	x1, \3
+**	str	x7, \[sp\]
+**	bl	f16_stack
+**	sbfx	x0, x0, 0, 63
+**...
+**	ret
+*/
+
+/*
+** f1_stdarg:
+**...
+**	and	x0, x1, 1
+**...
+**	ret
+*/
+/*
+** f16_stdarg:
+**...
+**	and	x0, x2, 1
+**...
+**	ret
+*/
+
+/*
+** fp_stdarg:
+**...
+**	and	x0, x1, 1
+**...
+**	ret
+*/
+
+/*
+** f1p_stdarg:
+**...
+**	and	x0, x1, 1
+**...
+**	ret
+*/
+/*
+** f8p_stdarg:
+**...
+**	and	x0, x1, 1
+**...
+**	ret
+*/
+/*
+** f16p_stdarg:
+**...
+**	and	x0, x2, 1
+**...
+**	ret
+*/
+
+/*
+** g1_stdarg:
+**	mov	(x[0-9]+), x0
+**	mov	w0, w1
+**	and	x1, \1, 1
+**	mov	x2, 0
+**	b	f1_stdarg
+*/
+
+/*
+** g16_stdarg:
+**	and	x2, x0, 1
+**	mov	x3, 0
+**	mov	w0, w1
+**	b	f16_stdarg
+*/
+
+/*
+** gp_stdarg:
+**...
+**	mov	x([0-9]+), x0
+**	mov	w0, w1
+**	mov	(w[0-9]+), 0
+**	bfi	\2, w\1, 0, 1
+**	mov	x2, 0
+**	str	xzr, \[sp\]
+**	strb	\2, \[sp\]
+**	ldr	x1, \[sp\]
+**...
+**	b	fp_stdarg
+*/
+
+/*
+** g1p_stdarg:
+**	mov	(x[0-9]+), x0
+**	mov	w0, w1
+**	and	x1, \1, 1
+**	mov	x2, 0
+**	b	f1p_stdarg
+*/
+
+/*
+** g8p_stdarg:
+**	mov	(x[0-9]+), x0
+**	mov	w0, w1
+**	and	x1, \1, 1
+**	mov	x2, 0
+**	b	f8p_stdarg
+*/
+
+/*
+** g16p_stdarg:
+**	and	x2, x0, 1
+**	mov	x3, 0
+**	mov	w0, w1
+**	b	f16p_stdarg
+*/
diff --git a/gcc/testsuite/gcc.target/aarch64/bitfield-bitint-abi.h b/gcc/testsuite/gcc.target/aarch64/bitfield-bitint-abi.h
new file mode 100644
index 0000000000000000000000000000000000000000..b02182f4b3b15c574b9d53dfd6788492909207e9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/bitfield-bitint-abi.h
@@ -0,0 +1,101 @@
+#include <stdarg.h>
+
+typedef unsigned _BitInt(63) BI __attribute__((aligned(ALIGN)));
+
+#ifndef EXTRA
+#define EXTRA unsigned long long x;
+#endif
+
+struct S1  { __attribute__((aligned(1))) BI i : 1; EXTRA };
+struct S8  { __attribute__((aligned(8))) BI i : 1; EXTRA };
+struct S16 { __attribute__((aligned(16))) BI i : 1; EXTRA };
+
+struct Sp   { BI i : 1; EXTRA }__attribute__((packed));
+struct S1p  { __attribute__((packed, aligned(1))) BI i : 1; EXTRA };
+struct S8p  { __attribute__((packed, aligned(8))) BI i : 1; EXTRA };
+struct S16p { __attribute__((packed, aligned(16))) BI i : 1; EXTRA };
+
+/* Bitfield in registers.  */
+#define PARAMS(xx) int a0, struct S##xx s, BI a1
+/* Bitfield passed by the stack.  */
+#define PARAMS_STACK(xx) int a0, BI a1, BI a2, BI a3, BI a4, BI a5, BI a6, BI a7, BI a8, struct S##xx t
+/* Bitfield passed via stdarg.  */
+#define PARAMS_STDARG(xx) int a0, ...
+
+#define CODE(xx)				\
+  return s.i;
+
+#define CODE_STACK(xx)				\
+  return t.i;
+
+#define CODE_STDARG(xx)				\
+  va_list ap;					\
+  struct S##xx arg;				\
+  __builtin_va_start(ap,a0);			\
+  arg = __builtin_va_arg(ap, struct S##xx);	\
+  return arg.i;
+
+#define ARGS(xx) y, (struct S##xx) { x }, x
+#define ARGS_STACK(xx) y, y, y, y, y, y, y, y, y, (struct S##xx) { x }
+#define ARGS_STDARG(xx) y, (struct S##xx) { x }
+
+/* Bitfield in registers.  */
+_BitInt(63) __attribute__ ((noipa)) f1 (PARAMS(1))  { CODE(1) }
+_BitInt(63) __attribute__ ((noipa)) f8 (PARAMS(8))  { CODE(8) }
+_BitInt(63) __attribute__ ((noipa)) f16(PARAMS(16)) { CODE(16) }
+
+_BitInt(63) __attribute__ ((noipa)) fp  (PARAMS(p))   { CODE(p) }
+_BitInt(63) __attribute__ ((noipa)) f1p (PARAMS(1p))  { CODE(1p) }
+_BitInt(63) __attribute__ ((noipa)) f8p (PARAMS(8p))  { CODE(8p) }
+_BitInt(63) __attribute__ ((noipa)) f16p(PARAMS(16p)) { CODE(16p) }
+
+_BitInt(63) g1 (_BitInt(63) x, int y) { return f1 (ARGS(1)); }
+_BitInt(63) g8 (_BitInt(63) x, int y) { return f8 (ARGS(8)); }
+_BitInt(63) g16(_BitInt(63) x, int y) { return f16 (ARGS(16)); }
+
+_BitInt(63) gp  (_BitInt(63) x, int y) { return fp   (ARGS(p)); }
+_BitInt(63) g1p (_BitInt(63) x, int y) { return f1p  (ARGS(1p)); }
+_BitInt(63) g8p (_BitInt(63) x, int y) { return f8p  (ARGS(8p)); }
+_BitInt(63) g16p(_BitInt(63) x, int y) { return f16p (ARGS(16p)); }
+
+/* Bitfield in the stack.  */
+_BitInt(63) __attribute__ ((noipa)) f1_stack (PARAMS_STACK(1))  { CODE_STACK(1) }
+_BitInt(63) __attribute__ ((noipa)) f8_stack (PARAMS_STACK(8))  { CODE_STACK(8) }
+_BitInt(63) __attribute__ ((noipa)) f16_stack(PARAMS_STACK(16)) { CODE_STACK(16) }
+
+_BitInt(63) __attribute__ ((noipa)) fp_stack  (PARAMS_STACK(p))   { CODE_STACK(p) }
+_BitInt(63) __attribute__ ((noipa)) f1p_stack (PARAMS_STACK(1p))  { CODE_STACK(1p) }
+_BitInt(63) __attribute__ ((noipa)) f8p_stack (PARAMS_STACK(8p))  { CODE_STACK(8p) }
+_BitInt(63) __attribute__ ((noipa)) f16p_stack(PARAMS_STACK(16p)) { CODE_STACK(16p) }
+
+
+_BitInt(63) g1_stack (_BitInt(63) x, int y) { return f1_stack (ARGS_STACK(1)); }
+_BitInt(63) g8_stack (_BitInt(63) x, int y) { return f8_stack (ARGS_STACK(8)); }
+_BitInt(63) g16_stack(_BitInt(63) x, int y) { return f16_stack (ARGS_STACK(16)); }
+
+_BitInt(63) gp_stack  (_BitInt(63) x, int y) { return fp_stack (ARGS_STACK(p)); }
+_BitInt(63) g1p_stack (_BitInt(63) x, int y) { return f1p_stack (ARGS_STACK(1p)); }
+_BitInt(63) g8p_stack (_BitInt(63) x, int y) { return f8p_stack (ARGS_STACK(8p)); }
+_BitInt(63) g16p_stack(_BitInt(63) x, int y) { return f16p_stack (ARGS_STACK(16p)); }
+
+
+/* Bitfield  via stdarg.  */
+_BitInt(63) __attribute__ ((noipa)) f1_stdarg (PARAMS_STDARG(1))  { CODE_STDARG(1) }
+_BitInt(63) __attribute__ ((noipa)) f8_stdarg (PARAMS_STDARG(8))  { CODE_STDARG(8) }
+_BitInt(63) __attribute__ ((noipa)) f16_stdarg(PARAMS_STDARG(16)) { CODE_STDARG(16) }
+
+_BitInt(63) __attribute__ ((noipa)) fp_stdarg  (PARAMS_STDARG(p))   { CODE_STDARG(p) }
+_BitInt(63) __attribute__ ((noipa)) f1p_stdarg (PARAMS_STDARG(1p))  { CODE_STDARG(1p) }
+_BitInt(63) __attribute__ ((noipa)) f8p_stdarg (PARAMS_STDARG(8p))  { CODE_STDARG(8p) }
+_BitInt(63) __attribute__ ((noipa)) f16p_stdarg(PARAMS_STDARG(16p)) { CODE_STDARG(16p) }
+
+_BitInt(63) g1_stdarg (_BitInt(63) x, int y) { return f1_stdarg (ARGS_STDARG(1)); }
+_BitInt(63) g8_stdarg (_BitInt(63) x, int y) { return f8_stdarg (ARGS_STDARG(8)); }
+_BitInt(63) g16_stdarg(_BitInt(63) x, int y) { return f16_stdarg (ARGS_STDARG(16)); }
+
+_BitInt(63) gp_stdarg  (_BitInt(63) x, int y) { return fp_stdarg (ARGS_STDARG(p)); }
+_BitInt(63) g1p_stdarg (_BitInt(63) x, int y) { return f1p_stdarg (ARGS_STDARG(1p)); }
+_BitInt(63) g8p_stdarg (_BitInt(63) x, int y) { return f8p_stdarg (ARGS_STDARG(8p)); }
+_BitInt(63) g16p_stdarg(_BitInt(63) x, int y) { return f16p_stdarg (ARGS_STDARG(16p)); }
+
+
diff --git a/gcc/testsuite/gcc.target/aarch64/bitint-alignments.c b/gcc/testsuite/gcc.target/aarch64/bitint-alignments.c
new file mode 100644
index 0000000000000000000000000000000000000000..4de31fe7ebd933247911c48ace01ab520fe194a3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/bitint-alignments.c
@@ -0,0 +1,58 @@
+/* { dg-do run } */
+/* { dg-options "-std=c23" } */
+
+static long unsigned int
+calc_size (int n)
+{
+  if (n > 64)
+    return alignof(__int128_t);
+  if (n > 32)
+    return alignof(long long);
+  if (n > 16)
+    return alignof(int);
+  if (n > 8)
+    return alignof(short);
+  else
+    return alignof(char);
+}
+
+#define CHECK_ALIGNMENT(N) \
+  if (alignof(_BitInt(N)) != calc_size(N)) \
+    __builtin_abort ();
+
+int main (void)
+{
+  CHECK_ALIGNMENT(2);
+  CHECK_ALIGNMENT(3);
+  CHECK_ALIGNMENT(7);
+  CHECK_ALIGNMENT(8);
+  CHECK_ALIGNMENT(9);
+  CHECK_ALIGNMENT(13);
+  CHECK_ALIGNMENT(15);
+  CHECK_ALIGNMENT(16);
+  CHECK_ALIGNMENT(17);
+  CHECK_ALIGNMENT(24);
+  CHECK_ALIGNMENT(31);
+  CHECK_ALIGNMENT(32);
+  CHECK_ALIGNMENT(33);
+  CHECK_ALIGNMENT(42);
+  CHECK_ALIGNMENT(53);
+  CHECK_ALIGNMENT(63);
+  CHECK_ALIGNMENT(64);
+  CHECK_ALIGNMENT(65);
+  CHECK_ALIGNMENT(79);
+  CHECK_ALIGNMENT(96);
+  CHECK_ALIGNMENT(113);
+  CHECK_ALIGNMENT(127);
+  CHECK_ALIGNMENT(128);
+  CHECK_ALIGNMENT(129);
+  CHECK_ALIGNMENT(153);
+  CHECK_ALIGNMENT(255);
+  CHECK_ALIGNMENT(256);
+  CHECK_ALIGNMENT(257);
+  CHECK_ALIGNMENT(353);
+  CHECK_ALIGNMENT(512);
+  CHECK_ALIGNMENT(620);
+  CHECK_ALIGNMENT(1024);
+  CHECK_ALIGNMENT(30000);
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/bitint-args.c b/gcc/testsuite/gcc.target/aarch64/bitint-args.c
new file mode 100644
index 0000000000000000000000000000000000000000..f827b7d3220561ec343a8ca276f9a6f4a4d0f6d4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/bitint-args.c
@@ -0,0 +1,101 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c23 -O -fno-stack-clash-protection -g" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#define CHECK_ARG(N)				\
+void f##N(_BitInt(N) *ptr, _BitInt(N) y)	\
+{						\
+    *ptr = y;					\
+}
+
+
+CHECK_ARG(2)
+/*
+** f2:
+**	sbfiz	(w[0-9]+), w1, 6, 2
+**	asr	(w[0-9]+), \1, 6
+**	strb	\2, \[x0\]
+**	ret
+*/
+CHECK_ARG(8)
+/*
+** f8:
+**	strb	w1, \[x0\]
+**	ret
+*/
+CHECK_ARG(9)
+/*
+** f9:
+**	sbfiz	(w[0-9]+), w1, 7, 9
+**	asr	(w[0-9]+), \1, 7
+**	strh	\2, \[x0\]
+**	ret
+*/
+CHECK_ARG(16)
+/*
+** f16:
+**	strh	w1, \[x0\]
+**	ret
+*/
+CHECK_ARG(19)
+/*
+** f19:
+**	sbfx	x([0-9]+), x1, 0, 19
+**	str	w\1, \[x0\]
+**	ret
+*/
+CHECK_ARG(32)
+/*
+** f32:
+**	str	w1, \[x0\]
+**	ret
+*/
+CHECK_ARG(42)
+/*
+** f42:
+**	sbfx	(x[0-9]+), x1, 0, 42
+**	str	\1, \[x0\]
+**	ret
+*/
+CHECK_ARG(64)
+/*
+** f64:
+**	str	x1, \[x0\]
+**	ret
+*/
+
+CHECK_ARG(65)
+/*
+** f65:
+**	extr	(x[0-9]+), x3, x2, 1
+**	asr	(x[0-9]+), \1, 63
+**	stp	x2, \2, \[x0\]
+**	ret
+*/
+
+CHECK_ARG(127)
+/*
+** f127:
+**	extr	(x[0-9]+), x3, x2, 63
+**	asr	(x[0-9]+), \1, 1
+**	stp	x2, \2, \[x0\]
+**	ret
+*/
+
+CHECK_ARG(128)
+/*
+** f128:
+**	stp	x2, x3, \[x0\]
+**	ret
+*/
+
+CHECK_ARG(129)
+/*
+** f129:
+**	ldp	(x[0-9]+), (x[0-9]+), \[x1\]
+**	stp	\1, \2, \[x0\]
+**	ldr	(x[0-9]+), \[x1, 16\]
+**	sbfx	(x[0-9]+), \3, 0, 1
+**	str	\4, \[x0, 16\]
+**	ret
+*/
diff --git a/gcc/testsuite/gcc.target/aarch64/bitint-sizes.c b/gcc/testsuite/gcc.target/aarch64/bitint-sizes.c
new file mode 100644
index 0000000000000000000000000000000000000000..bee9abfe91b0dcb1ec335ef9ed02f212f7aa34b7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/bitint-sizes.c
@@ -0,0 +1,60 @@
+/* { dg-do run } */
+/* { dg-options "-std=c23" } */
+
+static long unsigned int
+calc_size (int n)
+{
+  if (n > 128)
+    return ((n - 1)/128 + 1)  * sizeof(__int128_t);
+  if (n > 64)
+    return sizeof(__int128_t);
+  if (n > 32)
+    return sizeof(long long);
+  if (n > 16)
+    return sizeof(int);
+  if (n > 8)
+    return sizeof(short);
+  else
+    return sizeof(char);
+}
+
+#define CHECK_SIZE(N) \
+  if (sizeof(_BitInt(N)) != calc_size(N)) \
+    __builtin_abort ();
+
+int main (void)
+{
+  CHECK_SIZE(2);
+  CHECK_SIZE(3);
+  CHECK_SIZE(7);
+  CHECK_SIZE(8);
+  CHECK_SIZE(9);
+  CHECK_SIZE(13);
+  CHECK_SIZE(15);
+  CHECK_SIZE(16);
+  CHECK_SIZE(17);
+  CHECK_SIZE(24);
+  CHECK_SIZE(31);
+  CHECK_SIZE(32);
+  CHECK_SIZE(33);
+  CHECK_SIZE(42);
+  CHECK_SIZE(53);
+  CHECK_SIZE(63);
+  CHECK_SIZE(64);
+  CHECK_SIZE(65);
+  CHECK_SIZE(79);
+  CHECK_SIZE(96);
+  CHECK_SIZE(113);
+  CHECK_SIZE(127);
+  CHECK_SIZE(128);
+  CHECK_SIZE(129);
+  CHECK_SIZE(153);
+  CHECK_SIZE(255);
+  CHECK_SIZE(256);
+  CHECK_SIZE(257);
+  CHECK_SIZE(353);
+  CHECK_SIZE(512);
+  CHECK_SIZE(620);
+  CHECK_SIZE(1024);
+  CHECK_SIZE(30000);
+}
diff --git a/libgcc/config/aarch64/libgcc-softfp.ver b/libgcc/config/aarch64/libgcc-softfp.ver
index e73f5f9129776d39eb5020ed7398dc59aba2d197..9ba857036abef99913eebe56971eaaabf5e1952e 100644
--- a/libgcc/config/aarch64/libgcc-softfp.ver
+++ b/libgcc/config/aarch64/libgcc-softfp.ver
@@ -39,3 +39,11 @@ GCC_13.0.0 {
   __trunctfbf2
   __trunchfbf2
 }
+
+%inherit GCC_14.0.0 GCC_13.0.0
+GCC_14.0.0 {
+  __fixtfbitint
+  __floatbitintbf
+  __floatbitinthf
+  __floatbitinttf
+}
diff --git a/libgcc/config/aarch64/t-softfp b/libgcc/config/aarch64/t-softfp
index 2e32366f891361e2056c680b2e36edb1871c7670..80e7e77a545cc10eeccd84eea092871751c3e139 100644
--- a/libgcc/config/aarch64/t-softfp
+++ b/libgcc/config/aarch64/t-softfp
@@ -4,7 +4,8 @@ softfp_extensions := sftf dftf hftf bfsf
 softfp_truncations := tfsf tfdf tfhf tfbf dfbf sfbf hfbf
 softfp_exclude_libgcc2 := n
 softfp_extras += fixhfti fixunshfti floattihf floatuntihf \
-		 floatdibf floatundibf floattibf floatuntibf
+		 floatdibf floatundibf floattibf floatuntibf \
+		 floatbitinthf floatbitintbf floatbitinttf fixtfbitint
 
 TARGET_LIBGCC2_CFLAGS += -Wno-missing-prototypes