diff mbox series

aarch64: Fix ACLE SME streaming mode error in neon-sve-bridge

Message ID 0612833c-daa9-4008-a376-07fb81d80811@arm.com
State New
Headers show
Series aarch64: Fix ACLE SME streaming mode error in neon-sve-bridge | expand

Commit Message

Richard Ball Feb. 1, 2024, 4:25 p.m. UTC
When using LTO, handling the pragma for sme before the pragma
for the neon-sve-bridge caused the following error on svset_neonq,
in the neon-sve-bridge.c test.

error: ACLE function '0' can only be called when SME streaming mode is enabled.

Handling the pragmas the other way around fixes this.

No regressions on aarch64-none-elf.


gcc/ChangeLog:

        * config/aarch64/aarch64-sve-builtins.cc (init_builtins):
        Re-order handle pragma functions for lto.

Comments

Richard Sandiford Feb. 1, 2024, 5:15 p.m. UTC | #1
Richard Ball <richard.ball@arm.com> writes:
> When using LTO, handling the pragma for sme before the pragma
> for the neon-sve-bridge caused the following error on svset_neonq,
> in the neon-sve-bridge.c test.
>
> error: ACLE function '0' can only be called when SME streaming mode is enabled.
>
> Handling the pragmas the other way around fixes this.

Why is that the right fix though?  And where did the weird function
name in the error message come from?

Trying it locally, I think the patch works for the testcase because it
mimics the testcase's particular include order.  But the patch regresses
LTO for:

#include <arm_sme.h>

int
main (void)
{
  __builtin_printf ("%d\n", (int) svcntsb ());
  return 0;
}

because the handle_*() calls no longer match the arm_sme.h include order.

I guess this is where assigning dynamic ids breaks down.  It worked well
for one header file, and arguably for two (given that one required the
other).  But now we need a slightly different approach.

Some ideas for fixes:

(1) Treat arm_neon_sve_bridge.h and arm_sme.h as separate enums in
    aarch64_builtin_class.

(2) Subdivide AARCH64_BUILTIN_SVE internally within arm_sve.h into
    arm_sve.h, arm_neon_sve_bridge.h and arm_sme.h, using the low
    2 bits of the function code.

(3) Manually reserve a number of functions for each header file and assert
    that the reserved number is big enough.  So arm_sve.h function codes
    would start at 0, arm_sme.h at N (for some "nice" N that is bigger
    than the number of functions in arm_sve.h), and arm_neon_sve_bridge.h
    at N+M, etc.

(4) Like (3), but continue to maintain ids dynamically.  Have a way of
    cycling through all the functions in a header file *without* declaring
    them, and just pushing nulls to registered_functions instead.

I think (1) goes against the current design, since aarch64_builtin_class
really selects a framework rather than a header file.  (2) would mean
maintaining separate datastructures for each header file.  (3) could be
wasteful, since we'd either have padding elements in registered_functions,
or we'd need to convert registered_functions to some other datastructure.
So I'm leaning towards (4).

Thanks,
Richard

>
> No regressions on aarch64-none-elf.
>
>
> gcc/ChangeLog:
>
>         * config/aarch64/aarch64-sve-builtins.cc (init_builtins):
>         Re-order handle pragma functions for lto.
>
> diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc
> index c2f1486315f02c3e38f808b3124a9b4cf49a4708..10f1df233ca9ae4e4a639a45a25db1f055a7d7c3 100644
> --- a/gcc/config/aarch64/aarch64-sve-builtins.cc
> +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
> @@ -4511,8 +4511,8 @@ init_builtins ()
>    if (in_lto_p)
>      {
>        handle_arm_sve_h ();
> -      handle_arm_sme_h ();
>        handle_arm_neon_sve_bridge_h ();
> +      handle_arm_sme_h ();
>      }
>  }
>
diff mbox series

Patch

diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc
index c2f1486315f02c3e38f808b3124a9b4cf49a4708..10f1df233ca9ae4e4a639a45a25db1f055a7d7c3 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
@@ -4511,8 +4511,8 @@  init_builtins ()
   if (in_lto_p)
     {
       handle_arm_sve_h ();
-      handle_arm_sme_h ();
       handle_arm_neon_sve_bridge_h ();
+      handle_arm_sme_h ();
     }
 }