diff mbox series

ARM: Prevent the compiler from using NEON registers

Message ID 20210816031437.15856-1-samuel@sholland.org
State Superseded
Delegated to: Tom Rini
Headers show
Series ARM: Prevent the compiler from using NEON registers | expand

Commit Message

Samuel Holland Aug. 16, 2021, 3:14 a.m. UTC
For ARMv8-A, NEON is standard, so the compiler can use it even when no
special target flags are provided. For example, it can use stores from
NEON registers to zero-initialize large structures. GCC 11 decides to
do this inside the DRAM init code for the Allwinner H6, which breaks
boot on that platform, as NEON is not available in SPL. Fix this by
restricting the compiler to using GPRs only, not vector registers.

Signed-off-by: Samuel Holland <samuel@sholland.org>
---
 arch/arm/config.mk | 1 +
 1 file changed, 1 insertion(+)

Comments

Andre Przywara Aug. 16, 2021, 11:34 a.m. UTC | #1
On Sun, 15 Aug 2021 22:14:37 -0500
Samuel Holland <samuel@sholland.org> wrote:

Hi,

in general I think the patch makes sense, and we should use that option
since we also specify -msoft-float.

> For ARMv8-A, NEON is standard,

It should be noted that the ARMv8-A architecture itself treats FP and
AdvSIMD as optional, and little cores like Cortex-A53 make this even an
integration option [1]. This also gives another reason for this patch,
as we cannot assume NEON support for *every* core we are compiling for
(even though most A53s out there seem to include NEON).
 
Anyway GCC decides to include both +fp and +simd when the basic (and
probably default) "armv8-a" is used for -march [2], so we must indeed
restrict it explicitly, when we want to avoid it.

[1]https://developer.arm.com/documentation/ddi0500/j/Introduction/Implementation-options
[2]https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html -march=name

> so the compiler can use it even when no
> special target flags are provided. For example, it can use stores from
> NEON registers to zero-initialize large structures. GCC 11 decides to
> do this inside the DRAM init code for the Allwinner H6, which breaks
> boot on that platform, as NEON is not available in SPL.

And that brings up the question: why? The Cortex cores in all Allwinner
SoCs support NEON, and we always clear CPTR_EL3 in start.S, so it
should be usable.
So I did some experiments, and I guess it's our old friend
"unaligned access" again, because SIMD instructions themselves work
(movi, str q0). But the SPL runs with the MMU off, so everything
is device memory, and natural alignment is mandatory, even with SCTLR.A
cleared. "stp q0, q0, [x0]" worked when x0 was 16 bytes aligned, but
hang when it was not. The same applied to "stur q0, [x0]", which is
used with an unaligned offset in the generated code
(https://tpaste.us/qPEw).

So this deserves some more research, for instance to find out if GCC
ignores -mstrict-align here?

Cheers,
Andre

> Fix this by
> restricting the compiler to using GPRs only, not vector registers.
> 
> Signed-off-by: Samuel Holland <samuel@sholland.org>
> ---
>  arch/arm/config.mk | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm/config.mk b/arch/arm/config.mk
> index 16c63e12667..964c6b026ec 100644
> --- a/arch/arm/config.mk
> +++ b/arch/arm/config.mk
> @@ -25,6 +25,7 @@ endif
>  
>  PLATFORM_RELFLAGS += -fno-common -ffixed-r9
>  PLATFORM_RELFLAGS += $(call cc-option, -msoft-float) \
> +		     $(call cc-option,-mgeneral-regs-only) \
>        $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
>  
>  # LLVM support
diff mbox series

Patch

diff --git a/arch/arm/config.mk b/arch/arm/config.mk
index 16c63e12667..964c6b026ec 100644
--- a/arch/arm/config.mk
+++ b/arch/arm/config.mk
@@ -25,6 +25,7 @@  endif
 
 PLATFORM_RELFLAGS += -fno-common -ffixed-r9
 PLATFORM_RELFLAGS += $(call cc-option, -msoft-float) \
+		     $(call cc-option,-mgeneral-regs-only) \
       $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,))
 
 # LLVM support