Message ID | bd5ffa8a-63ad-153a-44e8-887064ee63e9@foss.arm.com |
---|---|
State | New |
Headers | show |
[resend without HTML formatting] On 14/07/17 16:29, Thomas Preudhomme wrote: > Hi Richard, Hi, > > I've committed the requested change as a separate patch to make it easier to > backport to earlier GCC versions While looking into backporting r250206 I read the ARM C Language Extension documentation again and realized that __ARM_FEATURE_NUMERIC_MAXMIN is for *vector* min and max instructions and intrinsics. Therefore the previous definition was correct. Sorry for the mistake. This reverts commit r250206. 2017-07-15 Thomas Preud'homme <thomas.preudhomme@arm.com> Revert: 2017-07-14 Thomas Preud'homme <thomas.preudhomme@arm.com> * config/arm/arm-c.c (arm_cpu_builtins): Define __ARM_FEATURE_NUMERIC_MAXMIN solely based on TARGET_VFP5. Best regards, Thomas > > Definition of __ARM_FEATURE_NUMERIC_MAXMIN checks for > TARGET_ARM_ARCH >= 8 and TARGET_NEON being true in addition to > TARGET_VFP5. However, instructions covered by this macro are part of > FPv5 which is available in ARMv7E-M architecture. This patch fixes the > macro to only check for TARGET_VFP5. > > ChangeLog entry is as follows: > > *** gcc/ChangeLog *** > > * config/arm/arm-c.c (arm_cpu_builtins): Define > __ARM_FEATURE_NUMERIC_MAXMIN solely based on TARGET_VFP5. > > Built and confirmed that the macro is now defined when building with > -march=armv7e-m+fpv5 -mfloat-abi=hard. > > Best regards, > > Thomas > > On 14/07/17 15:43, Richard Earnshaw (lists) wrote: >> On 14/07/17 09:20, Thomas Preudhomme wrote: >>> Hi, >>> >>> fp-armv8 is currently defined as a double precision FPv5 with 32 D >>> registers *and* a special FP_ARMv8 bit. However FP for ARMv8 should only >>> bring 32 D registers on top of FPv5-D16 so this FP_ARMv8 bit is >>> spurious. As a consequence, many instruction patterns which are guarded >>> by TARGET_FPU_ARMV8 are unavailable to FPv5-D16 and FPv5-SP-D16. >>> >>> This patch gets rid of TARGET_FPU_ARMV8 and rewire all uses to >>> expressions based on TARGET_VFP5, TARGET_VFPD32 and TARGET_VFP_DOUBLE. >>> It also redefine ISA_FP_ARMv8 to include the D32 capability to >>> distinguish it from FPv5-D16. At last, it sets the +fp.sp for ARMv8-R to >>> enable FPv5-SP-D16 (ie FP for ARMv8 with single precision only and 16 D >>> registers). >>> >>> ChangeLog entry is as follows: >>> >>> 2017-07-07 Thomas Preud'homme <thomas.preudhomme@arm.com> >>> >>> * config/arm/arm-isa.h (isa_bit_FP_ARMv8): Delete enumerator. >>> (ISA_FP_ARMv8): Define as ISA_FPv5 and ISA_FP_D32. >>> * config/arm/arm-cpus.in (armv8-r): Define fp.sp as enabling FPv5. >>> (fp-armv8): Define it as FP_ARMv8 only. >>> config/arm/arm.h (TARGET_FPU_ARMV8): Delete. >>> (TARGET_VFP_FP16INST): Define using TARGET_VFP5 rather than >>> TARGET_FPU_ARMV8. >>> config/arm/arm.c (arm_rtx_costs_internal): Replace checks against >>> TARGET_FPU_ARMV8 by checks against TARGET_VFP5. >>> * config/arm/arm-builtins.c (arm_builtin_vectorized_function): Define >>> first ARM_CHECK_BUILTIN_MODE definition using TARGET_VFP5 rather >>> than TARGET_FPU_ARMV8. >>> * config/arm/arm-c.c (arm_cpu_builtins): Likewise for >>> __ARM_FEATURE_NUMERIC_MAXMIN macro definition. >>> * config/arm/arm.md (cmov<mode>): Condition on TARGET_VFP5 rather than >>> TARGET_FPU_ARMV8. >>> * config/arm/neon.md (neon_vrint): Likewise. >>> (neon_vcvt): Likewise. >>> (neon_<fmaxmin_op><mode>): Likewise. >>> (<fmaxmin><mode>3): Likewise. >>> * config/arm/vfp.md (l<vrint_pattern><su_optab><mode>si2): Likewise. >>> * config/arm/predicates.md (arm_cond_move_operator): Check against >>> TARGET_VFP5 rather than TARGET_FPU_ARMV8 and fix spacing. >>> >>> Testing: >>> * Bootstrapped under ARMv8-A Thumb state and ran testsuite -> no >>> regression >>> * built Spec2000 and Spec2006 with -march=armv8-a+fp16 and compared >>> objdump -> no code generation difference >>> >>> Is this ok for trunk? >> >> OK with changes mentioned below. >> >> R. >> >>> >>> Best regards, >>> >>> Thomas >>> >>> rewire_mfpu_fparmv8.patch >>> >>> >>> diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c >>> index >>> 63ee880822c17eda55dd58438d61cbbba333b2c6..7504ed581c63a657a0dff48442633704bd252b2e >>> 100644 >>> --- a/gcc/config/arm/arm-builtins.c >>> +++ b/gcc/config/arm/arm-builtins.c >>> @@ -3098,7 +3098,7 @@ arm_builtin_vectorized_function (unsigned int fn, tree >>> type_out, tree type_in) >>> NULL_TREE is returned if no such builtin is available. */ >>> #undef ARM_CHECK_BUILTIN_MODE >>> #define ARM_CHECK_BUILTIN_MODE(C) \ >>> - (TARGET_FPU_ARMV8 \ >>> + (TARGET_VFP5 \ >>> && flag_unsafe_math_optimizations \ >>> && ARM_CHECK_BUILTIN_MODE_1 (C)) >>> diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c >>> index >>> a3daa3220a2bc4220dffdb7ca08ca9419bdac425..9178937b6d9e0fe5d0948701390c4cf01f4f8c7d >>> 100644 >>> --- a/gcc/config/arm/arm-c.c >>> +++ b/gcc/config/arm/arm-c.c >>> @@ -96,7 +96,7 @@ arm_cpu_builtins (struct cpp_reader* pfile) >>> || TARGET_ARM_ARCH_ISA_THUMB >=2)); >>> def_or_undef_macro (pfile, "__ARM_FEATURE_NUMERIC_MAXMIN", >>> - TARGET_ARM_ARCH >= 8 && TARGET_NEON && TARGET_FPU_ARMV8); >>> + TARGET_ARM_ARCH >= 8 && TARGET_NEON && TARGET_VFP5); >> >> This looks wrong (though ACLE is misleading). The MAXMIN property is >> solely defined by having an FPv5 capable FPU. >> >>> def_or_undef_macro (pfile, "__ARM_FEATURE_SIMD32", TARGET_INT_SIMD); >>> diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in >>> index >>> f35128acb7d68c6a0592355b9d3d56ee8f826aca..e2ff297aed7514073dbb3bf5ee86964f202e5a14 >>> 100644 >>> --- a/gcc/config/arm/arm-cpus.in >>> +++ b/gcc/config/arm/arm-cpus.in >>> @@ -389,7 +389,7 @@ begin arch armv8-r >>> option crc add bit_crc32 >>> # fp.sp => fp-armv8 (d16); simd => simd + fp-armv8 + d32 + double precision >> Please update comment >>> # note: no fp option for fp-armv8 (d16) + double precision at the moment >>> - option fp.sp add FP_ARMv8 >>> + option fp.sp add FPv5 >>> option simd add FP_ARMv8 NEON >>> option crypto add FP_ARMv8 CRYPTO >>> option nocrypto remove ALL_CRYPTO >>> @@ -1390,7 +1390,7 @@ begin fpu fpv5-d16 >>> end fpu fpv5-d16 >>> begin fpu fp-armv8 >>> - isa FP_ARMv8 FP_D32 >>> + isa FP_ARMv8 >>> end fpu fp-armv8 >>> begin fpu neon-fp-armv8 >>> diff --git a/gcc/config/arm/arm-isa.h b/gcc/config/arm/arm-isa.h >>> index >>> 0d66a0400c517668db023fc66ff43e26d43add51..dbd29eaa52f2007498c2aff6263b8b6c3a70e2c2 >>> 100644 >>> --- a/gcc/config/arm/arm-isa.h >>> +++ b/gcc/config/arm/arm-isa.h >>> @@ -60,7 +60,6 @@ enum isa_feature >>> isa_bit_VFPv4, /* Vector floating point v4. */ >>> isa_bit_FPv5, /* Floating point v5. */ >>> isa_bit_lpae, /* ARMv7-A LPAE. */ >>> - isa_bit_FP_ARMv8, /* ARMv8 floating-point extension. */ >>> isa_bit_neon, /* Advanced SIMD instructions. */ >>> isa_bit_fp16conv, /* Conversions to/from fp16 (VFPv3 extension). */ >>> isa_bit_fp_dbl, /* Double precision operations supported. */ >>> @@ -143,7 +142,7 @@ enum isa_feature >>> default. isa_bit_fp16 is deliberately missing from this list. */ >>> #define ISA_ALL_FPU_INTERNAL \ >>> isa_bit_VFPv2, isa_bit_VFPv3, isa_bit_VFPv4, isa_bit_FPv5, \ >>> - isa_bit_FP_ARMv8, isa_bit_fp16conv, isa_bit_fp_dbl, ISA_ALL_SIMD >>> + isa_bit_fp16conv, isa_bit_fp_dbl, ISA_ALL_SIMD >>> /* Similarly, but including fp16 and other extensions that aren't part of >>> -mfpu support. */ >>> @@ -154,10 +153,10 @@ enum isa_feature >>> #define ISA_VFPv3 ISA_VFPv2, isa_bit_VFPv3 >>> #define ISA_VFPv4 ISA_VFPv3, isa_bit_VFPv4, isa_bit_fp16conv >>> #define ISA_FPv5 ISA_VFPv4, isa_bit_FPv5 >>> -#define ISA_FP_ARMv8 ISA_FPv5, isa_bit_FP_ARMv8 >>> #define ISA_FP_DBL isa_bit_fp_dbl >>> #define ISA_FP_D32 ISA_FP_DBL, isa_bit_fp_d32 >>> +#define ISA_FP_ARMv8 ISA_FPv5, ISA_FP_D32 >>> #define ISA_NEON ISA_FP_D32, isa_bit_neon >>> #define ISA_CRYPTO ISA_NEON, isa_bit_crypto >>> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h >>> index >>> 315622212a5ce10d0c771535fe31f63c3be16444..4f53583cf0219de4329bc64a47a5a42c550ff354 >>> 100644 >>> --- a/gcc/config/arm/arm.h >>> +++ b/gcc/config/arm/arm.h >>> @@ -196,10 +196,6 @@ extern tree arm_fp16_type_node; >>> /* FPU supports fused-multiply-add operations. */ >>> #define TARGET_FMA (bitmap_bit_p (arm_active_target.isa, isa_bit_VFPv4)) >>> -/* FPU is ARMv8 compatible. */ >>> -#define TARGET_FPU_ARMV8 \ >>> - (bitmap_bit_p (arm_active_target.isa, isa_bit_FP_ARMv8)) >>> - >>> /* FPU supports Crypto extensions. */ >>> #define TARGET_CRYPTO (bitmap_bit_p (arm_active_target.isa, isa_bit_crypto)) >>> @@ -216,7 +212,7 @@ extern tree arm_fp16_type_node; >>> /* FPU supports the floating point FP16 instructions for ARMv8.2 and >>> later. */ >>> #define TARGET_VFP_FP16INST \ >>> - (TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_FPU_ARMV8 && arm_fp16_inst) >>> + (TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP5 && arm_fp16_inst) >>> /* FPU supports the AdvSIMD FP16 instructions for ARMv8.2 and later. */ >>> #define TARGET_NEON_FP16INST (TARGET_VFP_FP16INST && TARGET_NEON_RDMA) >>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c >>> index >>> c6101efd555996a4c6db5eaea0130b0940c4cff8..f59132c3f079d10d9e3d920b61037db2f3144eee >>> 100644 >>> --- a/gcc/config/arm/arm.c >>> +++ b/gcc/config/arm/arm.c >>> @@ -10755,7 +10755,7 @@ arm_rtx_costs_internal (rtx x, enum rtx_code code, >>> enum rtx_code outer_code, >>> { >>> if (speed_p) >>> *cost += extra_cost->fp[mode == DFmode].widen; >>> - if (!TARGET_FPU_ARMV8 >>> + if (!TARGET_VFP5 >>> && GET_MODE (XEXP (x, 0)) == HFmode) >>> { >>> /* Pre v8, widening HF->DF is a two-step process, first >>> @@ -10849,7 +10849,7 @@ arm_rtx_costs_internal (rtx x, enum rtx_code code, >>> enum rtx_code outer_code, >>> return true; >>> } >>> else if (GET_MODE_CLASS (mode) == MODE_FLOAT >>> - && TARGET_FPU_ARMV8) >>> + && TARGET_VFP5) >>> { >>> if (speed_p) >>> *cost += extra_cost->fp[mode == DFmode].roundint; >>> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md >>> index >>> e6e1ac54a850c35807d683804f5294fbef1487ad..049a78edefe9f85c6f84a4ecf0158d559e1d5674 >>> 100644 >>> --- a/gcc/config/arm/arm.md >>> +++ b/gcc/config/arm/arm.md >>> @@ -7879,7 +7879,7 @@ >>> "<F_constraint>") >>> (match_operand:SDF 4 "s_register_operand" >>> "<F_constraint>")))] >>> - "TARGET_HARD_FLOAT && TARGET_FPU_ARMV8 <vfp_double_cond>" >>> + "TARGET_HARD_FLOAT && TARGET_VFP5 <vfp_double_cond>" >>> "* >>> { >>> enum arm_cond_code code = maybe_get_arm_condition_code (operands[1]); >>> diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md >>> index >>> 33b25ff3c730544b4376bf318400d703c8813a0a..235c46da1a19712e2924d748545474ed991d9f92 >>> 100644 >>> --- a/gcc/config/arm/neon.md >>> +++ b/gcc/config/arm/neon.md >>> @@ -751,7 +751,7 @@ >>> (unspec:VCVTF [(match_operand:VCVTF 1 >>> "s_register_operand" "w")] >>> NEON_VRINT))] >>> - "TARGET_NEON && TARGET_FPU_ARMV8" >>> + "TARGET_NEON && TARGET_VFP5" >>> "vrint<nvrint_variant>.f32\\t%<V_reg>0, %<V_reg>1" >>> [(set_attr "type" "neon_fp_round_<V_elem_ch><q>")] >>> ) >>> @@ -761,7 +761,7 @@ >>> (FIXUORS:<V_cmp_result> (unspec:VCVTF >>> [(match_operand:VCVTF 1 "register_operand" "w")] >>> NEON_VCVT)))] >>> - "TARGET_NEON && TARGET_FPU_ARMV8" >>> + "TARGET_NEON && TARGET_VFP5" >>> "vcvt<nvrint_variant>.<su>32.f32\\t%<V_reg>0, %<V_reg>1" >>> [(set_attr "type" "neon_fp_to_int_<V_elem_ch><q>") >>> (set_attr "predicable" "no")] >>> @@ -2901,7 +2901,7 @@ >>> (unspec:VCVTF [(match_operand:VCVTF 1 "s_register_operand" "w") >>> (match_operand:VCVTF 2 "s_register_operand" "w")] >>> VMAXMINFNM))] >>> - "TARGET_NEON && TARGET_FPU_ARMV8" >>> + "TARGET_NEON && TARGET_VFP5" >>> "<fmaxmin_op>.<V_s_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2" >>> [(set_attr "type" "neon_fp_minmax_s<q>")] >>> ) >>> @@ -2912,7 +2912,7 @@ >>> (unspec:VCVTF [(match_operand:VCVTF 1 "s_register_operand" "w") >>> (match_operand:VCVTF 2 "s_register_operand" "w")] >>> VMAXMINFNM))] >>> - "TARGET_NEON && TARGET_FPU_ARMV8" >>> + "TARGET_NEON && TARGET_VFP5" >>> "<fmaxmin_op>.<V_s_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2" >>> [(set_attr "type" "neon_fp_minmax_s<q>")] >>> ) >>> diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md >>> index >>> afb5d6339a8af362384c93bbb46928635073b74b..3e25cd16b29231d53b4cadce3db0fbb3168cd4c5 >>> 100644 >>> --- a/gcc/config/arm/predicates.md >>> +++ b/gcc/config/arm/predicates.md >>> @@ -350,9 +350,9 @@ >>> (define_special_predicate "arm_cond_move_operator" >>> (if_then_else (match_test "arm_restrict_it") >>> - (and (match_test "TARGET_FPU_ARMV8") >>> - (match_operand 0 "arm_vsel_comparison_operator")) >>> - (match_operand 0 "expandable_comparison_operator"))) >>> + (and (match_test "TARGET_VFP5") >>> + (match_operand 0 "arm_vsel_comparison_operator")) >>> + (match_operand 0 "expandable_comparison_operator"))) >>> (define_special_predicate "noov_comparison_operator" >>> (match_code "lt,ge,eq,ne")) >>> diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md >>> index >>> d8f77e2ffe4fdb7c952d6a5ac947d91f89ce259d..23c1d67c9e3707e64a4e206dc62727e4c79ba89c >>> 100644 >>> --- a/gcc/config/arm/vfp.md >>> +++ b/gcc/config/arm/vfp.md >>> @@ -1997,7 +1997,7 @@ >>> (FIXUORS:SI (unspec:SDF >>> [(match_operand:SDF 1 >>> "register_operand" "<F_constraint>")] VCVT)))] >>> - "TARGET_HARD_FLOAT && TARGET_FPU_ARMV8 <vfp_double_cond>" >>> + "TARGET_HARD_FLOAT && TARGET_VFP5 <vfp_double_cond>" >>> "vcvt<vrint_variant>.<su>32.<V_if_elem>\\t%0, %<V_reg>1" >>> [(set_attr "predicable" "no") >>> (set_attr "conds" "unconditional") >>> >>
diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c index 9178937b6d9e0fe5d0948701390c4cf01f4f8c7d..6ab50f7ee3320d9b56688dd4c5f1ac80b233e84c 100644 --- a/gcc/config/arm/arm-c.c +++ b/gcc/config/arm/arm-c.c @@ -96,7 +96,7 @@ arm_cpu_builtins (struct cpp_reader* pfile) || TARGET_ARM_ARCH_ISA_THUMB >=2)); def_or_undef_macro (pfile, "__ARM_FEATURE_NUMERIC_MAXMIN", - TARGET_ARM_ARCH >= 8 && TARGET_NEON && TARGET_VFP5); + TARGET_VFP5); def_or_undef_macro (pfile, "__ARM_FEATURE_SIMD32", TARGET_INT_SIMD);