diff mbox

[ARM] Enable __fp16 as a function parameter and return type.

Message ID 5721D5C1.3080000@foss.arm.com
State New
Headers show

Commit Message

Matthew Wahab April 28, 2016, 9:20 a.m. UTC
Hello,

The ARM target supports the half-precision floating point type __fp16
but does not allow its use as a function return or parameter type. This
patch removes that restriction and defines the ACLE macro
__ARM_FP16_ARGS to indicate this. The code generated for passing __fp16
values into and out of functions depends on the level of hardware
support but conforms to the AAPCS (see
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042f/IHI0042F_aapcs.pdf).

This patch enables data movement for HF-mode values using VFP registers,
when they are available, to support passing arguments and return values
through the registers.

This patch also fixes the definition of TARGET_NEON_FP16 which used to
require both neon and fp16 features to be enabled. This was
inadvertently weakened, when the macro definition was changed to use
ARM_FPU_FSET_HAS, to only require one of neon or fp16 to be
enabled. This patch returns to the original
requirements. TARGET_NEON_FP16 is only used in instruction selection for
HF-mode data moves.

Tested for arm-none-eabi with cross-compiled check-gcc and for
arm-none-linux-gnueabihf with native bootstrap and make check.

Ok for trunk?
Matthew

gcc/
2016-04-27  Matthew Wahab  <matthew.wahab@arm.com>
	    Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>
	    Jiong Wang  <jiong.wang@arm.com>

	* config/arm/arm-c.c (arm_cpu_builtins): Use def_or_undef_macro
	for __ARM_FP16_FORMAT_IEEE and __ARM_FP16_FORMAT_ALTERNATIVE.
	Define __ARM_FP16_ARGS when appropriate.
	* config/arm/arm.c (arm_invalid_parameter_type): Remove
	declaration.
	(arm_invalid_return_type): Likewise.
	(TARGET_INVALID_PARAMETER_TYPE): Remove.
	(TARGET_INVALID_RETURN_TYPE): Remove.
	(aapcs_vfp_sub_candidate): Allow HFmode.
	(aapcs_vfp_allocate): Add comment.  Support HFmode.
	(aapcs_vfp_allocate_return_reg): Likewise.
	(struct aapcs_cp_arg_layout): Slightly reword comments for
	is_return_candidate and allocate_return_reg.
	(output_mov_vfp): Update assert.
	(arm_hard_regno_mode_ok): Remove comment, update HF-mode
	condition.
	(arm_invalid_parameter_type): Remove.
	(amr_invalid_return_type): Remove.
	* config/arm/arm.h (TARGET_NEON_FP16): Fix definition.
	* config/arm/arm.md (*arm32_movhf): Disable for TARGET_VFP.
	* config/arm/vfp.md (*movhf_vfp): Enable for TARGET_VFP.

gcc/testsuite/
2016-04-27  Matthew Wahab  <matthew.wahab@arm.com>

	* g++.dg/ext/arm-fp16/fp16-param-1.c: Update expected output.  Add
	test for __ARM_FP16_ARGS.
	* g++.dg/ext/arm-fp16/fp16-return-1.c: Update expected output.
	* gcc.target/arm/aapcs/neon-vect10.c: New.
	* gcc.target/arm/aapcs/neon-vect9.c: New.
	* gcc.target/arm/aapcs/vfp18.c: New.
	* gcc.target/arm/aapcs/vfp19.c: New.
	* gcc.target/arm/aapcs/vfp20.c: New.
	* gcc.target/arm/aapcs/vfp21.c: New.
	* gcc.target/arm/fp16-aapcs-1.c: New.
	* g++.target/arm/fp16-param-1.c: Update expected output.  Add
	test for __ARM_FP16_ARGS.
	* g++.target/arm/fp16-return-1.c: Update expected output.

Comments

Joseph Myers April 28, 2016, 3:49 p.m. UTC | #1
On Thu, 28 Apr 2016, Matthew Wahab wrote:

> Hello,
> 
> The ARM target supports the half-precision floating point type __fp16
> but does not allow its use as a function return or parameter type. This
> patch removes that restriction and defines the ACLE macro
> __ARM_FP16_ARGS to indicate this. The code generated for passing __fp16
> values into and out of functions depends on the level of hardware
> support but conforms to the AAPCS (see
> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042f/IHI0042F_aapcs.pdf).

The sole use of the TARGET_INVALID_PARAMETER_TYPE and 
TARGET_INVALID_RETURN_TYPE hooks was to disallow __fp16 use as a function 
return or parameter type.  Thus, I think this patch should completely 
remove those hooks and poison them in system.h.

This patch addresses one incompatibility of the original __fp16 
specification with the more recent ACLE specification and the 
specification in ISO/IEC TS 18661-3 for how such types should work.  
Another such incompatibility is the peculiar rule in the original 
specification that conversions from double to __fp16 go via float, with 
double rounding.  Do you have plans to eliminate that and move to the 
single-rounding semantics that are in current specifications?

I note that that AAPCS revision says for __fp16, in 7.1.1 Arithmetic 
Types, "In a variadic function call this will be passed as a 
double-precision value.".  I haven't checked what this patch implements, 
but that could be problematic, and different from what's said under 7.2, 
"For variadic functions, float arguments that match the ellipsis (...) are 
converted to type double.".

In TS 18661-3, _Float16 is *not* affected by default argument promotions; 
only float is.  This reflects how the default conversion of float to 
double is a legacy feature; note for example how in C99 and C11 float 
_Imaginary is not promoted to double _Imaginary, and float _Complex is not 
promoted to double _Complex.

Thus it would be better for compatibility with TS 18661-3 to pass __fp16 
values to variadic functions as themselves, unpromoted.  (Formally of 
course the lack of promotion is a language feature not an ABI feature; as 
long as va_arg for _Float16 named works correctly, you could promote at 
the ABI level and then convert back, and the only effect would be that 
sNaNs get quieted, so passing a _Float16 sNaN through variable arguments 
would act as a convertFormat operation instead of a copy operation.  It's 
not clear that having such an ABI-level promotion is a good idea, 
however.)

Now, in the context of the current implementation and current ACLE 
arithmetic on __fp16 values produces float results - the operands are 
promoted at the C language level.  This is different from TS 18661-3, 
where _Float16 arithmetic produces results whose semantics type is 
_Float16 but which, if FLT_EVAL_METHOD is 0, are evaluated with excess 
range and precision to the range and precision of float.  So if __fp16 and 
float are differently passed to variadic functions, you have the issue 
that if the argument is an expression resulting from __fp16 arithmetic, 
the way it is passed depends on whether current ACLE or TS 18661-3 are 
followed.  But if the eventual aim is for __fp16 (when using the IEEE 
format rather than the alternative format) to become just a typedef for 
_Float16, then these issues will need to be addressed.
Matthew Wahab May 3, 2016, 1:19 p.m. UTC | #2
Hello,

On 28/04/16 16:49, Joseph Myers wrote:
> On Thu, 28 Apr 2016, Matthew Wahab wrote:
>
>> The ARM target supports the half-precision floating point type __fp16 but does
>> not allow its use as a function return or parameter type. This patch removes
>> that restriction and defines the ACLE macro __ARM_FP16_ARGS to indicate this.
>> The code generated for passing __fp16 values into and out of functions depends
>> on the level of hardware support but conforms to the AAPCS (see
>> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042f/IHI0042F_aapcs.pdf).
>
> The sole use of the TARGET_INVALID_PARAMETER_TYPE and TARGET_INVALID_RETURN_TYPE
> hooks was to disallow __fp16 use as a function return or parameter type.  Thus, I
> think this patch should completely remove those hooks and poison them in
> system.h.

This touches the C and C++ front-ends so I'll send a separate patch to do this.

> This patch addresses one incompatibility of the original __fp16 specification with
> the more recent ACLE specification and the specification in ISO/IEC TS 18661-3 for
> how such types should work. Another such incompatibility is the peculiar rule in
> the original specification that conversions from double to __fp16 go via float,
> with double rounding.  Do you have plans to eliminate that and move to the
> single-rounding semantics that are in current specifications?

The patch aims to preserve the __fp16 semantics currently used by GCC, except for the
obvious argument/return value in registers change. This is to avoid any changes to
the values calculated by existing __fp16 code that might be introduced if things like
the conversion rules are changed.

> I note that that AAPCS revision says for __fp16, in 7.1.1 Arithmetic Types, "In a
> variadic function call this will be passed as a double-precision value.".  I
> haven't checked what this patch implements, but that could be problematic, and
> different from what's said under 7.2, "For variadic functions, float arguments
> that match the ellipsis (...) are converted to type double.".

The patch keeps the current GCC behaviour (conversion to double). (There's an 
existing test for this in gcc.target/arm/fp16-variadic-1.c.)

> In TS 18661-3, _Float16 is *not* affected by default argument promotions; only
> float is.  This reflects how the default conversion of float to double is a legacy
> feature; note for example how in C99 and C11 float _Imaginary is not promoted to
> double _Imaginary, and float _Complex is not promoted to double _Complex.
>
> Thus it would be better for compatibility with TS 18661-3 to pass __fp16 values to
> variadic functions as themselves, unpromoted.  (Formally of course the lack of
> promotion is a language feature not an ABI feature; as long as va_arg for _Float16
> named works correctly, you could promote at the ABI level and then convert back,
> and the only effect would be that sNaNs get quieted, so passing a _Float16 sNaN
> through variable arguments would act as a convertFormat operation instead of a
> copy operation.  It's not clear that having such an ABI-level promotion is a good
> idea, however.)
>
> Now, in the context of the current implementation and current ACLE arithmetic on
> __fp16 values produces float results - the operands are promoted at the C language
> level.  This is different from TS 18661-3, where _Float16 arithmetic produces
> results whose semantics type is _Float16 but which, if FLT_EVAL_METHOD is 0, are
> evaluated with excess range and precision to the range and precision of float.  So
> if __fp16 and float are differently passed to variadic functions, you have the
> issue that if the argument is an expression resulting from __fp16 arithmetic, the
> way it is passed depends on whether current ACLE or TS 18661-3 are followed.  But
> if the eventual aim is for __fp16 (when using the IEEE format rather than the
> alternative format) to become just a typedef for _Float16, then these issues will
> need to be addressed.

When _Float16 support is added, the relationship between __fp16 and _FLoat16 is 
something that will need to be considered. At the moment though, there's only the 
__fp16 type and the intention with this patch is to avoid doing anything that changes 
the behaviour of existing code.

Matthew
Joseph Myers May 10, 2016, 8:06 p.m. UTC | #3
On Tue, 3 May 2016, Matthew Wahab wrote:

> > This patch addresses one incompatibility of the original __fp16 
> > specification with the more recent ACLE specification and the 
> > specification in ISO/IEC TS 18661-3 for how such types should work. 
> > Another such incompatibility is the peculiar rule in the original 
> > specification that conversions from double to __fp16 go via float, 
> > with double rounding.  Do you have plans to eliminate that and move to 
> > the single-rounding semantics that are in current specifications?
> 
> The patch aims to preserve the __fp16 semantics currently used by GCC, 
> except for the obvious argument/return value in registers change. This 
> is to avoid any changes to the values calculated by existing __fp16 code 
> that might be introduced if things like the conversion rules are 
> changed.

My question wasn't about what this patch does - it makes sense for each 
patch to do just one thing (so this patch only deals with allowing certain 
cases that were disallowed by the original __fp16 specification but are 
allowed by ACLE).  My question was about future intent - do you intend 
other patches to address other incompatibilities between ACLE and GCC 
__fp16, one incompatibility at a time?
Tejas Belagod May 11, 2016, 3:02 p.m. UTC | #4
On 28/04/16 16:49, Joseph Myers wrote:
> On Thu, 28 Apr 2016, Matthew Wahab wrote:
>
>> Hello,
>>
>> The ARM target supports the half-precision floating point type __fp16
>> but does not allow its use as a function return or parameter type. This
>> patch removes that restriction and defines the ACLE macro
>> __ARM_FP16_ARGS to indicate this. The code generated for passing __fp16
>> values into and out of functions depends on the level of hardware
>> support but conforms to the AAPCS (see
>> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042f/IHI0042F_aapcs.pdf).
>
> The sole use of the TARGET_INVALID_PARAMETER_TYPE and
> TARGET_INVALID_RETURN_TYPE hooks was to disallow __fp16 use as a function
> return or parameter type.  Thus, I think this patch should completely
> remove those hooks and poison them in system.h.
>
> This patch addresses one incompatibility of the original __fp16
> specification with the more recent ACLE specification and the
> specification in ISO/IEC TS 18661-3 for how such types should work.
> Another such incompatibility is the peculiar rule in the original
> specification that conversions from double to __fp16 go via float, with
> double rounding.  Do you have plans to eliminate that and move to the
> single-rounding semantics that are in current specifications?
>

http://infocenter.arm.com/help/topic/com.arm.doc.ihi0053c/IHI0053C_acle_2_0.pdf

Section 4.1.2 states that double to fp16 should round only once and it only 
suggests that it is done via a two step hardware instruction rather than an 
emulation library if speed is the priority as pre-ARMv8 architectures do not 
support this in hardware.

http://infocenter.arm.com/help/topic/com.arm.doc.ihi0053d/IHI0053D_acle_2_1.pdf

updates this paragraph to reflect ARMv8 hardware feature, but still maintains 
the suggestion of using two-step hardware instruction rather than emulation 
library if speed is priority for pre-ARMv8 architectures.

AFAICS, I don't think it mandates a double-rounding behavior for double to 
__fp16 conversions and I don't see a change in stand between the two versions of 
ACLE on the behavior of __fp16.

We could improve the ACLE spec to include a caveat that a two-step reduction 
could introduce a loss in precision which could result in incompatibility with 
ARMv8 architectures.

> I note that that AAPCS revision says for __fp16, in 7.1.1 Arithmetic
> Types, "In a variadic function call this will be passed as a
> double-precision value.".  I haven't checked what this patch implements,
> but that could be problematic, and different from what's said under 7.2,
> "For variadic functions, float arguments that match the ellipsis (...) are
> converted to type double.".
>
> In TS 18661-3, _Float16 is *not* affected by default argument promotions;
> only float is.  This reflects how the default conversion of float to
> double is a legacy feature; note for example how in C99 and C11 float
> _Imaginary is not promoted to double _Imaginary, and float _Complex is not
> promoted to double _Complex.
>
> Thus it would be better for compatibility with TS 18661-3 to pass __fp16
> values to variadic functions as themselves, unpromoted.  (Formally of
> course the lack of promotion is a language feature not an ABI feature; as
> long as va_arg for _Float16 named works correctly, you could promote at
> the ABI level and then convert back, and the only effect would be that
> sNaNs get quieted, so passing a _Float16 sNaN through variable arguments
> would act as a convertFormat operation instead of a copy operation.  It's
> not clear that having such an ABI-level promotion is a good idea,
> however.)
>
> Now, in the context of the current implementation and current ACLE
> arithmetic on __fp16 values produces float results - the operands are
> promoted at the C language level.  This is different from TS 18661-3,
> where _Float16 arithmetic produces results whose semantics type is
> _Float16 but which, if FLT_EVAL_METHOD is 0, are evaluated with excess
> range and precision to the range and precision of float.  So if __fp16 and
> float are differently passed to variadic functions, you have the issue
> that if the argument is an expression resulting from __fp16 arithmetic,
> the way it is passed depends on whether current ACLE or TS 18661-3 are
> followed.  But if the eventual aim is for __fp16 (when using the IEEE
> format rather than the alternative format) to become just a typedef for
> _Float16, then these issues will need to be addressed.
>

__fp16's compatibility with _Float16 is still under discussion internally.

Thanks,
Tejas.
Joseph Myers May 11, 2016, 3:46 p.m. UTC | #5
On Wed, 11 May 2016, Tejas Belagod wrote:

> AFAICS, I don't think it mandates a double-rounding behavior for double to
> __fp16 conversions and I don't see a change in stand between the two versions
> of ACLE on the behavior of __fp16.

It's not a change between the two versions of ACLE.  It's a change 
relative to the early (pre-ACLE) __fp16 specification (or, at least, a 
clarification thereto in email on 12 Aug 2008) that was used as a basis 
for the original implementation of __fp16 in GCC (and that thus is what's 
currently implemented by GCC and tested for in the testsuite).
Tejas Belagod May 13, 2016, 9:23 a.m. UTC | #6
On 11/05/16 16:46, Joseph Myers wrote:
> On Wed, 11 May 2016, Tejas Belagod wrote:
>
>> AFAICS, I don't think it mandates a double-rounding behavior for double to
>> __fp16 conversions and I don't see a change in stand between the two versions
>> of ACLE on the behavior of __fp16.
>
> It's not a change between the two versions of ACLE.  It's a change
> relative to the early (pre-ACLE) __fp16 specification (or, at least, a
> clarification thereto in email on 12 Aug 2008) that was used as a basis
> for the original implementation of __fp16 in GCC (and that thus is what's
> currently implemented by GCC and tested for in the testsuite).
>

Hi Joseph,

I can't seem to find that email on gcc-patches circa August 2008 - which list 
was it sent to?

Thanks,
Tejas.
Joseph Myers May 13, 2016, 12:10 p.m. UTC | #7
On Fri, 13 May 2016, Tejas Belagod wrote:

> > It's not a change between the two versions of ACLE.  It's a change
> > relative to the early (pre-ACLE) __fp16 specification (or, at least, a
> > clarification thereto in email on 12 Aug 2008) that was used as a basis
> > for the original implementation of __fp16 in GCC (and that thus is what's
> > currently implemented by GCC and tested for in the testsuite).
> > 
> 
> Hi Joseph,
> 
> I can't seem to find that email on gcc-patches circa August 2008 - which list
> was it sent to?

That was an internal discussion between CodeSourcery and ARM.  Email from 
Alasdair Grant, Tue, 12 Aug 2008 08:41:15 +0100.
Ramana Radhakrishnan May 13, 2016, 1:41 p.m. UTC | #8
On Thu, Apr 28, 2016 at 10:20 AM, Matthew Wahab
<matthew.wahab@foss.arm.com> wrote:
> Hello,
>
> The ARM target supports the half-precision floating point type __fp16
> but does not allow its use as a function return or parameter type. This
> patch removes that restriction and defines the ACLE macro
> __ARM_FP16_ARGS to indicate this. The code generated for passing __fp16
> values into and out of functions depends on the level of hardware
> support but conforms to the AAPCS (see
> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042f/IHI0042F_aapcs.pdf).
>
> This patch enables data movement for HF-mode values using VFP registers,
> when they are available, to support passing arguments and return values
> through the registers.
>
> This patch also fixes the definition of TARGET_NEON_FP16 which used to
> require both neon and fp16 features to be enabled. This was
> inadvertently weakened, when the macro definition was changed to use
> ARM_FPU_FSET_HAS, to only require one of neon or fp16 to be
> enabled. This patch returns to the original
> requirements. TARGET_NEON_FP16 is only used in instruction selection for
> HF-mode data moves.
>
> Tested for arm-none-eabi with cross-compiled check-gcc and for
> arm-none-linux-gnueabihf with native bootstrap and make check.
>
> Ok for trunk?
> Matthew

This is OK - thanks.

We have to deal with Joseph's points around the issue with double
rounding but I think that's the subject of a separate patch.

regards
Ramana

>
> gcc/
> 2016-04-27  Matthew Wahab  <matthew.wahab@arm.com>
>             Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>
>             Jiong Wang  <jiong.wang@arm.com>
>
>         * config/arm/arm-c.c (arm_cpu_builtins): Use def_or_undef_macro
>         for __ARM_FP16_FORMAT_IEEE and __ARM_FP16_FORMAT_ALTERNATIVE.
>         Define __ARM_FP16_ARGS when appropriate.
>         * config/arm/arm.c (arm_invalid_parameter_type): Remove
>         declaration.
>         (arm_invalid_return_type): Likewise.
>         (TARGET_INVALID_PARAMETER_TYPE): Remove.
>         (TARGET_INVALID_RETURN_TYPE): Remove.
>         (aapcs_vfp_sub_candidate): Allow HFmode.
>         (aapcs_vfp_allocate): Add comment.  Support HFmode.
>         (aapcs_vfp_allocate_return_reg): Likewise.
>         (struct aapcs_cp_arg_layout): Slightly reword comments for
>         is_return_candidate and allocate_return_reg.
>         (output_mov_vfp): Update assert.
>         (arm_hard_regno_mode_ok): Remove comment, update HF-mode
>         condition.
>         (arm_invalid_parameter_type): Remove.
>         (amr_invalid_return_type): Remove.
>         * config/arm/arm.h (TARGET_NEON_FP16): Fix definition.
>         * config/arm/arm.md (*arm32_movhf): Disable for TARGET_VFP.
>         * config/arm/vfp.md (*movhf_vfp): Enable for TARGET_VFP.
>
> gcc/testsuite/
> 2016-04-27  Matthew Wahab  <matthew.wahab@arm.com>
>
>         * g++.dg/ext/arm-fp16/fp16-param-1.c: Update expected output.  Add
>         test for __ARM_FP16_ARGS.
>         * g++.dg/ext/arm-fp16/fp16-return-1.c: Update expected output.
>         * gcc.target/arm/aapcs/neon-vect10.c: New.
>         * gcc.target/arm/aapcs/neon-vect9.c: New.
>         * gcc.target/arm/aapcs/vfp18.c: New.
>         * gcc.target/arm/aapcs/vfp19.c: New.
>         * gcc.target/arm/aapcs/vfp20.c: New.
>         * gcc.target/arm/aapcs/vfp21.c: New.
>         * gcc.target/arm/fp16-aapcs-1.c: New.
>         * g++.target/arm/fp16-param-1.c: Update expected output.  Add
>         test for __ARM_FP16_ARGS.
>         * g++.target/arm/fp16-return-1.c: Update expected output.
Tejas Belagod May 16, 2016, 1:16 p.m. UTC | #9
On 11/05/16 16:46, Joseph Myers wrote:
> On Wed, 11 May 2016, Tejas Belagod wrote:
>
>> AFAICS, I don't think it mandates a double-rounding behavior for double to
>> __fp16 conversions and I don't see a change in stand between the two versions
>> of ACLE on the behavior of __fp16.
>
> It's not a change between the two versions of ACLE.  It's a change
> relative to the early (pre-ACLE) __fp16 specification (or, at least, a
> clarification thereto in email on 12 Aug 2008) that was used as a basis
> for the original implementation of __fp16 in GCC (and that thus is what's
> currently implemented by GCC and tested for in the testsuite).
>

Hi Joseph,

Sorry for the delay in responding.

I've had a conversation with Al and I now have some context. You're right - the 
2008 mail you are referring to is the pre-ACLE behavior. By the time the first 
draft of the first version of ACLE was reviewed by CodeSourcery(circa 2011), it 
already mandated single rounding. No published ACLE has ever allowed double 
rounding.

This meant that when the first draft of ACLE was published in 2011, its pre-ACLE 
implementations in gcc and armcc were already non-conformant, in other words, 
'bug-compatible'.

We do have plans to fix pre-ACLE behavior of fp16 to conform to current ACLE 
spec, but can't say when exactly.

Thanks,
Tejas.
Ramana Radhakrishnan May 18, 2016, 8:41 a.m. UTC | #10
On Mon, May 16, 2016 at 2:16 PM, Tejas Belagod
<tejas.belagod@foss.arm.com> wrote:

>
> We do have plans to fix pre-ACLE behavior of fp16 to conform to current ACLE
> spec, but can't say when exactly.

Matthew, could you please take a look at this while you are in this area ?

thanks,
Ramana

>
> Thanks,
> Tejas.
Matthew Wahab May 18, 2016, 2:33 p.m. UTC | #11
On 18/05/16 09:41, Ramana Radhakrishnan wrote:
> On Mon, May 16, 2016 at 2:16 PM, Tejas Belagod
> <tejas.belagod@foss.arm.com> wrote:
>
>>
>> We do have plans to fix pre-ACLE behavior of fp16 to conform to current ACLE
>> spec, but can't say when exactly.
>
> Matthew, could you please take a look at this while you are in this area ?

Ok.

Part of this is likely to involve removing TARGET_CONVERT_TO_TYPE from the ARM 
backend. grep doesn't show anywhere that uses this hook, the only other occurrence is 
in the ARC backend. Does the hook ever get used?

Matthew
Ramana Radhakrishnan May 18, 2016, 2:37 p.m. UTC | #12
On 18/05/16 15:33, Matthew Wahab wrote:
> On 18/05/16 09:41, Ramana Radhakrishnan wrote:
>> On Mon, May 16, 2016 at 2:16 PM, Tejas Belagod
>> <tejas.belagod@foss.arm.com> wrote:
>>
>>>
>>> We do have plans to fix pre-ACLE behavior of fp16 to conform to current ACLE
>>> spec, but can't say when exactly.
>>
>> Matthew, could you please take a look at this while you are in this area ?
> 
> Ok.
> 
> Part of this is likely to involve removing TARGET_CONVERT_TO_TYPE from the ARM backend. grep doesn't show anywhere that uses this hook, the only other occurrence is in the ARC backend. Does the hook ever get used?
> 
> Matthew
> 
> 

The use in the front-ends / mid-end is usually in lower case - thus grepping for convert_to_type will give you the answer.

grep -r convert_to_type *

gcc/jit/jit-playback.c:  t_ret = targetm.convert_to_type (t_dst_type, t_expr);
gcc/cp/cvt.c:  e1 = targetm.convert_to_type (type, e);
gcc/c/c-convert.c:  ret = targetm.convert_to_type (type, expr);
gcc/config/arm/arm.c:static tree arm_convert_to_type (tree type, tree expr);
gcc/config/arm/arm.c:#define TARGET_CONVERT_TO_TYPE arm_convert_to_type
gcc/config/arm/arm.c:arm_convert_to_type (tree type, tree expr)
gcc/config/avr/avr.c:avr_convert_to_type (tree type, tree expr)
gcc/config/avr/avr.c:#define TARGET_CONVERT_TO_TYPE avr_convert_to_type

I don't think you can poison this as the avr backend appears to use this rather than arc.


regards
Ramana
Joseph Myers May 18, 2016, 3:54 p.m. UTC | #13
On Wed, 18 May 2016, Matthew Wahab wrote:

> On 18/05/16 09:41, Ramana Radhakrishnan wrote:
> > On Mon, May 16, 2016 at 2:16 PM, Tejas Belagod
> > <tejas.belagod@foss.arm.com> wrote:
> > 
> > > 
> > > We do have plans to fix pre-ACLE behavior of fp16 to conform to current
> > > ACLE
> > > spec, but can't say when exactly.
> > 
> > Matthew, could you please take a look at this while you are in this area ?
> 
> Ok.

FWIW, the obvious (to me) approach to doing the conversion without double 
rounding issues while properly respecting exceptions and rounding modes 
would be to set a sticky bit in the double value and ensure its precision 
is no more than that of float before converting to float.  Something like 
(example for little-endian, untested):

union { double d; struct { uint32_t lo, hi; } r; } x;
__fp16 ret;

if (x.r.lo) x.r.hi |= 1;
x.r.lo = 0;
ret = (__fp16) (float) x.d;

By using floating point for the final conversion, you ensure it respects 
the rounding mode and produces the proper exceptions.
Christophe Lyon June 1, 2016, 2:43 p.m. UTC | #14
On 13 May 2016 at 15:41, Ramana Radhakrishnan <ramana.gcc@googlemail.com> wrote:
> On Thu, Apr 28, 2016 at 10:20 AM, Matthew Wahab
> <matthew.wahab@foss.arm.com> wrote:
>> Hello,
>>
>> The ARM target supports the half-precision floating point type __fp16
>> but does not allow its use as a function return or parameter type. This
>> patch removes that restriction and defines the ACLE macro
>> __ARM_FP16_ARGS to indicate this. The code generated for passing __fp16
>> values into and out of functions depends on the level of hardware
>> support but conforms to the AAPCS (see
>> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042f/IHI0042F_aapcs.pdf).
>>
>> This patch enables data movement for HF-mode values using VFP registers,
>> when they are available, to support passing arguments and return values
>> through the registers.
>>
>> This patch also fixes the definition of TARGET_NEON_FP16 which used to
>> require both neon and fp16 features to be enabled. This was
>> inadvertently weakened, when the macro definition was changed to use
>> ARM_FPU_FSET_HAS, to only require one of neon or fp16 to be
>> enabled. This patch returns to the original
>> requirements. TARGET_NEON_FP16 is only used in instruction selection for
>> HF-mode data moves.
>>
>> Tested for arm-none-eabi with cross-compiled check-gcc and for
>> arm-none-linux-gnueabihf with native bootstrap and make check.
>>
>> Ok for trunk?
>> Matthew
>
> This is OK - thanks.

Hi,

I'm seeing regressions on non-hf targets (arm-none-eabi,
arm-none-linux-gnueabi):
new FAIL:
gcc.target/arm/aapcs/neon-vect10.c execution test
gcc.target/arm/aapcs/neon-vect9.c execution test

I'm using QEMU (2.4.1). You said you tested arm-none-eabi, so I'm
probably missing something?

Christophe


>
> We have to deal with Joseph's points around the issue with double
> rounding but I think that's the subject of a separate patch.
>
> regards
> Ramana
>
>>
>> gcc/
>> 2016-04-27  Matthew Wahab  <matthew.wahab@arm.com>
>>             Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>
>>             Jiong Wang  <jiong.wang@arm.com>
>>
>>         * config/arm/arm-c.c (arm_cpu_builtins): Use def_or_undef_macro
>>         for __ARM_FP16_FORMAT_IEEE and __ARM_FP16_FORMAT_ALTERNATIVE.
>>         Define __ARM_FP16_ARGS when appropriate.
>>         * config/arm/arm.c (arm_invalid_parameter_type): Remove
>>         declaration.
>>         (arm_invalid_return_type): Likewise.
>>         (TARGET_INVALID_PARAMETER_TYPE): Remove.
>>         (TARGET_INVALID_RETURN_TYPE): Remove.
>>         (aapcs_vfp_sub_candidate): Allow HFmode.
>>         (aapcs_vfp_allocate): Add comment.  Support HFmode.
>>         (aapcs_vfp_allocate_return_reg): Likewise.
>>         (struct aapcs_cp_arg_layout): Slightly reword comments for
>>         is_return_candidate and allocate_return_reg.
>>         (output_mov_vfp): Update assert.
>>         (arm_hard_regno_mode_ok): Remove comment, update HF-mode
>>         condition.
>>         (arm_invalid_parameter_type): Remove.
>>         (amr_invalid_return_type): Remove.
>>         * config/arm/arm.h (TARGET_NEON_FP16): Fix definition.
>>         * config/arm/arm.md (*arm32_movhf): Disable for TARGET_VFP.
>>         * config/arm/vfp.md (*movhf_vfp): Enable for TARGET_VFP.
>>
>> gcc/testsuite/
>> 2016-04-27  Matthew Wahab  <matthew.wahab@arm.com>
>>
>>         * g++.dg/ext/arm-fp16/fp16-param-1.c: Update expected output.  Add
>>         test for __ARM_FP16_ARGS.
>>         * g++.dg/ext/arm-fp16/fp16-return-1.c: Update expected output.
>>         * gcc.target/arm/aapcs/neon-vect10.c: New.
>>         * gcc.target/arm/aapcs/neon-vect9.c: New.
>>         * gcc.target/arm/aapcs/vfp18.c: New.
>>         * gcc.target/arm/aapcs/vfp19.c: New.
>>         * gcc.target/arm/aapcs/vfp20.c: New.
>>         * gcc.target/arm/aapcs/vfp21.c: New.
>>         * gcc.target/arm/fp16-aapcs-1.c: New.
>>         * g++.target/arm/fp16-param-1.c: Update expected output.  Add
>>         test for __ARM_FP16_ARGS.
>>         * g++.target/arm/fp16-return-1.c: Update expected output.
Matthew Wahab June 2, 2016, 7:39 a.m. UTC | #15
On 01/06/16 15:43, Christophe Lyon wrote:
> On 13 May 2016 at 15:41, Ramana Radhakrishnan <ramana.gcc@googlemail.com> wrote:
>> On Thu, Apr 28, 2016 at 10:20 AM, Matthew Wahab
>> <matthew.wahab@foss.arm.com> wrote:
>>>
>>> This patch enables data movement for HF-mode values using VFP registers,
>>> when they are available, to support passing arguments and return values
>>> through the registers.
>>>
[..]
>>> HF-mode data moves.
>>>
>>> Tested for arm-none-eabi with cross-compiled check-gcc and for
>>> arm-none-linux-gnueabihf with native bootstrap and make check.
>>>
>>> Ok for trunk?
>>> Matthew
>>
>> This is OK - thanks.
>
> Hi,
>
> I'm seeing regressions on non-hf targets (arm-none-eabi,
> arm-none-linux-gnueabi):
> new FAIL:
> gcc.target/arm/aapcs/neon-vect10.c execution test
> gcc.target/arm/aapcs/neon-vect9.c execution test
>
> I'm using QEMU (2.4.1). You said you tested arm-none-eabi, so I'm
> probably missing something?
>
> Christophe

Hi,

This has also appeared in our internal testing in some configurations. 
It's due to mfloatabi=soft or mfloat-abi=softfp being added to the end 
of the command line for the tests. That overrides the mfloat-abi=hard 
option that the tests need. I'm just finishing up a patch to fix this.

Matthew
diff mbox

Patch

[PATCH] [ARM] Enable __fp16 as a function parameter and return type.

The ARM target supports the half-precision floating point type __fp16
but does not allow its use as a function return or parameter type. This
patch removes that restriction and defines the ACLE macro
__ARM_FP16_ARGS to indicate this. The code generated for passing __fp16
values into and out of functions depends on the level of hardware
support but conforms to the AAPCS (see
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042f/IHI0042F_aapcs.pdf).

This patch enables data movement for HF-mode values using VFP registers,
when they are available, to support passing arguments and return values
through the register.

This patch also fixes the definition of TARGET_NEON_FP16 which used to
require both neon and fp16 features to be enabled. This was
inadvertently weakened, when the macro definition was changed to use
ARM_FPU_FSET_HAS, to only require one of neon or fp16 to be
enabled. This patch returns to the original
requirements. TARGET_NEON_FP16 is only used in instruction selection for
HF-mode data moves.

Tested for arm-none-eabi with cross-compiled check-gcc and for
arm-none-linux-gnueabihf with native bootstrap and make check.

gcc/
2016-04-27  Matthew Wahab  <matthew.wahab@arm.com>
	    Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>
	    Jiong Wang  <jiong.wang@arm.com>

	* config/arm/arm-c.c (arm_cpu_builtins): Use def_or_undef_macro
	for __ARM_FP16_FORMAT_IEEE and __ARM_FP16_FORMAT_ALTERNATIVE.
	Define __ARM_FP16_ARGS when appropriate.
	* config/arm/arm.c (arm_invalid_parameter_type): Remove
	declaration.
	(arm_invalid_return_type): Likewise.
	(TARGET_INVALID_PARAMETER_TYPE): Remove.
	(TARGET_INVALID_RETURN_TYPE): Remove.
	(aapcs_vfp_sub_candidate): Allow HFmode.
	(aapcs_vfp_allocate): Add comment.  Support HFmode.
	(aapcs_vfp_allocate_return_reg): Likewise.
	(struct aapcs_cp_arg_layout): Slightly reword comments for
	is_return_candidate and allocate_return_reg.
	(output_mov_vfp): Update assert.
	(arm_hard_regno_mode_ok): Remove comment, update HF-mode
	condition.
	(arm_invalid_parameter_type): Remove.
	(amr_invalid_return_type): Remove.
	* config/arm/arm.h (TARGET_NEON_FP16): Fix definition.
	* config/arm/arm.md (*arm32_movhf): Disable for TARGET_VFP.
	* config/arm/vfp.md (*movhf_vfp): Enable for TARGET_VFP.

gcc/testsuite/
2016-04-27  Matthew Wahab  <matthew.wahab@arm.com>

	* g++.dg/ext/arm-fp16/fp16-param-1.c: Update expected output.  Add
	test for __ARM_FP16_ARGS.
	* g++.dg/ext/arm-fp16/fp16-return-1.c: Update expected output.
	* gcc.target/arm/aapcs/neon-vect10.c: New.
	* gcc.target/arm/aapcs/neon-vect9.c: New.
	* gcc.target/arm/aapcs/vfp18.c: New.
	* gcc.target/arm/aapcs/vfp19.c: New.
	* gcc.target/arm/aapcs/vfp20.c: New.
	* gcc.target/arm/aapcs/vfp21.c: New.
	* gcc.target/arm/fp16-aapcs-1.c: New.
	* g++.target/arm/fp16-param-1.c: Update expected output.  Add
	test for __ARM_FP16_ARGS.
	* g++.target/arm/fp16-return-1.c: Update expected output.
---
 gcc/config/arm/arm-c.c                            | 10 ++--
 gcc/config/arm/arm.c                              | 57 +++++++----------------
 gcc/config/arm/arm.h                              |  3 +-
 gcc/config/arm/arm.md                             |  2 +-
 gcc/config/arm/vfp.md                             |  3 +-
 gcc/testsuite/g++.dg/ext/arm-fp16/fp16-param-1.C  | 12 +++--
 gcc/testsuite/g++.dg/ext/arm-fp16/fp16-return-1.C |  7 ++-
 gcc/testsuite/gcc.target/arm/aapcs/neon-vect10.c  | 31 ++++++++++++
 gcc/testsuite/gcc.target/arm/aapcs/neon-vect9.c   | 23 +++++++++
 gcc/testsuite/gcc.target/arm/aapcs/vfp18.c        | 27 +++++++++++
 gcc/testsuite/gcc.target/arm/aapcs/vfp19.c        | 29 ++++++++++++
 gcc/testsuite/gcc.target/arm/aapcs/vfp20.c        | 21 +++++++++
 gcc/testsuite/gcc.target/arm/aapcs/vfp21.c        | 25 ++++++++++
 gcc/testsuite/gcc.target/arm/fp16-aapcs-1.c       | 17 +++++++
 gcc/testsuite/gcc.target/arm/fp16-param-1.c       | 12 +++--
 gcc/testsuite/gcc.target/arm/fp16-return-1.c      |  7 ++-
 16 files changed, 224 insertions(+), 62 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/aapcs/neon-vect10.c
 create mode 100644 gcc/testsuite/gcc.target/arm/aapcs/neon-vect9.c
 create mode 100644 gcc/testsuite/gcc.target/arm/aapcs/vfp18.c
 create mode 100644 gcc/testsuite/gcc.target/arm/aapcs/vfp19.c
 create mode 100644 gcc/testsuite/gcc.target/arm/aapcs/vfp20.c
 create mode 100644 gcc/testsuite/gcc.target/arm/aapcs/vfp21.c
 create mode 100644 gcc/testsuite/gcc.target/arm/fp16-aapcs-1.c

diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c
index 4fbdfc5..b98470f 100644
--- a/gcc/config/arm/arm-c.c
+++ b/gcc/config/arm/arm-c.c
@@ -135,10 +135,12 @@  arm_cpu_builtins (struct cpp_reader* pfile)
   else
     cpp_undef (pfile, "__ARM_FP");
 
-  if (arm_fp16_format == ARM_FP16_FORMAT_IEEE)
-    builtin_define ("__ARM_FP16_FORMAT_IEEE");
-  if (arm_fp16_format == ARM_FP16_FORMAT_ALTERNATIVE)
-    builtin_define ("__ARM_FP16_FORMAT_ALTERNATIVE");
+  def_or_undef_macro (pfile, "__ARM_FP16_FORMAT_IEEE",
+		      arm_fp16_format == ARM_FP16_FORMAT_IEEE);
+  def_or_undef_macro (pfile, "__ARM_FP16_FORMAT_ALTERNATIVE",
+		      arm_fp16_format == ARM_FP16_FORMAT_ALTERNATIVE);
+  def_or_undef_macro (pfile, "__ARM_FP16_ARGS",
+		      arm_fp16_format != ARM_FP16_FORMAT_NONE);
 
   def_or_undef_macro (pfile, "__ARM_FEATURE_FMA", TARGET_FMA);
   def_or_undef_macro (pfile, "__ARM_NEON__", TARGET_NEON);
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 71b5143..e1e311f 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -249,8 +249,6 @@  static void arm_output_dwarf_dtprel (FILE *, int, rtx) ATTRIBUTE_UNUSED;
 static bool arm_output_addr_const_extra (FILE *, rtx);
 static bool arm_allocate_stack_slots_for_args (void);
 static bool arm_warn_func_return (tree);
-static const char *arm_invalid_parameter_type (const_tree t);
-static const char *arm_invalid_return_type (const_tree t);
 static tree arm_promoted_type (const_tree t);
 static tree arm_convert_to_type (tree type, tree expr);
 static bool arm_scalar_mode_supported_p (machine_mode);
@@ -654,12 +652,6 @@  static const struct attribute_spec arm_attribute_table[] =
 #undef TARGET_PREFERRED_RELOAD_CLASS
 #define TARGET_PREFERRED_RELOAD_CLASS arm_preferred_reload_class
 
-#undef TARGET_INVALID_PARAMETER_TYPE
-#define TARGET_INVALID_PARAMETER_TYPE arm_invalid_parameter_type
-
-#undef TARGET_INVALID_RETURN_TYPE
-#define TARGET_INVALID_RETURN_TYPE arm_invalid_return_type
-
 #undef TARGET_PROMOTED_TYPE
 #define TARGET_PROMOTED_TYPE arm_promoted_type
 
@@ -5549,7 +5541,7 @@  aapcs_vfp_sub_candidate (const_tree type, machine_mode *modep)
     {
     case REAL_TYPE:
       mode = TYPE_MODE (type);
-      if (mode != DFmode && mode != SFmode)
+      if (mode != DFmode && mode != SFmode && mode != HFmode)
 	return -1;
 
       if (*modep == VOIDmode)
@@ -5797,11 +5789,16 @@  aapcs_vfp_is_call_candidate (CUMULATIVE_ARGS *pcum, machine_mode mode,
 						&pcum->aapcs_vfp_rcount);
 }
 
+/* Implement the allocate field in aapcs_cp_arg_layout.  See the comment there
+   for the behaviour of this function.  */
+
 static bool
 aapcs_vfp_allocate (CUMULATIVE_ARGS *pcum, machine_mode mode,
 		    const_tree type  ATTRIBUTE_UNUSED)
 {
-  int shift = GET_MODE_SIZE (pcum->aapcs_vfp_rmode) / GET_MODE_SIZE (SFmode);
+  int rmode_size
+    = MAX (GET_MODE_SIZE (pcum->aapcs_vfp_rmode), GET_MODE_SIZE (SFmode));
+  int shift = rmode_size / GET_MODE_SIZE (SFmode);
   unsigned mask = (1 << (shift * pcum->aapcs_vfp_rcount)) - 1;
   int regno;
 
@@ -5850,6 +5847,9 @@  aapcs_vfp_allocate (CUMULATIVE_ARGS *pcum, machine_mode mode,
   return false;
 }
 
+/* Implement the allocate_return_reg field in aapcs_cp_arg_layout.  See the
+   comment there for the behaviour of this function.  */
+
 static rtx
 aapcs_vfp_allocate_return_reg (enum arm_pcs pcs_variant ATTRIBUTE_UNUSED,
 			       machine_mode mode,
@@ -5940,13 +5940,13 @@  static struct
      required for a return from FUNCTION_ARG.  */
   bool (*allocate) (CUMULATIVE_ARGS *, machine_mode, const_tree);
 
-  /* Return true if a result of mode MODE (or type TYPE if MODE is
-     BLKmode) is can be returned in this co-processor's registers.  */
+  /* Return true if a result of mode MODE (or type TYPE if MODE is BLKmode) can
+     be returned in this co-processor's registers.  */
   bool (*is_return_candidate) (enum arm_pcs, machine_mode, const_tree);
 
-  /* Allocate and return an RTX element to hold the return type of a
-     call, this routine must not fail and will only be called if
-     is_return_candidate returned true with the same parameters.  */
+  /* Allocate and return an RTX element to hold the return type of a call.  This
+     routine must not fail and will only be called if is_return_candidate
+     returned true with the same parameters.  */
   rtx (*allocate_return_reg) (enum arm_pcs, machine_mode, const_tree);
 
   /* Finish processing this argument and prepare to start processing
@@ -18608,7 +18608,8 @@  output_move_vfp (rtx *operands)
 
   gcc_assert (REG_P (reg));
   gcc_assert (IS_VFP_REGNUM (REGNO (reg)));
-  gcc_assert (mode == SFmode
+  gcc_assert ((mode == HFmode && TARGET_HARD_FLOAT && TARGET_VFP)
+	      || mode == SFmode
 	      || mode == DFmode
 	      || mode == SImode
 	      || mode == DImode
@@ -23397,10 +23398,8 @@  arm_hard_regno_mode_ok (unsigned int regno, machine_mode mode)
       if (mode == DFmode)
 	return VFP_REGNO_OK_FOR_DOUBLE (regno);
 
-      /* VFP registers can hold HFmode values, but there is no point in
-	 putting them there unless we have hardware conversion insns. */
       if (mode == HFmode)
-	return TARGET_FP16 && VFP_REGNO_OK_FOR_SINGLE (regno);
+	return VFP_REGNO_OK_FOR_SINGLE (regno);
 
       if (TARGET_NEON)
         return (VALID_NEON_DREG_MODE (mode) && VFP_REGNO_OK_FOR_DOUBLE (regno))
@@ -23604,26 +23603,6 @@  arm_debugger_arg_offset (int value, rtx addr)
   return value;
 }
 
-/* Implement TARGET_INVALID_PARAMETER_TYPE.  */
-
-static const char *
-arm_invalid_parameter_type (const_tree t)
-{
-  if (SCALAR_FLOAT_TYPE_P (t) && TYPE_PRECISION (t) == 16)
-    return N_("function parameters cannot have __fp16 type");
-  return NULL;
-}
-
-/* Implement TARGET_INVALID_PARAMETER_TYPE.  */
-
-static const char *
-arm_invalid_return_type (const_tree t)
-{
-  if (SCALAR_FLOAT_TYPE_P (t) && TYPE_PRECISION (t) == 16)
-    return N_("functions cannot return __fp16 type");
-  return NULL;
-}
-
 /* Implement TARGET_PROMOTED_TYPE.  */
 
 static tree
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index ad123dd..5b1a030 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -194,7 +194,8 @@  extern void (*arm_lang_output_object_attributes_hook)(void);
 /* FPU supports half-precision floating-point with NEON element load/store.  */
 #define TARGET_NEON_FP16						\
   (TARGET_VFP								\
-   && ARM_FPU_FSET_HAS (TARGET_FPU_FEATURES, FPU_FL_NEON | FPU_FL_FP16))
+   && ARM_FPU_FSET_HAS (TARGET_FPU_FEATURES, FPU_FL_NEON)		\
+   && ARM_FPU_FSET_HAS (TARGET_FPU_FEATURES, FPU_FL_FP16))
 
 /* FPU supports VFP half-precision floating-point.  */
 #define TARGET_FP16							\
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 47171b9..adae7cc 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -6547,7 +6547,7 @@ 
 (define_insn "*arm32_movhf"
   [(set (match_operand:HF 0 "nonimmediate_operand" "=r,m,r,r")
 	(match_operand:HF 1 "general_operand"	   " m,r,r,F"))]
-  "TARGET_32BIT && !(TARGET_HARD_FLOAT && TARGET_FP16)
+  "TARGET_32BIT && !(TARGET_HARD_FLOAT && TARGET_VFP)
    && (	  s_register_operand (operands[0], HFmode)
        || s_register_operand (operands[1], HFmode))"
   "*
diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
index ac5f3b8..5e4d96a 100644
--- a/gcc/config/arm/vfp.md
+++ b/gcc/config/arm/vfp.md
@@ -268,7 +268,8 @@ 
 (define_insn "*movhf_vfp"
   [(set (match_operand:HF 0 "nonimmediate_operand" "=r,m,t,r,t,r,r")
 	(match_operand:HF 1 "general_operand"	   " m,r,t,r,r,t,F"))]
-  "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_FP16 && !TARGET_NEON_FP16
+  "TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP
+   && !TARGET_NEON_FP16
    && (   s_register_operand (operands[0], HFmode)
        || s_register_operand (operands[1], HFmode))"
   "*
diff --git a/gcc/testsuite/g++.dg/ext/arm-fp16/fp16-param-1.C b/gcc/testsuite/g++.dg/ext/arm-fp16/fp16-param-1.C
index 03feb1a..e89da15 100644
--- a/gcc/testsuite/g++.dg/ext/arm-fp16/fp16-param-1.C
+++ b/gcc/testsuite/g++.dg/ext/arm-fp16/fp16-param-1.C
@@ -1,10 +1,14 @@ 
 /* { dg-do compile { target arm*-*-* } } */
 /* { dg-options "-mfp16-format=ieee" } */
 
-/* Functions cannot have parameters of type __fp16.  */
-extern void f (__fp16);		/* { dg-error "parameters cannot have __fp16 type" } */
-extern void (*pf) (__fp16);	/* { dg-error "parameters cannot have __fp16 type" } */
+/* Test that the ACLE macro is defined.  */
+#if __ARM_FP16_ARGS != 1
+#error Unexpected value for __ARM_FP16_ARGS
+#endif
+
+/* Test that __fp16 is supported as a parameter type.  */
+extern void f (__fp16);
+extern void (*pf) (__fp16);
 
-/* These should be OK.  */
 extern void g (__fp16 *);
 extern void (*pg) (__fp16 *);
diff --git a/gcc/testsuite/g++.dg/ext/arm-fp16/fp16-return-1.C b/gcc/testsuite/g++.dg/ext/arm-fp16/fp16-return-1.C
index 406dfac..b96532d 100644
--- a/gcc/testsuite/g++.dg/ext/arm-fp16/fp16-return-1.C
+++ b/gcc/testsuite/g++.dg/ext/arm-fp16/fp16-return-1.C
@@ -1,10 +1,9 @@ 
 /* { dg-do compile { target arm*-*-* } } */
 /* { dg-options "-mfp16-format=ieee" } */
 
-/* Functions cannot return type __fp16.  */
-extern __fp16 f (void);		/* { dg-error "cannot return __fp16" } */
-extern __fp16 (*pf) (void);	/* { dg-error "cannot return __fp16" } */
+/* Test that __fp16 is supported as a return type.  */
+extern __fp16 f (void);
+extern __fp16 (*pf) (void);
 
-/* These should be OK.  */
 extern __fp16 *g (void);
 extern __fp16 *(*pg) (void);
diff --git a/gcc/testsuite/gcc.target/arm/aapcs/neon-vect10.c b/gcc/testsuite/gcc.target/arm/aapcs/neon-vect10.c
new file mode 100644
index 0000000..680a3b5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/aapcs/neon-vect10.c
@@ -0,0 +1,31 @@ 
+/* Test AAPCS layout (VFP variant for Neon types) */
+
+/* { dg-do run { target arm_eabi } } */
+/* { dg-require-effective-target arm_neon_fp16_ok } */
+/* { dg-add-options arm_neon_fp16 } */
+
+#ifndef IN_FRAMEWORK
+#define VFP
+#define NEON
+#define TESTFILE "neon-vect10.c"
+#include "neon-constants.h"
+
+#include "abitest.h"
+#else
+
+ARG (int32x4_t, i32x4_constvec2, Q0) /* D0, D1.  */
+#if defined (__ARM_BIG_ENDIAN)
+ARG (__fp16, 3.0f, S4 + 2) /* D2, Q1.  */
+#else
+ARG (__fp16, 3.0f, S4) /* D2, Q1.  */
+#endif
+ARG (int32x4x2_t, i32x4x2_constvec1, Q2) /* Q2, Q3 - D4-D6 , s5-s12.  */
+ARG (double, 12.0, D3) /* Backfill this particular argument.  */
+#if defined (__ARM_BIG_ENDIAN)
+ARG (__fp16, 5.0f, S5 + 2) /* Backfill in S5.  */
+#else
+ARG (__fp16, 5.0f, S5) /* Backfill in S5.  */
+#endif
+ARG (int32x4x2_t, i32x4x2_constvec2, STACK)
+LAST_ARG (int, 3, R0)
+#endif
diff --git a/gcc/testsuite/gcc.target/arm/aapcs/neon-vect9.c b/gcc/testsuite/gcc.target/arm/aapcs/neon-vect9.c
new file mode 100644
index 0000000..fc2b13b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/aapcs/neon-vect9.c
@@ -0,0 +1,23 @@ 
+/* Test AAPCS layout (VFP variant for Neon types) */
+
+/* { dg-do run { target arm_eabi } } */
+/* { dg-require-effective-target arm_neon_fp16_ok } */
+/* { dg-add-options arm_neon_fp16 } */
+
+#ifndef IN_FRAMEWORK
+#define VFP
+#define NEON
+#define TESTFILE "neon-vect9.c"
+#include "neon-constants.h"
+
+#include "abitest.h"
+#else
+
+ARG (int32x4_t, i32x4_constvec2, Q0) /* D0, D1.  */
+#if defined (__ARM_BIG_ENDIAN)
+ARG (__fp16, 3.0f, S4 + 2) /* D2, Q1 occupied.  */
+#else
+ARG (__fp16, 3.0f, S4) /* D2, Q1 occupied.  */
+#endif
+LAST_ARG (int, 3, R0)
+#endif
diff --git a/gcc/testsuite/gcc.target/arm/aapcs/vfp18.c b/gcc/testsuite/gcc.target/arm/aapcs/vfp18.c
new file mode 100644
index 0000000..225e9ce
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/aapcs/vfp18.c
@@ -0,0 +1,27 @@ 
+/* Test AAPCS layout (VFP variant) */
+
+/* { dg-do run { target arm_eabi } }  */
+/* { dg-require-effective-target arm_neon_fp16_ok } */
+/* { dg-options "-O -mfpu=vfp -mfloat-abi=hard -mfp16-format=ieee" } */
+
+#ifndef IN_FRAMEWORK
+#define VFP
+#define TESTFILE "vfp18.c"
+#include "abitest.h"
+
+#else
+#if defined (__ARM_BIG_ENDIAN)
+ARG (__fp16, 1.0f, S0 + 2)
+#else
+ARG (__fp16, 1.0f, S0)
+#endif
+ARG (float, 2.0f, S1)
+ARG (double, 4.0, D1)
+ARG (float, 2.0f, S4)
+#if defined (__ARM_BIG_ENDIAN)
+ARG (__fp16, 1.0f, S5 + 2)
+#else
+ARG (__fp16, 1.0f, S5)
+#endif
+LAST_ARG (int, 3, R0)
+#endif
diff --git a/gcc/testsuite/gcc.target/arm/aapcs/vfp19.c b/gcc/testsuite/gcc.target/arm/aapcs/vfp19.c
new file mode 100644
index 0000000..8928b15
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/aapcs/vfp19.c
@@ -0,0 +1,29 @@ 
+/* Test AAPCS layout (VFP variant) */
+
+/* { dg-do run { target arm_eabi } } */
+/* { dg-require-effective-target arm_neon_fp16_ok } */
+/* { dg-options "-O -mfpu=vfp -mfloat-abi=hard -mfp16-format=ieee" } */
+
+#ifndef IN_FRAMEWORK
+#define VFP
+#define TESTFILE "vfp19.c"
+
+__complex__ x = 1.0+2.0i;
+
+#include "abitest.h"
+#else
+#if defined (__ARM_BIG_ENDIAN)
+ARG (__fp16, 1.0f, S0 + 2)
+#else
+ARG (__fp16, 1.0f, S0)
+#endif
+ARG (float, 2.0f, S1)
+ARG (__complex__ double, x, D1)
+ARG (float, 3.0f, S6)
+#if defined (__ARM_BIG_ENDIAN)
+ARG (__fp16, 2.0f, S7 + 2)
+#else
+ARG (__fp16, 2.0f, S7)
+#endif
+LAST_ARG (int, 3, R0)
+#endif
diff --git a/gcc/testsuite/gcc.target/arm/aapcs/vfp20.c b/gcc/testsuite/gcc.target/arm/aapcs/vfp20.c
new file mode 100644
index 0000000..61f0704
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/aapcs/vfp20.c
@@ -0,0 +1,21 @@ 
+/* Test AAPCS layout (VFP variant) */
+
+/* { dg-do run { target arm_eabi } } */
+/* { dg-require-effective-target arm_neon_fp16_ok } */
+/* { dg-options "-O -mfpu=vfp -mfloat-abi=hard -mfp16-format=ieee" } */
+
+#ifndef IN_FRAMEWORK
+#define VFP
+#define TESTFILE "vfp20.c"
+
+#define PCSATTR __attribute__((pcs("aapcs")))
+
+#include "abitest.h"
+#else
+ARG (float, 1.0f, R0)
+ARG (double, 2.0, R2)
+ARG (float, 3.0f, STACK)
+ARG (__fp16, 2.0f, STACK+4)
+LAST_ARG (double, 4.0, STACK+8)
+#endif
+
diff --git a/gcc/testsuite/gcc.target/arm/aapcs/vfp21.c b/gcc/testsuite/gcc.target/arm/aapcs/vfp21.c
new file mode 100644
index 0000000..15dff7d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/aapcs/vfp21.c
@@ -0,0 +1,25 @@ 
+/* Test AAPCS layout (VFP variant) */
+
+/* { dg-do run { target arm_eabi } } */
+/* { dg-require-effective-target arm_neon_fp16_ok } */
+/* { dg-options "-O -mfpu=vfp -mfloat-abi=hard -mfp16-format=ieee" } */
+
+#ifndef IN_FRAMEWORK
+#define VFP
+#define TESTFILE "vfp21.c"
+
+#define PCSATTR __attribute__((pcs("aapcs")))
+
+#include "abitest.h"
+#else
+#if defined (__ARM_BIG_ENDIAN)
+ARG (__fp16, 1.0f, R0 + 2)
+#else
+ARG (__fp16, 1.0f, R0)
+#endif
+ARG (double, 2.0, R2)
+ARG (__fp16, 3.0f, STACK)
+ARG (float, 2.0f, STACK+4)
+LAST_ARG (double, 4.0, STACK+8)
+#endif
+
diff --git a/gcc/testsuite/gcc.target/arm/fp16-aapcs-1.c b/gcc/testsuite/gcc.target/arm/fp16-aapcs-1.c
new file mode 100644
index 0000000..5eab3e2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/fp16-aapcs-1.c
@@ -0,0 +1,17 @@ 
+/* { dg-do compile }  */
+/* { dg-require-effective-target arm_fp16_ok } */
+/* { dg-options "-mfp16-format=ieee -O2" }  */
+/* { dg-add-options arm_fp16 } */
+
+/* Test __fp16 arguments and return value in registers.  */
+
+__fp16 F (__fp16 a, __fp16 b, __fp16 c)
+{
+  if (a == b)
+    return c;
+  return a;
+}
+
+/* { dg-final { scan-assembler-times {vcvtb\.f32\.f16\ts[0-9]+, s0} 1 } }  */
+/* { dg-final { scan-assembler-times {vcvtb\.f32\.f16\ts[0-9]+, s1} 1 } }  */
+/* { dg-final { scan-assembler-times {vmov\ts0, r[0-9]+} 1 } }  */
diff --git a/gcc/testsuite/gcc.target/arm/fp16-param-1.c b/gcc/testsuite/gcc.target/arm/fp16-param-1.c
index af4845f..9c52730 100644
--- a/gcc/testsuite/gcc.target/arm/fp16-param-1.c
+++ b/gcc/testsuite/gcc.target/arm/fp16-param-1.c
@@ -1,10 +1,14 @@ 
 /* { dg-do compile } */
 /* { dg-options "-mfp16-format=ieee" } */
 
-/* Functions cannot have parameters of type __fp16.  */
-extern void f (__fp16);		/* { dg-error "parameters cannot have __fp16 type" } */
-extern void (*pf) (__fp16);	/* { dg-error "parameters cannot have __fp16 type" } */
+/* Test that the ACLE macro is defined.  */
+#if __ARM_FP16_ARGS != 1
+#error Unexpected value for __ARM_FP16_ARGS
+#endif
+
+/* Test that __fp16 is supported as a parameter type.  */
+extern void f (__fp16);
+extern void (*pf) (__fp16);
 
-/* These should be OK.  */
 extern void g (__fp16 *);
 extern void (*pg) (__fp16 *);
diff --git a/gcc/testsuite/gcc.target/arm/fp16-return-1.c b/gcc/testsuite/gcc.target/arm/fp16-return-1.c
index f763941..f97a8d7 100644
--- a/gcc/testsuite/gcc.target/arm/fp16-return-1.c
+++ b/gcc/testsuite/gcc.target/arm/fp16-return-1.c
@@ -1,10 +1,9 @@ 
 /* { dg-do compile } */
 /* { dg-options "-mfp16-format=ieee" } */
 
-/* Functions cannot return type __fp16.  */
-extern __fp16 f (void);		/* { dg-error "cannot return __fp16" } */
-extern __fp16 (*pf) (void);	/* { dg-error "cannot return __fp16" } */
+/* Test that __fp16 is supported as a return type.  */
+extern __fp16 f (void);
+extern __fp16 (*pf) (void);
 
-/* These should be OK.  */
 extern __fp16 *g (void);
 extern __fp16 *(*pg) (void);
-- 
2.1.4