diff mbox

New regression on ARM Linux

Message ID 551A8ECE.6040802@arm.com
State New
Headers show

Commit Message

Alan Lawrence March 31, 2015, 12:10 p.m. UTC
Richard Earnshaw wrote:
> On 31/03/15 11:45, Richard Biener wrote:
>> On Tue, 31 Mar 2015, Richard Earnshaw wrote:
>>
>>> On 31/03/15 11:36, Richard Biener wrote:
>>>> On Tue, 31 Mar 2015, Richard Earnshaw wrote:
>>>>
>>>>> On 31/03/15 11:00, Richard Biener wrote:
>>>>>> On Tue, 31 Mar 2015, Richard Earnshaw wrote:
>>>>>>
>>>>>>> On 31/03/15 08:50, Richard Biener wrote:
>>>>>>>> On Mon, Mar 30, 2015 at 10:13 PM, Richard Biener <rguenther@suse.de> wrote:
>>>>>>>>> On March 30, 2015 6:45:34 PM GMT+02:00, Alan Lawrence <alan.lawrence@arm.com> wrote:
>>>>>>>>>> -O2 was what I first used; it also occurs at -O1. -fno-tree-sra fixes
>>>>>>>>>> it.
>>>>>>>>>>
>>>>>>>>>> The problem appears to be in laying out arguments, specifically
>>>>>>>>>> varargs. From
>>>>>>>>>> the "good" -fdump-rtl-expand:
>>>>>>>>>>
>>>>>>>>>> (insn 18 17 19 2 (set (mem:SI (reg/f:SI 107 virtual-outgoing-args) [0
>>>>>>>>>> S4 A32])
>>>>>>>>>>         (reg:SI 111 [ b1$16 ])) reduced.c:14 -1
>>>>>>>>>>      (nil))
>>>>>>>>>> (insn 19 18 20 2 (set (reg:DF 2 r2)
>>>>>>>>>>         (reg:DF 112 [ b1$8 ])) reduced.c:14 -1
>>>>>>>>>>      (nil))
>>>>>>>>>> (insn 20 19 21 2 (set (reg:SI 1 r1)
>>>>>>>>>>         (reg:SI 113 [ b1 ])) reduced.c:14 -1
>>>>>>>>>>      (nil))
>>>>>>>>>> (insn 21 20 22 2 (set (reg:SI 0 r0)
>>>>>>>>>>         (reg:SI 118)) reduced.c:14 -1
>>>>>>>>>>      (nil))
>>>>>>>>>> (call_insn 22 21 23 2 (parallel [
>>>>>>>>>>             (set (reg:SI 0 r0)
>>>>>>>>>>                 (call (mem:SI (symbol_ref:SI ("printf") [flags 0x41]
>>>>>>>>>> <function_decl 0x2ab50e856d00 __builtin_printf>) [0 __builtin_printf S4
>>>>>>>>>> A32])
>>>>>>>>>>
>>>>>>>>>> The struct members are
>>>>>>>>>> reg:SI 113 => int a;
>>>>>>>>>> reg:DF 112 => double b;
>>>>>>>>>> reg:SI 111 => int c;
>>>>>>>>>>
>>>>>>>>>> r0 gets the format string; r1 gets int a; r2+r3 get double b; int c is
>>>>>>>>>> pushed
>>>>>>>>>> into virtual-outgoing-args. In contrast, post-change to
>>>>>>>>>> build_ref_of_offset, we get:
>>>>>>>>>>
>>>>>>>>>> (insn 17 16 18 2 (set (reg:SI 118)
>>>>>>>>>>   (symbol_ref/v/f:SI ("*.LC1") [flags 0x82]  <var_decl 0x2ba57fa8d750
>>>>>>>>>> *.LC1>)) reduced.c:14 -1
>>>>>>>>>>      (nil))
>>>>>>>>>> (insn 18 17 19 2 (set (mem:SI (plus:SI (reg/f:SI 107
>>>>>>>>>> virtual-outgoing-args)
>>>>>>>>>>                 (const_int 8 [0x8])) [0  S4 A64])
>>>>>>>>>>         (reg:SI 111 [ b1$16 ])) reduced.c:14 -1
>>>>>>>>>>      (nil))
>>>>>>>>>> (insn 19 18 20 2 (set (mem:DF (reg/f:SI 107 virtual-outgoing-args) [0
>>>>>>>>>> S8 A64])
>>>>>>>>>>         (reg:DF 112 [ b1$8 ])) reduced.c:14 -1
>>>>>>>>>>      (nil))
>>>>>>>>>> (insn 20 19 21 2 (set (reg:SI 2 r2)
>>>>>>>>>>         (reg:SI 113 [ b1 ])) reduced.c:14 -1
>>>>>>>>>>      (nil))
>>>>>>>>>> (insn 21 20 22 2 (set (reg:SI 0 r0)
>>>>>>>>>>         (reg:SI 118)) reduced.c:14 -1
>>>>>>>>>>      (nil))
>>>>>>>>>> (call_insn 22 21 23 2 (parallel [
>>>>>>>>>>             (set (reg:SI 0 r0)
>>>>>>>>>>                 (call (mem:SI (symbol_ref:SI ("printf") [flags 0x41]
>>>>>>>>>> <function_decl 0x2ba57f843d00 __builtin_printf>) [0 __builtin_printf S4
>>>>>>>>>> A32])
>>>>>>>>>>
>>>>>>>>>> r0 still gets the format string, but 'int b1.a' now goes in r2, and the
>>>>>>>>>>
>>>>>>>>>> double+following int are all pushed into virtual-outgoing-args. This is
>>>>>>>>>> because
>>>>>>>>>> arm_function_arg is fed a 64-bit-aligned int as type of the second
>>>>>>>>>> argument (the
>>>>>>>>>> type constructed by build_ref_for_offset); it then executes
>>>>>>>>>> (aapcs_layout_arg,
>>>>>>>>>> arm.c line ~~5914)
>>>>>>>>>>
>>>>>>>>>>   /* C3 - For double-word aligned arguments, round the NCRN up to the
>>>>>>>>>>      next even number.  */
>>>>>>>>>>   ncrn = pcum->aapcs_ncrn;
>>>>>>>>>>   if ((ncrn & 1) && arm_needs_doubleword_align (mode, type))
>>>>>>>>>>     ncrn++;
>>>>>>>>>>
>>>>>>>>>> Which changes r1 to r2. Passing -fno-tree-sra, or removing from the
>>>>>>>>>> testcase
>>>>>>>>>> "*(cls_struct_16byte *)resp = b1", causes arm_function_arg to be fed a
>>>>>>>>>> 32-bit-aligned int instead, which works as previously.
>>>>>>>>>>
>>>>>>>>>> Passing the same members of that struct in a non-vargs call, works ok -
>>>>>>>>>> I think
>>>>>>>>>> because these use the type of the declared parameters, rather than the
>>>>>>>>>> provided
>>>>>>>>>> arguments, and the former do not have the increased alignment from
>>>>>>>>>> build_ref_for_offset.
>>>>>>>>> It doesn't make sense to use the alignment of passed values.  That looks like bs.
>>>>>>>>>
>>>>>>>>> This means that
>>>>>>>>>
>>>>>>>>> Int I __aligned__(8);
>>>>>>>>>
>>>>>>>>> Is passed differently than int.
>>>>>>>>>
>>>>>>>>> Arm_function_arg needs to be fixed.
>>>>>>>> That is,
>>>>>>>>
>>>>>>>> typedef int myint __attribute__((aligned(8)));
>>>>>>>>
>>>>>>>> int main()
>>>>>>>> {
>>>>>>>>   myint i = 1;
>>>>>>>>   int j = 2;
>>>>>>>>   __builtin_printf("%d %d\n", i, j);
>>>>>>>> }
>>>>>>>>
>>>>>>>> or
>>>>>>>>
>>>>>>>> myint i;
>>>>>>>> int j;
>>>>>>>> myint *p = &i;
>>>>>>>> int *q = &j;
>>>>>>>>
>>>>>>>> int main()
>>>>>>>> {
>>>>>>>>   __builtin_printf("%d %d", *p, *q);
>>>>>>>> }
>>>>>>>>
>>>>>>>> should behave the same.  There isn't a printf modifier for an "aligned int"
>>>>>>>> because that sort of thing doesn't make sense.  Special-casing aligned vs.
>>>>>>>> non-aligned values only makes sense for things passed by value on the stack.
>>>>>>>> And then obviously only dependent on the functuion type signature, not
>>>>>>>> on the type of the passed value.
>>>>>>>>
>>>>>>> I think the testcase is ill-formed.  Just because printf doesn't have
>>>>>>> such a modifier, doesn't mean that another variadic function might not
>>>>>>> have a means to detect when an object in the variadic list needs to be
>>>>>>> over-aligned.  As such, the test should really be written as:
>>>>>> A value doesn't have "alignment".  A function may have alignment
>>>>>> requirements on its arguments, clearly printf doesn't.
>>>>>>
>>>>> Values don't.  But types do and variadic functions are special in that
>>>>> they derive their types from the types of the actual parameters passed
>>>>> not from the formals in the prototype.  Any manipulation of the types
>>>>> should be done in the front end, not in the back end.
>>>> The following seems to help the testcase (by luck I'd say?).  It
>>>> makes us drop alignment information from the temporaries that
>>>> SRA creates as memory replacement.
>>>>
>>>> But I find it odd that on ARM passing *((aligned_int *)p) as
>>>> vararg (only as varargs?) changes calling conventions independent
>>>> of the functions type signature.
>>>>
>>>> Richard.
>>>>
>>>> Index: gcc/tree-sra.c
>>>> ===================================================================
>>>> --- gcc/tree-sra.c      (revision 221770)
>>>> +++ gcc/tree-sra.c      (working copy)
>>>> @@ -2012,7 +2012,9 @@ create_access_replacement (struct access
>>>>        DECL_CONTEXT (repl) = current_function_decl;
>>>>      }
>>>>    else
>>>> -    repl = create_tmp_var (access->type, "SR");
>>>> +    repl = create_tmp_var (build_qualified_type
>>>> +                            (TYPE_MAIN_VARIANT (access->type),
>>>> +                             TYPE_QUALS (access->type)), "SR");
>>>>    if (TREE_CODE (access->type) == COMPLEX_TYPE
>>>>        || TREE_CODE (access->type) == VECTOR_TYPE)
>>>>      {
>>>>
>>> I guess we could do something similar (use TYPE_MAIN_VARIANT) in the
>>> backend code - but then we'd always ignore any over-alignment
>>> requirements (or under, for that matter).
>> The question is whether those are even in the scope of AACPS...
>>
> 
> Technically, they aren't.  The argument passing rules are all based on
> the alignment rules for fundamental types (and derived rules for
> composite types based on those fundamental types) written in the AAPCS
> itself.  There's no provision for over-aligning a data type, so all of
> this is off-piste.
> 
> R.

FWIW, I've just tested arm-none-linux-gnueabi with:

the right way to do it however is another question!

--Alan
diff mbox

Patch

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 48342d0..37662f4 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -6020,7 +6020,7 @@  static bool
  arm_needs_doubleword_align (machine_mode mode, const_tree type)
  {
    return (GET_MODE_ALIGNMENT (mode) > PARM_BOUNDARY
-         || (type && TYPE_ALIGN (type) > PARM_BOUNDARY));
+         || (type && TYPE_ALIGN (TYPE_MAIN_VARIANT (type)) > PARM_BOUNDARY));
  }

and found no regressions (and yes this fixes the original case). Whether this is