diff mbox

PATCH: PR rtl-optimization/64037: Miscompilation with -Os and enum class : char parameter

Message ID 20141125125711.GA9855@intel.com
State New
Headers show

Commit Message

H.J. Lu Nov. 25, 2014, 12:57 p.m. UTC
Hi,

The enclosed testcase fails on x86 when compiled with -Os since we pass
a byte parameter with a byte load in caller and read it as an int in
callee.  The reason it only shows up with -Os is x86 backend encodes
a byte load with an int load if -O isn't used.  When a byte load is
used, the upper 24 bits of the register have random value for none
WORD_REGISTER_OPERATIONS targets.

It happens because setup_incoming_promotions in combine.c has

      /* The mode and signedness of the argument before any promotions happen
         (equal to the mode of the pseudo holding it at that stage).  */
      mode1 = TYPE_MODE (TREE_TYPE (arg));
      uns1 = TYPE_UNSIGNED (TREE_TYPE (arg));

      /* The mode and signedness of the argument after any source language and
         TARGET_PROMOTE_PROTOTYPES-driven promotions.  */
      mode2 = TYPE_MODE (DECL_ARG_TYPE (arg));
      uns3 = TYPE_UNSIGNED (DECL_ARG_TYPE (arg));

      /* The mode and signedness of the argument as it is actually passed,
         after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions.  */
      mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, &uns3,
                                     TREE_TYPE (cfun->decl), 0);

while they are actually passed in register by assign_parm_setup_reg in
function.c:

  /* Store the parm in a pseudoregister during the function, but we may
     need to do it in a wider mode.  Using 2 here makes the result
     consistent with promote_decl_mode and thus expand_expr_real_1.  */
  promoted_nominal_mode
    = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
                             TREE_TYPE (current_function_decl), 2);

where nominal_type and nominal_mode are set up with TREE_TYPE (parm)
and TYPE_MODE (nominal_type). TREE_TYPE here is

(gdb) call debug_tree (type)
 <enumeral_type 0x7ffff19f85e8 X
    type <integer_type 0x7ffff18a93f0 unsigned char public unsigned
string-flag QI
        size <integer_cst 0x7ffff18a5fa8 constant 8>
        unit size <integer_cst 0x7ffff18a5fc0 constant 1>
        align 8 symtab 0 alias set -1 canonical type 0x7ffff18a93f0
precision 8 min <integer_cst 0x7ffff18a5fd8 0> max <integer_cst
0x7ffff18a5f78 255>>
    static unsigned type_5 QI size <integer_cst 0x7ffff18a5fa8 8> unit
size <integer_cst 0x7ffff18a5fc0 1>
    align 8 symtab 0 alias set -1 canonical type 0x7ffff19f85e8
precision 8 min <integer_cst 0x7ffff18a5fd8 0> max <integer_cst
0x7ffff18a5f78 255>
    values <tree_list 0x7ffff19fb028
        purpose <identifier_node 0x7ffff19f6738 V
            bindings <(nil)>
            local bindings <(nil)>>
        value <const_decl 0x7ffff18c21c0 V type <enumeral_type
0x7ffff19f85e8 X>
            readonly constant used VOID file pr64037.ii line 2 col 3
            align 1 context <enumeral_type 0x7ffff19f85e8 X> initial
<integer_cst 0x7ffff19d8d08 2>>> context <translation_unit_decl
0x7ffff7ff91e0 D.1>
    chain <type_decl 0x7ffff19f5c78 X>>
(gdb) 

and DECL_ARG_TYPE is

(gdb) call debug_tree (type)
 <integer_type 0x7ffff18a9690 int public SI
    size <integer_cst 0x7ffff18a5e70 type <integer_type 0x7ffff18a9150
bitsizetype> constant 32>
    unit size <integer_cst 0x7ffff18a5e88 type <integer_type
0x7ffff18a90a8 sizetype> constant 4>
    align 32 symtab 0 alias set 1 canonical type 0x7ffff18a9690
precision 32 min <integer_cst 0x7ffff18c60c0 -2147483648> max
<integer_cst 0x7ffff18c60d8 2147483647>
    pointer_to_this <pointer_type 0x7ffff18cb930>>
(gdb) 

This mismatch makes combine thinks a byte parameter is passed as int
in register and turns

(insn 9 6 10 2 (set (reg:SI 92 [ b ])
        (zero_extend:SI (subreg:QI (reg:SI 91 [ b ]) 0))) pr64037.ii:9
138 {*zero_extendqisi2}
     (expr_list:REG_DEAD (reg:SI 91 [ b ])
        (nil)))
(insn 10 9 0 2 (set (mem:SI (reg/v/f:SI 88 [ out ]) [1 *out_4(D)+0 S4
A32])
        (reg:SI 92 [ b ])) pr64037.ii:9 90 {*movsi_internal}
     (expr_list:REG_DEAD (reg:SI 92 [ b ])
        (expr_list:REG_DEAD (reg/v/f:SI 88 [ out ])
            (nil))))

into

Trying 9 -> 10: 
Successfully matched this instruction:
(set (mem:SI (reg/v/f:SI 88 [ out ]) [1 *out_4(D)+0 S4 A32])
    (reg:SI 91 [ b ])) 
allowing combination of insns 9 and 10
original costs 6 + 4 = 10
replacement cost 4
deferring deletion of insn with uid = 9.
modifying insn i3    10: [r88:SI]=r91:SI
      REG_DEAD r91:SI
      REG_DEAD r88:SI


This patch makes setup_incoming_promotions to match assign_parm_setup_reg.
Tested on Linux/x86-64 without regressions.  OK for trunk and backport?

Thanks.


H.J.
----

Comments

Richard Biener Nov. 25, 2014, 3:01 p.m. UTC | #1
On Tue, Nov 25, 2014 at 1:57 PM, H.J. Lu <hongjiu.lu@intel.com> wrote:
> Hi,
>
> The enclosed testcase fails on x86 when compiled with -Os since we pass
> a byte parameter with a byte load in caller and read it as an int in
> callee.  The reason it only shows up with -Os is x86 backend encodes
> a byte load with an int load if -O isn't used.  When a byte load is
> used, the upper 24 bits of the register have random value for none
> WORD_REGISTER_OPERATIONS targets.
>
> It happens because setup_incoming_promotions in combine.c has
>
>       /* The mode and signedness of the argument before any promotions happen
>          (equal to the mode of the pseudo holding it at that stage).  */
>       mode1 = TYPE_MODE (TREE_TYPE (arg));
>       uns1 = TYPE_UNSIGNED (TREE_TYPE (arg));
>
>       /* The mode and signedness of the argument after any source language and
>          TARGET_PROMOTE_PROTOTYPES-driven promotions.  */
>       mode2 = TYPE_MODE (DECL_ARG_TYPE (arg));
>       uns3 = TYPE_UNSIGNED (DECL_ARG_TYPE (arg));
>
>       /* The mode and signedness of the argument as it is actually passed,
>          after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions.  */
>       mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, &uns3,
>                                      TREE_TYPE (cfun->decl), 0);
>
> while they are actually passed in register by assign_parm_setup_reg in
> function.c:
>
>   /* Store the parm in a pseudoregister during the function, but we may
>      need to do it in a wider mode.  Using 2 here makes the result
>      consistent with promote_decl_mode and thus expand_expr_real_1.  */
>   promoted_nominal_mode
>     = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
>                              TREE_TYPE (current_function_decl), 2);
>
> where nominal_type and nominal_mode are set up with TREE_TYPE (parm)
> and TYPE_MODE (nominal_type). TREE_TYPE here is

I think the bug is here, not in combine.c.  Can you try going back in history
for both snippets and see if they matched at some point?

Thanks,
Richard.

> (gdb) call debug_tree (type)
>  <enumeral_type 0x7ffff19f85e8 X
>     type <integer_type 0x7ffff18a93f0 unsigned char public unsigned
> string-flag QI
>         size <integer_cst 0x7ffff18a5fa8 constant 8>
>         unit size <integer_cst 0x7ffff18a5fc0 constant 1>
>         align 8 symtab 0 alias set -1 canonical type 0x7ffff18a93f0
> precision 8 min <integer_cst 0x7ffff18a5fd8 0> max <integer_cst
> 0x7ffff18a5f78 255>>
>     static unsigned type_5 QI size <integer_cst 0x7ffff18a5fa8 8> unit
> size <integer_cst 0x7ffff18a5fc0 1>
>     align 8 symtab 0 alias set -1 canonical type 0x7ffff19f85e8
> precision 8 min <integer_cst 0x7ffff18a5fd8 0> max <integer_cst
> 0x7ffff18a5f78 255>
>     values <tree_list 0x7ffff19fb028
>         purpose <identifier_node 0x7ffff19f6738 V
>             bindings <(nil)>
>             local bindings <(nil)>>
>         value <const_decl 0x7ffff18c21c0 V type <enumeral_type
> 0x7ffff19f85e8 X>
>             readonly constant used VOID file pr64037.ii line 2 col 3
>             align 1 context <enumeral_type 0x7ffff19f85e8 X> initial
> <integer_cst 0x7ffff19d8d08 2>>> context <translation_unit_decl
> 0x7ffff7ff91e0 D.1>
>     chain <type_decl 0x7ffff19f5c78 X>>
> (gdb)
>
> and DECL_ARG_TYPE is
>
> (gdb) call debug_tree (type)
>  <integer_type 0x7ffff18a9690 int public SI
>     size <integer_cst 0x7ffff18a5e70 type <integer_type 0x7ffff18a9150
> bitsizetype> constant 32>
>     unit size <integer_cst 0x7ffff18a5e88 type <integer_type
> 0x7ffff18a90a8 sizetype> constant 4>
>     align 32 symtab 0 alias set 1 canonical type 0x7ffff18a9690
> precision 32 min <integer_cst 0x7ffff18c60c0 -2147483648> max
> <integer_cst 0x7ffff18c60d8 2147483647>
>     pointer_to_this <pointer_type 0x7ffff18cb930>>
> (gdb)
>
> This mismatch makes combine thinks a byte parameter is passed as int
> in register and turns
>
> (insn 9 6 10 2 (set (reg:SI 92 [ b ])
>         (zero_extend:SI (subreg:QI (reg:SI 91 [ b ]) 0))) pr64037.ii:9
> 138 {*zero_extendqisi2}
>      (expr_list:REG_DEAD (reg:SI 91 [ b ])
>         (nil)))
> (insn 10 9 0 2 (set (mem:SI (reg/v/f:SI 88 [ out ]) [1 *out_4(D)+0 S4
> A32])
>         (reg:SI 92 [ b ])) pr64037.ii:9 90 {*movsi_internal}
>      (expr_list:REG_DEAD (reg:SI 92 [ b ])
>         (expr_list:REG_DEAD (reg/v/f:SI 88 [ out ])
>             (nil))))
>
> into
>
> Trying 9 -> 10:
> Successfully matched this instruction:
> (set (mem:SI (reg/v/f:SI 88 [ out ]) [1 *out_4(D)+0 S4 A32])
>     (reg:SI 91 [ b ]))
> allowing combination of insns 9 and 10
> original costs 6 + 4 = 10
> replacement cost 4
> deferring deletion of insn with uid = 9.
> modifying insn i3    10: [r88:SI]=r91:SI
>       REG_DEAD r91:SI
>       REG_DEAD r88:SI
>
>
> This patch makes setup_incoming_promotions to match assign_parm_setup_reg.
> Tested on Linux/x86-64 without regressions.  OK for trunk and backport?
>
> Thanks.
>
>
> H.J.
> ----
> diff --git a/gcc/combine.c b/gcc/combine.c
> index 1808f97..a0449a2 100644
> --- a/gcc/combine.c
> +++ b/gcc/combine.c
> @@ -1561,8 +1561,8 @@ setup_incoming_promotions (rtx_insn *first)
>        uns3 = TYPE_UNSIGNED (DECL_ARG_TYPE (arg));
>
>        /* The mode and signedness of the argument as it is actually passed,
> -         after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions.  */
> -      mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, &uns3,
> +         see assign_parm_setup_reg in function.c.  */
> +      mode3 = promote_function_mode (TREE_TYPE (arg), mode1, &uns1,
>                                      TREE_TYPE (cfun->decl), 0);
>
>        /* The mode of the register in which the argument is being passed.  */
> diff --git a/gcc/testsuite/g++.dg/pr64037.C b/gcc/testsuite/g++.dg/pr64037.C
> new file mode 100644
> index 0000000..e5cd0e2
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/pr64037.C
> @@ -0,0 +1,27 @@
> +// { dg-do run { target i?86-*-* x86_64-*-* } }
> +// { dg-options "-std=c++11 -Os" }
> +
> +enum class X : unsigned char {
> +  V = 2,
> +};
> +
> +static void
> +__attribute__((noinline,noclone))
> +foo(unsigned &out, unsigned a, X b)
> +{
> +  out = static_cast<unsigned>(b);
> +}
> +
> +int main()
> +{
> +  unsigned deadbeef = 0xDEADBEEF;
> +  asm volatile ("" : "+d" (deadbeef), "+c" (deadbeef));
> +
> +  unsigned out;
> +  foo(out, 2, X::V);
> +
> +  if (out != 2)
> +    __builtin_abort ();
> +
> +  return 0;
> +}
Richard Biener Nov. 25, 2014, 3:04 p.m. UTC | #2
On Tue, Nov 25, 2014 at 4:01 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, Nov 25, 2014 at 1:57 PM, H.J. Lu <hongjiu.lu@intel.com> wrote:
>> Hi,
>>
>> The enclosed testcase fails on x86 when compiled with -Os since we pass
>> a byte parameter with a byte load in caller and read it as an int in
>> callee.  The reason it only shows up with -Os is x86 backend encodes
>> a byte load with an int load if -O isn't used.  When a byte load is
>> used, the upper 24 bits of the register have random value for none
>> WORD_REGISTER_OPERATIONS targets.
>>
>> It happens because setup_incoming_promotions in combine.c has
>>
>>       /* The mode and signedness of the argument before any promotions happen
>>          (equal to the mode of the pseudo holding it at that stage).  */
>>       mode1 = TYPE_MODE (TREE_TYPE (arg));
>>       uns1 = TYPE_UNSIGNED (TREE_TYPE (arg));
>>
>>       /* The mode and signedness of the argument after any source language and
>>          TARGET_PROMOTE_PROTOTYPES-driven promotions.  */
>>       mode2 = TYPE_MODE (DECL_ARG_TYPE (arg));
>>       uns3 = TYPE_UNSIGNED (DECL_ARG_TYPE (arg));
>>
>>       /* The mode and signedness of the argument as it is actually passed,
>>          after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions.  */
>>       mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, &uns3,
>>                                      TREE_TYPE (cfun->decl), 0);
>>
>> while they are actually passed in register by assign_parm_setup_reg in
>> function.c:
>>
>>   /* Store the parm in a pseudoregister during the function, but we may
>>      need to do it in a wider mode.  Using 2 here makes the result
>>      consistent with promote_decl_mode and thus expand_expr_real_1.  */
>>   promoted_nominal_mode
>>     = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
>>                              TREE_TYPE (current_function_decl), 2);
>>
>> where nominal_type and nominal_mode are set up with TREE_TYPE (parm)
>> and TYPE_MODE (nominal_type). TREE_TYPE here is
>
> I think the bug is here, not in combine.c.  Can you try going back in history
> for both snippets and see if they matched at some point?

Oh, and note that I think DECL_ARG_TYPE is sth dangerous - it's meant
to be a source language "ABI" kind-of-thing.  Or rather an optimization
hit.  For example in C when integral promotions happen to call arguments
this can be used to optimize sign-/zero-extensions in the callee.  Unless
something else overrides this (like the target which specifies the real ABI).
IIRC.

Richard.

> Thanks,
> Richard.
>
>> (gdb) call debug_tree (type)
>>  <enumeral_type 0x7ffff19f85e8 X
>>     type <integer_type 0x7ffff18a93f0 unsigned char public unsigned
>> string-flag QI
>>         size <integer_cst 0x7ffff18a5fa8 constant 8>
>>         unit size <integer_cst 0x7ffff18a5fc0 constant 1>
>>         align 8 symtab 0 alias set -1 canonical type 0x7ffff18a93f0
>> precision 8 min <integer_cst 0x7ffff18a5fd8 0> max <integer_cst
>> 0x7ffff18a5f78 255>>
>>     static unsigned type_5 QI size <integer_cst 0x7ffff18a5fa8 8> unit
>> size <integer_cst 0x7ffff18a5fc0 1>
>>     align 8 symtab 0 alias set -1 canonical type 0x7ffff19f85e8
>> precision 8 min <integer_cst 0x7ffff18a5fd8 0> max <integer_cst
>> 0x7ffff18a5f78 255>
>>     values <tree_list 0x7ffff19fb028
>>         purpose <identifier_node 0x7ffff19f6738 V
>>             bindings <(nil)>
>>             local bindings <(nil)>>
>>         value <const_decl 0x7ffff18c21c0 V type <enumeral_type
>> 0x7ffff19f85e8 X>
>>             readonly constant used VOID file pr64037.ii line 2 col 3
>>             align 1 context <enumeral_type 0x7ffff19f85e8 X> initial
>> <integer_cst 0x7ffff19d8d08 2>>> context <translation_unit_decl
>> 0x7ffff7ff91e0 D.1>
>>     chain <type_decl 0x7ffff19f5c78 X>>
>> (gdb)
>>
>> and DECL_ARG_TYPE is
>>
>> (gdb) call debug_tree (type)
>>  <integer_type 0x7ffff18a9690 int public SI
>>     size <integer_cst 0x7ffff18a5e70 type <integer_type 0x7ffff18a9150
>> bitsizetype> constant 32>
>>     unit size <integer_cst 0x7ffff18a5e88 type <integer_type
>> 0x7ffff18a90a8 sizetype> constant 4>
>>     align 32 symtab 0 alias set 1 canonical type 0x7ffff18a9690
>> precision 32 min <integer_cst 0x7ffff18c60c0 -2147483648> max
>> <integer_cst 0x7ffff18c60d8 2147483647>
>>     pointer_to_this <pointer_type 0x7ffff18cb930>>
>> (gdb)
>>
>> This mismatch makes combine thinks a byte parameter is passed as int
>> in register and turns
>>
>> (insn 9 6 10 2 (set (reg:SI 92 [ b ])
>>         (zero_extend:SI (subreg:QI (reg:SI 91 [ b ]) 0))) pr64037.ii:9
>> 138 {*zero_extendqisi2}
>>      (expr_list:REG_DEAD (reg:SI 91 [ b ])
>>         (nil)))
>> (insn 10 9 0 2 (set (mem:SI (reg/v/f:SI 88 [ out ]) [1 *out_4(D)+0 S4
>> A32])
>>         (reg:SI 92 [ b ])) pr64037.ii:9 90 {*movsi_internal}
>>      (expr_list:REG_DEAD (reg:SI 92 [ b ])
>>         (expr_list:REG_DEAD (reg/v/f:SI 88 [ out ])
>>             (nil))))
>>
>> into
>>
>> Trying 9 -> 10:
>> Successfully matched this instruction:
>> (set (mem:SI (reg/v/f:SI 88 [ out ]) [1 *out_4(D)+0 S4 A32])
>>     (reg:SI 91 [ b ]))
>> allowing combination of insns 9 and 10
>> original costs 6 + 4 = 10
>> replacement cost 4
>> deferring deletion of insn with uid = 9.
>> modifying insn i3    10: [r88:SI]=r91:SI
>>       REG_DEAD r91:SI
>>       REG_DEAD r88:SI
>>
>>
>> This patch makes setup_incoming_promotions to match assign_parm_setup_reg.
>> Tested on Linux/x86-64 without regressions.  OK for trunk and backport?
>>
>> Thanks.
>>
>>
>> H.J.
>> ----
>> diff --git a/gcc/combine.c b/gcc/combine.c
>> index 1808f97..a0449a2 100644
>> --- a/gcc/combine.c
>> +++ b/gcc/combine.c
>> @@ -1561,8 +1561,8 @@ setup_incoming_promotions (rtx_insn *first)
>>        uns3 = TYPE_UNSIGNED (DECL_ARG_TYPE (arg));
>>
>>        /* The mode and signedness of the argument as it is actually passed,
>> -         after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions.  */
>> -      mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, &uns3,
>> +         see assign_parm_setup_reg in function.c.  */
>> +      mode3 = promote_function_mode (TREE_TYPE (arg), mode1, &uns1,
>>                                      TREE_TYPE (cfun->decl), 0);
>>
>>        /* The mode of the register in which the argument is being passed.  */
>> diff --git a/gcc/testsuite/g++.dg/pr64037.C b/gcc/testsuite/g++.dg/pr64037.C
>> new file mode 100644
>> index 0000000..e5cd0e2
>> --- /dev/null
>> +++ b/gcc/testsuite/g++.dg/pr64037.C
>> @@ -0,0 +1,27 @@
>> +// { dg-do run { target i?86-*-* x86_64-*-* } }
>> +// { dg-options "-std=c++11 -Os" }
>> +
>> +enum class X : unsigned char {
>> +  V = 2,
>> +};
>> +
>> +static void
>> +__attribute__((noinline,noclone))
>> +foo(unsigned &out, unsigned a, X b)
>> +{
>> +  out = static_cast<unsigned>(b);
>> +}
>> +
>> +int main()
>> +{
>> +  unsigned deadbeef = 0xDEADBEEF;
>> +  asm volatile ("" : "+d" (deadbeef), "+c" (deadbeef));
>> +
>> +  unsigned out;
>> +  foo(out, 2, X::V);
>> +
>> +  if (out != 2)
>> +    __builtin_abort ();
>> +
>> +  return 0;
>> +}
H.J. Lu Nov. 25, 2014, 4:05 p.m. UTC | #3
On Tue, Nov 25, 2014 at 7:01 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, Nov 25, 2014 at 1:57 PM, H.J. Lu <hongjiu.lu@intel.com> wrote:
>> Hi,
>>
>> The enclosed testcase fails on x86 when compiled with -Os since we pass
>> a byte parameter with a byte load in caller and read it as an int in
>> callee.  The reason it only shows up with -Os is x86 backend encodes
>> a byte load with an int load if -O isn't used.  When a byte load is
>> used, the upper 24 bits of the register have random value for none
>> WORD_REGISTER_OPERATIONS targets.
>>
>> It happens because setup_incoming_promotions in combine.c has
>>
>>       /* The mode and signedness of the argument before any promotions happen
>>          (equal to the mode of the pseudo holding it at that stage).  */
>>       mode1 = TYPE_MODE (TREE_TYPE (arg));
>>       uns1 = TYPE_UNSIGNED (TREE_TYPE (arg));
>>
>>       /* The mode and signedness of the argument after any source language and
>>          TARGET_PROMOTE_PROTOTYPES-driven promotions.  */
>>       mode2 = TYPE_MODE (DECL_ARG_TYPE (arg));
>>       uns3 = TYPE_UNSIGNED (DECL_ARG_TYPE (arg));
>>
>>       /* The mode and signedness of the argument as it is actually passed,
>>          after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions.  */
>>       mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, &uns3,
>>                                      TREE_TYPE (cfun->decl), 0);
>>
>> while they are actually passed in register by assign_parm_setup_reg in
>> function.c:
>>
>>   /* Store the parm in a pseudoregister during the function, but we may
>>      need to do it in a wider mode.  Using 2 here makes the result
>>      consistent with promote_decl_mode and thus expand_expr_real_1.  */
>>   promoted_nominal_mode
>>     = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
>>                              TREE_TYPE (current_function_decl), 2);
>>
>> where nominal_type and nominal_mode are set up with TREE_TYPE (parm)
>> and TYPE_MODE (nominal_type). TREE_TYPE here is
>
> I think the bug is here, not in combine.c.  Can you try going back in history
> for both snippets and see if they matched at some point?
>

The bug was introduced by

https://gcc.gnu.org/ml/gcc-cvs/2007-09/msg00613.html

commit 5d93234932c3d8617ce92b77b7013ef6bede9508
Author: shinwell <shinwell@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Thu Sep 20 11:01:18 2007 +0000

      gcc/
      * combine.c: Include cgraph.h.
      (setup_incoming_promotions): Rework to allow more aggressive
      elimination of sign extensions when all call sites of the
      current function are known to lie within the current unit.


    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@128618
138bc75d-0d04-0410-961f-82ee72b054a4

Before this commit, combine.c has

          enum machine_mode mode = TYPE_MODE (TREE_TYPE (arg));
          int uns = TYPE_UNSIGNED (TREE_TYPE (arg));

          mode = promote_mode (TREE_TYPE (arg), mode, &uns, 1);
          if (mode == GET_MODE (reg) && mode != DECL_MODE (arg))
            {
              rtx x;
              x = gen_rtx_CLOBBER (DECL_MODE (arg), const0_rtx);
              x = gen_rtx_fmt_e ((uns ? ZERO_EXTEND : SIGN_EXTEND), mode, x);
              record_value_for_reg (reg, first, x);
            }

It matches function.c:

  /* This is not really promoting for a call.  However we need to be
     consistent with assign_parm_find_data_types and expand_expr_real_1.  */
  promoted_nominal_mode
    = promote_mode (data->nominal_type, data->nominal_mode, &unsignedp, 1);

r128618 changed

mode = promote_mode (TREE_TYPE (arg), mode, &uns, 1);

to

mode3 = promote_mode (DECL_ARG_TYPE (arg), mode2, &uns3, 1);

It breaks none WORD_REGISTER_OPERATIONS targets.
H.J. Lu Nov. 25, 2014, 8:06 p.m. UTC | #4
On Tue, Nov 25, 2014 at 7:04 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, Nov 25, 2014 at 4:01 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Tue, Nov 25, 2014 at 1:57 PM, H.J. Lu <hongjiu.lu@intel.com> wrote:
>>> Hi,
>>>
>>> The enclosed testcase fails on x86 when compiled with -Os since we pass
>>> a byte parameter with a byte load in caller and read it as an int in
>>> callee.  The reason it only shows up with -Os is x86 backend encodes
>>> a byte load with an int load if -O isn't used.  When a byte load is
>>> used, the upper 24 bits of the register have random value for none
>>> WORD_REGISTER_OPERATIONS targets.
>>>
>>> It happens because setup_incoming_promotions in combine.c has
>>>
>>>       /* The mode and signedness of the argument before any promotions happen
>>>          (equal to the mode of the pseudo holding it at that stage).  */
>>>       mode1 = TYPE_MODE (TREE_TYPE (arg));
>>>       uns1 = TYPE_UNSIGNED (TREE_TYPE (arg));
>>>
>>>       /* The mode and signedness of the argument after any source language and
>>>          TARGET_PROMOTE_PROTOTYPES-driven promotions.  */
>>>       mode2 = TYPE_MODE (DECL_ARG_TYPE (arg));
>>>       uns3 = TYPE_UNSIGNED (DECL_ARG_TYPE (arg));
>>>
>>>       /* The mode and signedness of the argument as it is actually passed,
>>>          after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions.  */
>>>       mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, &uns3,
>>>                                      TREE_TYPE (cfun->decl), 0);
>>>
>>> while they are actually passed in register by assign_parm_setup_reg in
>>> function.c:
>>>
>>>   /* Store the parm in a pseudoregister during the function, but we may
>>>      need to do it in a wider mode.  Using 2 here makes the result
>>>      consistent with promote_decl_mode and thus expand_expr_real_1.  */
>>>   promoted_nominal_mode
>>>     = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
>>>                              TREE_TYPE (current_function_decl), 2);
>>>
>>> where nominal_type and nominal_mode are set up with TREE_TYPE (parm)
>>> and TYPE_MODE (nominal_type). TREE_TYPE here is
>>
>> I think the bug is here, not in combine.c.  Can you try going back in history
>> for both snippets and see if they matched at some point?
>
> Oh, and note that I think DECL_ARG_TYPE is sth dangerous - it's meant
> to be a source language "ABI" kind-of-thing.  Or rather an optimization
> hit.  For example in C when integral promotions happen to call arguments
> this can be used to optimize sign-/zero-extensions in the callee.  Unless
> something else overrides this (like the target which specifies the real ABI).
> IIRC.
>

PROMOTE_MODE is a performance hint, not an ABI requirement.
i386.h has

#define PROMOTE_MODE(MODE, UNSIGNEDP, TYPE)             \
do {                                                    \
  if (((MODE) == HImode && TARGET_PROMOTE_HI_REGS)      \
      || ((MODE) == QImode && TARGET_PROMOTE_QI_REGS))  \
    (MODE) = SImode;                                    \
} while (0)

We may promote QI/HI to SI, depending on optimization.

On the other hand, TARGET_PROMOTE_FUNCTION_MODE is
determined by psABI.

I am enclosing the missing ChangeLog entries.
Richard Biener Nov. 27, 2014, 3:25 p.m. UTC | #5
On Tue, Nov 25, 2014 at 5:05 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Tue, Nov 25, 2014 at 7:01 AM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Tue, Nov 25, 2014 at 1:57 PM, H.J. Lu <hongjiu.lu@intel.com> wrote:
>>> Hi,
>>>
>>> The enclosed testcase fails on x86 when compiled with -Os since we pass
>>> a byte parameter with a byte load in caller and read it as an int in
>>> callee.  The reason it only shows up with -Os is x86 backend encodes
>>> a byte load with an int load if -O isn't used.  When a byte load is
>>> used, the upper 24 bits of the register have random value for none
>>> WORD_REGISTER_OPERATIONS targets.
>>>
>>> It happens because setup_incoming_promotions in combine.c has
>>>
>>>       /* The mode and signedness of the argument before any promotions happen
>>>          (equal to the mode of the pseudo holding it at that stage).  */
>>>       mode1 = TYPE_MODE (TREE_TYPE (arg));
>>>       uns1 = TYPE_UNSIGNED (TREE_TYPE (arg));
>>>
>>>       /* The mode and signedness of the argument after any source language and
>>>          TARGET_PROMOTE_PROTOTYPES-driven promotions.  */
>>>       mode2 = TYPE_MODE (DECL_ARG_TYPE (arg));
>>>       uns3 = TYPE_UNSIGNED (DECL_ARG_TYPE (arg));
>>>
>>>       /* The mode and signedness of the argument as it is actually passed,
>>>          after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions.  */
>>>       mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, &uns3,
>>>                                      TREE_TYPE (cfun->decl), 0);
>>>
>>> while they are actually passed in register by assign_parm_setup_reg in
>>> function.c:
>>>
>>>   /* Store the parm in a pseudoregister during the function, but we may
>>>      need to do it in a wider mode.  Using 2 here makes the result
>>>      consistent with promote_decl_mode and thus expand_expr_real_1.  */
>>>   promoted_nominal_mode
>>>     = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
>>>                              TREE_TYPE (current_function_decl), 2);
>>>
>>> where nominal_type and nominal_mode are set up with TREE_TYPE (parm)
>>> and TYPE_MODE (nominal_type). TREE_TYPE here is
>>
>> I think the bug is here, not in combine.c.  Can you try going back in history
>> for both snippets and see if they matched at some point?
>>
>
> The bug was introduced by
>
> https://gcc.gnu.org/ml/gcc-cvs/2007-09/msg00613.html
>
> commit 5d93234932c3d8617ce92b77b7013ef6bede9508
> Author: shinwell <shinwell@138bc75d-0d04-0410-961f-82ee72b054a4>
> Date:   Thu Sep 20 11:01:18 2007 +0000
>
>       gcc/
>       * combine.c: Include cgraph.h.
>       (setup_incoming_promotions): Rework to allow more aggressive
>       elimination of sign extensions when all call sites of the
>       current function are known to lie within the current unit.
>
>
>     git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@128618
> 138bc75d-0d04-0410-961f-82ee72b054a4
>
> Before this commit, combine.c has
>
>           enum machine_mode mode = TYPE_MODE (TREE_TYPE (arg));
>           int uns = TYPE_UNSIGNED (TREE_TYPE (arg));
>
>           mode = promote_mode (TREE_TYPE (arg), mode, &uns, 1);
>           if (mode == GET_MODE (reg) && mode != DECL_MODE (arg))
>             {
>               rtx x;
>               x = gen_rtx_CLOBBER (DECL_MODE (arg), const0_rtx);
>               x = gen_rtx_fmt_e ((uns ? ZERO_EXTEND : SIGN_EXTEND), mode, x);
>               record_value_for_reg (reg, first, x);
>             }
>
> It matches function.c:
>
>   /* This is not really promoting for a call.  However we need to be
>      consistent with assign_parm_find_data_types and expand_expr_real_1.  */
>   promoted_nominal_mode
>     = promote_mode (data->nominal_type, data->nominal_mode, &unsignedp, 1);
>
> r128618 changed
>
> mode = promote_mode (TREE_TYPE (arg), mode, &uns, 1);
>
> to
>
> mode3 = promote_mode (DECL_ARG_TYPE (arg), mode2, &uns3, 1);
>
> It breaks none WORD_REGISTER_OPERATIONS targets.

Hmm, I think that DECL_ARG_TYPE makes a difference only
for non-WORD_REGISTER_OPERATIONS targets.

But yeah, isolated the above change looks wrong.

Your patch is ok for trunk if nobody objects within 24h and for branches
after a week.

Thanks,
Richard.

> --
> H.J.
diff mbox

Patch

diff --git a/gcc/combine.c b/gcc/combine.c
index 1808f97..a0449a2 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -1561,8 +1561,8 @@  setup_incoming_promotions (rtx_insn *first)
       uns3 = TYPE_UNSIGNED (DECL_ARG_TYPE (arg));
 
       /* The mode and signedness of the argument as it is actually passed,
-         after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions.  */
-      mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, &uns3,
+         see assign_parm_setup_reg in function.c.  */
+      mode3 = promote_function_mode (TREE_TYPE (arg), mode1, &uns1,
 				     TREE_TYPE (cfun->decl), 0);
 
       /* The mode of the register in which the argument is being passed.  */
diff --git a/gcc/testsuite/g++.dg/pr64037.C b/gcc/testsuite/g++.dg/pr64037.C
new file mode 100644
index 0000000..e5cd0e2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr64037.C
@@ -0,0 +1,27 @@ 
+// { dg-do run { target i?86-*-* x86_64-*-* } }
+// { dg-options "-std=c++11 -Os" }
+
+enum class X : unsigned char {
+  V = 2,
+};
+
+static void
+__attribute__((noinline,noclone))
+foo(unsigned &out, unsigned a, X b)
+{
+  out = static_cast<unsigned>(b);
+}
+
+int main()
+{
+  unsigned deadbeef = 0xDEADBEEF;
+  asm volatile ("" : "+d" (deadbeef), "+c" (deadbeef));
+
+  unsigned out;
+  foo(out, 2, X::V);
+
+  if (out != 2)
+    __builtin_abort ();
+
+  return 0;
+}