Patchwork PATCH: PR rtl-optimization/55093: [4.8 Regression] [x32] -maddress-mode=long failed

login
register
mail settings
Submitter H.J. Lu
Date Oct. 30, 2012, 6:06 a.m.
Message ID <CAMe9rOq45FqsAy_QFSyUsBVj0SYxjqcLeNm_R8-PMtykuefJeQ@mail.gmail.com>
Download mbox | patch
Permalink /patch/195293/
State New
Headers show

Comments

H.J. Lu - Oct. 30, 2012, 6:06 a.m.
On Mon, Oct 29, 2012 at 3:49 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Mon, Oct 29, 2012 at 3:44 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Mon, Oct 29, 2012 at 8:15 AM, Richard Sandiford
>> <rdsandiford@googlemail.com> wrote:
>>> "H.J. Lu" <hjl.tools@gmail.com> writes:
>>>> Hi,
>>>>
>>>> This patch changes get_elimination to check register number instead of
>>>> RTX.  Tested on Linux/x32 with -maddress-mode=long.  OK to install?
>>>
>>> FWIW, this doesn't sound right to me, at least not without more justification.
>>> The idea is that things like frame_pointer_rtx are supposed to be unique,
>>> so the original code:
>>>
>>>>    if ((ep = elimination_map[hard_regno]) != NULL)
>>>> -    return ep->from_rtx != reg ? NULL : ep;
>>>> from != hard_regno ? NULL : ep;
>>>
>>> ought to be correct in itself.  reload did the same thing:
>>>
>>>           for (ep = reg_eliminate; ep < &reg_eliminate[NUM_ELIMINABLE_REGS];
>>>                ep++)
>>>             if (ep->from_rtx == x && ep->can_eliminate)
>>>               return plus_constant (Pmode, ep->to_rtx, ep->previous_offset);
>>>
>>> It sounds on the face of it like the bug is elsewhere.
>>>
>>
>> LRA has
>>
>>       if (REG_P (reg) && (ep = get_elimination (reg)) != NULL)
>>         {
>>           rtx to_rtx = replace_p ? ep->to_rtx : ep->from_rtx;
>>
>>           if (! replace_p)
>>             {
>>               offset += (ep->offset - ep->previous_offset);
>>               offset = trunc_int_for_mode (offset, GET_MODE (plus_cst_src));
>>             }
>>
>>           if (GET_CODE (XEXP (plus_cst_src, 0)) == SUBREG)
>>             to_rtx = gen_lowpart (GET_MODE (XEXP (plus_cst_src, 0)), to_rtx);
>>
>> Reload has
>>
>>             rtx to_rtx = ep->to_rtx;
>>             offset += ep->offset;
>>             offset = trunc_int_for_mode (offset, GET_MODE (plus_cst_src));
>>
>>             if (GET_CODE (XEXP (plus_cst_src, 0)) == SUBREG)
>>               to_rtx = gen_lowpart (GET_MODE (XEXP (plus_cst_src, 0)),
>>                                     to_rtx);
>>
>> (gdb) call debug_rtx (ep->to_rtx)
>> (reg/f:DI 7 sp)
>> (gdb) call debug_rtx (ep->from_rtx)
>> (reg/f:DI 16 argp)
>> (gdb)
>>
>> gen_lowpart returns (reg/f:DI 7 sp) for reload and (reg:SI 16 argp)
>> for LRA.   They are caused by
>>
>>   if (FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM
>>       /* We should convert arg register in LRA after the elimination
>>          if it is possible.  */
>>       && xregno == ARG_POINTER_REGNUM
>>       && ! lra_in_progress)
>>     return -1;
>>
>> It doesn't work in this case.
>>
>
> This testcase shows that LRA can't convert arg register after
> the elimination.
>

Here is a patch to remove ra_in_progress check for
ARG_POINTER_REGNUM.  Tested on Linux.x86-64.
OK to install?

Thanks.
Richard Sandiford - Oct. 30, 2012, 10:34 a.m.
"H.J. Lu" <hjl.tools@gmail.com> writes:
>>> LRA has
>>>
>>>       if (REG_P (reg) && (ep = get_elimination (reg)) != NULL)
>>>         {
>>>           rtx to_rtx = replace_p ? ep->to_rtx : ep->from_rtx;
>>>
>>>           if (! replace_p)
>>>             {
>>>               offset += (ep->offset - ep->previous_offset);
>>>               offset = trunc_int_for_mode (offset, GET_MODE (plus_cst_src));
>>>             }
>>>
>>>           if (GET_CODE (XEXP (plus_cst_src, 0)) == SUBREG)
>>>             to_rtx = gen_lowpart (GET_MODE (XEXP (plus_cst_src, 0)), to_rtx);
>>>
>>> Reload has
>>>
>>>             rtx to_rtx = ep->to_rtx;
>>>             offset += ep->offset;
>>>             offset = trunc_int_for_mode (offset, GET_MODE (plus_cst_src));
>>>
>>>             if (GET_CODE (XEXP (plus_cst_src, 0)) == SUBREG)
>>>               to_rtx = gen_lowpart (GET_MODE (XEXP (plus_cst_src, 0)),
>>>                                     to_rtx);
>>>
>>> (gdb) call debug_rtx (ep->to_rtx)
>>> (reg/f:DI 7 sp)
>>> (gdb) call debug_rtx (ep->from_rtx)
>>> (reg/f:DI 16 argp)
>>> (gdb)
>>>
>>> gen_lowpart returns (reg/f:DI 7 sp) for reload and (reg:SI 16 argp)
>>> for LRA.   They are caused by
>>>
>>>   if (FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM
>>>       /* We should convert arg register in LRA after the elimination
>>>          if it is possible.  */
>>>       && xregno == ARG_POINTER_REGNUM
>>>       && ! lra_in_progress)
>>>     return -1;
>>>
>>> It doesn't work in this case.
>>>
>>
>> This testcase shows that LRA can't convert arg register after
>> the elimination.
>>
> Here is a patch to remove ra_in_progress check for
> ARG_POINTER_REGNUM.  Tested on Linux.x86-64.
> OK to install?

Thanks HJ.  This looks good to me.  As well as your testcase, I think
it would be dangerous to reduce this kind of subreg during non-final
elimination in cases where the argument pointer occupies more than one
hard register (like avr IIRC).  We could end up with something like
ARG_POINTER_REGNUM+1, which wouldn't show up as an elimination register
during the rest of LRA.

It's important that we do get rid of the subreg during the final
elimination stage, but I think alter_subreg already handles that case.

Since this code is outside the LRA files: patch is OK if Vlad agrees.

Richard
Vladimir Makarov - Oct. 30, 2012, 3:30 p.m.
On 10/30/2012 06:34 AM, Richard Sandiford wrote:
> "H.J. Lu" <hjl.tools@gmail.com> writes:
>>>> LRA has
>>>>
>>>>        if (REG_P (reg) && (ep = get_elimination (reg)) != NULL)
>>>>          {
>>>>            rtx to_rtx = replace_p ? ep->to_rtx : ep->from_rtx;
>>>>
>>>>            if (! replace_p)
>>>>              {
>>>>                offset += (ep->offset - ep->previous_offset);
>>>>                offset = trunc_int_for_mode (offset, GET_MODE (plus_cst_src));
>>>>              }
>>>>
>>>>            if (GET_CODE (XEXP (plus_cst_src, 0)) == SUBREG)
>>>>              to_rtx = gen_lowpart (GET_MODE (XEXP (plus_cst_src, 0)), to_rtx);
>>>>
>>>> Reload has
>>>>
>>>>              rtx to_rtx = ep->to_rtx;
>>>>              offset += ep->offset;
>>>>              offset = trunc_int_for_mode (offset, GET_MODE (plus_cst_src));
>>>>
>>>>              if (GET_CODE (XEXP (plus_cst_src, 0)) == SUBREG)
>>>>                to_rtx = gen_lowpart (GET_MODE (XEXP (plus_cst_src, 0)),
>>>>                                      to_rtx);
>>>>
>>>> (gdb) call debug_rtx (ep->to_rtx)
>>>> (reg/f:DI 7 sp)
>>>> (gdb) call debug_rtx (ep->from_rtx)
>>>> (reg/f:DI 16 argp)
>>>> (gdb)
>>>>
>>>> gen_lowpart returns (reg/f:DI 7 sp) for reload and (reg:SI 16 argp)
>>>> for LRA.   They are caused by
>>>>
>>>>    if (FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM
>>>>        /* We should convert arg register in LRA after the elimination
>>>>           if it is possible.  */
>>>>        && xregno == ARG_POINTER_REGNUM
>>>>        && ! lra_in_progress)
>>>>      return -1;
>>>>
>>>> It doesn't work in this case.
>>>>
>>> This testcase shows that LRA can't convert arg register after
>>> the elimination.
>>>
>> Here is a patch to remove ra_in_progress check for
>> ARG_POINTER_REGNUM.  Tested on Linux.x86-64.
>> OK to install?
> Thanks HJ.  This looks good to me.  As well as your testcase, I think
> it would be dangerous to reduce this kind of subreg during non-final
> elimination in cases where the argument pointer occupies more than one
> hard register (like avr IIRC).  We could end up with something like
> ARG_POINTER_REGNUM+1, which wouldn't show up as an elimination register
> during the rest of LRA.
>
> It's important that we do get rid of the subreg during the final
> elimination stage, but I think alter_subreg already handles that case.
>
> Since this code is outside the LRA files: patch is OK if Vlad agrees.
>
I added this code for a reason probably to solve some target problems.

So I am not sure but let us try.

It is ok for me to commit the patch if there are no regressions on 
x86/x86-64.

Patch

diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
index 43d4cb8..c1a7580 100644
--- a/gcc/rtlanal.c
+++ b/gcc/rtlanal.c
@@ -3494,10 +3494,7 @@  simplify_subreg_regno (unsigned int xregno,
enum machine_mode xmode,
     return -1;

   if (FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM
-      /* We should convert arg register in LRA after the elimination
-	 if it is possible.  */
-      && xregno == ARG_POINTER_REGNUM
-      && ! lra_in_progress)
+      && xregno == ARG_POINTER_REGNUM)
     return -1;

   if (xregno == STACK_POINTER_REGNUM
diff --git a/gcc/testsuite/gcc.target/i386/pr55093.c
b/gcc/testsuite/gcc.target/i386/pr55093.c
new file mode 100644
index 0000000..76b4042
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr55093.c
@@ -0,0 +1,80 @@ 
+/* { dg-do compile { target { ! { ia32 } } } } */
+/* { dg-options "-O2 -mx32 -maddress-mode=long" } */
+
+typedef union tree_node *tree;
+typedef const union tree_node *const_tree;
+typedef struct {
+  unsigned long long low;
+  long long high;
+} double_int;
+struct real_value {
+};
+struct real_format {
+  int has_signed_zero;
+};
+extern const struct real_format *   real_format_for_mode[];
+extern int real_isnegzero (const struct real_value *);
+enum tree_code { REAL_CST, SSA_NAME };
+struct tree_base {
+  enum tree_code code : 16;
+  union {
+    unsigned int version;
+  }
+  u;
+};
+extern void tree_check_failed (const_tree, const char *, int, const
char *,           ...) __attribute__ ((__noreturn__));
+union tree_node {
+  struct tree_base base;
+};
+inline tree tree_check (tree __t, const char *__f, int __l, const
char *__g, enum tree_code __c) {
+  if (((enum tree_code) (__t)->base.code) != __c)
+    tree_check_failed (__t, __f, __l, __g, __c, 0);
+  return __t;
+}
+struct prop_value_d {
+  int lattice_val;
+  tree value;
+  double_int mask;
+};
+typedef struct prop_value_d prop_value_t;
+static prop_value_t *const_val;
+static void canonicalize_float_value (prop_value_t *);
+typedef void (*ssa_prop_visit_stmt_fn) (prop_value_t);
+typedef void (*ssa_prop_visit_phi_fn) (void);
+typedef void (*ssa_prop_fold_stmt_fn) (void *gsi);
+typedef void (*ssa_prop_get_value_fn) ( prop_value_t *val);
+void ssa_propagate (ssa_prop_visit_stmt_fn, ssa_prop_visit_phi_fn);
+int substitute_and_fold (ssa_prop_get_value_fn, ssa_prop_fold_stmt_fn);
+void ccp_fold_stmt (void *);
+static void get_constant_value (prop_value_t *val) {
+  canonicalize_float_value (val);
+}
+static void canonicalize_float_value (prop_value_t *val) {
+  int mode;
+  struct real_value d;
+  if (val->lattice_val != 1
+      || ((enum tree_code) (val->value)->base.code) != REAL_CST)
+    return;
+  mode = val->lattice_val;
+  if (real_format_for_mode[mode]->has_signed_zero && real_isnegzero (&d))
+    ccp_fold_stmt (0);
+}
+static void set_lattice_value (tree var, prop_value_t new_val) {
+  prop_value_t *old_val = &const_val[(tree_check ((var), "",
+						  0, "",
+						  (SSA_NAME)))->base.u.version];
+  canonicalize_float_value (&new_val);
+  canonicalize_float_value (old_val);
+}
+static void ccp_visit_phi_node (void) {
+  prop_value_t new_val;
+  set_lattice_value (0, new_val);
+}
+static void ccp_visit_stmt (prop_value_t v) {
+  set_lattice_value (0, v);
+}
+unsigned int do_ssa_ccp (void) {
+  ssa_propagate (ccp_visit_stmt, ccp_visit_phi_node);
+  substitute_and_fold (get_constant_value, ccp_fold_stmt);