diff mbox series

[rtl-optimization] : Fix PR88948, store-motion pass creates unrecognisable insn

Message ID CAFULd4a_28=30xqz-uFSyrZG7ZUN6yWRfb6fNCc9T=i5SsH=7Q@mail.gmail.com
State New
Headers show
Series [rtl-optimization] : Fix PR88948, store-motion pass creates unrecognisable insn | expand

Commit Message

Uros Bizjak Jan. 22, 2019, 7:10 p.m. UTC
Hello!

RTL store motion pass transforms:

(insn 19 18 20 4 (parallel [
            (set (mem/c:SI (symbol_ref:SI ("bar_arg") [flags 0x2]
<var_decl 0x7f68136f2b40 bar_arg>) [1 bar_arg+0 S4 A32])
                (fix:SI (reg:DF 89)))
            (clobber (scratch:XF))
        ]) "stdarg-3.c":63:7 156 {fix_truncsi_i387_fisttp}
     (expr_list:REG_DEAD (reg:DF 89)
        (nil)))

to an unrecognisable insn:

(insn 33 18 20 4 (set (reg:SI 93 [ bar_arg ])
        (fix:SI (reg:DF 89))) "stdarg-3.c":63:7 -1
     (expr_list:REG_DEAD (reg:DF 89)
        (nil)))

The problem is with can_assign_to_reg_without_clobbers_p in gcse.c,
where we have:

  /* If the test insn is valid and doesn't need clobbers, and the target also
     has no objections, we're good.  */
  if (icode >= 0
      && (num_clobbers == 0 || !added_clobbers_hard_reg_p (icode))
      && ! (targetm.cannot_copy_insn_p
   && targetm.cannot_copy_insn_p (test_insn)))
    can_assign = true;

The test instruction is created as:

(insn 26 0 0 (set (reg:SI 152)
        (fix:SI (reg:DF 89))) -1
     (nil))

which is (correctly) recognized as

(define_insn "fix_trunc<mode>_i387_fisttp"
  [(set (match_operand:SWI248x 0 "nonimmediate_operand" "=m")
(fix:SWI248x (match_operand 1 "register_operand" "f")))
   (clobber (match_scratch:XF 2 "=&f"))]

However, recog also reports that 1 clobber needs to be added. The
instruction is recognized nevertheless due to "||
!added_clobbers_hard_reg_p (icode)" bypass. The recognized insn
doesn't clobber hard reg, but it also needs a clobber of a scratch reg
to be recognized.

The solution is to recognize the replacement insn in
replace_store_insn. We are sure that the new instruction is valid (it
passed can_assign_to_reg_without_clobbers_p when analysed), but we
need to add clobbers, as is the case with the
{fix_truncsi_i387_fisttp}. The call to recog will provide the number
of necessary clobbers for recognized insn, so we are able to wrap the
original pattern with a PARALLEL comprising required clobbers.

2019-01-22  Uroš Bizjak  <ubizjak@gmail.com>

    PR target/88948
    * store-motion.c: Include insn-config.h and recog.h
    (replace_store_insn): Recognize new store insn and
    add clobbers if needed.

testsuite/ChangeLog:

2019-01-22  Uroš Bizjak  <ubizjak@gmail.com>

    PR target/88948
    * gcc.target/i386/pr88948.c: New test.

Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

OK for mainline?

Uros.

Comments

Richard Sandiford Jan. 24, 2019, 11:52 a.m. UTC | #1
Uros Bizjak <ubizjak@gmail.com> writes:
> Hello!
>
> RTL store motion pass transforms:
>
> (insn 19 18 20 4 (parallel [
>             (set (mem/c:SI (symbol_ref:SI ("bar_arg") [flags 0x2]
> <var_decl 0x7f68136f2b40 bar_arg>) [1 bar_arg+0 S4 A32])
>                 (fix:SI (reg:DF 89)))
>             (clobber (scratch:XF))
>         ]) "stdarg-3.c":63:7 156 {fix_truncsi_i387_fisttp}
>      (expr_list:REG_DEAD (reg:DF 89)
>         (nil)))
>
> to an unrecognisable insn:
>
> (insn 33 18 20 4 (set (reg:SI 93 [ bar_arg ])
>         (fix:SI (reg:DF 89))) "stdarg-3.c":63:7 -1
>      (expr_list:REG_DEAD (reg:DF 89)
>         (nil)))
>
> The problem is with can_assign_to_reg_without_clobbers_p in gcse.c,
> where we have:
>
>   /* If the test insn is valid and doesn't need clobbers, and the target also
>      has no objections, we're good.  */
>   if (icode >= 0
>       && (num_clobbers == 0 || !added_clobbers_hard_reg_p (icode))
>       && ! (targetm.cannot_copy_insn_p
>    && targetm.cannot_copy_insn_p (test_insn)))
>     can_assign = true;
>
> The test instruction is created as:
>
> (insn 26 0 0 (set (reg:SI 152)
>         (fix:SI (reg:DF 89))) -1
>      (nil))
>
> which is (correctly) recognized as
>
> (define_insn "fix_trunc<mode>_i387_fisttp"
>   [(set (match_operand:SWI248x 0 "nonimmediate_operand" "=m")
> (fix:SWI248x (match_operand 1 "register_operand" "f")))
>    (clobber (match_scratch:XF 2 "=&f"))]
>
> However, recog also reports that 1 clobber needs to be added. The
> instruction is recognized nevertheless due to "||
> !added_clobbers_hard_reg_p (icode)" bypass. The recognized insn
> doesn't clobber hard reg, but it also needs a clobber of a scratch reg
> to be recognized.
>
> The solution is to recognize the replacement insn in
> replace_store_insn. We are sure that the new instruction is valid (it
> passed can_assign_to_reg_without_clobbers_p when analysed), but we
> need to add clobbers, as is the case with the
> {fix_truncsi_i387_fisttp}. The call to recog will provide the number
> of necessary clobbers for recognized insn, so we are able to wrap the
> original pattern with a PARALLEL comprising required clobbers.
>
> 2019-01-22  Uroš Bizjak  <ubizjak@gmail.com>
>
>     PR target/88948
>     * store-motion.c: Include insn-config.h and recog.h
>     (replace_store_insn): Recognize new store insn and
>     add clobbers if needed.
>
> testsuite/ChangeLog:
>
> 2019-01-22  Uroš Bizjak  <ubizjak@gmail.com>
>
>     PR target/88948
>     * gcc.target/i386/pr88948.c: New test.
>
> Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
>
> OK for mainline?
>
> Uros.
>
> Index: store-motion.c
> ===================================================================
> --- store-motion.c	(revision 268157)
> +++ store-motion.c	(working copy)
> @@ -27,6 +27,8 @@ along with GCC; see the file COPYING3.  If not see
>  #include "df.h"
>  #include "toplev.h"
>  
> +#include "insn-config.h"
> +#include "recog.h"
>  #include "cfgrtl.h"
>  #include "cfganal.h"
>  #include "lcm.h"
> @@ -910,11 +912,28 @@ replace_store_insn (rtx reg, rtx_insn *del, basic_
>  		    struct st_expr *smexpr)
>  {
>    rtx_insn *insn;
> +  rtx pat;
> +  int insn_code_number;
> +  int num_clobbers_to_add = 0;
>    rtx mem, note, set;
>  
>    mem = smexpr->pattern;
>    insn = gen_move_insn (reg, SET_SRC (single_set (del)));
>  
> +  pat = PATTERN (insn);
> +  insn_code_number = recog (pat, insn, &num_clobbers_to_add);
> +
> +  if (num_clobbers_to_add)
> +    {
> +      rtx newpat
> +	= gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (num_clobbers_to_add + 1));
> +
> +      XVECEXP (newpat, 0, 0) = pat;
> +      add_clobbers (newpat, insn_code_number);
> +
> +      PATTERN (insn) = newpat;
> +    }

Another potential problem is that gen_move_insn should only really
be used for general_operands.

process_insert_insn seems to get this right, so it'd probably be
cleanest to split the innards out into something that takes the dest
and source rtxes of the move, then reuse that here.

Thanks,
Richard
Uros Bizjak Jan. 24, 2019, 1:50 p.m. UTC | #2
On Thu, Jan 24, 2019 at 12:52 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:

> Another potential problem is that gen_move_insn should only really
> be used for general_operands.
>
> process_insert_insn seems to get this right, so it'd probably be
> cleanest to split the innards out into something that takes the dest
> and source rtxes of the move, then reuse that here.

Something like the attached patch?

2019-01-24  Uroš Bizjak  <ubizjak@gmail.com>

    PR target/88948
    * rtl.h (prepare_copy_insn): New prototype.
    * gcse.c (prepare_copy_insn): New function, split out from
    process_insert_insn.
    (process_insert_insn): Use prepare_copy_insn.
    * store-motion.c (replace_store_insn): Use prepare_copy_insn
    instead of gen_move_insn.

testsuite/ChangeLog:

2019-01-24  Uroš Bizjak  <ubizjak@gmail.com>

    PR target/88948
    * gcc.target/i386/pr88948.c: New test.

Bootstrap and regression test on x86_64-linux-gnu {,-m32} in progress.

Uros.
diff --git a/gcc/gcse.c b/gcc/gcse.c
index 35716cd38838..6c77671fe311 100644
--- a/gcc/gcse.c
+++ b/gcc/gcse.c
@@ -1963,14 +1963,11 @@ pre_expr_reaches_here_p (basic_block occr_bb, struct gcse_expr *expr, basic_bloc
   return rval;
 }
 
-/* Generate RTL to copy an EXPR to its `reaching_reg' and return it.  */
+/* Generate RTL to copy an EXP to REG and return it.  */
 
-static rtx_insn *
-process_insert_insn (struct gcse_expr *expr)
+rtx_insn *
+prepare_copy_insn (rtx reg, rtx exp)
 {
-  rtx reg = expr->reaching_reg;
-  /* Copy the expression to make sure we don't have any sharing issues.  */
-  rtx exp = copy_rtx (expr->expr);
   rtx_insn *pat;
 
   start_sequence ();
@@ -1996,6 +1993,18 @@ process_insert_insn (struct gcse_expr *expr)
   return pat;
 }
 
+/* Generate RTL to copy an EXPR to its `reaching_reg' and return it.  */
+
+static rtx_insn *
+process_insert_insn (struct gcse_expr *expr)
+{
+  rtx reg = expr->reaching_reg;
+  /* Copy the expression to make sure we don't have any sharing issues.  */
+  rtx exp = copy_rtx (expr->expr);
+
+  return prepare_copy_insn (reg, exp);
+}
+
 /* Add EXPR to the end of basic block BB.
 
    This is used by both the PRE and code hoisting.  */
diff --git a/gcc/rtl.h b/gcc/rtl.h
index 70891e6e3646..f99191983d39 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -4078,6 +4078,9 @@ extern void init_lower_subreg (void);
 /* In gcse.c */
 extern bool can_copy_p (machine_mode);
 extern bool can_assign_to_reg_without_clobbers_p (rtx, machine_mode);
+extern rtx_insn *prepare_copy_insn (rtx, rtx);
+
+/* In cprop.c */
 extern rtx fis_get_condition (rtx_insn *);
 
 /* In ira.c */
diff --git a/gcc/store-motion.c b/gcc/store-motion.c
index 28c4825b343d..a0838f68ba0f 100644
--- a/gcc/store-motion.c
+++ b/gcc/store-motion.c
@@ -912,8 +912,7 @@ replace_store_insn (rtx reg, rtx_insn *del, basic_block bb,
   rtx_insn *insn;
   rtx mem, note, set;
 
-  mem = smexpr->pattern;
-  insn = gen_move_insn (reg, SET_SRC (single_set (del)));
+  insn = prepare_copy_insn (reg, SET_SRC (single_set (del)));
 
   unsigned int i;
   rtx_insn *temp;
@@ -946,6 +945,7 @@ replace_store_insn (rtx reg, rtx_insn *del, basic_block bb,
   /* Now we must handle REG_EQUAL notes whose contents is equal to the mem;
      they are no longer accurate provided that they are reached by this
      definition, so drop them.  */
+  mem = smexpr->pattern;
   for (; insn != NEXT_INSN (BB_END (bb)); insn = NEXT_INSN (insn))
     if (NONDEBUG_INSN_P (insn))
       {
Richard Sandiford Jan. 24, 2019, 2:06 p.m. UTC | #3
Uros Bizjak <ubizjak@gmail.com> writes:
> On Thu, Jan 24, 2019 at 12:52 PM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>
>> Another potential problem is that gen_move_insn should only really
>> be used for general_operands.
>>
>> process_insert_insn seems to get this right, so it'd probably be
>> cleanest to split the innards out into something that takes the dest
>> and source rtxes of the move, then reuse that here.
>
> Something like the attached patch?
>
> 2019-01-24  Uroš Bizjak  <ubizjak@gmail.com>
>
>     PR target/88948
>     * rtl.h (prepare_copy_insn): New prototype.
>     * gcse.c (prepare_copy_insn): New function, split out from
>     process_insert_insn.
>     (process_insert_insn): Use prepare_copy_insn.
>     * store-motion.c (replace_store_insn): Use prepare_copy_insn
>     instead of gen_move_insn.
>
> testsuite/ChangeLog:
>
> 2019-01-24  Uroš Bizjak  <ubizjak@gmail.com>
>
>     PR target/88948
>     * gcc.target/i386/pr88948.c: New test.
>
> Bootstrap and regression test on x86_64-linux-gnu {,-m32} in progress.

OK if it passes, thanks.

Richard
diff mbox series

Patch

Index: store-motion.c
===================================================================
--- store-motion.c	(revision 268157)
+++ store-motion.c	(working copy)
@@ -27,6 +27,8 @@  along with GCC; see the file COPYING3.  If not see
 #include "df.h"
 #include "toplev.h"
 
+#include "insn-config.h"
+#include "recog.h"
 #include "cfgrtl.h"
 #include "cfganal.h"
 #include "lcm.h"
@@ -910,11 +912,28 @@  replace_store_insn (rtx reg, rtx_insn *del, basic_
 		    struct st_expr *smexpr)
 {
   rtx_insn *insn;
+  rtx pat;
+  int insn_code_number;
+  int num_clobbers_to_add = 0;
   rtx mem, note, set;
 
   mem = smexpr->pattern;
   insn = gen_move_insn (reg, SET_SRC (single_set (del)));
 
+  pat = PATTERN (insn);
+  insn_code_number = recog (pat, insn, &num_clobbers_to_add);
+
+  if (num_clobbers_to_add)
+    {
+      rtx newpat
+	= gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (num_clobbers_to_add + 1));
+
+      XVECEXP (newpat, 0, 0) = pat;
+      add_clobbers (newpat, insn_code_number);
+
+      PATTERN (insn) = newpat;
+    }
+
   unsigned int i;
   rtx_insn *temp;
   FOR_EACH_VEC_ELT_REVERSE (smexpr->antic_stores, i, temp)
Index: testsuite/gcc.target/i386/pr88948.c
===================================================================
--- testsuite/gcc.target/i386/pr88948.c	(nonexistent)
+++ testsuite/gcc.target/i386/pr88948.c	(working copy)
@@ -0,0 +1,5 @@ 
+/* PR rtl-optimization/88948 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fgcse-sm -msse3 -mfpmath=387" } */
+
+#include "../../gcc.c-torture/execute/stdarg-3.c"