diff mbox series

Fix up remove_partial_avx_dependency (PR target/89474)

Message ID 20190225225304.GZ7611@tucnak
State New
Headers show
Series Fix up remove_partial_avx_dependency (PR target/89474) | expand

Commit Message

Jakub Jelinek Feb. 25, 2019, 10:53 p.m. UTC
Hi!

The following patch fixes two issues in the new rpad pass.
One is that the insertion at the start of a basic block didn't work properly
if the basic block didn't contain any non-NOTE/non-DEBUG_INSN instructions.
next_nonnote_nondebug_insn hapilly turns through into another basic block
and the insertion can insert an instruction in between basic blocks or into
a different basic block from where we wanted to emit it.
I believe we want to emit it after CODE_LABEL, after NOTE_INSN_BASIC_BLOCK
and if possible, after debug insns in there, the patch emits it before the
first normal insn in the bb if any, or after the BB_END (thus extending
BB_END).

Another issue is that it is quite weird/dangerous to add the v4sf_const0
pseudo uses in lots of places in the IL, register those changes with df,
then do df_analyze with different flags and finally emit the setter.
I understand the goal was not to do df_analyze etc. in the usual case where
there are no instructions that need this treatment.  This patch does the
df_analyze at the spot we find the first insn, but before we actually change
that instruction, so the changes are after the df_analyze.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-02-25  Jakub Jelinek  <jakub@redhat.com>

	PR target/89474
	* config/i386/i386.c (remove_partial_avx_dependency): Call
	df_analyze etc. before creation of the v4sf_const0 pseudo, rather than
	after changing possibly many instructions to use that pseudo.  Fix up
	insertion of v4sf_const0 setter at the start of bb.

	* gcc.target/i386/pr89474.c: New test.


	Jakub

Comments

H.J. Lu Feb. 25, 2019, 10:57 p.m. UTC | #1
On Mon, Feb 25, 2019 at 2:53 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> Hi!
>
> The following patch fixes two issues in the new rpad pass.
> One is that the insertion at the start of a basic block didn't work properly
> if the basic block didn't contain any non-NOTE/non-DEBUG_INSN instructions.
> next_nonnote_nondebug_insn hapilly turns through into another basic block
> and the insertion can insert an instruction in between basic blocks or into
> a different basic block from where we wanted to emit it.
> I believe we want to emit it after CODE_LABEL, after NOTE_INSN_BASIC_BLOCK
> and if possible, after debug insns in there, the patch emits it before the
> first normal insn in the bb if any, or after the BB_END (thus extending
> BB_END).
>
> Another issue is that it is quite weird/dangerous to add the v4sf_const0
> pseudo uses in lots of places in the IL, register those changes with df,
> then do df_analyze with different flags and finally emit the setter.
> I understand the goal was not to do df_analyze etc. in the usual case where
> there are no instructions that need this treatment.  This patch does the
> df_analyze at the spot we find the first insn, but before we actually change
> that instruction, so the changes are after the df_analyze.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2019-02-25  Jakub Jelinek  <jakub@redhat.com>
>
>         PR target/89474
>         * config/i386/i386.c (remove_partial_avx_dependency): Call
>         df_analyze etc. before creation of the v4sf_const0 pseudo, rather than
>         after changing possibly many instructions to use that pseudo.  Fix up
>         insertion of v4sf_const0 setter at the start of bb.
>
>         * gcc.target/i386/pr89474.c: New test.
>
> --- gcc/config/i386/i386.c.jj   2019-02-22 23:02:47.805117610 +0100
> +++ gcc/config/i386/i386.c      2019-02-25 14:20:05.793608879 +0100
> @@ -2835,7 +2835,14 @@ remove_partial_avx_dependency (void)
>             continue;
>
>           if (!v4sf_const0)
> -           v4sf_const0 = gen_reg_rtx (V4SFmode);
> +           {
> +             calculate_dominance_info (CDI_DOMINATORS);
> +             df_set_flags (DF_DEFER_INSN_RESCAN);
> +             df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN);
> +             df_md_add_problem ();
> +             df_analyze ();
> +             v4sf_const0 = gen_reg_rtx (V4SFmode);
> +           }
>
>           /* Convert PARTIAL_XMM_UPDATE_TRUE insns, DF -> SF, SF -> DF,
>              SI -> SF, SI -> DF, DI -> SF, DI -> DF, to vec_dup and
> @@ -2883,12 +2890,6 @@ remove_partial_avx_dependency (void)
>
>    if (v4sf_const0)
>      {
> -      calculate_dominance_info (CDI_DOMINATORS);
> -      df_set_flags (DF_DEFER_INSN_RESCAN);
> -      df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN);
> -      df_md_add_problem ();
> -      df_analyze ();
> -
>        /* (Re-)discover loops so that bb->loop_father can be used in the
>          analysis below.  */
>        loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
> @@ -2904,11 +2905,23 @@ remove_partial_avx_dependency (void)
>         bb = get_immediate_dominator (CDI_DOMINATORS,
>                                       bb->loop_father->header);
>
> -      insn = BB_HEAD (bb);
> -      if (!NONDEBUG_INSN_P (insn))
> -       insn = next_nonnote_nondebug_insn (insn);
>        set = gen_rtx_SET (v4sf_const0, CONST0_RTX (V4SFmode));
> -      set_insn = emit_insn_before (set, insn);
> +
> +      insn = BB_HEAD (bb);
> +      while (insn && !NONDEBUG_INSN_P (insn))
> +       {
> +         if (insn == BB_END (bb))
> +           {
> +             insn = NULL;
> +             break;
> +           }
> +         insn = NEXT_INSN (insn);
> +       }
> +      if (insn == BB_HEAD (bb))
> +        set_insn = emit_insn_before (set, insn);
> +      else
> +       set_insn = emit_insn_after (set,
> +                                   insn ? PREV_INSN (insn) : BB_END (bb));
>        df_insn_rescan (set_insn);
>        df_process_deferred_rescans ();
>        loop_optimizer_finalize ();
> --- gcc/testsuite/gcc.target/i386/pr89474.c.jj  2019-02-25 14:21:51.651867104 +0100
> +++ gcc/testsuite/gcc.target/i386/pr89474.c     2019-02-25 14:21:34.373151405 +0100
> @@ -0,0 +1,14 @@
> +/* PR target/89474 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mavx" } */
> +
> +int a;
> +void foo (double);
> +int baz (void);
> +
> +void
> +bar (void)
> +{
> +  while (baz ())
> +    foo (a);
> +}
>
>         Jakub

Thanks.
Jakub Jelinek Feb. 26, 2019, 8:52 a.m. UTC | #2
Hi!

Honza, you've reviewed H.J.'s patch, could you please review this follow-up
as well?  Thanks.

On Mon, Feb 25, 2019 at 02:57:39PM -0800, H.J. Lu wrote:
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> >
> > 2019-02-25  Jakub Jelinek  <jakub@redhat.com>
> >
> >         PR target/89474
> >         * config/i386/i386.c (remove_partial_avx_dependency): Call
> >         df_analyze etc. before creation of the v4sf_const0 pseudo, rather than
> >         after changing possibly many instructions to use that pseudo.  Fix up
> >         insertion of v4sf_const0 setter at the start of bb.
> >
> >         * gcc.target/i386/pr89474.c: New test.

	Jakub
Richard Biener Feb. 26, 2019, 10:02 a.m. UTC | #3
On Mon, Feb 25, 2019 at 11:53 PM Jakub Jelinek <jakub@redhat.com> wrote:
>
> Hi!
>
> The following patch fixes two issues in the new rpad pass.
> One is that the insertion at the start of a basic block didn't work properly
> if the basic block didn't contain any non-NOTE/non-DEBUG_INSN instructions.
> next_nonnote_nondebug_insn hapilly turns through into another basic block
> and the insertion can insert an instruction in between basic blocks or into
> a different basic block from where we wanted to emit it.
> I believe we want to emit it after CODE_LABEL, after NOTE_INSN_BASIC_BLOCK
> and if possible, after debug insns in there, the patch emits it before the
> first normal insn in the bb if any, or after the BB_END (thus extending
> BB_END).
>
> Another issue is that it is quite weird/dangerous to add the v4sf_const0
> pseudo uses in lots of places in the IL, register those changes with df,
> then do df_analyze with different flags and finally emit the setter.
> I understand the goal was not to do df_analyze etc. in the usual case where
> there are no instructions that need this treatment.  This patch does the
> df_analyze at the spot we find the first insn, but before we actually change
> that instruction, so the changes are after the df_analyze.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

> 2019-02-25  Jakub Jelinek  <jakub@redhat.com>
>
>         PR target/89474
>         * config/i386/i386.c (remove_partial_avx_dependency): Call
>         df_analyze etc. before creation of the v4sf_const0 pseudo, rather than
>         after changing possibly many instructions to use that pseudo.  Fix up
>         insertion of v4sf_const0 setter at the start of bb.
>
>         * gcc.target/i386/pr89474.c: New test.
>
> --- gcc/config/i386/i386.c.jj   2019-02-22 23:02:47.805117610 +0100
> +++ gcc/config/i386/i386.c      2019-02-25 14:20:05.793608879 +0100
> @@ -2835,7 +2835,14 @@ remove_partial_avx_dependency (void)
>             continue;
>
>           if (!v4sf_const0)
> -           v4sf_const0 = gen_reg_rtx (V4SFmode);
> +           {
> +             calculate_dominance_info (CDI_DOMINATORS);
> +             df_set_flags (DF_DEFER_INSN_RESCAN);
> +             df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN);
> +             df_md_add_problem ();
> +             df_analyze ();
> +             v4sf_const0 = gen_reg_rtx (V4SFmode);
> +           }
>
>           /* Convert PARTIAL_XMM_UPDATE_TRUE insns, DF -> SF, SF -> DF,
>              SI -> SF, SI -> DF, DI -> SF, DI -> DF, to vec_dup and
> @@ -2883,12 +2890,6 @@ remove_partial_avx_dependency (void)
>
>    if (v4sf_const0)
>      {
> -      calculate_dominance_info (CDI_DOMINATORS);
> -      df_set_flags (DF_DEFER_INSN_RESCAN);
> -      df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN);
> -      df_md_add_problem ();
> -      df_analyze ();
> -
>        /* (Re-)discover loops so that bb->loop_father can be used in the
>          analysis below.  */
>        loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
> @@ -2904,11 +2905,23 @@ remove_partial_avx_dependency (void)
>         bb = get_immediate_dominator (CDI_DOMINATORS,
>                                       bb->loop_father->header);
>
> -      insn = BB_HEAD (bb);
> -      if (!NONDEBUG_INSN_P (insn))
> -       insn = next_nonnote_nondebug_insn (insn);
>        set = gen_rtx_SET (v4sf_const0, CONST0_RTX (V4SFmode));
> -      set_insn = emit_insn_before (set, insn);
> +
> +      insn = BB_HEAD (bb);
> +      while (insn && !NONDEBUG_INSN_P (insn))
> +       {
> +         if (insn == BB_END (bb))
> +           {
> +             insn = NULL;
> +             break;
> +           }
> +         insn = NEXT_INSN (insn);
> +       }
> +      if (insn == BB_HEAD (bb))
> +        set_insn = emit_insn_before (set, insn);
> +      else
> +       set_insn = emit_insn_after (set,
> +                                   insn ? PREV_INSN (insn) : BB_END (bb));
>        df_insn_rescan (set_insn);
>        df_process_deferred_rescans ();
>        loop_optimizer_finalize ();
> --- gcc/testsuite/gcc.target/i386/pr89474.c.jj  2019-02-25 14:21:51.651867104 +0100
> +++ gcc/testsuite/gcc.target/i386/pr89474.c     2019-02-25 14:21:34.373151405 +0100
> @@ -0,0 +1,14 @@
> +/* PR target/89474 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mavx" } */
> +
> +int a;
> +void foo (double);
> +int baz (void);
> +
> +void
> +bar (void)
> +{
> +  while (baz ())
> +    foo (a);
> +}
>
>         Jakub
diff mbox series

Patch

--- gcc/config/i386/i386.c.jj	2019-02-22 23:02:47.805117610 +0100
+++ gcc/config/i386/i386.c	2019-02-25 14:20:05.793608879 +0100
@@ -2835,7 +2835,14 @@  remove_partial_avx_dependency (void)
 	    continue;
 
 	  if (!v4sf_const0)
-	    v4sf_const0 = gen_reg_rtx (V4SFmode);
+	    {
+	      calculate_dominance_info (CDI_DOMINATORS);
+	      df_set_flags (DF_DEFER_INSN_RESCAN);
+	      df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN);
+	      df_md_add_problem ();
+	      df_analyze ();
+	      v4sf_const0 = gen_reg_rtx (V4SFmode);
+	    }
 
 	  /* Convert PARTIAL_XMM_UPDATE_TRUE insns, DF -> SF, SF -> DF,
 	     SI -> SF, SI -> DF, DI -> SF, DI -> DF, to vec_dup and
@@ -2883,12 +2890,6 @@  remove_partial_avx_dependency (void)
 
   if (v4sf_const0)
     {
-      calculate_dominance_info (CDI_DOMINATORS);
-      df_set_flags (DF_DEFER_INSN_RESCAN);
-      df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN);
-      df_md_add_problem ();
-      df_analyze ();
-
       /* (Re-)discover loops so that bb->loop_father can be used in the
 	 analysis below.  */
       loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
@@ -2904,11 +2905,23 @@  remove_partial_avx_dependency (void)
 	bb = get_immediate_dominator (CDI_DOMINATORS,
 				      bb->loop_father->header);
 
-      insn = BB_HEAD (bb);
-      if (!NONDEBUG_INSN_P (insn))
-	insn = next_nonnote_nondebug_insn (insn);
       set = gen_rtx_SET (v4sf_const0, CONST0_RTX (V4SFmode));
-      set_insn = emit_insn_before (set, insn);
+
+      insn = BB_HEAD (bb);
+      while (insn && !NONDEBUG_INSN_P (insn))
+	{
+	  if (insn == BB_END (bb))
+	    {
+	      insn = NULL;
+	      break;
+	    }
+	  insn = NEXT_INSN (insn);
+	}
+      if (insn == BB_HEAD (bb))
+        set_insn = emit_insn_before (set, insn);
+      else
+	set_insn = emit_insn_after (set,
+				    insn ? PREV_INSN (insn) : BB_END (bb));
       df_insn_rescan (set_insn);
       df_process_deferred_rescans ();
       loop_optimizer_finalize ();
--- gcc/testsuite/gcc.target/i386/pr89474.c.jj	2019-02-25 14:21:51.651867104 +0100
+++ gcc/testsuite/gcc.target/i386/pr89474.c	2019-02-25 14:21:34.373151405 +0100
@@ -0,0 +1,14 @@ 
+/* PR target/89474 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx" } */
+
+int a;
+void foo (double);
+int baz (void);
+
+void
+bar (void)
+{
+  while (baz ())
+    foo (a);
+}