diff mbox series

Fix wrong code generated by unroll-and-jam pass

Message ID 4094054.1IzOArtZ34@fomalhaut
State New
Headers show
Series Fix wrong code generated by unroll-and-jam pass | expand

Commit Message

Eric Botcazou Oct. 5, 2022, 3:36 p.m. UTC
Hi,

as shown by the attached testcase, there is a loophole in the unroll-and-jam
pass that can quickly result in wrong code generation.  The code reads:

    if (!compute_data_dependences_for_loop (outer, true, &loop_nest,
				&datarefs, &dependences))
	{
	  if (dump_file && (dump_flags & TDF_DETAILS))
	    fprintf (dump_file, "Cannot analyze data dependencies\n");
	  free_data_refs (datarefs);
	  free_dependence_relations (dependences);
	  continue;
	}

but compute_data_dependences_for_loop may return true even if the analysis is
reported as failing by compute_affine_dependence for some dependence pair:

(compute_affine_dependence
  ref_a: data[_14], stmt_a: data[_14] = i_59;
  ref_b: data[_14], stmt_b: data[_14] = i_59;
Data ref a:
#(Data Ref: 
#  bb: 12 
#  stmt: data[_14] = i_59;
#  ref: data[_14];
#  base_object: data;
#  Access function 0: scev_not_known;
#)
Data ref b:
#(Data Ref: 
#  bb: 12 
#  stmt: data[_14] = i_59;
#  ref: data[_14];
#  base_object: data;
#  Access function 0: scev_not_known;
#)
affine dependence test not usable: access function not affine or constant.
) -> dependence analysis failed

Note that this is a self-dependence pair and the code for them reads:

	  /* Nothing interesting for the self dependencies.  */
	  if (dra == drb)
	    continue;

This means that the pass may reorder "complex" accesses to the same memory
location in successive iterations, which is OK for reads but not for writes.

Proposed fix attached, tested on x86-64/Linux, OK for all active branches?


2022-10-05  Eric Botcazou  <ebotcazou@adacore.com>

	* gimple-loop-jam.cc (tree_loop_unroll_and_jam): Bail out for a self
	dependency that is a write-after-write if the access function is not
	affine or constant.


2022-10-05  Eric Botcazou  <ebotcazou@adacore.com>

	* gcc.c-torture/execute/20221005-1.c: New test.

Comments

Richard Biener Oct. 6, 2022, 8:21 a.m. UTC | #1
On Wed, Oct 5, 2022 at 5:39 PM Eric Botcazou via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi,
>
> as shown by the attached testcase, there is a loophole in the unroll-and-jam
> pass that can quickly result in wrong code generation.  The code reads:
>
>     if (!compute_data_dependences_for_loop (outer, true, &loop_nest,
>                                 &datarefs, &dependences))
>         {
>           if (dump_file && (dump_flags & TDF_DETAILS))
>             fprintf (dump_file, "Cannot analyze data dependencies\n");
>           free_data_refs (datarefs);
>           free_dependence_relations (dependences);
>           continue;
>         }
>
> but compute_data_dependences_for_loop may return true even if the analysis is
> reported as failing by compute_affine_dependence for some dependence pair:
>
> (compute_affine_dependence
>   ref_a: data[_14], stmt_a: data[_14] = i_59;
>   ref_b: data[_14], stmt_b: data[_14] = i_59;
> Data ref a:
> #(Data Ref:
> #  bb: 12
> #  stmt: data[_14] = i_59;
> #  ref: data[_14];
> #  base_object: data;
> #  Access function 0: scev_not_known;
> #)
> Data ref b:
> #(Data Ref:
> #  bb: 12
> #  stmt: data[_14] = i_59;
> #  ref: data[_14];
> #  base_object: data;
> #  Access function 0: scev_not_known;
> #)
> affine dependence test not usable: access function not affine or constant.
> ) -> dependence analysis failed
>
> Note that this is a self-dependence pair and the code for them reads:
>
>           /* Nothing interesting for the self dependencies.  */
>           if (dra == drb)
>             continue;
>
> This means that the pass may reorder "complex" accesses to the same memory
> location in successive iterations, which is OK for reads but not for writes.
>
> Proposed fix attached, tested on x86-64/Linux, OK for all active branches?

+             if (DR_IS_WRITE (dra)
+                 && !DR_ACCESS_FNS (dra).is_empty ()
+                 && DDR_ARE_DEPENDENT (ddr) == chrec_dont_know)
+               {

I'm wondering if testing DR_IS_WRITE (dra) is enough here and whether
the logic also applies to RAW and WAR.  So should it be either
(DR_IS_WRITE (dra) || DR_IS_WRITE (drb)) or DR_IS_WRITE (dra) &&
DR_IS_WRITE (drb)
instead?

Otherwise thanks for catching.

Richard.

>
> 2022-10-05  Eric Botcazou  <ebotcazou@adacore.com>
>
>         * gimple-loop-jam.cc (tree_loop_unroll_and_jam): Bail out for a self
>         dependency that is a write-after-write if the access function is not
>         affine or constant.
>
>
> 2022-10-05  Eric Botcazou  <ebotcazou@adacore.com>
>
>         * gcc.c-torture/execute/20221005-1.c: New test.
>
> --
> Eric Botcazou
Eric Botcazou Oct. 6, 2022, 11:09 a.m. UTC | #2
> I'm wondering if testing DR_IS_WRITE (dra) is enough here and whether
> the logic also applies to RAW and WAR.  So should it be either
> (DR_IS_WRITE (dra) || DR_IS_WRITE (drb)) or DR_IS_WRITE (dra) &&
> DR_IS_WRITE (drb) instead?

It's a self-dependence, i.e. dra == drb in the block.  Or do you mean for 
other dependence pairs in general?  For them, I think that the code does the 
proper filtering in adjust_unroll_factor, but I may be wrong of course.
Richard Biener Oct. 6, 2022, 12:21 p.m. UTC | #3
On Thu, Oct 6, 2022 at 1:09 PM Eric Botcazou <botcazou@adacore.com> wrote:
>
> > I'm wondering if testing DR_IS_WRITE (dra) is enough here and whether
> > the logic also applies to RAW and WAR.  So should it be either
> > (DR_IS_WRITE (dra) || DR_IS_WRITE (drb)) or DR_IS_WRITE (dra) &&
> > DR_IS_WRITE (drb) instead?
>
> It's a self-dependence, i.e. dra == drb in the block.  Or do you mean for
> other dependence pairs in general?  For them, I think that the code does the
> proper filtering in adjust_unroll_factor, but I may be wrong of course.

Ah, I see.

The original patch is OK.

thanks,
Richard.

> --
> Eric Botcazou
>
>
diff mbox series

Patch

diff --git a/gcc/gimple-loop-jam.cc b/gcc/gimple-loop-jam.cc
index a8a57d3d384..4f7a6e5bbae 100644
--- a/gcc/gimple-loop-jam.cc
+++ b/gcc/gimple-loop-jam.cc
@@ -545,11 +545,25 @@  tree_loop_unroll_and_jam (void)
 	  /* If the refs are independend there's nothing to do.  */
 	  if (DDR_ARE_DEPENDENT (ddr) == chrec_known)
 	    continue;
+
 	  dra = DDR_A (ddr);
 	  drb = DDR_B (ddr);
-	  /* Nothing interesting for the self dependencies.  */
+
+	  /* Nothing interesting for the self dependencies, except for WAW if
+	     the access function is not affine or constant because we may end
+	     up reordering writes to the same location.  */
 	  if (dra == drb)
-	    continue;
+	    {
+	      if (DR_IS_WRITE (dra)
+		  && !DR_ACCESS_FNS (dra).is_empty ()
+		  && DDR_ARE_DEPENDENT (ddr) == chrec_dont_know)
+		{
+		  unroll_factor = 0;
+		  break;
+		}
+	      else
+		continue;
+	    }
 
 	  /* Now check the distance vector, for determining a sensible
 	     outer unroll factor, and for validity of merging the inner