Patchwork PR middle-end/51472: handle TM memmove with non-addressable destinations

login
register
mail settings
Submitter Aldy Hernandez
Date Jan. 4, 2012, 6:20 p.m.
Message ID <4F049864.1030004@gmail.com>
Download mbox | patch
Permalink /patch/134313/
State New
Headers show

Comments

Aldy Hernandez - Jan. 4, 2012, 6:20 p.m.
I fixed this PR, and then it got reopened because the testcase triggered 
a different problem on Alpha, Mips, and other architectures.  The 
problem is actually totally different than the previous fix for 51472, 
and has nothing to do with --param tm-max-aggregate-size.

This problem here is that a load is transformed into a transactional 
memmove in such a way that the subsequent SSA uses are pointing to the 
wrong thing.  For example, originally we have:

     global_var_ssa_999 = global_var;
     *p = global_var_ssa_999;

Through expand_assign_tm -> gimplify_addr -> force_gimple_operand_gsi, 
the above gets transformed into this:

     D.1234 = global_var_ssa_999; <-- BOO HISS! Uninitialized.
     __builtin__ITM_memmoveRtWt (&D.1234, &global_var, 16);
     *p = global_var_ssa_999; <-- Wuuuut?

We should either propagate D.1234 to the uses of global_var_ssa_999, or 
copy D.1234 into global_var_ssa_999 and happily proceed.
   Option B is pretty straightforward, and with the attached patch we 
end up with:

__builtin__ITM_memmoveRtWt (&D.1234, &global_var, 16);
     global_var_ssa_999 = D.1234;

The attached patch fixes the ICE on alpha-linux-gnu as tested with a 
cross-cc1 build.  Fully bootregtested on x86-64 Linux.

OK?
PR middle-end/51472
	* trans-mem.c (expand_assign_tm): Handle TM_MEMMOVE loads correctly.
testsuite/
	PR middle-end/51472
	* gcc.dg/tm/memopt-6.c: Adjust regexp.
Patrick Marlier - Jan. 4, 2012, 6:59 p.m.
On 01/04/2012 01:20 PM, Aldy Hernandez wrote:
> The attached patch fixes the ICE on alpha-linux-gnu as tested with a
> cross-cc1 build.

Note that it was also failing with i686/linux and the patch fixes the ICE.

Patrick.
Richard Henderson - Jan. 4, 2012, 8:48 p.m.
On 01/05/2012 05:20 AM, Aldy Hernandez wrote:
> 	PR middle-end/51472
> 	* trans-mem.c (expand_assign_tm): Handle TM_MEMMOVE loads correctly.
> testsuite/
> 	PR middle-end/51472
> 	* gcc.dg/tm/memopt-6.c: Adjust regexp.

Ok.


r~
Patrick Marlier - Jan. 4, 2012, 11:16 p.m.
On 01/04/2012 01:20 PM, Aldy Hernandez wrote:
> I fixed this PR, and then it got reopened because the testcase triggered
> a different problem on Alpha, Mips, and other architectures. The problem
> is actually totally different than the previous fix for 51472, and has
> nothing to do with --param tm-max-aggregate-size.

Also solve PR/51685 because it is a duplicate.

Patrick.
Richard Guenther - Jan. 5, 2012, 9:20 a.m.
On Wed, Jan 4, 2012 at 7:20 PM, Aldy Hernandez <aldyher@gmail.com> wrote:
> I fixed this PR, and then it got reopened because the testcase triggered a
> different problem on Alpha, Mips, and other architectures.  The problem is
> actually totally different than the previous fix for 51472, and has nothing
> to do with --param tm-max-aggregate-size.
>
> This problem here is that a load is transformed into a transactional memmove
> in such a way that the subsequent SSA uses are pointing to the wrong thing.
>  For example, originally we have:
>
>    global_var_ssa_999 = global_var;
>    *p = global_var_ssa_999;
>
> Through expand_assign_tm -> gimplify_addr -> force_gimple_operand_gsi, the
> above gets transformed into this:
>
>    D.1234 = global_var_ssa_999; <-- BOO HISS! Uninitialized.
>    __builtin__ITM_memmoveRtWt (&D.1234, &global_var, 16);
>    *p = global_var_ssa_999; <-- Wuuuut?
>
> We should either propagate D.1234 to the uses of global_var_ssa_999, or copy
> D.1234 into global_var_ssa_999 and happily proceed.
>  Option B is pretty straightforward, and with the attached patch we end up
> with:
>
> __builtin__ITM_memmoveRtWt (&D.1234, &global_var, 16);
>    global_var_ssa_999 = D.1234;
>
> The attached patch fixes the ICE on alpha-linux-gnu as tested with a
> cross-cc1 build.  Fully bootregtested on x86-64 Linux.
>
> OK?

You want is_gimple_reg () instead of is_gimple_non_addressable () as you
can't simply make something addressable at this point.
is_gimple_non_addressable
looks like a weird redundant predicate to me - it's only used once and its use
should be replaced (and the predicate removed).

Richard.
Aldy Hernandez - Jan. 5, 2012, 1:42 p.m.
On 01/04/12 17:16, Patrick Marlier wrote:
> On 01/04/2012 01:20 PM, Aldy Hernandez wrote:
>> I fixed this PR, and then it got reopened because the testcase triggered
>> a different problem on Alpha, Mips, and other architectures. The problem
>> is actually totally different than the previous fix for 51472, and has
>> nothing to do with --param tm-max-aggregate-size.
>
> Also solve PR/51685 because it is a duplicate.
>
> Patrick.
>

Sweet!  Three for one!  Adjusted PR as duplicate.  Closed the original PR.
Aldy Hernandez - Jan. 5, 2012, 2:32 p.m.
> You want is_gimple_reg () instead of is_gimple_non_addressable () as you
> can't simply make something addressable at this point.
> is_gimple_non_addressable
> looks like a weird redundant predicate to me - it's only used once and its use
> should be replaced (and the predicate removed).

Ok, thanks.  I am testing a patch changing my original patch, and 
cleaning up the other use of is_gimple_non_addressable.
Richard Guenther - Jan. 5, 2012, 3:35 p.m.
On Thu, Jan 5, 2012 at 3:32 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>
>> You want is_gimple_reg () instead of is_gimple_non_addressable () as you
>> can't simply make something addressable at this point.
>> is_gimple_non_addressable
>> looks like a weird redundant predicate to me - it's only used once and its
>> use
>> should be replaced (and the predicate removed).
>
>
> Ok, thanks.  I am testing a patch changing my original patch, and cleaning
> up the other use of is_gimple_non_addressable.

... which is btw a bit weird and probably not easy to change at this point
(a check would be !is_gimple_reg () && !TREE_ADDRESSABLE (), but
during gimplification we make more things addressable, so that isn't really
a good predicate).  So please at most inline it into its single caller.

Thanks,
Richard.

Patch

Index: testsuite/gcc.dg/tm/memopt-6.c
===================================================================
--- testsuite/gcc.dg/tm/memopt-6.c	(revision 182848)
+++ testsuite/gcc.dg/tm/memopt-6.c	(working copy)
@@ -17,5 +17,5 @@  int f()
   return lala.x[i];
 }
 
-/* { dg-final { scan-tree-dump-times "memmoveRtWt \\\(&lala, &lacopy" 1 "tmedge" } } */
+/* { dg-final { scan-tree-dump-times "memmoveRtWt \\\(.*, &lacopy" 1 "tmedge" } } */
 /* { dg-final { cleanup-tree-dump "tmedge" } } */
Index: trans-mem.c
===================================================================
--- trans-mem.c	(revision 182876)
+++ trans-mem.c	(working copy)
@@ -2174,7 +2174,7 @@  expand_assign_tm (struct tm_region *regi
     }
   if (!gcall)
     {
-      tree lhs_addr, rhs_addr;
+      tree lhs_addr, rhs_addr, tmp;
 
       if (load_p)
 	transaction_subcode_ior (region, GTMA_HAVE_LOAD);
@@ -2183,13 +2183,29 @@  expand_assign_tm (struct tm_region *regi
 
       /* ??? Figure out if there's any possible overlap between the LHS
 	 and the RHS and if not, use MEMCPY.  */
-      lhs_addr = gimplify_addr (gsi, lhs);
+
+      if (load_p && is_gimple_non_addressable (lhs))
+	{
+	  tmp = create_tmp_var (TREE_TYPE (lhs), NULL);
+	  lhs_addr = build_fold_addr_expr (tmp);
+	}
+      else
+	{
+	  tmp = NULL_TREE;
+	  lhs_addr = gimplify_addr (gsi, lhs);
+	}
       rhs_addr = gimplify_addr (gsi, rhs);
       gcall = gimple_build_call (builtin_decl_explicit (BUILT_IN_TM_MEMMOVE),
 				 3, lhs_addr, rhs_addr,
 				 TYPE_SIZE_UNIT (TREE_TYPE (lhs)));
       gimple_set_location (gcall, loc);
       gsi_insert_before (gsi, gcall, GSI_SAME_STMT);
+
+      if (tmp)
+	{
+	  gcall = gimple_build_assign (lhs, tmp);
+	  gsi_insert_before (gsi, gcall, GSI_SAME_STMT);
+	}
     }
 
   /* Now that we have the load/store in its instrumented form, add