Patchwork Fix PR tree-optimization/50569

login
register
mail settings
Submitter Eric Botcazou
Date Dec. 10, 2011, 9:31 p.m.
Message ID <201112102231.23872.ebotcazou@adacore.com>
Download mbox | patch
Permalink /patch/130583/
State New
Headers show

Comments

Eric Botcazou - Dec. 10, 2011, 9:31 p.m.
Hi,

this is a regression present on mainline and 4.6 branch at -O for the SPARC.  
The compiler again generates an unaligned access for the memcpy calls in:

struct event {
    struct {
	unsigned int sec;
    } sent __attribute__((packed));
};

void __attribute__((noinline,noclone)) frob_entry(char *buf)
{
    struct event event;

    __builtin_memcpy(&event, buf, sizeof(event));
    if (event.sent.sec < 64) {
	event.sent.sec = -1U;
	__builtin_memcpy(buf, &event, sizeof(event));
    }
}

Unsurprisingly enough, the trick used in build_ref_for_model (in case this is a 
reference to a component, the function will replicate the last COMPONENT_REF 
of model's expr to access it) isn't sufficient anymore with MEM_REFs around, 
since MEM_REFs can encapsulate an arbitrary number of inner references.

Fixed by extending the trick to chain of COMPONENT_REFs.  Tested on x86/Linux 
and SPARC/Solaris, OK for mainline and 4.6 branch?


2011-12-10  Eric Botcazou  <ebotcazou@adacore.com>

	PR tree-optimization/50569
	* tree-sra.c (build_ref_for_model): Replicate a chain of COMPONENT_REFs
	in the expression of MODEL instead of just the last one.


2011-12-10  Eric Botcazou  <ebotcazou@adacore.com>

	* gcc.c-torture/execute/20111210-1.c! New test.
Richard Guenther - Dec. 12, 2011, 9:37 a.m.
On Sat, Dec 10, 2011 at 10:31 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
> Hi,
>
> this is a regression present on mainline and 4.6 branch at -O for the SPARC.
> The compiler again generates an unaligned access for the memcpy calls in:
>
> struct event {
>    struct {
>        unsigned int sec;
>    } sent __attribute__((packed));
> };
>
> void __attribute__((noinline,noclone)) frob_entry(char *buf)
> {
>    struct event event;
>
>    __builtin_memcpy(&event, buf, sizeof(event));
>    if (event.sent.sec < 64) {
>        event.sent.sec = -1U;
>        __builtin_memcpy(buf, &event, sizeof(event));
>    }
> }
>
> Unsurprisingly enough, the trick used in build_ref_for_model (in case this is a
> reference to a component, the function will replicate the last COMPONENT_REF
> of model's expr to access it) isn't sufficient anymore with MEM_REFs around,
> since MEM_REFs can encapsulate an arbitrary number of inner references.
>
> Fixed by extending the trick to chain of COMPONENT_REFs.  Tested on x86/Linux
> and SPARC/Solaris, OK for mainline and 4.6 branch?

Ok for both.

Thanks,
Richard.

>
> 2011-12-10  Eric Botcazou  <ebotcazou@adacore.com>
>
>        PR tree-optimization/50569
>        * tree-sra.c (build_ref_for_model): Replicate a chain of COMPONENT_REFs
>        in the expression of MODEL instead of just the last one.
>
>
> 2011-12-10  Eric Botcazou  <ebotcazou@adacore.com>
>
>        * gcc.c-torture/execute/20111210-1.c! New test.
>
>
> --
> Eric Botcazou
Martin Jambor - Dec. 12, 2011, 9:49 a.m.
Hi,

On Sat, Dec 10, 2011 at 10:31:23PM +0100, Eric Botcazou wrote:
> Hi,
> 
> this is a regression present on mainline and 4.6 branch at -O for the SPARC.  
> The compiler again generates an unaligned access for the memcpy calls in:
> 
> struct event {
>     struct {
> 	unsigned int sec;
>     } sent __attribute__((packed));
> };
> 
> void __attribute__((noinline,noclone)) frob_entry(char *buf)
> {
>     struct event event;
> 
>     __builtin_memcpy(&event, buf, sizeof(event));
>     if (event.sent.sec < 64) {
> 	event.sent.sec = -1U;
> 	__builtin_memcpy(buf, &event, sizeof(event));
>     }
> }

I believe there are many manifestation of this issue, the ones I track
are PR 50052 and PR 50444 which has even a x86_64 SSE testcase.

> 
> Unsurprisingly enough, the trick used in build_ref_for_model (in case this is a 
> reference to a component, the function will replicate the last COMPONENT_REF 
> of model's expr to access it) isn't sufficient anymore with MEM_REFs around, 
> since MEM_REFs can encapsulate an arbitrary number of inner references.
> 
> Fixed by extending the trick to chain of COMPONENT_REFs.

Well, I can live with this change (though I cannot approve anything).
On the other hand, the real underlying problem is that expander cannot
handle unaligned MEM_REFs where strict alignment is required.  SRA is
of course much more prone to create such situations than anything else
but I wonder whether they can creep up elsewhere too.  It also takes
us in the opposite direction than the one initially intended with
MEM_REFs, doesn't it?

That said, I looked into the expander briefly in summer but given my
level of experience in that area I did not nearly have enough time.  I
still plan to look into this issue in expander but for the same
reasons I cannot guarantee any quick success. So I acknowledge this is
the only working approach to a long-standing difficult bug... and most
probably the most appropriate for the 4.6 branch.

However, since we have them, shouldn't we use stack-based vectors to
handle the stack of COMPONENT_REFs?

Thanks,

Martin


>  Tested on x86/Linux 
> and SPARC/Solaris, OK for mainline and 4.6 branch?
> 
> 
> 2011-12-10  Eric Botcazou  <ebotcazou@adacore.com>
> 
> 	PR tree-optimization/50569
> 	* tree-sra.c (build_ref_for_model): Replicate a chain of COMPONENT_REFs
> 	in the expression of MODEL instead of just the last one.
> 
> 
> 2011-12-10  Eric Botcazou  <ebotcazou@adacore.com>
> 
> 	* gcc.c-torture/execute/20111210-1.c! New test.
> 
> 
> -- 
> Eric Botcazou

> /* PR tree-optimization/50569 */
> /* Reported by Paul Koning <pkoning@gcc.gnu.org> */
> /* Reduced testcase by Mikael Pettersson <mikpe@it.uu.se> */
> 
> struct event {
>     struct {
> 	unsigned int sec;
>     } sent __attribute__((packed));
> };
> 
> void __attribute__((noinline,noclone)) frob_entry(char *buf)
> {
>     struct event event;
> 
>     __builtin_memcpy(&event, buf, sizeof(event));
>     if (event.sent.sec < 64) {
> 	event.sent.sec = -1U;
> 	__builtin_memcpy(buf, &event, sizeof(event));
>     }
> }
> 
> int main(void)
> {
>     union {
> 	char buf[1 + sizeof(struct event)];
> 	int align;
>     } u;
> 
>     __builtin_memset(&u, 0, sizeof u);
> 
>     frob_entry(&u.buf[1]);
> 
>     return 0;
> }

> Index: tree-sra.c
> ===================================================================
> --- tree-sra.c	(revision 182102)
> +++ tree-sra.c	(working copy)
> @@ -1493,32 +1493,61 @@ build_ref_for_offset (location_t loc, tr
>  }
>  
>  /* Construct a memory reference to a part of an aggregate BASE at the given
> -   OFFSET and of the same type as MODEL.  In case this is a reference to a
> -   component, the function will replicate the last COMPONENT_REF of model's
> -   expr to access it.  GSI and INSERT_AFTER have the same meaning as in
> -   build_ref_for_offset.  */
> +   OFFSET and of the type of MODEL.  In case this is a chain of references
> +   to component, the function will replicate the chain of COMPONENT_REFs of
> +   the expression of MODEL to access it.  GSI and INSERT_AFTER have the same
> +   meaning as in build_ref_for_offset.  */
>  
>  static tree
>  build_ref_for_model (location_t loc, tree base, HOST_WIDE_INT offset,
>  		     struct access *model, gimple_stmt_iterator *gsi,
>  		     bool insert_after)
>  {
> +  tree type = model->type, t;
> +  VEC(tree,heap) *stack = NULL;
> +
>    if (TREE_CODE (model->expr) == COMPONENT_REF)
>      {
> -      tree t, exp_type, fld = TREE_OPERAND (model->expr, 1);
> -      tree cr_offset = component_ref_field_offset (model->expr);
> +      tree expr = model->expr;
> +
> +      /* Create a stack of the COMPONENT_REFs so later we can walk them in
> +	 order from inner to outer.  */
> +      stack = VEC_alloc (tree, heap, 6);
> +
> +      do {
> +	tree field = TREE_OPERAND (expr, 1);
> +	tree cr_offset = component_ref_field_offset (expr);
> +	gcc_assert (cr_offset && host_integerp (cr_offset, 1));
> +
> +	offset -= TREE_INT_CST_LOW (cr_offset) * BITS_PER_UNIT;
> +	offset -= TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (field));
>  
> -      gcc_assert (cr_offset && host_integerp (cr_offset, 1));
> -      offset -= TREE_INT_CST_LOW (cr_offset) * BITS_PER_UNIT;
> -      offset -= TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (fld));
> -      exp_type = TREE_TYPE (TREE_OPERAND (model->expr, 0));
> -      t = build_ref_for_offset (loc, base, offset, exp_type, gsi, insert_after);
> -      return fold_build3_loc (loc, COMPONENT_REF, TREE_TYPE (fld), t, fld,
> -			      TREE_OPERAND (model->expr, 2));
> +	VEC_safe_push (tree, heap, stack, expr);
> +
> +	expr = TREE_OPERAND (expr, 0);
> +	type = TREE_TYPE (expr);
> +      } while (TREE_CODE (expr) == COMPONENT_REF);
>      }
> -  else
> -    return build_ref_for_offset (loc, base, offset, model->type,
> -				 gsi, insert_after);
> +
> +  t = build_ref_for_offset (loc, base, offset, type, gsi, insert_after);
> +
> +  if (TREE_CODE (model->expr) == COMPONENT_REF)
> +    {
> +      unsigned i;
> +      tree expr;
> +
> +      /* Now replicate the chain of COMPONENT_REFs from inner to outer.  */
> +      FOR_EACH_VEC_ELT_REVERSE (tree, stack, i, expr)
> +	{
> +	  tree field = TREE_OPERAND (expr, 1);
> +	  t = fold_build3_loc (loc, COMPONENT_REF, TREE_TYPE (field), t, field,
> +			       TREE_OPERAND (expr, 2));
> +	}
> +
> +      VEC_free (tree, heap, stack);
> +    }
> +
> +  return t;
>  }
>  
>  /* Construct a memory reference consisting of component_refs and array_refs to
Eric Botcazou - Dec. 12, 2011, 11:40 a.m.
> Well, I can live with this change (though I cannot approve anything).
> On the other hand, the real underlying problem is that expander cannot
> handle unaligned MEM_REFs where strict alignment is required.  SRA is
> of course much more prone to create such situations than anything else
> but I wonder whether they can creep up elsewhere too.  It also takes
> us in the opposite direction than the one initially intended with
> MEM_REFs, doesn't it?

Certainly, but we need to fix the regression in a relatively safe manner.

> That said, I looked into the expander briefly in summer but given my
> level of experience in that area I did not nearly have enough time.  I
> still plan to look into this issue in expander but for the same
> reasons I cannot guarantee any quick success. So I acknowledge this is
> the only working approach to a long-standing difficult bug... and most
> probably the most appropriate for the 4.6 branch.

Thanks.  This is still the same very old issue: misalignment cannot be handled 
indirectly (because we don't really have misaligned pointers) so MEM_REFs can 
be used safely only when everything is properly aligned.

> However, since we have them, shouldn't we use stack-based vectors to
> handle the stack of COMPONENT_REFs?

Indeed, it will make the change before installing.
Richard Guenther - Dec. 12, 2011, 1:40 p.m.
On Mon, Dec 12, 2011 at 12:40 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> Well, I can live with this change (though I cannot approve anything).
>> On the other hand, the real underlying problem is that expander cannot
>> handle unaligned MEM_REFs where strict alignment is required.  SRA is
>> of course much more prone to create such situations than anything else
>> but I wonder whether they can creep up elsewhere too.  It also takes
>> us in the opposite direction than the one initially intended with
>> MEM_REFs, doesn't it?
>
> Certainly, but we need to fix the regression in a relatively safe manner.
>
>> That said, I looked into the expander briefly in summer but given my
>> level of experience in that area I did not nearly have enough time.  I
>> still plan to look into this issue in expander but for the same
>> reasons I cannot guarantee any quick success. So I acknowledge this is
>> the only working approach to a long-standing difficult bug... and most
>> probably the most appropriate for the 4.6 branch.
>
> Thanks.  This is still the same very old issue: misalignment cannot be handled
> indirectly (because we don't really have misaligned pointers) so MEM_REFs can
> be used safely only when everything is properly aligned.

We do have misaligned accesses - TYPE_ALIGN of TREE_TYPE of
the MEM_REF reflects that.  Similar to how would do

typedef int myint __attribute__((aligned(1)));
int foo (myint *p)
{
  return *p;
}

which is a testcase that is "miscompiled" since forever on
STRICT_ALIGNMENT targets (well, maybe apart from now for
those who implement movmisalign).

The fix is to fix the above testcase (which is a good idea anyway)
and then to make sure to transition misaligned information to
TREE_TYPE of the MEM_REF we create.

Richard.

Patch

Index: tree-sra.c
===================================================================
--- tree-sra.c	(revision 182102)
+++ tree-sra.c	(working copy)
@@ -1493,32 +1493,61 @@  build_ref_for_offset (location_t loc, tr
 }
 
 /* Construct a memory reference to a part of an aggregate BASE at the given
-   OFFSET and of the same type as MODEL.  In case this is a reference to a
-   component, the function will replicate the last COMPONENT_REF of model's
-   expr to access it.  GSI and INSERT_AFTER have the same meaning as in
-   build_ref_for_offset.  */
+   OFFSET and of the type of MODEL.  In case this is a chain of references
+   to component, the function will replicate the chain of COMPONENT_REFs of
+   the expression of MODEL to access it.  GSI and INSERT_AFTER have the same
+   meaning as in build_ref_for_offset.  */
 
 static tree
 build_ref_for_model (location_t loc, tree base, HOST_WIDE_INT offset,
 		     struct access *model, gimple_stmt_iterator *gsi,
 		     bool insert_after)
 {
+  tree type = model->type, t;
+  VEC(tree,heap) *stack = NULL;
+
   if (TREE_CODE (model->expr) == COMPONENT_REF)
     {
-      tree t, exp_type, fld = TREE_OPERAND (model->expr, 1);
-      tree cr_offset = component_ref_field_offset (model->expr);
+      tree expr = model->expr;
+
+      /* Create a stack of the COMPONENT_REFs so later we can walk them in
+	 order from inner to outer.  */
+      stack = VEC_alloc (tree, heap, 6);
+
+      do {
+	tree field = TREE_OPERAND (expr, 1);
+	tree cr_offset = component_ref_field_offset (expr);
+	gcc_assert (cr_offset && host_integerp (cr_offset, 1));
+
+	offset -= TREE_INT_CST_LOW (cr_offset) * BITS_PER_UNIT;
+	offset -= TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (field));
 
-      gcc_assert (cr_offset && host_integerp (cr_offset, 1));
-      offset -= TREE_INT_CST_LOW (cr_offset) * BITS_PER_UNIT;
-      offset -= TREE_INT_CST_LOW (DECL_FIELD_BIT_OFFSET (fld));
-      exp_type = TREE_TYPE (TREE_OPERAND (model->expr, 0));
-      t = build_ref_for_offset (loc, base, offset, exp_type, gsi, insert_after);
-      return fold_build3_loc (loc, COMPONENT_REF, TREE_TYPE (fld), t, fld,
-			      TREE_OPERAND (model->expr, 2));
+	VEC_safe_push (tree, heap, stack, expr);
+
+	expr = TREE_OPERAND (expr, 0);
+	type = TREE_TYPE (expr);
+      } while (TREE_CODE (expr) == COMPONENT_REF);
     }
-  else
-    return build_ref_for_offset (loc, base, offset, model->type,
-				 gsi, insert_after);
+
+  t = build_ref_for_offset (loc, base, offset, type, gsi, insert_after);
+
+  if (TREE_CODE (model->expr) == COMPONENT_REF)
+    {
+      unsigned i;
+      tree expr;
+
+      /* Now replicate the chain of COMPONENT_REFs from inner to outer.  */
+      FOR_EACH_VEC_ELT_REVERSE (tree, stack, i, expr)
+	{
+	  tree field = TREE_OPERAND (expr, 1);
+	  t = fold_build3_loc (loc, COMPONENT_REF, TREE_TYPE (field), t, field,
+			       TREE_OPERAND (expr, 2));
+	}
+
+      VEC_free (tree, heap, stack);
+    }
+
+  return t;
 }
 
 /* Construct a memory reference consisting of component_refs and array_refs to