Patchwork [PR,55755] Make SRA create less VIEW_CONVERT_EXPRs

login
register
mail settings
Submitter Martin Jambor
Date Jan. 3, 2013, 5:45 p.m.
Message ID <20130103174527.GA5183@virgil.suse>
Download mbox | patch
Permalink /patch/209287/
State New
Headers show

Comments

Martin Jambor - Jan. 3, 2013, 5:45 p.m.
Hi,

the patch below fixes PR 55755 which was in the compiler for years.
The problem is that a replacement of a bit-field can have a larger
TYPE_SIZE than the type of the field and so creating a V_C_E from it
to the field type may result in invalid gimple.  We do that when we
scalarize only one side of an assignment and get incompatible types on
both sides and the other (non-scalar) side has a child access in the
access tree (regardless if it is to be scalarize or not).

When looking at the issue I realized that the last condition is
completely unnecessary (at least now, the first concepts of the "new"
SRA were a bit different) because the subsequent handling of
sub-replacements will do the right thing (load/store them to the
original aggregate) and removing it is indeed the correct thing to
deal with this bug - if both sides are scalarized, size of both will
grow to mode size, if only one, we can avoid the V_C_E.  I am a little
worried about the contains_bitfld_comp_ref_p and
contains_vce_or_bfcref_p gurads which are there because of Ada PR
46349 (which involves aggregate bit-fields) and which might in theory
lead to the same problem but I'm weary of touching it, at least not in
one commit (I'm testing what happens if I remove them right now), and
this patch does not make the current situation any worse.

In order to make sure we do not mess up when the non-scalar side has
sub-replacements in it, I have added a new testcase.

The patch has passed bootstrap and testing on x86_64-linux on trunk
and the 4.7 and 4.6 branches.  I'd like to commit it to all of them,
perhaps after having it on trunk only for a while.

Thanks,

Martin


2013-01-02  Martin Jambor  <mjambor@suse.cz>

	PR tree-optimization/55755
	* tree-sra.c (sra_modify_assign): Do not check that an access has no
	children when trying to avoid producing a VIEW_CONVERT_EXPR.

testsuite/
	* gcc.dg/torture/pr55755.c: New test.
	* gcc.dg/tree-ssa/sra-13.c: Likewise.
	* gcc.dg/tree-ssa/pr45144.c: Update.
Richard Guenther - Jan. 4, 2013, 10:41 a.m.
On Thu, 3 Jan 2013, Martin Jambor wrote:

> Hi,
> 
> the patch below fixes PR 55755 which was in the compiler for years.
> The problem is that a replacement of a bit-field can have a larger
> TYPE_SIZE than the type of the field and so creating a V_C_E from it
> to the field type may result in invalid gimple.  We do that when we
> scalarize only one side of an assignment and get incompatible types on
> both sides and the other (non-scalar) side has a child access in the
> access tree (regardless if it is to be scalarize or not).
> 
> When looking at the issue I realized that the last condition is
> completely unnecessary (at least now, the first concepts of the "new"
> SRA were a bit different) because the subsequent handling of
> sub-replacements will do the right thing (load/store them to the
> original aggregate) and removing it is indeed the correct thing to
> deal with this bug - if both sides are scalarized, size of both will
> grow to mode size, if only one, we can avoid the V_C_E.  I am a little
> worried about the contains_bitfld_comp_ref_p and
> contains_vce_or_bfcref_p gurads which are there because of Ada PR
> 46349 (which involves aggregate bit-fields) and which might in theory
> lead to the same problem but I'm weary of touching it, at least not in
> one commit (I'm testing what happens if I remove them right now), and
> this patch does not make the current situation any worse.
> 
> In order to make sure we do not mess up when the non-scalar side has
> sub-replacements in it, I have added a new testcase.
> 
> The patch has passed bootstrap and testing on x86_64-linux on trunk
> and the 4.7 and 4.6 branches.  I'd like to commit it to all of them,
> perhaps after having it on trunk only for a while.

Ok for trunk and branches after a while.

Thanks,
Richard.

> Thanks,
> 
> Martin
> 
> 
> 2013-01-02  Martin Jambor  <mjambor@suse.cz>
> 
> 	PR tree-optimization/55755
> 	* tree-sra.c (sra_modify_assign): Do not check that an access has no
> 	children when trying to avoid producing a VIEW_CONVERT_EXPR.
> 
> testsuite/
> 	* gcc.dg/torture/pr55755.c: New test.
> 	* gcc.dg/tree-ssa/sra-13.c: Likewise.
> 	* gcc.dg/tree-ssa/pr45144.c: Update.
> 
> Index: src/gcc/testsuite/gcc.dg/torture/pr55755.c
> ===================================================================
> --- /dev/null
> +++ src/gcc/testsuite/gcc.dg/torture/pr55755.c
> @@ -0,0 +1,43 @@
> +/* { dg-do run } */
> +/* { dg-require-effective-target int32plus } */
> +
> +struct S4
> +{
> +  unsigned f0:24;
> +} __attribute__((__packed__));
> +
> +struct S4 g_10 = {
> +  6210831
> +};
> +
> +struct S5
> +{
> +  int i;
> +  struct S4 l_8[2];
> +}  __attribute__((__packed__));
> +
> +int a, b;
> +
> +struct S4 func_2 (int x)
> +{
> +  struct S5 l = {
> +    0,
> +    {{0}, {0}}
> +  };
> +  l.i = a;
> +  g_10 = l.l_8[1];
> +  for (; x<2; x++) {
> +    struct S4 tmp = {
> +      11936567
> +    };
> +    l.l_8[x] = tmp;
> +  }
> +  b = l.i;
> +  return g_10;
> +}
> +
> +int main (void)
> +{
> +  func_2 (0);
> +  return 0;
> +}
> Index: src/gcc/testsuite/gcc.dg/tree-ssa/sra-13.c
> ===================================================================
> --- /dev/null
> +++ src/gcc/testsuite/gcc.dg/tree-ssa/sra-13.c
> @@ -0,0 +1,114 @@
> +/* Test that SRA replacement can deal with assignments that have
> +   sub-replacements on one side and a single scalar replacement on another.  */
> +/* { dg-do run } */
> +/* { dg-options "-O1" } */
> +
> +struct A
> +{
> +  int i1, i2;
> +};
> +
> +struct B
> +{
> +  long long int l;
> +};
> +
> +union U
> +{
> +  struct A a;
> +  struct B b;
> +};
> +
> +int b, gi;
> +long gl;
> +union U gu1, gu2;
> +
> +int __attribute__ ((noinline, noclone))
> +foo (void)
> +{
> +  union U x, y;
> +  int r;
> +
> +  y = gu1;
> +  if (b)
> +    y.b.l = gl;
> +
> +  x = y;
> +
> +  if (!b)
> +    r = x.a.i1;
> +  else
> +    r = 0;
> +
> +  gu2 = x;
> +  return r;
> +}
> +
> +long long int __attribute__ ((noinline, noclone))
> +bar (void)
> +{
> +  union U x, y;
> +  int r;
> +
> +  y = gu1;
> +  if (b)
> +    y.a.i1 = gi;
> +
> +  x = y;
> +
> +  if (!b)
> +    r = x.b.l;
> +  else
> +    r = 0;
> +
> +  gu2 = x;
> +  return r;
> +}
> +
> +
> +int
> +main (void)
> +{
> +  int r;
> +  long long int s;
> +
> +  b = 0;
> +  gu1.a.i1 = 123;
> +  gu1.a.i2 = 234;
> +  r = foo ();
> +  if (r != 123)
> +    __builtin_abort ();
> +  if (gu2.a.i1 != 123)
> +    __builtin_abort ();
> +  if (gu2.a.i2 != 234)
> +    __builtin_abort ();
> +
> +  b = 1;
> +  gl = 10000001;
> +  gu1.b.l = 10000000;
> +  r = foo ();
> +  if (r != 0)
> +    __builtin_abort ();
> +  if (gu2.b.l != 10000001)
> +    __builtin_abort ();
> +
> +  b = 0;
> +  gu1.b.l = 20000000;
> +  s = bar ();
> +  if (s != 20000000)
> +    __builtin_abort ();
> +  if (gu2.b.l != 20000000)
> +    __builtin_abort ();
> +
> +  b = 1;
> +  gi = 456;
> +  gu1.a.i1 = 123;
> +  gu1.a.i2 = 234;
> +  s = bar ();
> +  if (s != 0)
> +    __builtin_abort ();
> +  if (gu2.a.i1 != 456)
> +    __builtin_abort ();
> +
> +  return 0;
> +}
> Index: src/gcc/tree-sra.c
> ===================================================================
> --- src.orig/gcc/tree-sra.c
> +++ src/gcc/tree-sra.c
> @@ -3087,15 +3087,13 @@ sra_modify_assign (gimple *stmt, gimple_
>  	     ???  This should move to fold_stmt which we simply should
>  	     call after building a VIEW_CONVERT_EXPR here.  */
>  	  if (AGGREGATE_TYPE_P (TREE_TYPE (lhs))
> -	      && !contains_bitfld_comp_ref_p (lhs)
> -	      && !access_has_children_p (lacc))
> +	      && !contains_bitfld_comp_ref_p (lhs))
>  	    {
>  	      lhs = build_ref_for_model (loc, lhs, 0, racc, gsi, false);
>  	      gimple_assign_set_lhs (*stmt, lhs);
>  	    }
>  	  else if (AGGREGATE_TYPE_P (TREE_TYPE (rhs))
> -		   && !contains_vce_or_bfcref_p (rhs)
> -		   && !access_has_children_p (racc))
> +		   && !contains_vce_or_bfcref_p (rhs))
>  	    rhs = build_ref_for_model (loc, rhs, 0, lacc, gsi, false);
>  
>  	  if (!useless_type_conversion_p (TREE_TYPE (lhs), TREE_TYPE (rhs)))
> Index: src/gcc/testsuite/gcc.dg/tree-ssa/pr45144.c
> ===================================================================
> --- src.orig/gcc/testsuite/gcc.dg/tree-ssa/pr45144.c
> +++ src/gcc/testsuite/gcc.dg/tree-ssa/pr45144.c
> @@ -43,5 +43,5 @@ bar (unsigned orig, unsigned *new)
>    *new = foo (&a);
>  }
>  
> -/* { dg-final { scan-tree-dump " = VIEW_CONVERT_EXPR<unsigned int>\\(a\\);" "optimized"} } */
> +/* { dg-final { scan-tree-dump-not "unnamed-unsigned:19" "optimized"} } */
>  /* { dg-final { cleanup-tree-dump "optimized" } } */
> 
>

Patch

Index: src/gcc/testsuite/gcc.dg/torture/pr55755.c
===================================================================
--- /dev/null
+++ src/gcc/testsuite/gcc.dg/torture/pr55755.c
@@ -0,0 +1,43 @@ 
+/* { dg-do run } */
+/* { dg-require-effective-target int32plus } */
+
+struct S4
+{
+  unsigned f0:24;
+} __attribute__((__packed__));
+
+struct S4 g_10 = {
+  6210831
+};
+
+struct S5
+{
+  int i;
+  struct S4 l_8[2];
+}  __attribute__((__packed__));
+
+int a, b;
+
+struct S4 func_2 (int x)
+{
+  struct S5 l = {
+    0,
+    {{0}, {0}}
+  };
+  l.i = a;
+  g_10 = l.l_8[1];
+  for (; x<2; x++) {
+    struct S4 tmp = {
+      11936567
+    };
+    l.l_8[x] = tmp;
+  }
+  b = l.i;
+  return g_10;
+}
+
+int main (void)
+{
+  func_2 (0);
+  return 0;
+}
Index: src/gcc/testsuite/gcc.dg/tree-ssa/sra-13.c
===================================================================
--- /dev/null
+++ src/gcc/testsuite/gcc.dg/tree-ssa/sra-13.c
@@ -0,0 +1,114 @@ 
+/* Test that SRA replacement can deal with assignments that have
+   sub-replacements on one side and a single scalar replacement on another.  */
+/* { dg-do run } */
+/* { dg-options "-O1" } */
+
+struct A
+{
+  int i1, i2;
+};
+
+struct B
+{
+  long long int l;
+};
+
+union U
+{
+  struct A a;
+  struct B b;
+};
+
+int b, gi;
+long gl;
+union U gu1, gu2;
+
+int __attribute__ ((noinline, noclone))
+foo (void)
+{
+  union U x, y;
+  int r;
+
+  y = gu1;
+  if (b)
+    y.b.l = gl;
+
+  x = y;
+
+  if (!b)
+    r = x.a.i1;
+  else
+    r = 0;
+
+  gu2 = x;
+  return r;
+}
+
+long long int __attribute__ ((noinline, noclone))
+bar (void)
+{
+  union U x, y;
+  int r;
+
+  y = gu1;
+  if (b)
+    y.a.i1 = gi;
+
+  x = y;
+
+  if (!b)
+    r = x.b.l;
+  else
+    r = 0;
+
+  gu2 = x;
+  return r;
+}
+
+
+int
+main (void)
+{
+  int r;
+  long long int s;
+
+  b = 0;
+  gu1.a.i1 = 123;
+  gu1.a.i2 = 234;
+  r = foo ();
+  if (r != 123)
+    __builtin_abort ();
+  if (gu2.a.i1 != 123)
+    __builtin_abort ();
+  if (gu2.a.i2 != 234)
+    __builtin_abort ();
+
+  b = 1;
+  gl = 10000001;
+  gu1.b.l = 10000000;
+  r = foo ();
+  if (r != 0)
+    __builtin_abort ();
+  if (gu2.b.l != 10000001)
+    __builtin_abort ();
+
+  b = 0;
+  gu1.b.l = 20000000;
+  s = bar ();
+  if (s != 20000000)
+    __builtin_abort ();
+  if (gu2.b.l != 20000000)
+    __builtin_abort ();
+
+  b = 1;
+  gi = 456;
+  gu1.a.i1 = 123;
+  gu1.a.i2 = 234;
+  s = bar ();
+  if (s != 0)
+    __builtin_abort ();
+  if (gu2.a.i1 != 456)
+    __builtin_abort ();
+
+  return 0;
+}
Index: src/gcc/tree-sra.c
===================================================================
--- src.orig/gcc/tree-sra.c
+++ src/gcc/tree-sra.c
@@ -3087,15 +3087,13 @@  sra_modify_assign (gimple *stmt, gimple_
 	     ???  This should move to fold_stmt which we simply should
 	     call after building a VIEW_CONVERT_EXPR here.  */
 	  if (AGGREGATE_TYPE_P (TREE_TYPE (lhs))
-	      && !contains_bitfld_comp_ref_p (lhs)
-	      && !access_has_children_p (lacc))
+	      && !contains_bitfld_comp_ref_p (lhs))
 	    {
 	      lhs = build_ref_for_model (loc, lhs, 0, racc, gsi, false);
 	      gimple_assign_set_lhs (*stmt, lhs);
 	    }
 	  else if (AGGREGATE_TYPE_P (TREE_TYPE (rhs))
-		   && !contains_vce_or_bfcref_p (rhs)
-		   && !access_has_children_p (racc))
+		   && !contains_vce_or_bfcref_p (rhs))
 	    rhs = build_ref_for_model (loc, rhs, 0, lacc, gsi, false);
 
 	  if (!useless_type_conversion_p (TREE_TYPE (lhs), TREE_TYPE (rhs)))
Index: src/gcc/testsuite/gcc.dg/tree-ssa/pr45144.c
===================================================================
--- src.orig/gcc/testsuite/gcc.dg/tree-ssa/pr45144.c
+++ src/gcc/testsuite/gcc.dg/tree-ssa/pr45144.c
@@ -43,5 +43,5 @@  bar (unsigned orig, unsigned *new)
   *new = foo (&a);
 }
 
-/* { dg-final { scan-tree-dump " = VIEW_CONVERT_EXPR<unsigned int>\\(a\\);" "optimized"} } */
+/* { dg-final { scan-tree-dump-not "unnamed-unsigned:19" "optimized"} } */
 /* { dg-final { cleanup-tree-dump "optimized" } } */