Patchwork RFA: better gimplification of compound literals

login
register
mail settings
Submitter Michael Matz
Date June 13, 2012, 11:46 a.m.
Message ID <Pine.LNX.4.64.1206131340560.29474@wotan.suse.de>
Download mbox | patch
Permalink /patch/164638/
State New
Headers show

Comments

Michael Matz - June 13, 2012, 11:46 a.m.
Hi,

On Tue, 12 Jun 2012, Richard Guenther wrote:

> > Ok, I see the C frontend hands us this as
> >
> >  return  VEC_PERM_EXPR < a , b , <<< Unknown tree: compound_literal_expr
> >    v4si D.1712 = { 0, 4, 1, 5 }; >>> > ;
> >
> > and gimplification in some way fails to gimplify it to { 0, 4, 1, 5 }.

Was a non-implemented optimization.  If the compound literal value isn't 
used as lvalue and doesn't have its address taken (and generally fits the 
current predicate) we can as well subst it in place instead of going over 
an intermediate statement.

Regstrapping on x86_64-linux in progress.  Okay if that passes?


Ciao,
Michael.
---------------------
	* gimplify.c (gimplify_compound_literal_expr): Take gimple_test_f
	argument, don't emit assign statement if value is directly usable.
	(gimplify_expr): Adjust.

testsuite/
	* gcc.dg/tree-ssa/vector-4.c: New test.
Richard Guenther - June 13, 2012, 11:57 a.m.
On Wed, Jun 13, 2012 at 1:46 PM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Tue, 12 Jun 2012, Richard Guenther wrote:
>
>> > Ok, I see the C frontend hands us this as
>> >
>> >  return  VEC_PERM_EXPR < a , b , <<< Unknown tree: compound_literal_expr
>> >    v4si D.1712 = { 0, 4, 1, 5 }; >>> > ;
>> >
>> > and gimplification in some way fails to gimplify it to { 0, 4, 1, 5 }.
>
> Was a non-implemented optimization.  If the compound literal value isn't
> used as lvalue and doesn't have its address taken (and generally fits the
> current predicate) we can as well subst it in place instead of going over
> an intermediate statement.
>
> Regstrapping on x86_64-linux in progress.  Okay if that passes?

Ok.

Thanks,
Richard.

>
> Ciao,
> Michael.
> ---------------------
>        * gimplify.c (gimplify_compound_literal_expr): Take gimple_test_f
>        argument, don't emit assign statement if value is directly usable.
>        (gimplify_expr): Adjust.
>
> testsuite/
>        * gcc.dg/tree-ssa/vector-4.c: New test.
>
> Index: gimplify.c
> ===================================================================
> --- gimplify.c  (revision 188500)
> +++ gimplify.c  (working copy)
> @@ -3796,15 +3796,29 @@ rhs_predicate_for (tree lhs)
>
>  static enum gimplify_status
>  gimplify_compound_literal_expr (tree *expr_p, gimple_seq *pre_p,
> +                               bool (*gimple_test_f) (tree),
>                                fallback_t fallback)
>  {
>   tree decl_s = COMPOUND_LITERAL_EXPR_DECL_EXPR (*expr_p);
>   tree decl = DECL_EXPR_DECL (decl_s);
> +  tree init = DECL_INITIAL (decl);
>   /* Mark the decl as addressable if the compound literal
>      expression is addressable now, otherwise it is marked too late
>      after we gimplify the initialization expression.  */
>   if (TREE_ADDRESSABLE (*expr_p))
>     TREE_ADDRESSABLE (decl) = 1;
> +  /* Otherwise, if we don't need an lvalue and have a literal directly
> +     substitute it.  Check if it matches the gimple predicate, as
> +     otherwise we'd generate a new temporary, and we can as well just
> +     use the decl we already have.  */
> +  else if (!TREE_ADDRESSABLE (decl)
> +          && init
> +          && (fallback & fb_lvalue) == 0
> +          && gimple_test_f (init))
> +    {
> +      *expr_p = init;
> +      return GS_OK;
> +    }
>
>   /* Preliminarily mark non-addressed complex variables as eligible
>      for promotion to gimple registers.  We'll transform their uses
> @@ -7118,7 +7132,8 @@ gimplify_expr (tree *expr_p, gimple_seq
>          break;
>
>        case COMPOUND_LITERAL_EXPR:
> -         ret = gimplify_compound_literal_expr (expr_p, pre_p, fallback);
> +         ret = gimplify_compound_literal_expr (expr_p, pre_p,
> +                                               gimple_test_f, fallback);
>          break;
>
>        case MODIFY_EXPR:
> Index: testsuite/gcc.dg/tree-ssa/vector-4.c
> ===================================================================
> --- testsuite/gcc.dg/tree-ssa/vector-4.c        (revision 0)
> +++ testsuite/gcc.dg/tree-ssa/vector-4.c        (revision 0)
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-w -O1 -fdump-tree-gimple" } */
> +
> +typedef int v4si __attribute__ ((vector_size (16)));
> +
> +v4si vs (v4si a, v4si b)
> +{
> +  return __builtin_shuffle (a, b, (v4si) {0, 4, 1, 5});
> +}
> +
> +/* The compound literal should be placed directly in the vec_perm.  */
> +/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR <a, b, { 0, 4, 1, 5 }>;" 1 "gimple"} } */
> +
> +/* { dg-final { cleanup-tree-dump "gimple" } } */
Michael Matz - June 14, 2012, 2:47 p.m.
Hi,

[Honza, for you a question below]

On Wed, 13 Jun 2012, Richard Guenther wrote:

> > Was a non-implemented optimization.  If the compound literal value 
> > isn't used as lvalue and doesn't have its address taken (and generally 
> > fits the current predicate) we can as well subst it in place instead 
> > of going over an intermediate statement.
> >
> > Regstrapping on x86_64-linux in progress.  Okay if that passes?
> 
> Ok.

It doesn't due to a missed folding that nobody else is doing either 
(pr44214-[13].c).  Formerly we had:

   D.1712 = { 2.0e+0, 2.0e+0 };
   *a = *b / D.1712;

which ccp (!) finally folded into

   *a = *b * { 0.5, 0.5 };

Now we start right away with

   *a = *b / { 2.0e+0, 2.0e+0 };

Nothing currently folds this (fold doesn't because it still sees the 
compound literal, and the tree optimizers never change something on that 
statement again, so there's no reason to re-fold).

So, let's finally fold statements in the gimplifier like in the below 
patch.  If you prefer I can conditionalize the folding also on being 
in the generic->gimple lowering, in difference to the other calls into the 
gimplifier from the optimizers.

This exposes some regressions in the testsuite, two of them easily fixed 
in the testcase and the third because we fold before the symtab is ready,
hence this line was crashing:

    if (!from_decl
        ...
        || (symtab_get_node (from_decl)->symbol.in_other_partition))
      return true;

in can_refer_decl_in_current_unit_p.  Honza: I don't understand this 
particular condition.  If we have a from_decl (i.e. the decl we're 
concerned about stems from the initializer of it) and it is in another 
partition, then this says that we can freely refer to the decl itself?  
That doesn't make sense to me.  That decl might for instance also be in 
that other partition and be hidden.  Hence I'm unsure what to return if 
the symtab isn't ready yet.  In the patch below I'm returning false as in 
"don't know".  But I think the current behaviour is wrong?

In any case, this patch is currently in regstrapping on x86-64.  Okay if 
it passes (modulo changes for the above symtab_get_node() issue)?


Ciao,
Michael.
	* gimplify.c (gimplify_modify_expr): Fold generated statements.
	* gimple-fold.c (can_refer_decl_in_current_unit_p): Check
	symtab node existence before access.

testsuite/
	* gcc.dg/debug/dwarf2/inline3.c: Adjust.
	* gcc.dg/tree-ssa/foldstring-1.c: Adjust.

Index: gimplify.c
===================================================================
*** gimplify.c	(revision 188500)
--- gimplify.c	(working copy)
*************** gimplify_modify_expr (tree *expr_p, gimp
*** 4772,4777 ****
--- 4786,4792 ----
    enum gimplify_status ret = GS_UNHANDLED;
    gimple assign;
    location_t loc = EXPR_LOCATION (*expr_p);
+   gimple_stmt_iterator gsi;
  
    gcc_assert (TREE_CODE (*expr_p) == MODIFY_EXPR
  	      || TREE_CODE (*expr_p) == INIT_EXPR);
*************** gimplify_modify_expr (tree *expr_p, gimp
*** 4912,4919 ****
        gimple_set_location (assign, EXPR_LOCATION (*expr_p));
      }
  
-   gimplify_seq_add_stmt (pre_p, assign);
- 
    if (gimplify_ctxp->into_ssa && is_gimple_reg (*to_p))
      {
        /* If we've somehow already got an SSA_NAME on the LHS, then
--- 4927,4932 ----
*************** gimplify_modify_expr (tree *expr_p, gimp
*** 4923,4928 ****
--- 4936,4945 ----
        gimple_set_lhs (assign, *to_p);
      }
  
+   gimplify_seq_add_stmt (pre_p, assign);
+   gsi = gsi_last (*pre_p);
+   fold_stmt (&gsi);
+ 
    if (want_value)
      {
        *expr_p = TREE_THIS_VOLATILE (*to_p) ? *from_p : unshare_expr (*to_p);
Index: gimple-fold.c
===================================================================
*** gimple-fold.c	(revision 188500)
--- gimple-fold.c	(working copy)
*************** can_refer_decl_in_current_unit_p (tree d
*** 61,79 ****
    struct cgraph_node *node;
    symtab_node snode;
  
!   /* We will later output the initializer, so we can reffer to it.
       So we are concerned only when DECL comes from initializer of
       external var.  */
    if (!from_decl
        || TREE_CODE (from_decl) != VAR_DECL
!       || !DECL_EXTERNAL (from_decl)
!       || (symtab_get_node (from_decl)->symbol.in_other_partition))
      return true;
!   /* We are concerned ony about static/external vars and functions.  */
    if ((!TREE_STATIC (decl) && !DECL_EXTERNAL (decl))
        || (TREE_CODE (decl) != VAR_DECL && TREE_CODE (decl) != FUNCTION_DECL))
      return true;
!   /* Weakrefs have somewhat confusing DECL_EXTERNAL flag set; they are always safe.  */
    if (DECL_EXTERNAL (decl)
        && lookup_attribute ("weakref", DECL_ATTRIBUTES (decl)))
      return true;
--- 61,85 ----
    struct cgraph_node *node;
    symtab_node snode;
  
!   /* We will later output the initializer, so we can refer to it.
       So we are concerned only when DECL comes from initializer of
       external var.  */
    if (!from_decl
        || TREE_CODE (from_decl) != VAR_DECL
!       || !DECL_EXTERNAL (from_decl))
      return true;
!   /* We might be called from fold before symtab is ready, in which case
!      we don't know yet.  */
!   if (!(snode = symtab_get_node (from_decl)))
!     return false;
!   if (snode->symbol.in_other_partition)
!     return true;
!   /* We are concerned only about static/external vars and functions.  */
    if ((!TREE_STATIC (decl) && !DECL_EXTERNAL (decl))
        || (TREE_CODE (decl) != VAR_DECL && TREE_CODE (decl) != FUNCTION_DECL))
      return true;
!   /* Weakrefs have somewhat confusing DECL_EXTERNAL flag set; they
!      are always safe.  */
    if (DECL_EXTERNAL (decl)
        && lookup_attribute ("weakref", DECL_ATTRIBUTES (decl)))
      return true;
Index: testsuite/gcc.dg/debug/dwarf2/inline3.c
===================================================================
*** testsuite/gcc.dg/debug/dwarf2/inline3.c	(revision 188500)
--- testsuite/gcc.dg/debug/dwarf2/inline3.c	(working copy)
***************
*** 1,7 ****
  /* Verify that only one DW_AT_const_value is emitted for baz,
     not for baz abstract DIE and again inside of
     DW_TAG_inlined_subroutine.  */
! /* { dg-options "-O2 -g -dA" } */
  /* { dg-do compile } */
  /* { dg-final { scan-assembler-times " DW_AT_const_value" 1 } } */
  
--- 1,7 ----
  /* Verify that only one DW_AT_const_value is emitted for baz,
     not for baz abstract DIE and again inside of
     DW_TAG_inlined_subroutine.  */
! /* { dg-options "-O2 -g -dA -fmerge-all-constants" } */
  /* { dg-do compile } */
  /* { dg-final { scan-assembler-times " DW_AT_const_value" 1 } } */
  
*************** static inline long
*** 11,16 ****
--- 11,19 ----
  foo (void)
  {
    const struct A baz = { .i = 2, .j = 21 };
+   /* We must make sure that baz isn't optimized away before inlining,
+      otherwise its initializer is also lost.  */
+   const struct A *p = &baz;
    asm volatile ("" : : : "memory");
    return baz.i * baz.j;
  }
Index: testsuite/gcc.dg/tree-ssa/foldstring-1.c
===================================================================
*** testsuite/gcc.dg/tree-ssa/foldstring-1.c	(revision 188500)
--- testsuite/gcc.dg/tree-ssa/foldstring-1.c	(working copy)
***************
*** 1,5 ****
  /* { dg-do compile } */
! /* { dg-options "-O1 -fdump-tree-fre1" } */
  
  void
  arf ()
--- 1,5 ----
  /* { dg-do compile } */
! /* { dg-options "-O1 -fdump-tree-gimple" } */
  
  void
  arf ()
*************** arf ()
*** 7,11 ****
    if (""[0] == 0)
      blah ();
  }
! /* { dg-final { scan-tree-dump-times "= 0;" 1 "fre1"} } */
! /* { dg-final { cleanup-tree-dump "fre1" } } */
--- 7,11 ----
    if (""[0] == 0)
      blah ();
  }
! /* { dg-final { scan-tree-dump-times "= 0;" 1 "gimple"} } */
! /* { dg-final { cleanup-tree-dump "gimple" } } */
Jan Hubicka - June 14, 2012, 2:58 p.m.
> 
>     if (!from_decl
>         ...
>         || (symtab_get_node (from_decl)->symbol.in_other_partition))
>       return true;
> 
> in can_refer_decl_in_current_unit_p.  Honza: I don't understand this 
> particular condition.  If we have a from_decl (i.e. the decl we're 
> concerned about stems from the initializer of it) and it is in another 
> partition, then this says that we can freely refer to the decl itself?  
> That doesn't make sense to me.  That decl might for instance also be in 
> that other partition and be hidden.  Hence I'm unsure what to return if 
> the symtab isn't ready yet.  In the patch below I'm returning false as in 
> "don't know".  But I think the current behaviour is wrong?

If the initializers is from other partition, we know that var promoting will make it
part of the boundary so we can reffer to it.
This is really about external constructor that is in different DSO.  WHOPR partitions
are all in one DSO at the end.

Honza
Michael Matz - June 14, 2012, 3:12 p.m.
Hi,

On Thu, 14 Jun 2012, Jan Hubicka wrote:

> >     if (!from_decl
> >         ...
> >         || (symtab_get_node (from_decl)->symbol.in_other_partition))
> >       return true;
> > 
> > in can_refer_decl_in_current_unit_p.  Honza: I don't understand this 
> > particular condition.  If we have a from_decl (i.e. the decl we're 
> > concerned about stems from the initializer of it) and it is in another 
> > partition, then this says that we can freely refer to the decl itself?  
> > That doesn't make sense to me.  That decl might for instance also be in 
> > that other partition and be hidden.  Hence I'm unsure what to return if 
> > the symtab isn't ready yet.  In the patch below I'm returning false as in 
> > "don't know".  But I think the current behaviour is wrong?
> 
> If the initializers is from other partition, we know that var promoting 
> will make it part of the boundary so we can reffer to it.

Hmm, but why only when it's in a different partition?  And what's "it"?  
The initializer or the element we're concerned about (decl)?  You're 
saying that if the initializer is in a different partition we don't need 
further checks for a certain reason, but we _do_ need further checks if 
it's in the current partition?


Ciao,
Michael.
Michael Matz - June 14, 2012, 3:33 p.m.
Hi,

On Thu, 14 Jun 2012, Michael Matz wrote:

> In any case, this patch is currently in regstrapping on x86-64.  Okay if 
> it passes (modulo changes for the above symtab_get_node() issue)?

After discussion with Honza, consider the patch changed like so:

   if (!from_decl
       || TREE_CODE (from_decl) != VAR_DECL
       || !DECL_EXTERNAL (from_decl)
-      || (symtab_get_node (from_decl)->symbol.in_other_partition))
+      || (flag_ltrans
+         && symtab_get_node (from_decl)->symbol.in_other_partition))
     return true;

Restarted regstrapping the thing on x86_64 again.  Okay if that passes?


Ciao,
Michael.
Jan Hubicka - June 14, 2012, 4:10 p.m.
> Hi,
> 
> On Thu, 14 Jun 2012, Michael Matz wrote:
> 
> > In any case, this patch is currently in regstrapping on x86-64.  Okay if 
> > it passes (modulo changes for the above symtab_get_node() issue)?
> 
> After discussion with Honza, consider the patch changed like so:
> 
>    if (!from_decl
>        || TREE_CODE (from_decl) != VAR_DECL
>        || !DECL_EXTERNAL (from_decl)
> -      || (symtab_get_node (from_decl)->symbol.in_other_partition))
> +      || (flag_ltrans
> +         && symtab_get_node (from_decl)->symbol.in_other_partition))
>      return true;
> 
> Restarted regstrapping the thing on x86_64 again.  Okay if that passes?

This change is fine with me ;)

Thanks,
Honza
> 
> 
> Ciao,
> Michael.
Richard Guenther - June 14, 2012, 5:03 p.m.
On Thu, Jun 14, 2012 at 5:33 PM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Thu, 14 Jun 2012, Michael Matz wrote:
>
>> In any case, this patch is currently in regstrapping on x86-64.  Okay if
>> it passes (modulo changes for the above symtab_get_node() issue)?
>
> After discussion with Honza, consider the patch changed like so:
>
>   if (!from_decl
>       || TREE_CODE (from_decl) != VAR_DECL
>       || !DECL_EXTERNAL (from_decl)
> -      || (symtab_get_node (from_decl)->symbol.in_other_partition))
> +      || (flag_ltrans
> +         && symtab_get_node (from_decl)->symbol.in_other_partition))
>     return true;
>
> Restarted regstrapping the thing on x86_64 again.  Okay if that passes?

Ok.

But I wonder how the symtab cannot be "ready" when we gimplify - after all
we gimplify only from after cgraph_finalize_compilation_unit ...

Thanks,
Richard.

>
> Ciao,
> Michael.
Michael Matz - June 15, 2012, 11:08 a.m.
Hi,

On Thu, 14 Jun 2012, Richard Guenther wrote:

> > Restarted regstrapping the thing on x86_64 again.  Okay if that 
> > passes?
> 
> Ok.
> 
> But I wonder how the symtab cannot be "ready" when we gimplify - after 
> all we gimplify only from after cgraph_finalize_compilation_unit ...

"Ready" may have been the wrong word.  There is no entry for the vtable 
object in the symtab at that point.  Only for all the functions.  It also 
never is generated later, so if there had ever been a folding for such 
statement before it would have crashed already without my patch.  I don't 
know if that is by intention or an oversight.


Ciao,
Michael.
Jan Hubicka - June 15, 2012, 12:20 p.m.
> On Thu, Jun 14, 2012 at 5:33 PM, Michael Matz <matz@suse.de> wrote:
> > Hi,
> >
> > On Thu, 14 Jun 2012, Michael Matz wrote:
> >
> >> In any case, this patch is currently in regstrapping on x86-64.  Okay if
> >> it passes (modulo changes for the above symtab_get_node() issue)?
> >
> > After discussion with Honza, consider the patch changed like so:
> >
> >   if (!from_decl
> >       || TREE_CODE (from_decl) != VAR_DECL
> >       || !DECL_EXTERNAL (from_decl)
> > -      || (symtab_get_node (from_decl)->symbol.in_other_partition))
> > +      || (flag_ltrans
> > +         && symtab_get_node (from_decl)->symbol.in_other_partition))
> >     return true;
> >
> > Restarted regstrapping the thing on x86_64 again.  Okay if that passes?
> 
> Ok.
> 
> But I wonder how the symtab cannot be "ready" when we gimplify - after all
> we gimplify only from after cgraph_finalize_compilation_unit ...

We build nodies for external declarations only after we see references in them
and we build references from gimplified bodies.
So at this time symtab has all defined symbols but only some external as the
symtab construction goes by.

Honza

Patch

Index: gimplify.c
===================================================================
--- gimplify.c	(revision 188500)
+++ gimplify.c	(working copy)
@@ -3796,15 +3796,29 @@  rhs_predicate_for (tree lhs)
 
 static enum gimplify_status
 gimplify_compound_literal_expr (tree *expr_p, gimple_seq *pre_p,
+				bool (*gimple_test_f) (tree),
 				fallback_t fallback)
 {
   tree decl_s = COMPOUND_LITERAL_EXPR_DECL_EXPR (*expr_p);
   tree decl = DECL_EXPR_DECL (decl_s);
+  tree init = DECL_INITIAL (decl);
   /* Mark the decl as addressable if the compound literal
      expression is addressable now, otherwise it is marked too late
      after we gimplify the initialization expression.  */
   if (TREE_ADDRESSABLE (*expr_p))
     TREE_ADDRESSABLE (decl) = 1;
+  /* Otherwise, if we don't need an lvalue and have a literal directly
+     substitute it.  Check if it matches the gimple predicate, as
+     otherwise we'd generate a new temporary, and we can as well just
+     use the decl we already have.  */
+  else if (!TREE_ADDRESSABLE (decl)
+	   && init
+	   && (fallback & fb_lvalue) == 0
+	   && gimple_test_f (init))
+    {
+      *expr_p = init;
+      return GS_OK;
+    }
 
   /* Preliminarily mark non-addressed complex variables as eligible
      for promotion to gimple registers.  We'll transform their uses
@@ -7118,7 +7132,8 @@  gimplify_expr (tree *expr_p, gimple_seq
 	  break;
 
 	case COMPOUND_LITERAL_EXPR:
-	  ret = gimplify_compound_literal_expr (expr_p, pre_p, fallback);
+	  ret = gimplify_compound_literal_expr (expr_p, pre_p,
+						gimple_test_f, fallback);
 	  break;
 
 	case MODIFY_EXPR:
Index: testsuite/gcc.dg/tree-ssa/vector-4.c
===================================================================
--- testsuite/gcc.dg/tree-ssa/vector-4.c	(revision 0)
+++ testsuite/gcc.dg/tree-ssa/vector-4.c	(revision 0)
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-options "-w -O1 -fdump-tree-gimple" } */
+
+typedef int v4si __attribute__ ((vector_size (16)));
+
+v4si vs (v4si a, v4si b)
+{
+  return __builtin_shuffle (a, b, (v4si) {0, 4, 1, 5});
+}
+
+/* The compound literal should be placed directly in the vec_perm.  */
+/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR <a, b, { 0, 4, 1, 5 }>;" 1 "gimple"} } */
+
+/* { dg-final { cleanup-tree-dump "gimple" } } */