diff mbox series

VEC_COND_EXPR: clean up first argument

Message ID 58296208-00ca-b306-3e98-6c995e9c738a@suse.cz
State New
Headers show
Series VEC_COND_EXPR: clean up first argument | expand

Commit Message

Martin Liška June 24, 2020, 7:21 a.m. UTC
Hi.

When expanding a VEC_COND_EXPR it happens that first argument (a SSA_NAME)
that can be no longer used. When that happens we need to remove the SSA_NAME,
otherwise we end up expanding it and for targets like s390x, there's no optab
expansion. We need to remove them at both places as -O0 does not help us
with a dead SSA_NAMEs after vector lowering.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
Cross-compiler for s390x was tested as well for problematic test-case.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

	PR tree-optimization/95745
	PR middle-end/95830
	* gimple-isel.cc (gimple_expand_vec_cond_exprs): Delete dead
	SSA_NAMEs used as the first argument of a VEC_COND_EXPR.  Always
	return 0.
	* tree-vect-generic.c (expand_vector_condition): Remove dead
	SSA_NAMEs used as the firs argument of a VEC_COND_EXPR.
---
  gcc/gimple-isel.cc      | 11 +++++++++--
  gcc/tree-vect-generic.c |  9 ++++++++-
  2 files changed, 17 insertions(+), 3 deletions(-)

Comments

Richard Biener June 24, 2020, 7:43 a.m. UTC | #1
On Wed, Jun 24, 2020 at 9:21 AM Martin Liška <mliska@suse.cz> wrote:
>
> Hi.
>
> When expanding a VEC_COND_EXPR it happens that first argument (a SSA_NAME)
> that can be no longer used. When that happens we need to remove the SSA_NAME,
> otherwise we end up expanding it and for targets like s390x, there's no optab
> expansion. We need to remove them at both places as -O0 does not help us
> with a dead SSA_NAMEs after vector lowering.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> Cross-compiler for s390x was tested as well for problematic test-case.
>
> Ready to be installed?

Hmm, can you instead use simple_dce_from_worklist and simply
record all SSA_NAMEs you end up "forwarding" as possibly dead
in a bitmap?  At least that hashmap traversal looks dangerous
with respect to address-space randomization and gsi_remove
inserting debug stmts and thus eventually allocating debug decls.

Thanks,
Richard.

> Thanks,
> Martin
>
> gcc/ChangeLog:
>
>         PR tree-optimization/95745
>         PR middle-end/95830
>         * gimple-isel.cc (gimple_expand_vec_cond_exprs): Delete dead
>         SSA_NAMEs used as the first argument of a VEC_COND_EXPR.  Always
>         return 0.
>         * tree-vect-generic.c (expand_vector_condition): Remove dead
>         SSA_NAMEs used as the firs argument of a VEC_COND_EXPR.
> ---
>   gcc/gimple-isel.cc      | 11 +++++++++--
>   gcc/tree-vect-generic.c |  9 ++++++++-
>   2 files changed, 17 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
> index 97f92080503..25b893224af 100644
> --- a/gcc/gimple-isel.cc
> +++ b/gcc/gimple-isel.cc
> @@ -178,7 +178,6 @@ gimple_expand_vec_cond_exprs (void)
>   {
>     gimple_stmt_iterator gsi;
>     basic_block bb;
> -  bool cfg_changed = false;
>     hash_map<tree, unsigned int> vec_cond_ssa_name_uses;
>
>     FOR_EACH_BB_FN (bb, cfun)
> @@ -196,7 +195,15 @@ gimple_expand_vec_cond_exprs (void)
>         }
>       }
>
> -  return cfg_changed ? TODO_cleanup_cfg : 0;
> +  for (hash_map<tree, unsigned int>::iterator it = vec_cond_ssa_name_uses.begin ();
> +       it != vec_cond_ssa_name_uses.end (); ++it)
> +    if (num_imm_uses ((*it).first) == 0)
> +      {
> +       gsi = gsi_for_stmt (SSA_NAME_DEF_STMT ((*it).first));
> +       gsi_remove (&gsi, true);
> +      }
> +
> +  return 0;
>   }
>
>   namespace {
> diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
> index 83d399a7898..2479368743d 100644
> --- a/gcc/tree-vect-generic.c
> +++ b/gcc/tree-vect-generic.c
> @@ -954,10 +954,11 @@ expand_vector_condition (gimple_stmt_iterator *gsi)
>     tree comp_index = index;
>     location_t loc = gimple_location (gsi_stmt (*gsi));
>     tree_code code = TREE_CODE (a);
> +  gassign *assign = NULL;
>
>     if (code == SSA_NAME)
>       {
> -      gassign *assign = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (a));
> +      assign = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (a));
>         if (assign != NULL
>           && TREE_CODE_CLASS (gimple_assign_rhs_code (assign)) == tcc_comparison)
>         {
> @@ -1064,6 +1065,12 @@ expand_vector_condition (gimple_stmt_iterator *gsi)
>     constr = build_constructor (type, v);
>     gimple_assign_set_rhs_from_tree (gsi, constr);
>     update_stmt (gsi_stmt (*gsi));
> +
> +  if (a_is_comparison && num_imm_uses (gimple_assign_lhs (assign)) == 0)
> +    {
> +      gimple_stmt_iterator gsi = gsi_for_stmt (assign);
> +      gsi_remove (&gsi, true);
> +    }
>   }
>
>   static tree
> --
> 2.27.0
>
Martin Liška June 24, 2020, 8:49 a.m. UTC | #2
On 6/24/20 9:43 AM, Richard Biener wrote:
> Hmm, can you instead use simple_dce_from_worklist and simply
> record all SSA_NAMEs you end up "forwarding" as possibly dead
> in a bitmap?  At least that hashmap traversal looks dangerous
> with respect to address-space randomization and gsi_remove
> inserting debug stmts and thus eventually allocating debug decls.

Sure, done in the updated patch.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin
Richard Biener June 24, 2020, 9:09 a.m. UTC | #3
On Wed, Jun 24, 2020 at 10:49 AM Martin Liška <mliska@suse.cz> wrote:
>
> On 6/24/20 9:43 AM, Richard Biener wrote:
> > Hmm, can you instead use simple_dce_from_worklist and simply
> > record all SSA_NAMEs you end up "forwarding" as possibly dead
> > in a bitmap?  At least that hashmap traversal looks dangerous
> > with respect to address-space randomization and gsi_remove
> > inserting debug stmts and thus eventually allocating debug decls.
>
> Sure, done in the updated patch.

You can simplify the patch by eliding the num_imm_uses checks
and by using auto_bitmap.  Why is it necessary to update
the veclower pass btw?  Is that just to avoid useless isels
on dead code?

You also updated veclower "nicely" but still have the hashmap
walk in isel - you should know when you "merged" a condition
into a cond and set the bit there.

Richard.

> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?
> Thanks,
> Martin
>
Martin Liška June 24, 2020, 9:27 a.m. UTC | #4
On 6/24/20 11:09 AM, Richard Biener wrote:
> On Wed, Jun 24, 2020 at 10:49 AM Martin Liška <mliska@suse.cz> wrote:
>>
>> On 6/24/20 9:43 AM, Richard Biener wrote:
>>> Hmm, can you instead use simple_dce_from_worklist and simply
>>> record all SSA_NAMEs you end up "forwarding" as possibly dead
>>> in a bitmap?  At least that hashmap traversal looks dangerous
>>> with respect to address-space randomization and gsi_remove
>>> inserting debug stmts and thus eventually allocating debug decls.
>>
>> Sure, done in the updated patch.
> 
> You can simplify the patch by eliding the num_imm_uses checks

Really? How can I be sure that a SSA_NAME is not shared among different
VEC_COND_EXPR statements (or even by some other statements)?

> and by using auto_bitmap.

Oh yeah!

>  Why is it necessary to update
> the veclower pass btw?  Is that just to avoid useless isels
> on dead code?

Yes:

   _10 = _9 != { 0, 0, 0, 0 };
   _11 = *a_16(D);
   _12 = *b_17(D);
   _13 = _11 + _12;
   _14 = VEC_COND_EXPR <_10, _13, { 3.0e+0, 3.0e+0, 3.0e+0, 3.0e+0 }>;

is expanded by vectlower to something like:

   _10 = _9 != { 0, 0, 0, 0 };
   _11 = *a_16(D);
   _12 = *b_17(D);
   _67 = BIT_FIELD_REF <_11, 32, 0>;
   _68 = BIT_FIELD_REF <_12, 32, 0>;
   _69 = _67 + _68;
...
   _14 = {_80, _82, _84, _86};
   *a_16(D) = _14;

So one needs to remove: _10 = _9 != { 0, 0, 0, 0 };
Note the ICE happens without an optimization level.


> 
> You also updated veclower "nicely" but still have the hashmap
> walk in isel - you should know when you "merged" a condition
> into a cond and set the bit there.

Isn't the same as before as the first argument can be actually shared in between
multiple GIMPLE statements?

Thanks,
Martin

> 
> Richard.
> 
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
>> Thanks,
>> Martin
>>
Richard Biener June 24, 2020, 2:15 p.m. UTC | #5
On Wed, Jun 24, 2020 at 11:27 AM Martin Liška <mliska@suse.cz> wrote:
>
> On 6/24/20 11:09 AM, Richard Biener wrote:
> > On Wed, Jun 24, 2020 at 10:49 AM Martin Liška <mliska@suse.cz> wrote:
> >>
> >> On 6/24/20 9:43 AM, Richard Biener wrote:
> >>> Hmm, can you instead use simple_dce_from_worklist and simply
> >>> record all SSA_NAMEs you end up "forwarding" as possibly dead
> >>> in a bitmap?  At least that hashmap traversal looks dangerous
> >>> with respect to address-space randomization and gsi_remove
> >>> inserting debug stmts and thus eventually allocating debug decls.
> >>
> >> Sure, done in the updated patch.
> >
> > You can simplify the patch by eliding the num_imm_uses checks
>
> Really? How can I be sure that a SSA_NAME is not shared among different
> VEC_COND_EXPR statements (or even by some other statements)?

The bitmap DCE does this check for you.

> > and by using auto_bitmap.
>
> Oh yeah!
>
> >  Why is it necessary to update
> > the veclower pass btw?  Is that just to avoid useless isels
> > on dead code?
>
> Yes:
>
>    _10 = _9 != { 0, 0, 0, 0 };
>    _11 = *a_16(D);
>    _12 = *b_17(D);
>    _13 = _11 + _12;
>    _14 = VEC_COND_EXPR <_10, _13, { 3.0e+0, 3.0e+0, 3.0e+0, 3.0e+0 }>;
>
> is expanded by vectlower to something like:
>
>    _10 = _9 != { 0, 0, 0, 0 };
>    _11 = *a_16(D);
>    _12 = *b_17(D);
>    _67 = BIT_FIELD_REF <_11, 32, 0>;
>    _68 = BIT_FIELD_REF <_12, 32, 0>;
>    _69 = _67 + _68;
> ...
>    _14 = {_80, _82, _84, _86};
>    *a_16(D) = _14;
>
> So one needs to remove: _10 = _9 != { 0, 0, 0, 0 };
> Note the ICE happens without an optimization level.

Ah, OK.  That makes sense.

>
> >
> > You also updated veclower "nicely" but still have the hashmap
> > walk in isel - you should know when you "merged" a condition
> > into a cond and set the bit there.
>
> Isn't the same as before as the first argument can be actually shared in between
> multiple GIMPLE statements?

As said above the bitmap DCE is built for lazy consumption.

Richard.

> Thanks,
> Martin
>
> >
> > Richard.
> >
> >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >>
> >> Ready to be installed?
> >> Thanks,
> >> Martin
> >>
>
Martin Liška June 25, 2020, 7:01 a.m. UTC | #6
On 6/24/20 4:15 PM, Richard Biener wrote:
> On Wed, Jun 24, 2020 at 11:27 AM Martin Liška <mliska@suse.cz> wrote:
>>
>> On 6/24/20 11:09 AM, Richard Biener wrote:
>>> On Wed, Jun 24, 2020 at 10:49 AM Martin Liška <mliska@suse.cz> wrote:
>>>>
>>>> On 6/24/20 9:43 AM, Richard Biener wrote:
>>>>> Hmm, can you instead use simple_dce_from_worklist and simply
>>>>> record all SSA_NAMEs you end up "forwarding" as possibly dead
>>>>> in a bitmap?  At least that hashmap traversal looks dangerous
>>>>> with respect to address-space randomization and gsi_remove
>>>>> inserting debug stmts and thus eventually allocating debug decls.
>>>>
>>>> Sure, done in the updated patch.
>>>
>>> You can simplify the patch by eliding the num_imm_uses checks
>>
>> Really? How can I be sure that a SSA_NAME is not shared among different
>> VEC_COND_EXPR statements (or even by some other statements)?
> 
> The bitmap DCE does this check for you.
> 
>>> and by using auto_bitmap.
>>
>> Oh yeah!
>>
>>>   Why is it necessary to update
>>> the veclower pass btw?  Is that just to avoid useless isels
>>> on dead code?
>>
>> Yes:
>>
>>     _10 = _9 != { 0, 0, 0, 0 };
>>     _11 = *a_16(D);
>>     _12 = *b_17(D);
>>     _13 = _11 + _12;
>>     _14 = VEC_COND_EXPR <_10, _13, { 3.0e+0, 3.0e+0, 3.0e+0, 3.0e+0 }>;
>>
>> is expanded by vectlower to something like:
>>
>>     _10 = _9 != { 0, 0, 0, 0 };
>>     _11 = *a_16(D);
>>     _12 = *b_17(D);
>>     _67 = BIT_FIELD_REF <_11, 32, 0>;
>>     _68 = BIT_FIELD_REF <_12, 32, 0>;
>>     _69 = _67 + _68;
>> ...
>>     _14 = {_80, _82, _84, _86};
>>     *a_16(D) = _14;
>>
>> So one needs to remove: _10 = _9 != { 0, 0, 0, 0 };
>> Note the ICE happens without an optimization level.
> 
> Ah, OK.  That makes sense.
> 
>>
>>>
>>> You also updated veclower "nicely" but still have the hashmap
>>> walk in isel - you should know when you "merged" a condition
>>> into a cond and set the bit there.
>>
>> Isn't the same as before as the first argument can be actually shared in between
>> multiple GIMPLE statements?
> 
> As said above the bitmap DCE is built for lazy consumption.

All right, I'm going to push a patch with that changed.

Martin

> 
> Richard.
> 
>> Thanks,
>> Martin
>>
>>>
>>> Richard.
>>>
>>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>>>
>>>> Ready to be installed?
>>>> Thanks,
>>>> Martin
>>>>
>>
diff mbox series

Patch

diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
index 97f92080503..25b893224af 100644
--- a/gcc/gimple-isel.cc
+++ b/gcc/gimple-isel.cc
@@ -178,7 +178,6 @@  gimple_expand_vec_cond_exprs (void)
  {
    gimple_stmt_iterator gsi;
    basic_block bb;
-  bool cfg_changed = false;
    hash_map<tree, unsigned int> vec_cond_ssa_name_uses;
  
    FOR_EACH_BB_FN (bb, cfun)
@@ -196,7 +195,15 @@  gimple_expand_vec_cond_exprs (void)
  	}
      }
  
-  return cfg_changed ? TODO_cleanup_cfg : 0;
+  for (hash_map<tree, unsigned int>::iterator it = vec_cond_ssa_name_uses.begin ();
+       it != vec_cond_ssa_name_uses.end (); ++it)
+    if (num_imm_uses ((*it).first) == 0)
+      {
+	gsi = gsi_for_stmt (SSA_NAME_DEF_STMT ((*it).first));
+	gsi_remove (&gsi, true);
+      }
+
+  return 0;
  }
  
  namespace {
diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
index 83d399a7898..2479368743d 100644
--- a/gcc/tree-vect-generic.c
+++ b/gcc/tree-vect-generic.c
@@ -954,10 +954,11 @@  expand_vector_condition (gimple_stmt_iterator *gsi)
    tree comp_index = index;
    location_t loc = gimple_location (gsi_stmt (*gsi));
    tree_code code = TREE_CODE (a);
+  gassign *assign = NULL;
  
    if (code == SSA_NAME)
      {
-      gassign *assign = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (a));
+      assign = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (a));
        if (assign != NULL
  	  && TREE_CODE_CLASS (gimple_assign_rhs_code (assign)) == tcc_comparison)
  	{
@@ -1064,6 +1065,12 @@  expand_vector_condition (gimple_stmt_iterator *gsi)
    constr = build_constructor (type, v);
    gimple_assign_set_rhs_from_tree (gsi, constr);
    update_stmt (gsi_stmt (*gsi));
+
+  if (a_is_comparison && num_imm_uses (gimple_assign_lhs (assign)) == 0)
+    {
+      gimple_stmt_iterator gsi = gsi_for_stmt (assign);
+      gsi_remove (&gsi, true);
+    }
  }
  
  static tree