diff mbox

Permanent Fix for PR46886

Message ID OF165878BB.FC5B37DD-ONC22579CD.003C03B9-C22579CD.003D7469@il.ibm.com
State New
Headers show

Commit Message

Razya Ladelsky March 26, 2012, 11:11 a.m. UTC
Hi,

This is (hopefully) a permanent fix  to pr46886.c
I removed the condition preventing parallelization of do_while loops, as 
it 
was blocking parallelizing important loops in spec-2006.
The patch fixes the number of iterations for cases where the body could 
appear in the latch, as in pr46886.c.

2012-03-26  Razya Ladelsky  <razya@il.ibm.com>

                 PR tree-optimization/46886
                 * tree-parloops.c (transform_to_exit_first_loop):Set 
number of iterations correctly when the body may appear at the latch.
                 (pallelize_loops): Remove the condition preventing 
do-while loops.
 


Bootstrap and testsuite psss successfully on power7 linux machine.
Ok to commit?
Thanks,
=

Comments

Richard Biener March 26, 2012, 11:23 a.m. UTC | #1
On Mon, 26 Mar 2012, Razya Ladelsky wrote:

> Hi,
> 
> This is (hopefully) a permanent fix  to pr46886.c
> I removed the condition preventing parallelization of do_while loops, as 
> it 
> was blocking parallelizing important loops in spec-2006.
> The patch fixes the number of iterations for cases where the body could 
> appear in the latch, as in pr46886.c.
> 
> 2012-03-26  Razya Ladelsky  <razya@il.ibm.com>
> 
>                  PR tree-optimization/46886
>                  * tree-parloops.c (transform_to_exit_first_loop):Set 
> number of iterations correctly when the body may appear at the latch.
>                  (pallelize_loops): Remove the condition preventing 
> do-while loops.
>  
> 
> 
> Bootstrap and testsuite psss successfully on power7 linux machine.
> Ok to commit?

+  
+  /* if the latch contains more than the one statemnt of control variable 
+     increment then it contains the body.  */
+  if (exit_1->dest == loop->latch && last_and_only_stmt (loop->latch))
     new_rhs = gimple_cond_rhs (cond_stmt);

please check what the comment suggests, thus,

       && last_and_only_stmt (loop->latch) == cond_stmt

tree-parloops.c is quite fragile in what it expects the IL to look like
and tests do not cover what later code expects.  Please do not add to 
that.

@@ -2146,7 +2149,6 @@ parallelize_loops (void)
     return false;
   if (cfun->has_nonlocal_label)
     return false;
-

spurious whitespace change.

@@ -2213,6 +2212,7 @@ parallelize_loops (void)
 	continue;
 
       changed = true;
+       
       if (dump_file && (dump_flags & TDF_DETAILS))
       {
 	if (loop->inner)

Likewise.

Ok with the above change.

Thanks,
Richard.
Razya Ladelsky March 26, 2012, noon UTC | #2
Richard Guenther <rguenther@suse.de> wrote on 26/03/2012 01:23:15 PM:

> From: Richard Guenther <rguenther@suse.de>
> To: Razya Ladelsky/Haifa/IBM@IBMIL
> Cc: gcc-patches@gcc.gnu.org
> Date: 26/03/2012 01:23 PM
> Subject: Re: [PATCH] Permanent Fix for PR46886
> 
> On Mon, 26 Mar 2012, Razya Ladelsky wrote:
> 
> > Hi,
> > 
> > This is (hopefully) a permanent fix  to pr46886.c
> > I removed the condition preventing parallelization of do_while loops, 
as 
> > it 
> > was blocking parallelizing important loops in spec-2006.
> > The patch fixes the number of iterations for cases where the body 
could 
> > appear in the latch, as in pr46886.c.
> > 
> > 2012-03-26  Razya Ladelsky  <razya@il.ibm.com>
> > 
> >                  PR tree-optimization/46886
> >                  * tree-parloops.c (transform_to_exit_first_loop):Set 
> > number of iterations correctly when the body may appear at the latch.
> >                  (pallelize_loops): Remove the condition preventing 
> > do-while loops.
> > 
> > 
> > 
> > Bootstrap and testsuite psss successfully on power7 linux machine.
> > Ok to commit?
> 
> + 
> +  /* if the latch contains more than the one statemnt of control 
variable 
> +     increment then it contains the body.  */
> +  if (exit_1->dest == loop->latch && last_and_only_stmt (loop->latch))
>      new_rhs = gimple_cond_rhs (cond_stmt);
> 
> please check what the comment suggests, thus,
> 
>        && last_and_only_stmt (loop->latch) == cond_stmt
> 

Hi Richard,
The latch has at least one stmt for sure:
the control variable increment (not a cond_stmt) coming from the call to 
canonicalize_loop_ivs(loop, nit, true)  earlier in tree-parloops:

 
/* .....  When BUMP_IN_LATCH is true, the induction
   variable is incremented in the loop latch, otherwise it is
   incremented in the loop header.  Return the induction variable that
   was created.  */

tree
canonicalize_loop_ivs (struct loop *loop, tree *nit, bool bump_in_latch)



> tree-parloops.c is quite fragile in what it expects the IL to look like
> and tests do not cover what later code expects.  Please do not add to 
> that.


I agree.
I have also tested this code on spec2006 benchmarks.
(together with an upcoming patch, 5 of these show speedups with autopar on 
a linux-power7)




> 
> @@ -2146,7 +2149,6 @@ parallelize_loops (void)
>      return false;
>    if (cfun->has_nonlocal_label)
>      return false;
> -
> 
> spurious whitespace change.
> 
> @@ -2213,6 +2212,7 @@ parallelize_loops (void)
>     continue;
> 
>        changed = true;
> + 
>        if (dump_file && (dump_flags & TDF_DETAILS))
>        {
>     if (loop->inner)
> 
> Likewise.
> 
> Ok with the above change.

Ok, thanks,
Razya

> 
> Thanks,
> Richard.
>
Richard Biener March 26, 2012, 2:45 p.m. UTC | #3
On Mon, 26 Mar 2012, Razya Ladelsky wrote:

> > + 
> > +  /* if the latch contains more than the one statemnt of control 
> variable 
> > +     increment then it contains the body.  */
> > +  if (exit_1->dest == loop->latch && last_and_only_stmt (loop->latch))
> >      new_rhs = gimple_cond_rhs (cond_stmt);
> > 
> > please check what the comment suggests, thus,
> > 
> >        && last_and_only_stmt (loop->latch) == cond_stmt
> > 
> 
> 
> Richard,
> There's no simple way to do such a check,
> as the control variable increment stmt is explicitly created in 
> canonicalize_loop_ivs.
> What we do know is that it is put in the latch because we ask for it when 
> we call 
> canonicalize_loop_ivs from parloops.
> 
> One way to get the control variable increment stmt is to use
> canonicalize_loop_ivs return value and look for the stmts that use it.
> 
> Do you think I should add all this functionality to the code in order to 
> assert
> the stmt that is in the latch? or can I skip it ?

No, you do not need to do that.  I see the check is basically
verifying the latch is empty (apart from the newly inserted
IV update).

Well, looking at the code simply checking for an empty latch
(before canonicalize_loop_ivs) is not going to make the case
correct when we do _not_ have an empty latch.  That is, the whole
logic in transform_to_exit_first_loop seems to assume that either
the latch is empty or the header is empty, but this does not seem
to be verified anywhere (AFAIU)?  We copy the loop anyway but
exempt the code in the latch (well, it's not exactly clear
how gimple_duplicate_sese_tail will end up turning the loop
into while (...) style to me).

It might be that the patch is a progression, so it's ok in its
original form.

Richard.
diff mbox

Patch

Index: tree-parloops.c
===================================================================
--- tree-parloops.c	(revision 185775)
+++ tree-parloops.c	(working copy)
@@ -1522,7 +1522,10 @@  transform_to_exit_first_loop (struct loop *loop, h
     the condition, moving the condition to the entry requires
     decrementing one iteration.  */
   exit_1 = EDGE_SUCC (exit->src, EDGE_SUCC (exit->src, 0) == exit); 
-  if (exit_1->dest == loop->latch)
+  
+  /* if the latch contains more than the one statemnt of control variable 
+     increment then it contains the body.  */
+  if (exit_1->dest == loop->latch && last_and_only_stmt (loop->latch))
     new_rhs = gimple_cond_rhs (cond_stmt);
   else
   {
@@ -2146,7 +2149,6 @@  parallelize_loops (void)
     return false;
   if (cfun->has_nonlocal_label)
     return false;
-
   gcc_obstack_init (&parloop_obstack);
   reduction_list = htab_create (10, reduction_info_hash,
 				     reduction_info_eq, free);
@@ -2187,10 +2189,7 @@  parallelize_loops (void)
 	  || loop_has_blocks_with_irreducible_flag (loop)
 	  || (loop_preheader_edge (loop)->src->flags & BB_IRREDUCIBLE_LOOP)
 	  /* FIXME: the check for vector phi nodes could be removed.  */
-	  || loop_has_vector_phi_nodes (loop)
-	  /* FIXME: transform_to_exit_first_loop does not handle not
-	     header-copied loops correctly - see PR46886.  */
-	  || !do_while_loop_p (loop))
+	  || loop_has_vector_phi_nodes (loop))
 	continue;
       estimated = max_stmt_executions_int (loop, false);
       /* FIXME: Bypass this check as graphite doesn't update the
@@ -2213,6 +2212,7 @@  parallelize_loops (void)
 	continue;
 
       changed = true;
+       
       if (dump_file && (dump_flags & TDF_DETAILS))
       {
 	if (loop->inner)