diff mbox

basic_block flags, BB_VISITED

Message ID 8760ov2wbg.fsf@hertz.schwinge.homeip.net
State New
Headers show

Commit Message

Thomas Schwinge Oct. 14, 2016, 8:01 a.m. UTC
Hi!

After the "Add Early VRP" GCC trunk commit r240291 (Kugan CC for your
information), I've been observing all kinds of OpenACC offloading
failures.  I now figured out what's going on.

The "evrp" pass uses basic_block's BB_VISITED flag.  It first clears
these all, gcc/tree-vrp.c:execute_early_vrp:

      FOR_EACH_BB_FN (bb, cfun)
        {
          bb->flags &= ~BB_VISITED;

..., then does its processing, and at the end, clears these again:

      FOR_EACH_BB_FN (bb, cfun)
        bb->flags &= ~BB_VISITED;

I note that this looks slightly different from what
gcc/cfg.c:clear_bb_flags whould be doing, which works from
ENTRY_BLOCK_PTR_FOR_FN onwards:

    /* Clear all basic block flags that do not have to be preserved.  */
    void
    clear_bb_flags (void)
    {
      basic_block bb;
    
      FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR_FOR_FN (cfun), NULL, next_bb)
        bb->flags &= BB_FLAGS_TO_PRESERVE;
    }

In the specific case that I've been looking at, "evrp" processing doesn't
change the code other than for shifting some IDs because of adding a
(dummy one-argument) PHI node.  And the problem now exactly is that the
ENTRY_BLOCK_PTR_FOR_FN's BB_VISITED flag is still set, and so
gcc/omp-low.c:oacc_loop_discover_walk will bail out without doing any
processing:

    /* DFS walk of basic blocks BB onwards, creating OpenACC loop
       structures as we go.  By construction these loops are properly
       nested.  */
    
    static void
    oacc_loop_discover_walk (oacc_loop *loop, basic_block bb)
    {
    [...]
      if (bb->flags & BB_VISITED)
        return;
    
     follow:
      bb->flags |= BB_VISITED;
    [...]
    
    /* Discover the OpenACC loops marked up by HEAD and TAIL markers for
       the current function.  */
    
    static oacc_loop *
    oacc_loop_discovery ()
    {
      basic_block bb;
      
      oacc_loop *top = new_oacc_loop_outer (current_function_decl);
      oacc_loop_discover_walk (top, ENTRY_BLOCK_PTR_FOR_FN (cfun));
    [...]

Now, gcc/cfg-flags.def says:

       [Most of the basic block] flags may be cleared by clear_bb_flags().  It is generally
       a bad idea to rely on any flags being up-to-date.  */
    [...]
    /* A general visited flag for passes to use.  */
    DEF_BASIC_BLOCK_FLAG(VISITED, 13)

The BB_VISITED flag to me seems to always be of very local use only.
Should we be more strict and strengthen the above "may be"/"bad idea"
wording?

Does the "evrp" pass need to get changes applied, to properly clear
BB_VISITED (ENTRY_BLOCK_PTR_FOR_FN, in particular)?  Or, in contrast, as
we're not to "rely on any flags being up-to-date", should the BB_VISITED
clearing at the end of gcc/tree-vrp.c:execute_early_vrp be removed in
fact?

From (really just) a quick look, I can't tell the exact policy that other
GCC passes are applying/using regarding the basic_block flags...

The following patch does cure the testsuite regressions:


Is something like that what we should do?  Should clear_bb_flags be
called here, or early in gcc/omp-low.c:execute_oacc_device_lower (like
some other GCC passes seem to be doing)?


Grüße
 Thomas

Comments

Bernd Schmidt Oct. 14, 2016, 9:15 a.m. UTC | #1
On 10/14/2016 10:01 AM, Thomas Schwinge wrote:
> After the "Add Early VRP" GCC trunk commit r240291 (Kugan CC for your
> information), I've been observing all kinds of OpenACC offloading
> failures.  I now figured out what's going on.
>
> The "evrp" pass uses basic_block's BB_VISITED flag.  It first clears
> these all, gcc/tree-vrp.c:execute_early_vrp:
>
>       FOR_EACH_BB_FN (bb, cfun)
>         {
>           bb->flags &= ~BB_VISITED;
>
> ..., then does its processing, and at the end, clears these again:
>
>       FOR_EACH_BB_FN (bb, cfun)
>         bb->flags &= ~BB_VISITED;
>
> I note that this looks slightly different from what
> gcc/cfg.c:clear_bb_flags whould be doing, which works from
> ENTRY_BLOCK_PTR_FOR_FN onwards:

So maybe it should just call clear_bb_flags instead of doing the loop 
itself? Ok if that works.


Bernd
Richard Biener Oct. 14, 2016, 9:26 a.m. UTC | #2
On Fri, Oct 14, 2016 at 11:15 AM, Bernd Schmidt <bschmidt@redhat.com> wrote:
> On 10/14/2016 10:01 AM, Thomas Schwinge wrote:
>>
>> After the "Add Early VRP" GCC trunk commit r240291 (Kugan CC for your
>> information), I've been observing all kinds of OpenACC offloading
>> failures.  I now figured out what's going on.
>>
>> The "evrp" pass uses basic_block's BB_VISITED flag.  It first clears
>> these all, gcc/tree-vrp.c:execute_early_vrp:
>>
>>       FOR_EACH_BB_FN (bb, cfun)
>>         {
>>           bb->flags &= ~BB_VISITED;
>>
>> ..., then does its processing, and at the end, clears these again:
>>
>>       FOR_EACH_BB_FN (bb, cfun)
>>         bb->flags &= ~BB_VISITED;
>>
>> I note that this looks slightly different from what
>> gcc/cfg.c:clear_bb_flags whould be doing, which works from
>> ENTRY_BLOCK_PTR_FOR_FN onwards:
>
>
> So maybe it should just call clear_bb_flags instead of doing the loop
> itself? Ok if that works.

That doesn't generally work, it clears too many flags (it was appearantly
designed for RTL).

Richard.

>
> Bernd
>
Richard Biener Oct. 14, 2016, 9:28 a.m. UTC | #3
On Fri, Oct 14, 2016 at 10:01 AM, Thomas Schwinge
<thomas@codesourcery.com> wrote:
> Hi!
>
> After the "Add Early VRP" GCC trunk commit r240291 (Kugan CC for your
> information), I've been observing all kinds of OpenACC offloading
> failures.  I now figured out what's going on.
>
> The "evrp" pass uses basic_block's BB_VISITED flag.  It first clears
> these all, gcc/tree-vrp.c:execute_early_vrp:
>
>       FOR_EACH_BB_FN (bb, cfun)
>         {
>           bb->flags &= ~BB_VISITED;
>
> ..., then does its processing, and at the end, clears these again:
>
>       FOR_EACH_BB_FN (bb, cfun)
>         bb->flags &= ~BB_VISITED;
>
> I note that this looks slightly different from what
> gcc/cfg.c:clear_bb_flags whould be doing, which works from
> ENTRY_BLOCK_PTR_FOR_FN onwards:
>
>     /* Clear all basic block flags that do not have to be preserved.  */
>     void
>     clear_bb_flags (void)
>     {
>       basic_block bb;
>
>       FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR_FOR_FN (cfun), NULL, next_bb)
>         bb->flags &= BB_FLAGS_TO_PRESERVE;
>     }
>
> In the specific case that I've been looking at, "evrp" processing doesn't
> change the code other than for shifting some IDs because of adding a
> (dummy one-argument) PHI node.  And the problem now exactly is that the
> ENTRY_BLOCK_PTR_FOR_FN's BB_VISITED flag is still set, and so
> gcc/omp-low.c:oacc_loop_discover_walk will bail out without doing any
> processing:

The BB_VISITED flag has indetermined state at the beginning of a pass.
You have to ensure it is cleared yourself.

EVRP clearing it (similar to tree-ssa-propagate.c) is because IRA has
the same issue and crashes on "stale" BB_VISTED flags.

Richard.

>     /* DFS walk of basic blocks BB onwards, creating OpenACC loop
>        structures as we go.  By construction these loops are properly
>        nested.  */
>
>     static void
>     oacc_loop_discover_walk (oacc_loop *loop, basic_block bb)
>     {
>     [...]
>       if (bb->flags & BB_VISITED)
>         return;
>
>      follow:
>       bb->flags |= BB_VISITED;
>     [...]
>
>     /* Discover the OpenACC loops marked up by HEAD and TAIL markers for
>        the current function.  */
>
>     static oacc_loop *
>     oacc_loop_discovery ()
>     {
>       basic_block bb;
>
>       oacc_loop *top = new_oacc_loop_outer (current_function_decl);
>       oacc_loop_discover_walk (top, ENTRY_BLOCK_PTR_FOR_FN (cfun));
>     [...]
>
> Now, gcc/cfg-flags.def says:
>
>        [Most of the basic block] flags may be cleared by clear_bb_flags().  It is generally
>        a bad idea to rely on any flags being up-to-date.  */
>     [...]
>     /* A general visited flag for passes to use.  */
>     DEF_BASIC_BLOCK_FLAG(VISITED, 13)
>
> The BB_VISITED flag to me seems to always be of very local use only.
> Should we be more strict and strengthen the above "may be"/"bad idea"
> wording?
>
> Does the "evrp" pass need to get changes applied, to properly clear
> BB_VISITED (ENTRY_BLOCK_PTR_FOR_FN, in particular)?  Or, in contrast, as
> we're not to "rely on any flags being up-to-date", should the BB_VISITED
> clearing at the end of gcc/tree-vrp.c:execute_early_vrp be removed in
> fact?
>
> From (really just) a quick look, I can't tell the exact policy that other
> GCC passes are applying/using regarding the basic_block flags...
>
> The following patch does cure the testsuite regressions:
>
> --- gcc/omp-low.c
> +++ gcc/omp-low.c
> @@ -19342,6 +19342,7 @@ oacc_loop_discovery ()
>    basic_block bb;
>
>    oacc_loop *top = new_oacc_loop_outer (current_function_decl);
> +  clear_bb_flags ();
>    oacc_loop_discover_walk (top, ENTRY_BLOCK_PTR_FOR_FN (cfun));
>
>    /* The siblings were constructed in reverse order, reverse them so
>
> Is something like that what we should do?  Should clear_bb_flags be
> called here, or early in gcc/omp-low.c:execute_oacc_device_lower (like
> some other GCC passes seem to be doing)?
>
>
> Grüße
>  Thomas
Bernd Schmidt Oct. 14, 2016, 10:57 a.m. UTC | #4
On 10/14/2016 11:26 AM, Richard Biener wrote:
> On Fri, Oct 14, 2016 at 11:15 AM, Bernd Schmidt <bschmidt@redhat.com> wrote:
>> So maybe it should just call clear_bb_flags instead of doing the loop
>> itself? Ok if that works.
>
> That doesn't generally work, it clears too many flags (it was appearantly
> designed for RTL).

Or maybe it could be argued that more tree-specific flags ought to be in 
BB_FLAGS_TO_PRESERVE, or that it would make more sense to define a more 
specific BB_FLAGS_TO_KEEP.

So maybe just change the loop at the end of evrp, it clearly wants to 
clean up after itself so just make it do it reliably.


Bernd
Nathan Sidwell Oct. 14, 2016, 11 a.m. UTC | #5
On 10/14/16 05:28, Richard Biener wrote:

> The BB_VISITED flag has indetermined state at the beginning of a pass.
> You have to ensure it is cleared yourself.

In that case the openacc (&nvptx?) passes should be modified to clear the flags 
at their start, rather than at their end.

nathan
Richard Biener Oct. 14, 2016, 11:06 a.m. UTC | #6
On Fri, Oct 14, 2016 at 1:00 PM, Nathan Sidwell <nathan@acm.org> wrote:
> On 10/14/16 05:28, Richard Biener wrote:
>
>> The BB_VISITED flag has indetermined state at the beginning of a pass.
>> You have to ensure it is cleared yourself.
>
>
> In that case the openacc (&nvptx?) passes should be modified to clear the
> flags at their start, rather than at their end.

Yes.  But as I said, I ran into IRA ICEs (somewhere in the testsuite) when not
cleaning up after tree-ssa-propagate.c.  So somebody has to fix IRA first.

Richard.

> nathan
Richard Biener Oct. 14, 2016, 11:07 a.m. UTC | #7
On Fri, Oct 14, 2016 at 12:57 PM, Bernd Schmidt <bschmidt@redhat.com> wrote:
> On 10/14/2016 11:26 AM, Richard Biener wrote:
>>
>> On Fri, Oct 14, 2016 at 11:15 AM, Bernd Schmidt <bschmidt@redhat.com>
>> wrote:
>>>
>>> So maybe it should just call clear_bb_flags instead of doing the loop
>>> itself? Ok if that works.
>>
>>
>> That doesn't generally work, it clears too many flags (it was appearantly
>> designed for RTL).
>
>
> Or maybe it could be argued that more tree-specific flags ought to be in
> BB_FLAGS_TO_PRESERVE, or that it would make more sense to define a more
> specific BB_FLAGS_TO_KEEP.
>
> So maybe just change the loop at the end of evrp, it clearly wants to clean
> up after itself so just make it do it reliably.

I thought about adding an argument to clear_bb_flags telling it the
flags to clear.

Richard.

>
> Bernd
>
diff mbox

Patch

--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -19342,6 +19342,7 @@  oacc_loop_discovery ()
   basic_block bb;
   
   oacc_loop *top = new_oacc_loop_outer (current_function_decl);
+  clear_bb_flags ();
   oacc_loop_discover_walk (top, ENTRY_BLOCK_PTR_FOR_FN (cfun));
 
   /* The siblings were constructed in reverse order, reverse them so