[RFC] Relax tree-if-conv.c trap assumptions.
diff mbox

Message ID CAFiYyc0YjGeoKL-M1rNiEXa8Y9ba+Q2tUc7TOw4NRcH2g=oHJA@mail.gmail.com
State New
Headers show

Commit Message

Richard Biener Oct. 30, 2015, 11:29 a.m. UTC
On Fri, Oct 30, 2015 at 11:21 AM, Kumar, Venkataramanan
<Venkataramanan.Kumar@amd.com> wrote:
> Hi Andrew,
>
>> -----Original Message-----
>> From: Andrew Pinski [mailto:pinskia@gmail.com]
>> Sent: Friday, October 30, 2015 3:38 PM
>> To: Kumar, Venkataramanan
>> Cc: Richard Beiner (richard.guenther@gmail.com); gcc-patches@gcc.gnu.org
>> Subject: Re: [RFC] [Patch] Relax tree-if-conv.c trap assumptions.
>>
>> On Fri, Oct 30, 2015 at 6:06 PM, Kumar, Venkataramanan
>> <Venkataramanan.Kumar@amd.com> wrote:
>> > Hi Richard,
>> >
>> > I am trying to "if covert the store" in the below test case and later
>> > help it to get vectorized under -Ofast -ftree-loop-if-convert-stores
>> > -fno-common
>> >
>> > #define LEN 4096
>> >  __attribute__((aligned(32))) float array[LEN]; void test() { for (int i = 0; i <
>> LEN; i++) {
>> >    if (array[i] > (float)0.)
>> >                 array[i] =3 ;
>> >
>> > }
>> > }
>> >
>> > Currently in GCC 5.2  does not vectorize it.
>> > https://goo.gl/9nS6Pd
>> >
>> > However ICC seems to vectorize it
>> > https://goo.gl/y1yGHx
>> >
>> > As discussed with you  earlier, to allow "if convert store"  I am checking the
>> following:
>> >
>> > (1) We already  read the reference "array[i]"  unconditionally once .
>> > (2) I am now checking  if we are conditionally writing to memory which is
>> defined as read and write and is bound to the definition we are seeing.
>>
>>
>> I don't think this is thread safe ....
>>
>
> I thought under -ftree-loop-if-convert-stores it is ok to do this transformation.

Yes, that's what we have done in the past here.  Note that I think we should
remove the flag in favor of the --param allow-store-data-races and if-convert
safe stores always (stores to thread-local memory).  Esp. using masked
stores should be always safe.

> Regards,
> Venkat.
>
>> Thanks,
>> Andrew
>>
>> >
>> > With this change, I get able to if convert and the vectorize the case also.
>> >
>> > /build/gcc/xgcc -B ./build/gcc/  ifconv.c -Ofast -fopt-info-vec  -S
>> > -ftree-loop-if-convert-stores -fno-common
>> > ifconv.c:2:63: note: loop vectorized
>> >
>> > Patch
>> > ------
>> > diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c index
>> > f201ab5..6475cc0 100644
>> > --- a/gcc/tree-if-conv.c
>> > +++ b/gcc/tree-if-conv.c
>> > @@ -727,6 +727,34 @@ write_memrefs_written_at_least_once (gimple
>> *stmt,
>> >    return true;
>> > }
>> >
>> > +static bool
>> > +write_memrefs_safe_to_access_unconditionally (gimple *stmt,
>> > +
>> > +vec<data_reference_p> drs) {

{ to the next line

The function has a bad name it should be write_memrefs_writable ()

>> > +  int i;
>> > +  data_reference_p a;
>> > +  bool found = false;
>> > +
>> > +  for (i = 0; drs.iterate (i, &a); i++)
>> > +    {
>> > +      if (DR_STMT (a) == stmt
>> > +               && DR_IS_WRITE (a)
>> > +               && (DR_WRITTEN_AT_LEAST_ONCE (a) == 0)
>> > +               && (DR_RW_UNCONDITIONALLY (a) == 1))
>> > +             {
>> > +               tree base = get_base_address (DR_REF (a));
>> > +               found = false;
>> > +               if (DECL_P (base)
>> > +                   && decl_binds_to_current_def_p (base)
>> > +                   && !TREE_READONLY (base))
>> > +                 {
>> > +                   found = true;

So if the vector ever would contain more than one write you'd return
true if only one of them is not readonly.

IMHO all the routines need refactoring to operate on single DRs which
AFAIK is the only case if-conversion handles anyway (can't if-convert
calls or aggregate assignments or asms).  Ugh, it seems the whole
thing is quadratic, doing linear walks to find the DRs for a stmt ...

A simple


would improve that tremendously ...  (well, given the array of DRs is
sorted by stmt
which is an implementation detail not documented).

Computing all the DR flags in separate loops also doesn't make much sense to me
and just complicates code.

Richard.

>> > +                 }
>> > +             }
>> > +    }
>> > +  return found;
>> > +}
>> > +
>> > /* Return true when the memory references of STMT won't trap in the
>> >     if-converted code.  There are two things that we have to check for:
>> >
>> > @@ -748,8 +776,20 @@ write_memrefs_written_at_least_once (gimple
>> > *stmt, static bool ifcvt_memrefs_wont_trap (gimple *stmt,
>> > vec<data_reference_p> refs) {
>> > -  return write_memrefs_written_at_least_once (stmt, refs)
>> > -    && memrefs_read_or_written_unconditionally (stmt, refs);
>> > +  bool memrefs_written_once, memrefs_read_written_unconditionally;
>> > +  bool memrefs_safe_to_access;
>> > +
>> > +  memrefs_written_once
>> > +             = write_memrefs_written_at_least_once (stmt, refs);
>> > +
>> > +  memrefs_read_written_unconditionally
>> > +             =  memrefs_read_or_written_unconditionally (stmt, refs);
>> > +
>> > +  memrefs_safe_to_access
>> > +             = write_memrefs_safe_to_access_unconditionally (stmt,
>> > + refs);
>> > +
>> > +  return ((memrefs_written_once || memrefs_safe_to_access)
>> > +                && memrefs_read_written_unconditionally);
>> > }
>> >
>> >  /* Wrapper around gimple_could_trap_p refined for the needs of the
>> >
>> >
>> > do I need this function write_memrefs_written_at_least_once anymore?
>> > Please suggest if there a better way to do this.
>> >
>> > Bootstrapped and regression  tested on x86_64.
>> > I am not  adding change log and comments now, as I  wanted to check
>> approach first.
>> >
>> > Regards,
>> > Venkat.
>> >
>> >

Patch
diff mbox

Index: gcc/tree-if-conv.c
===================================================================
--- gcc/tree-if-conv.c  (revision 229572)
+++ gcc/tree-if-conv.c  (working copy)
@@ -612,9 +612,10 @@  memrefs_read_or_written_unconditionally
   data_reference_p a, b;
   tree ca = bb_predicate (gimple_bb (stmt));

-  for (i = 0; drs.iterate (i, &a); i++)
-    if (DR_STMT (a) == stmt)
-      {
+  for (i = gimple_uid (stmt) - 1; drs.iterate (i, &a); i++)
+    {
+    if (DR_STMT (a) != stmt)
+      break;
        bool found = false;
        int x = DR_RW_UNCONDITIONALLY (a);

@@ -684,10 +685,13 @@  write_memrefs_written_at_least_once (gim
   data_reference_p a, b;
   tree ca = bb_predicate (gimple_bb (stmt));

-  for (i = 0; drs.iterate (i, &a); i++)
-    if (DR_STMT (a) == stmt
-       && DR_IS_WRITE (a))
-      {
+  for (i = gimple_uid (stmt) - 1; drs.iterate (i, &a); i++)
+    {
+      if (DR_STMT (a) != stmt)
+       break;
+      if (! DR_IS_WRITE (a))
+       continue;
+
        bool found = false;
        int x = DR_WRITTEN_AT_LEAST_ONCE (a);

@@ -721,7 +725,7 @@  write_memrefs_written_at_least_once (gim
            DR_WRITTEN_AT_LEAST_ONCE (a) = 0;
            return false;
          }
-      }
+    }

   return true;
 }
@@ -1291,6 +1297,7 @@  if_convertible_loop_p_1 (struct loop *lo
          case GIMPLE_CALL:
          case GIMPLE_DEBUG:
          case GIMPLE_COND:
+           gimple_set_uid (gsi_stmt (gsi), 0);
            break;
          default:
            return false;
@@ -1304,6 +1311,8 @@  if_convertible_loop_p_1 (struct loop *lo
       dr->aux = XNEW (struct ifc_dr);
       DR_WRITTEN_AT_LEAST_ONCE (dr) = -1;
       DR_RW_UNCONDITIONALLY (dr) = -1;
+      if (gimple_uid (DR_STMT (dr)) == 0)
+       gimple_set_uid (DR_STMT (dr), i + 1);
     }
   predicate_bbs (loop);