diff mbox

[RFC] PR67326 - relax trap assumption by looking at similar DRS

Message ID CY1PR1201MB109888741196B084079723C78F1A0@CY1PR1201MB1098.namprd12.prod.outlook.com
State New
Headers show

Commit Message

Kumar, Venkataramanan Nov. 20, 2015, 12:02 p.m. UTC
Hi Richard,

As per Jakub suggestion in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67326, the below patch fixes the regression in tree if conversion.
Basically allowing if conversion to happen for a candidate DR, if we find similar DR with same dimensions  and that DR will not trap.

To find similar DRs using hash table to hashing the offset and DR pairs.  
Also reusing  read/written information that was stored for reference tree.  

Also. 
(1) I guard these checks for  -ftree-loop-if-convert-stores and -fno-common. 
Sometimes vectorization flags also triggers if conversion.
(2) Also hashing base DRs for writes only. 

gcc/ChangeLog
2015-11-19  Venkataramanan  <Venkataramanan.Kumar@amd.com>
	
	PR tree-optimization/67326
	* tree-if-conv.c  (offset_DR_map): Define.
    	(struct ifc_dr): Add new tree base_predicate field.
    	(hash_memrefs_baserefs_and_store_DRs_read_written_info): Hash offsets, DR pairs
	and hash base ref,  DR pairs  for write type DRs.
	(ifcvt_memrefs_wont_trap):  Guard checks with -ftree-loop-if-convert-stores flag. 
       Check for similar DR that are accessed unconditionally.
       (if_convertible_loop_p_1):  Initialize and delete offset hash maps

gcc/testsuite/ChangeLog
2015-11-19  Venkataramanan  <Venkataramanan.Kumar@amd.com>
	* gcc.dg/tree-ssa/ifc-pr67326.c:  Add new.

Regstrapped on x86_64, Ok for trunk?

Regards,
Venkat.

Comments

Jakub Jelinek Nov. 20, 2015, 12:22 p.m. UTC | #1
On Fri, Nov 20, 2015 at 12:02:07PM +0000, Kumar, Venkataramanan wrote:
> Also. 
> (1) I guard these checks for  -ftree-loop-if-convert-stores and -fno-common. 
> Sometimes vectorization flags also triggers if conversion.
> (2) Also hashing base DRs for writes only. 

Let me comment just on formatting, will leave actual review to Richard.
> 
> gcc/ChangeLog
> 2015-11-19  Venkataramanan  <Venkataramanan.Kumar@amd.com>

Surname missing.
> 	
> 	PR tree-optimization/67326
> 	* tree-if-conv.c  (offset_DR_map): Define.

Extraneous space.

>     	(struct ifc_dr): Add new tree base_predicate field.

All ChangeLog lines should be indented by a single tab.

>     	(hash_memrefs_baserefs_and_store_DRs_read_written_info): Hash offsets, DR pairs

Too long line.

> 	and hash base ref,  DR pairs  for write type DRs.

Extraneous spaces.

> 	(ifcvt_memrefs_wont_trap):  Guard checks with -ftree-loop-if-convert-stores flag. 

Too long line and extraneous spaces.

>        Check for similar DR that are accessed unconditionally.
>        (if_convertible_loop_p_1):  Initialize and delete offset hash maps

Extraneous space, missing full stop at the end.
> 
> gcc/testsuite/ChangeLog
> 2015-11-19  Venkataramanan  <Venkataramanan.Kumar@amd.com>
> 	* gcc.dg/tree-ssa/ifc-pr67326.c:  Add new.

Extraneous space.

> +  if (DR_IS_WRITE (a))
>      {
> -      IFC_DR (a)->predicate = ca;
> -      *base_master_dr = a;
> +      base_master_dr = &baseref_DR_map->get_or_insert (base_ref,&exist2);

Missing space before &exist2);

> +      offset_master_dr = &offset_DR_map->get_or_insert (offset,&exist3);
> +      if (!exist3)
> +	*offset_master_dr = a;
> +
> +      if (DR_RW_UNCONDITIONALLY (*offset_master_dr) != 1)
> +	DR_RW_UNCONDITIONALLY (*offset_master_dr)
> +		= DR_RW_UNCONDITIONALLY (*master_dr);

Wrong indentation, the = should be below the first underscore in
DR_RW_UNCONDITIONALLY.

> -      if (DR_WRITTEN_AT_LEAST_ONCE (*base_master_dr) == 1)
> +      if ((base_master_dr
> +	   && DR_WRITTEN_AT_LEAST_ONCE (*base_master_dr) == 1))

Extraneous () pair.

> +  else if (DR_OFFSET (a))
> +    {
> +      offset_dr = offset_DR_map->get (DR_OFFSET (a));
> +      if ((DR_RW_UNCONDITIONALLY (*offset_dr) == 1)

Likewise.

	Jakub
Richard Biener Nov. 24, 2015, 3:37 p.m. UTC | #2
On Fri, Nov 20, 2015 at 1:02 PM, Kumar, Venkataramanan
<Venkataramanan.Kumar@amd.com> wrote:
> Hi Richard,
>
> As per Jakub suggestion in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67326, the below patch fixes the regression in tree if conversion.
> Basically allowing if conversion to happen for a candidate DR, if we find similar DR with same dimensions  and that DR will not trap.
>
> To find similar DRs using hash table to hashing the offset and DR pairs.
> Also reusing  read/written information that was stored for reference tree.
>
> Also.
> (1) I guard these checks for  -ftree-loop-if-convert-stores and -fno-common.
> Sometimes vectorization flags also triggers if conversion.
> (2) Also hashing base DRs for writes only.
>
> gcc/ChangeLog
> 2015-11-19  Venkataramanan  <Venkataramanan.Kumar@amd.com>
>
>         PR tree-optimization/67326
>         * tree-if-conv.c  (offset_DR_map): Define.
>         (struct ifc_dr): Add new tree base_predicate field.
>         (hash_memrefs_baserefs_and_store_DRs_read_written_info): Hash offsets, DR pairs
>         and hash base ref,  DR pairs  for write type DRs.
>         (ifcvt_memrefs_wont_trap):  Guard checks with -ftree-loop-if-convert-stores flag.
>        Check for similar DR that are accessed unconditionally.
>        (if_convertible_loop_p_1):  Initialize and delete offset hash maps
>
> gcc/testsuite/ChangeLog
> 2015-11-19  Venkataramanan  <Venkataramanan.Kumar@amd.com>
>         * gcc.dg/tree-ssa/ifc-pr67326.c:  Add new.
>
> Regstrapped on x86_64, Ok for trunk?

+  if (offset)
+    {
+      offset_master_dr = &offset_DR_map->get_or_insert (offset,&exist3);
+      if (!exist3)
+       *offset_master_dr = a;
+
+      if (DR_RW_UNCONDITIONALLY (*offset_master_dr) != 1)
+       DR_RW_UNCONDITIONALLY (*offset_master_dr)
+               = DR_RW_UNCONDITIONALLY (*master_dr);

this is fishy - as far as I can see offset_master globs all _candidates_ and

+  else if (DR_OFFSET (a))
+    {
+      offset_dr = offset_DR_map->get (DR_OFFSET (a));
+      if ((DR_RW_UNCONDITIONALLY (*offset_dr) == 1)
+          && DR_NUM_DIMENSIONS (a) == DR_NUM_DIMENSIONS (*offset_dr))
+       {
+         tree base_tree = get_base_address (DR_REF (a));
+         if (DECL_P (base_tree)
+             && flag_tree_loop_if_convert_stores
+             && decl_binds_to_current_def_p (base_tree)
+             && !TREE_READONLY (base_tree))
+           return true;
+       }
+    }

where with this that actually checks something (DR_NUM_DIMENSIONS is
not something you
can use to identify two arrays with the same domain) will then
consider DR_DW_UNCONDITIONALLY
ORed from all _candidates_ but not only from those which really have
the same domain.

You need to do the domain check as part of the hash-map hashing/comparing.

Note that there is no bounds info in the data ref info so you need to
  a) consider DR_OFFSET + DR_INIT
  b) verify the access size is the same (TYPE_SIZE_UNIT (TREE_TYPE (dr->ref)))
  c) verify the base objects are of the same size - note this is
somewhat difficult
as the base object for DR_OFFSET/INIT is starting at DR_BASE_ADDRESS so maybe
restrict this to ADDR_EXPR <decl> DR_BASE_ADDRESS cases where you can look
at DECL_SIZE (decl) of both candidates

You can also try using indices (DR_BASE_OBJECT plus DR_ACCESS_FNS when
DR_UNCONSTRAINED_BASE is false).  If the size of DR_BASE_OBJECT matches
and all access functions are equal it should be a compatible enough
case as well.

I'd say you should split out the base_predicate introduction into a
separate patch
(this change looks ok).

Richard.

> Regards,
> Venkat.
>
diff mbox

Patch

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr67326.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr67326.c
new file mode 100644
index 0000000..5ca11cd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr67326.c
@@ -0,0 +1,19 @@ 
+/* { dg-do compile } */
+/* { dg-options "-Ofast -fdump-tree-ifcvt-details -fno-common -ftree-loop-if-convert-stores" } */
+
+float v0[1024];
+float v1[1024];
+float v2[1024];
+float v3[1024];
+
+void condAssign1 () {
+  for (int i=0; i<1024; ++i)
+    v0[i] = (v2[i]>v1[i]) ? v2[i]*v3[i] : v0[i];
+}
+
+void condAssign2 () {
+  for (int i=0; i<1024; ++i)
+    v0[i] = (v2[i]>v1[i]) ? v2[i]*v1[i] : v0[i];
+}
+
+/* { dg-final { scan-tree-dump-times "Applying if-conversion" 2 "ifcvt" } } */
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index 01065cb..1b4d220 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -124,6 +124,9 @@  static hash_map<tree_operand_hash, data_reference_p> *ref_DR_map;
 /* Hash table to store base reference, DR pairs.  */
 static hash_map<tree_operand_hash, data_reference_p> *baseref_DR_map;
 
+/* Hash table to store DR offset, DR pairs.  */
+static hash_map<tree_operand_hash, data_reference_p> *offset_DR_map;
+
 /* Structure used to predicate basic blocks.  This is attached to the
    ->aux field of the BBs in the loop to be if-converted.  */
 struct bb_predicate {
@@ -589,6 +592,8 @@  struct ifc_dr {
   int rw_unconditionally;
 
   tree predicate;
+
+  tree base_predicate;
 };
 
 #define IFC_DR(DR) ((struct ifc_dr *) (DR)->aux)
@@ -609,11 +614,12 @@  static void
 hash_memrefs_baserefs_and_store_DRs_read_written_info (data_reference_p a)
 {
 
-  data_reference_p *master_dr, *base_master_dr;
+  data_reference_p *master_dr, *base_master_dr, *offset_master_dr;
   tree ref = DR_REF (a);
   tree base_ref = DR_BASE_OBJECT (a);
+  tree offset = DR_OFFSET (a);
   tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
-  bool exist1, exist2;
+  bool exist1, exist2, exist3;
 
   while (TREE_CODE (ref) == COMPONENT_REF
 	 || TREE_CODE (ref) == IMAGPART_EXPR
@@ -636,22 +642,34 @@  hash_memrefs_baserefs_and_store_DRs_read_written_info (data_reference_p a)
   if (is_true_predicate (IFC_DR (*master_dr)->predicate))
     DR_RW_UNCONDITIONALLY (*master_dr) = 1;
 
-  base_master_dr = &baseref_DR_map->get_or_insert (base_ref,&exist2);
-
-  if (!exist2)
+  if (DR_IS_WRITE (a))
     {
-      IFC_DR (a)->predicate = ca;
-      *base_master_dr = a;
+      base_master_dr = &baseref_DR_map->get_or_insert (base_ref,&exist2);
+      if (!exist2)
+	{
+	  IFC_DR (a)->base_predicate = ca;
+	  *base_master_dr = a;
+	}
+      else
+	IFC_DR (*base_master_dr)->base_predicate
+	  = fold_or_predicates
+		(EXPR_LOCATION (IFC_DR (*base_master_dr)->base_predicate),
+		 ca, IFC_DR (*base_master_dr)->base_predicate);
+
+      if (is_true_predicate (IFC_DR (*base_master_dr)->base_predicate))
+	DR_WRITTEN_AT_LEAST_ONCE (*base_master_dr) = 1;
     }
-  else
-    IFC_DR (*base_master_dr)->predicate
-	= fold_or_predicates
-		(EXPR_LOCATION (IFC_DR (*base_master_dr)->predicate),
-		 ca, IFC_DR (*base_master_dr)->predicate);
 
-  if (DR_IS_WRITE (a)
-      && (is_true_predicate (IFC_DR (*base_master_dr)->predicate)))
-    DR_WRITTEN_AT_LEAST_ONCE (*base_master_dr) = 1;
+  if (offset)
+    {
+      offset_master_dr = &offset_DR_map->get_or_insert (offset,&exist3);
+      if (!exist3)
+	*offset_master_dr = a;
+
+      if (DR_RW_UNCONDITIONALLY (*offset_master_dr) != 1)
+	DR_RW_UNCONDITIONALLY (*offset_master_dr)
+		= DR_RW_UNCONDITIONALLY (*master_dr);
+    }
 }
 
 /* Return true when the memory references of STMT won't trap in the
@@ -682,7 +700,7 @@  hash_memrefs_baserefs_and_store_DRs_read_written_info (data_reference_p a)
 static bool
 ifcvt_memrefs_wont_trap (gimple *stmt, vec<data_reference_p> drs)
 {
-  data_reference_p *master_dr, *base_master_dr;
+  data_reference_p *master_dr, *base_master_dr, *offset_dr;
   data_reference_p a = drs[gimple_uid (stmt) - 1];
 
   tree ref_base_a = DR_REF (a);
@@ -698,21 +716,38 @@  ifcvt_memrefs_wont_trap (gimple *stmt, vec<data_reference_p> drs)
   master_dr = ref_DR_map->get (ref_base_a);
   base_master_dr = baseref_DR_map->get (base);
 
-  gcc_assert (master_dr != NULL && base_master_dr != NULL);
+  gcc_assert (master_dr != NULL);
 
   if (DR_RW_UNCONDITIONALLY (*master_dr) == 1)
     {
-      if (DR_WRITTEN_AT_LEAST_ONCE (*base_master_dr) == 1)
+      if ((base_master_dr
+	   && DR_WRITTEN_AT_LEAST_ONCE (*base_master_dr) == 1))
 	return true;
       else
 	{
 	  tree base_tree = get_base_address (DR_REF (a));
 	  if (DECL_P (base_tree)
 	      && decl_binds_to_current_def_p (base_tree)
+	      && flag_tree_loop_if_convert_stores
 	      && !TREE_READONLY (base_tree))
 	  return true;
 	}
     }
+  else if (DR_OFFSET (a))
+    {
+      offset_dr = offset_DR_map->get (DR_OFFSET (a));
+      if ((DR_RW_UNCONDITIONALLY (*offset_dr) == 1)
+	   && DR_NUM_DIMENSIONS (a) == DR_NUM_DIMENSIONS (*offset_dr))
+	{
+	  tree base_tree = get_base_address (DR_REF (a));
+	  if (DECL_P (base_tree)
+	      && flag_tree_loop_if_convert_stores
+	      && decl_binds_to_current_def_p (base_tree)
+	      && !TREE_READONLY (base_tree))
+	    return true;
+	}
+    }
+
   return false;
 }
 
@@ -1267,6 +1302,7 @@  if_convertible_loop_p_1 (struct loop *loop,
 
   ref_DR_map = new hash_map<tree_operand_hash, data_reference_p>;
   baseref_DR_map = new hash_map<tree_operand_hash, data_reference_p>;
+  offset_DR_map = new hash_map<tree_operand_hash, data_reference_p>;
 
   predicate_bbs (loop);
 
@@ -1382,6 +1418,9 @@  if_convertible_loop_p (struct loop *loop, bool *any_mask_load_store)
   delete baseref_DR_map;
   baseref_DR_map = NULL;
 
+  delete offset_DR_map;
+  offset_DR_map = NULL;
+
   return res;
 }