diff mbox series

middle-end: Recursively check is_trivially_copyable_or_pair in vec.h

Message ID patch-17792-tamar@arm.com
State New
Headers show
Series middle-end: Recursively check is_trivially_copyable_or_pair in vec.h | expand

Commit Message

Tamar Christina Oct. 2, 2023, 12:38 p.m. UTC
Hi All,

I recently committed a patch that uses a nested std::pair in the second argument.
It temporarily adds a second ranking variable for sorting and then later drops it.

This hits the newly added assert in vec.h.  This assert made some relaxation for
std::pair but doesn't allow this case through.  The patch allows a recursive
std::pair in the second argument which fixes bootstrap.

It should also still maintain the invariant that was being tested here since
the nested arguments should still be trivially copyable.

Bootstrapped on aarch64-none-linux-gnu, x86_64-linux-gnu, and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

	vec.h (struct is_trivially_copyable_or_pair): Check recursively in
	second arg.

--- inline copy of patch -- 
diff --git a/gcc/vec.h b/gcc/vec.h
index d509639292b..dcc18c99bfb 100644



--
diff --git a/gcc/vec.h b/gcc/vec.h
index d509639292b..dcc18c99bfb 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -1199,7 +1199,7 @@ namespace vec_detail
   template<typename T, typename U>
   struct is_trivially_copyable_or_pair<std::pair<T, U> >
   : std::integral_constant<bool, std::is_trivially_copyable<T>::value
-    && std::is_trivially_copyable<U>::value> { };
+    && is_trivially_copyable_or_pair<U>::value> { };
 }
 #endif

Comments

Jakub Jelinek Oct. 2, 2023, 1:21 p.m. UTC | #1
On Mon, Oct 02, 2023 at 01:38:53PM +0100, Tamar Christina wrote:
> Hi All,
> 
> I recently committed a patch that uses a nested std::pair in the second argument.
> It temporarily adds a second ranking variable for sorting and then later drops it.
> 
> This hits the newly added assert in vec.h.  This assert made some relaxation for
> std::pair but doesn't allow this case through.  The patch allows a recursive
> std::pair in the second argument which fixes bootstrap.

I must say I still don't understand why using a
struct ifcvt_arg_entry { tree arg; unsigned len, occur; };
with comments describing what the members mean wouldn't be a better fix,
in the sorting function what exactly means x{1,2}.second.first and
x{1,2}.second.second isn't easily understandable, neither from the
identifiers nor from any comments.
Seems because you use 2 separate vectors, one with just tree elements and
another with those tree elements + 2 unsigned values cached from it for the
sorting purpose and then rewrite the original tree vector after sorting, I
don't really see why nested std::pair would be a better match for it than
a named structure.  Furthermore, why populate args first, then compute
the extra 2 integers in another loop pushing to argsKV and finally overwrite
args with sorted values?  Can't the first loop push tree with the 2 integers
already?  And what is the point of not using this structure later on when
both args and argsKV vectors are live until the end of the same function?
Can't you either pass that argsKV to others, having just one vector, or
at least release the other vector when you don't really need it?
Formatting style, swap? arg1 : arg0 isn't correctly formatted, missing space
before ?.

Also, ArgEntry is CamelCase which we (usually) don't use in GCC and
additionally doesn't seem to be unique enough for ODR purposes.
Ditto argsKV.

> It should also still maintain the invariant that was being tested here since
> the nested arguments should still be trivially copyable.
> 
> Bootstrapped on aarch64-none-linux-gnu, x86_64-linux-gnu, and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
> 	vec.h (struct is_trivially_copyable_or_pair): Check recursively in

Missing "* " above.

> 	second arg.

	Jakub
Tamar Christina Oct. 2, 2023, 1:28 p.m. UTC | #2
> -----Original Message-----
> From: Jakub Jelinek <jakub@redhat.com>
> Sent: Monday, October 2, 2023 2:21 PM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; jwakely@redhat.com
> Subject: Re: [PATCH]middle-end: Recursively check
> is_trivially_copyable_or_pair in vec.h
> 
> On Mon, Oct 02, 2023 at 01:38:53PM +0100, Tamar Christina wrote:
> > Hi All,
> >
> > I recently committed a patch that uses a nested std::pair in the second
> argument.
> > It temporarily adds a second ranking variable for sorting and then later drops
> it.
> >
> > This hits the newly added assert in vec.h.  This assert made some
> > relaxation for std::pair but doesn't allow this case through.  The
> > patch allows a recursive std::pair in the second argument which fixes
> bootstrap.
> 
> I must say I still don't understand why using a struct ifcvt_arg_entry { tree arg;
> unsigned len, occur; }; with comments describing what the members mean
> wouldn't be a better fix, in the sorting function what exactly means
> x{1,2}.second.first and x{1,2}.second.second isn't easily understandable,
> neither from the identifiers nor from any comments.
> Seems because you use 2 separate vectors, one with just tree elements and
> another with those tree elements + 2 unsigned values cached from it for the
> sorting purpose and then rewrite the original tree vector after sorting, I don't
> really see why nested std::pair would be a better match for it than a named
> structure.  Furthermore, why populate args first, then compute the extra 2
> integers in another loop pushing to argsKV and finally overwrite args with
> sorted values?  Can't the first loop push tree with the 2 integers already?  And
> what is the point of not using this structure later on when both args and
> argsKV vectors are live until the end of the same function?
> Can't you either pass that argsKV to others, having just one vector, or at least
> release the other vector when you don't really need it?
> Formatting style, swap? arg1 : arg0 isn't correctly formatted, missing space
> before ?.
> 
> Also, ArgEntry is CamelCase which we (usually) don't use in GCC and
> additionally doesn't seem to be unique enough for ODR purposes.
> Ditto argsKV.

Ok, since these will take a lot longer to test and do I've reverted the patch
for now since bootstrap was broken.

Thanks,
Tamar

> 
> > It should also still maintain the invariant that was being tested here
> > since the nested arguments should still be trivially copyable.
> >
> > Bootstrapped on aarch64-none-linux-gnu, x86_64-linux-gnu, and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > 	vec.h (struct is_trivially_copyable_or_pair): Check recursively in
> 
> Missing "* " above.
> 
> > 	second arg.
> 
> 	Jakub
Tamar Christina Oct. 3, 2023, 10:27 a.m. UTC | #3
> -----Original Message-----
> From: Jakub Jelinek <jakub@redhat.com>
> Sent: Monday, October 2, 2023 2:21 PM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; jwakely@redhat.com
> Subject: Re: [PATCH]middle-end: Recursively check
> is_trivially_copyable_or_pair in vec.h
> 
> On Mon, Oct 02, 2023 at 01:38:53PM +0100, Tamar Christina wrote:
> > Hi All,
> >
> > I recently committed a patch that uses a nested std::pair in the second
> argument.
> > It temporarily adds a second ranking variable for sorting and then later drops
> it.
> >
> > This hits the newly added assert in vec.h.  This assert made some
> > relaxation for std::pair but doesn't allow this case through.  The
> > patch allows a recursive std::pair in the second argument which fixes
> bootstrap.
> 
> I must say I still don't understand why using a struct ifcvt_arg_entry { tree arg;
> unsigned len, occur; }; with comments describing what the members mean
> wouldn't be a better fix, in the sorting function what exactly means
> x{1,2}.second.first and x{1,2}.second.second isn't easily understandable,
> neither from the identifiers nor from any comments.
> Seems because you use 2 separate vectors, one with just tree elements and
> another with those tree elements + 2 unsigned values cached from it for the
> sorting purpose and then rewrite the original tree vector after sorting, I don't
> really see why nested std::pair would be a better match for it than a named
> structure.  Furthermore, why populate args first, then compute the extra 2
> integers in another loop pushing to argsKV and finally overwrite args with
> sorted values?  Can't the first loop push tree with the 2 integers already?
> what is the point of not using this structure later on when both args and
> argsKV vectors are live until the end of the same function?
> Can't you either pass that argsKV to others, having just one vector, or at least
> release the other vector when you don't really need it?
> Formatting style, swap? arg1 : arg0 isn't correctly formatted, missing space
> before ?.
> 
> Also, ArgEntry is CamelCase which we (usually) don't use in GCC and
> additionally doesn't seem to be unique enough for ODR purposes.
> Ditto argsKV.
> 

Hi All,

This refactors the code to remove the args cache and index lookups
in favor of a single structure. It also again, removes the use of
std::sort as previously requested but avoids the new asserts in
trunk.

Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-linux-gnu
and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

	* tree-if-conv.cc (typedef struct ifcvt_arg_entry): New.
	(cmp_arg_entry): New.
	(gen_phi_arg_condition, gen_phi_nest_statement,
	predicate_scalar_phi): Use them.

--- inline copy of patch ----

diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc
index a8c915913aed267edfb3ebd2c530aeca7cf51832..f7037bd42494b3982d2efd593ba276812b8d2f4f 100644
--- a/gcc/tree-if-conv.cc
+++ b/gcc/tree-if-conv.cc
@@ -1927,11 +1927,32 @@ gen_simplified_condition (tree cond, scalar_cond_masked_set_type &cond_set)
   return cond;
 }
 
+/* Structure used to track meta-data on PHI arguments used to generate
+   most efficient comparison sequence to slatten a PHI node.  */
+
+typedef struct ifcvt_arg_entry
+{
+  /* The PHI node argument value.  */
+  tree arg;
+
+  /* The number of compares required to reach this PHI node from start of the
+     BB being if-converted.  */
+  unsigned num_compares;
+
+  /* The number of times this PHI node argument appears in the current PHI
+     node.  */
+  unsigned occurs;
+
+  /* The indices at which this PHI arg occurs inside the PHI node.  */
+  vec <int> indexes;
+} ifcvt_arg_entry_t;
+
 /* Produce condition for all occurrences of ARG in PHI node.  Set *INVERT
    as to whether the condition is inverted.  */
 
 static tree
-gen_phi_arg_condition (gphi *phi, vec<int> *occur, gimple_stmt_iterator *gsi,
+gen_phi_arg_condition (gphi *phi, ifcvt_arg_entry_t &arg,
+		       gimple_stmt_iterator *gsi,
 		       scalar_cond_masked_set_type &cond_set, bool *invert)
 {
   int len;
@@ -1941,11 +1962,11 @@ gen_phi_arg_condition (gphi *phi, vec<int> *occur, gimple_stmt_iterator *gsi,
   edge e;
 
   *invert = false;
-  len = occur->length ();
+  len = arg.indexes.length ();
   gcc_assert (len > 0);
   for (i = 0; i < len; i++)
     {
-      e = gimple_phi_arg_edge (phi, (*occur)[i]);
+      e = gimple_phi_arg_edge (phi, arg.indexes[i]);
       c = bb_predicate (e->src);
       if (is_true_predicate (c))
 	{
@@ -2010,22 +2031,21 @@ gen_phi_arg_condition (gphi *phi, vec<int> *occur, gimple_stmt_iterator *gsi,
 static tree
 gen_phi_nest_statement (gphi *phi, gimple_stmt_iterator *gsi,
 			scalar_cond_masked_set_type &cond_set, tree type,
-			hash_map<tree_operand_hash, auto_vec<int>> &phi_arg_map,
-			gimple **res_stmt, tree lhs0, vec<tree> &args,
-			unsigned idx)
+			gimple **res_stmt, tree lhs0,
+			vec<struct ifcvt_arg_entry> &args, unsigned idx)
 {
   if (idx == args.length ())
-    return args[idx - 1];
+    return args[idx - 1].arg;
 
-  vec<int> *indexes = phi_arg_map.get (args[idx - 1]);
   bool invert;
-  tree cond = gen_phi_arg_condition (phi, indexes, gsi, cond_set, &invert);
-  tree arg1 = gen_phi_nest_statement (phi, gsi, cond_set, type, phi_arg_map,
-				      res_stmt, lhs0, args, idx + 1);
+  tree cond = gen_phi_arg_condition (phi, args[idx - 1], gsi, cond_set,
+				     &invert);
+  tree arg1 = gen_phi_nest_statement (phi, gsi, cond_set, type, res_stmt, lhs0,
+				      args, idx + 1);
 
   unsigned prev = idx;
   unsigned curr = prev - 1;
-  tree arg0 = args[curr];
+  tree arg0 = args[curr].arg;
   tree rhs, lhs;
   if (idx > 1)
     lhs = make_temp_ssa_name (type, NULL, "_ifc_");
@@ -2045,6 +2065,25 @@ gen_phi_nest_statement (gphi *phi, gimple_stmt_iterator *gsi,
   return lhs;
 }
 
+static int
+cmp_arg_entry (const void *p1, const void *p2)
+{
+  const ifcvt_arg_entry sval1 = *(const ifcvt_arg_entry *)p1;
+  const ifcvt_arg_entry sval2 = *(const ifcvt_arg_entry *)p2;
+
+  if (sval1.num_compares < sval2.num_compares)
+    return -1;
+  else if (sval1.num_compares > sval2.num_compares)
+    return 1;
+
+  if (sval1.occurs < sval2.occurs)
+    return -1;
+  else if (sval1.occurs > sval2.occurs)
+    return 1;
+
+  return 0;
+}
+
 /* Replace a scalar PHI node with a COND_EXPR using COND as condition.
    This routine can handle PHI nodes with more than two arguments.
 
@@ -2167,61 +2206,53 @@ predicate_scalar_phi (gphi *phi, gimple_stmt_iterator *gsi)
   /* Create hashmap for PHI node which contain vector of argument indexes
      having the same value.  */
   bool swap = false;
-  hash_map<tree_operand_hash, auto_vec<int> > phi_arg_map;
+  hash_map<tree_operand_hash, vec<int> > phi_arg_map;
   unsigned int num_args = gimple_phi_num_args (phi);
   /* Vector of different PHI argument values.  */
-  auto_vec<tree> args (num_args);
+  auto_vec<ifcvt_arg_entry_t> args;
 
-  /* Compute phi_arg_map.  */
+  /* Compute phi_arg_map, determine the list of unique PHI args and the indices
+     where they are in the PHI node.  The indices will be used to determine
+     the conditions to apply and their complexity.  */
   for (i = 0; i < num_args; i++)
     {
       tree arg;
 
       arg = gimple_phi_arg_def (phi, i);
-      if (!phi_arg_map.get (arg))
-	args.quick_push (arg);
       phi_arg_map.get_or_insert (arg).safe_push (i);
     }
 
-  /* Determine element with max number of occurrences and complexity.  Looking at only
-     number of occurrences as a measure for complexity isn't enough as all usages can
-     be unique but the comparisons to reach the PHI node differ per branch.  */
-  typedef std::pair <tree, std::pair <unsigned, unsigned>> ArgEntry;
-  auto_vec<ArgEntry> argsKV;
-  for (i = 0; i < args.length (); i++)
+  /* Determine element with max number of occurrences and complexity.  Looking
+     at only number of occurrences as a measure for complexity isn't enough as
+     all usages can be unique but the comparisons to reach the PHI node differ
+     per branch.  */
+  for (auto entry : phi_arg_map)
     {
       unsigned int len = 0;
-      for (int index : phi_arg_map.get (args[i]))
+      for (int index : entry.second)
 	{
 	  edge e = gimple_phi_arg_edge (phi, index);
 	  len += get_bb_num_predicate_stmts (e->src);
 	}
 
-      unsigned occur = phi_arg_map.get (args[i])->length ();
+      unsigned occur = entry.second.length ();
       if (dump_file && (dump_flags & TDF_DETAILS))
 	fprintf (dump_file, "Ranking %d as len=%d, idx=%d\n", i, len, occur);
-      argsKV.safe_push ({ args[i], { len, occur }});
+      args.safe_push ({ entry.first, len, occur, entry.second });
     }
 
   /* Sort elements based on rankings ARGS.  */
-  std::sort(argsKV.begin(), argsKV.end(), [](const ArgEntry &left,
-					     const ArgEntry &right) {
-    return left.second < right.second;
-  });
-
-  for (i = 0; i < args.length (); i++)
-    args[i] = argsKV[i].first;
+  args.qsort (cmp_arg_entry);
 
   /* Handle one special case when number of arguments with different values
      is equal 2 and one argument has the only occurrence.  Such PHI can be
      handled as if would have only 2 arguments.  */
-  if (args.length () == 2 && phi_arg_map.get (args[0])->length () == 1)
+  if (args.length () == 2
+      && args[0].indexes.length () == 1)
     {
-      vec<int> *indexes;
-      indexes = phi_arg_map.get (args[0]);
-      index0 = (*indexes)[0];
-      arg0 = args[0];
-      arg1 = args[1];
+      index0 = args[0].indexes[0];
+      arg0 = args[0].arg;
+      arg1 = args[1].arg;
       e = gimple_phi_arg_edge (phi, index0);
       cond = bb_predicate (e->src);
       if (TREE_CODE (cond) == TRUTH_NOT_EXPR)
@@ -2235,8 +2266,8 @@ predicate_scalar_phi (gphi *phi, gimple_stmt_iterator *gsi)
       if (!(is_cond_scalar_reduction (phi, &reduc, arg0 , arg1,
 				      &op0, &op1, true, &has_nop, &nop_reduc)))
 	rhs = fold_build_cond_expr (TREE_TYPE (res), unshare_expr (cond),
-				    swap? arg1 : arg0,
-				    swap? arg0 : arg1);
+				    swap ? arg1 : arg0,
+				    swap ? arg0 : arg1);
       else
 	{
 	  /* Convert reduction stmt into vectorizable form.  */
@@ -2252,8 +2283,8 @@ predicate_scalar_phi (gphi *phi, gimple_stmt_iterator *gsi)
     {
       /* Common case.  */
       tree type = TREE_TYPE (gimple_phi_result (phi));
-      gen_phi_nest_statement (phi, gsi, cond_set, type, phi_arg_map,
-			      &new_stmt, res, args, 1);
+      gen_phi_nest_statement (phi, gsi, cond_set, type, &new_stmt, res,
+			      args, 1);
     }
 
   if (dump_file && (dump_flags & TDF_DETAILS))
Jakub Jelinek Oct. 3, 2023, 11:01 a.m. UTC | #4
On Tue, Oct 03, 2023 at 10:27:16AM +0000, Tamar Christina wrote:
> +/* Structure used to track meta-data on PHI arguments used to generate
> +   most efficient comparison sequence to slatten a PHI node.  */

                                            ^^^ typo (at least, never heard
of this word, and wiktionary doesn't know it either (except for Dannish/Swedish))

> @@ -2045,6 +2065,25 @@ gen_phi_nest_statement (gphi *phi, gimple_stmt_iterator *gsi,
>    return lhs;
>  }
>  

Perhaps add a short function comment here?

> +static int
> +cmp_arg_entry (const void *p1, const void *p2)
> +{
> +  const ifcvt_arg_entry sval1 = *(const ifcvt_arg_entry *)p1;
> +  const ifcvt_arg_entry sval2 = *(const ifcvt_arg_entry *)p2;
> +
> +  if (sval1.num_compares < sval2.num_compares)
> +    return -1;
> +  else if (sval1.num_compares > sval2.num_compares)
> +    return 1;
> +
> +  if (sval1.occurs < sval2.occurs)
> +    return -1;
> +  else if (sval1.occurs > sval2.occurs)
> +    return 1;
> +
> +  return 0;
> +}
> +

> @@ -2167,61 +2206,53 @@ predicate_scalar_phi (gphi *phi, gimple_stmt_iterator *gsi)
>    /* Create hashmap for PHI node which contain vector of argument indexes
>       having the same value.  */
>    bool swap = false;
> -  hash_map<tree_operand_hash, auto_vec<int> > phi_arg_map;
> +  hash_map<tree_operand_hash, vec<int> > phi_arg_map;
>    unsigned int num_args = gimple_phi_num_args (phi);
>    /* Vector of different PHI argument values.  */
> -  auto_vec<tree> args (num_args);
> +  auto_vec<ifcvt_arg_entry_t> args;
>  
> -  /* Compute phi_arg_map.  */
> +  /* Compute phi_arg_map, determine the list of unique PHI args and the indices
> +     where they are in the PHI node.  The indices will be used to determine
> +     the conditions to apply and their complexity.  */
>    for (i = 0; i < num_args; i++)
>      {
>        tree arg;
>  
>        arg = gimple_phi_arg_def (phi, i);
> -      if (!phi_arg_map.get (arg))
> -	args.quick_push (arg);
>        phi_arg_map.get_or_insert (arg).safe_push (i);
>      }
>  
> -  /* Determine element with max number of occurrences and complexity.  Looking at only
> -     number of occurrences as a measure for complexity isn't enough as all usages can
> -     be unique but the comparisons to reach the PHI node differ per branch.  */
> -  typedef std::pair <tree, std::pair <unsigned, unsigned>> ArgEntry;
> -  auto_vec<ArgEntry> argsKV;
> -  for (i = 0; i < args.length (); i++)
> +  /* Determine element with max number of occurrences and complexity.  Looking
> +     at only number of occurrences as a measure for complexity isn't enough as
> +     all usages can be unique but the comparisons to reach the PHI node differ
> +     per branch.  */
> +  for (auto entry : phi_arg_map)
>      {
>        unsigned int len = 0;
> -      for (int index : phi_arg_map.get (args[i]))
> +      for (int index : entry.second)
>  	{
>  	  edge e = gimple_phi_arg_edge (phi, index);
>  	  len += get_bb_num_predicate_stmts (e->src);
>  	}
>  
> -      unsigned occur = phi_arg_map.get (args[i])->length ();
> +      unsigned occur = entry.second.length ();
>        if (dump_file && (dump_flags & TDF_DETAILS))
>  	fprintf (dump_file, "Ranking %d as len=%d, idx=%d\n", i, len, occur);
> -      argsKV.safe_push ({ args[i], { len, occur }});
> +      args.safe_push ({ entry.first, len, occur, entry.second });
>      }
>  
>    /* Sort elements based on rankings ARGS.  */
> -  std::sort(argsKV.begin(), argsKV.end(), [](const ArgEntry &left,
> -					     const ArgEntry &right) {
> -    return left.second < right.second;
> -  });
> -
> -  for (i = 0; i < args.length (); i++)
> -    args[i] = argsKV[i].first;
> +  args.qsort (cmp_arg_entry);

I admit I don't know what you're using the args vector later on for and
whether its ordering affects code generation, but because you qsort it I
assume it does.  My worry is that a hash_map traversal might not be the same
order on all hosts and similarly qsort doesn't achieve stable sorting
in case num_compares and occurrs members are equal for two or more different
arguments.  Can that ever happen?  We have stablesort method instead of
qsort but that would require consistent ordering in the vector (std::sort
doesn't ensure stable sorting either).

If it is a non-issue, the patch is ok with the above nits fixed.  Otherwise
perhaps we'd need to push in the first loop into the vector (but that
      if (!phi_arg_map.get (arg))
	args.quick_push (arg);
      phi_arg_map.get_or_insert (arg).safe_push (i);
in there was quite inefficient, better would be
      bool existed;
      phi_arg_map.get_or_insert (arg, &existed).safe_push (i);
      if (!existed)
	args.safe_push (ifcvt_arg_entry { arg, 0, 0, vNULL });
or something similar), plus use stablesort.  Or add another compared member
which would be the first position.

	Jakub
Tamar Christina Oct. 3, 2023, 11:41 a.m. UTC | #5
> -----Original Message-----
> From: Jakub Jelinek <jakub@redhat.com>
> Sent: Tuesday, October 3, 2023 12:02 PM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; jwakely@redhat.com
> Subject: Re: [PATCH]middle-end: Recursively check
> is_trivially_copyable_or_pair in vec.h
> 
> On Tue, Oct 03, 2023 at 10:27:16AM +0000, Tamar Christina wrote:
> > +/* Structure used to track meta-data on PHI arguments used to generate
> > +   most efficient comparison sequence to slatten a PHI node.  */
> 
>                                             ^^^ typo (at least, never heard of this word, and
> wiktionary doesn't know it either (except for Dannish/Swedish))
> 
> > @@ -2045,6 +2065,25 @@ gen_phi_nest_statement (gphi *phi,
> gimple_stmt_iterator *gsi,
> >    return lhs;
> >  }
> >
> 
> Perhaps add a short function comment here?
> 
> > +static int
> > +cmp_arg_entry (const void *p1, const void *p2) {
> > +  const ifcvt_arg_entry sval1 = *(const ifcvt_arg_entry *)p1;
> > +  const ifcvt_arg_entry sval2 = *(const ifcvt_arg_entry *)p2;
> > +
> > +  if (sval1.num_compares < sval2.num_compares)
> > +    return -1;
> > +  else if (sval1.num_compares > sval2.num_compares)
> > +    return 1;
> > +
> > +  if (sval1.occurs < sval2.occurs)
> > +    return -1;
> > +  else if (sval1.occurs > sval2.occurs)
> > +    return 1;
> > +
> > +  return 0;
> > +}
> > +
> 
> > @@ -2167,61 +2206,53 @@ predicate_scalar_phi (gphi *phi,
> gimple_stmt_iterator *gsi)
> >    /* Create hashmap for PHI node which contain vector of argument indexes
> >       having the same value.  */
> >    bool swap = false;
> > -  hash_map<tree_operand_hash, auto_vec<int> > phi_arg_map;
> > +  hash_map<tree_operand_hash, vec<int> > phi_arg_map;
> >    unsigned int num_args = gimple_phi_num_args (phi);
> >    /* Vector of different PHI argument values.  */
> > -  auto_vec<tree> args (num_args);
> > +  auto_vec<ifcvt_arg_entry_t> args;
> >
> > -  /* Compute phi_arg_map.  */
> > +  /* Compute phi_arg_map, determine the list of unique PHI args and the
> indices
> > +     where they are in the PHI node.  The indices will be used to determine
> > +     the conditions to apply and their complexity.  */
> >    for (i = 0; i < num_args; i++)
> >      {
> >        tree arg;
> >
> >        arg = gimple_phi_arg_def (phi, i);
> > -      if (!phi_arg_map.get (arg))
> > -	args.quick_push (arg);
> >        phi_arg_map.get_or_insert (arg).safe_push (i);
> >      }
> >
> > -  /* Determine element with max number of occurrences and complexity.
> Looking at only
> > -     number of occurrences as a measure for complexity isn't enough as all
> usages can
> > -     be unique but the comparisons to reach the PHI node differ per branch.
> */
> > -  typedef std::pair <tree, std::pair <unsigned, unsigned>> ArgEntry;
> > -  auto_vec<ArgEntry> argsKV;
> > -  for (i = 0; i < args.length (); i++)
> > +  /* Determine element with max number of occurrences and complexity.
> Looking
> > +     at only number of occurrences as a measure for complexity isn't enough
> as
> > +     all usages can be unique but the comparisons to reach the PHI node differ
> > +     per branch.  */
> > +  for (auto entry : phi_arg_map)
> >      {
> >        unsigned int len = 0;
> > -      for (int index : phi_arg_map.get (args[i]))
> > +      for (int index : entry.second)
> >  	{
> >  	  edge e = gimple_phi_arg_edge (phi, index);
> >  	  len += get_bb_num_predicate_stmts (e->src);
> >  	}
> >
> > -      unsigned occur = phi_arg_map.get (args[i])->length ();
> > +      unsigned occur = entry.second.length ();
> >        if (dump_file && (dump_flags & TDF_DETAILS))
> >  	fprintf (dump_file, "Ranking %d as len=%d, idx=%d\n", i, len, occur);
> > -      argsKV.safe_push ({ args[i], { len, occur }});
> > +      args.safe_push ({ entry.first, len, occur, entry.second });
> >      }
> >
> >    /* Sort elements based on rankings ARGS.  */
> > -  std::sort(argsKV.begin(), argsKV.end(), [](const ArgEntry &left,
> > -					     const ArgEntry &right) {
> > -    return left.second < right.second;
> > -  });
> > -
> > -  for (i = 0; i < args.length (); i++)
> > -    args[i] = argsKV[i].first;
> > +  args.qsort (cmp_arg_entry);
> 
> I admit I don't know what you're using the args vector later on for and
> whether its ordering affects code generation, but because you qsort it I
> assume it does. My worry is that a hash_map traversal might not be the same
> order on all hosts and similarly qsort doesn't achieve stable sorting in case
> num_compares and occurrs members are equal for two or more different
> arguments.  Can that ever happen?

The order does matter but only for args. The hashmap is only used to collect
the unique values and their locations.  While you can have num_compares and
occurs being the same it wouldn't matter for the optimization as they are both
"equally" as expensive.  It would matter in the case where they are the last 2
entries in the list as we never test the last entry.  So your codegen would
select a different element then to test.  So could potentially affect reproducibility.

> We have stablesort method instead of
> qsort but that would require consistent ordering in the vector (std::sort
> doesn't ensure stable sorting either).
> 
> If it is a non-issue, the patch is ok with the above nits fixed.  Otherwise
> perhaps we'd need to push in the first loop into the vector (but that
>       if (!phi_arg_map.get (arg))
> 	args.quick_push (arg);
>       phi_arg_map.get_or_insert (arg).safe_push (i); in there was quite
> inefficient, better would be
>       bool existed;
>       phi_arg_map.get_or_insert (arg, &existed).safe_push (i);
>       if (!existed)
> 	args.safe_push (ifcvt_arg_entry { arg, 0, 0, vNULL }); or something
> similar), plus use stablesort.  Or add another compared member which would
> be the first position.

Hmm the problem here is that it would make the second loop that fills in the len
quadratic as it has to search for arg in the list.  I suppose I could push a pointer
to the struct instead of `i` in the hashmap and the element into args and update
the pointer as we go along?  Would that work?

Regards,
Tamar

> 
> 	Jakub
Jakub Jelinek Oct. 3, 2023, 11:52 a.m. UTC | #6
On Tue, Oct 03, 2023 at 11:41:01AM +0000, Tamar Christina wrote:
> > We have stablesort method instead of
> > qsort but that would require consistent ordering in the vector (std::sort
> > doesn't ensure stable sorting either).
> > 
> > If it is a non-issue, the patch is ok with the above nits fixed.  Otherwise
> > perhaps we'd need to push in the first loop into the vector (but that
> >       if (!phi_arg_map.get (arg))
> > 	args.quick_push (arg);
> >       phi_arg_map.get_or_insert (arg).safe_push (i); in there was quite
> > inefficient, better would be
> >       bool existed;
> >       phi_arg_map.get_or_insert (arg, &existed).safe_push (i);
> >       if (!existed)
> > 	args.safe_push (ifcvt_arg_entry { arg, 0, 0, vNULL }); or something
> > similar), plus use stablesort.  Or add another compared member which would
> > be the first position.
> 
> Hmm the problem here is that it would make the second loop that fills in the len
> quadratic as it has to search for arg in the list.  I suppose I could push a pointer
> to the struct instead of `i` in the hashmap and the element into args and update
> the pointer as we go along?  Would that work?

Only if the second loop traverses the hashmap elements and for each tries to
find the corresponding vector element.
If instead you do what you've done before in the second loop, walk the
vector and for each arg in there lookup phi_args_map.get (v.arg) (but please
just once, vanilla trunk looks it up twice in
      for (int index : phi_arg_map.get (args[i]))
        {
          edge e = gimple_phi_arg_edge (phi, index);
          len += get_bb_num_predicate_stmts (e->src);
        }
  
      unsigned occur = phi_arg_map.get (args[i])->length ();
), then I don't think it would be quadratic.

	Jakub
Tamar Christina Oct. 5, 2023, 2:01 p.m. UTC | #7
> On Tue, Oct 03, 2023 at 11:41:01AM +0000, Tamar Christina wrote:
> > > We have stablesort method instead of qsort but that would require
> > > consistent ordering in the vector (std::sort doesn't ensure stable
> > > sorting either).
> > >
> > > If it is a non-issue, the patch is ok with the above nits fixed.
> > > Otherwise perhaps we'd need to push in the first loop into the vector (but
> that
> > >       if (!phi_arg_map.get (arg))
> > > 	args.quick_push (arg);
> > >       phi_arg_map.get_or_insert (arg).safe_push (i); in there was
> > > quite inefficient, better would be
> > >       bool existed;
> > >       phi_arg_map.get_or_insert (arg, &existed).safe_push (i);
> > >       if (!existed)
> > > 	args.safe_push (ifcvt_arg_entry { arg, 0, 0, vNULL }); or something
> > > similar), plus use stablesort.  Or add another compared member which
> > > would be the first position.
> >
> > Hmm the problem here is that it would make the second loop that fills
> > in the len quadratic as it has to search for arg in the list.  I
> > suppose I could push a pointer to the struct instead of `i` in the
> > hashmap and the element into args and update the pointer as we go along?
> Would that work?
> 
> Only if the second loop traverses the hashmap elements and for each tries to
> find the corresponding vector element.
> If instead you do what you've done before in the second loop, walk the vector
> and for each arg in there lookup phi_args_map.get (v.arg) (but please just
> once, vanilla trunk looks it up twice in
>       for (int index : phi_arg_map.get (args[i]))
>         {
>           edge e = gimple_phi_arg_edge (phi, index);
>           len += get_bb_num_predicate_stmts (e->src);
>         }
> 
>       unsigned occur = phi_arg_map.get (args[i])->length (); ), then I don't think
> it would be quadratic.

Fair enough, here's the updated patch. It should address all the concerns
and clean up the code 😊

Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-linux-gnu
and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

	* tree-if-conv.cc (INCLUDE_ALGORITHM): Remove.
	(typedef struct ifcvt_arg_entry): New.
	(cmp_arg_entry): New.
	(gen_phi_arg_condition, gen_phi_nest_statement,
	predicate_scalar_phi): Use them.

--- inline copy of patch ---

diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc
index f76e0d8f2e6e0f59073fa8484b0b2c7a6cdc9783..635fce7a69af254dbc5aa9f829e6a053671d1d2c 100644
--- a/gcc/tree-if-conv.cc
+++ b/gcc/tree-if-conv.cc
@@ -80,7 +80,6 @@ along with GCC; see the file COPYING3.  If not see
      <L18>:;
 */
 
-#define INCLUDE_ALGORITHM
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -1937,11 +1936,32 @@ gen_simplified_condition (tree cond, scalar_cond_masked_set_type &cond_set)
   return cond;
 }
 
+/* Structure used to track meta-data on PHI arguments used to generate
+   most efficient comparison sequence to slatten a PHI node.  */
+
+typedef struct ifcvt_arg_entry
+{
+  /* The PHI node argument value.  */
+  tree arg;
+
+  /* The number of compares required to reach this PHI node from start of the
+     BB being if-converted.  */
+  unsigned num_compares;
+
+  /* The number of times this PHI node argument appears in the current PHI
+     node.  */
+  unsigned occurs;
+
+  /* The indices at which this PHI arg occurs inside the PHI node.  */
+  vec <int> *indexes;
+} ifcvt_arg_entry_t;
+
 /* Produce condition for all occurrences of ARG in PHI node.  Set *INVERT
    as to whether the condition is inverted.  */
 
 static tree
-gen_phi_arg_condition (gphi *phi, vec<int> *occur, gimple_stmt_iterator *gsi,
+gen_phi_arg_condition (gphi *phi, ifcvt_arg_entry_t &arg,
+		       gimple_stmt_iterator *gsi,
 		       scalar_cond_masked_set_type &cond_set, bool *invert)
 {
   int len;
@@ -1951,11 +1971,11 @@ gen_phi_arg_condition (gphi *phi, vec<int> *occur, gimple_stmt_iterator *gsi,
   edge e;
 
   *invert = false;
-  len = occur->length ();
+  len = arg.indexes->length ();
   gcc_assert (len > 0);
   for (i = 0; i < len; i++)
     {
-      e = gimple_phi_arg_edge (phi, (*occur)[i]);
+      e = gimple_phi_arg_edge (phi, (*arg.indexes)[i]);
       c = bb_predicate (e->src);
       if (is_true_predicate (c))
 	{
@@ -2020,22 +2040,21 @@ gen_phi_arg_condition (gphi *phi, vec<int> *occur, gimple_stmt_iterator *gsi,
 static tree
 gen_phi_nest_statement (gphi *phi, gimple_stmt_iterator *gsi,
 			scalar_cond_masked_set_type &cond_set, tree type,
-			hash_map<tree_operand_hash, auto_vec<int>> &phi_arg_map,
-			gimple **res_stmt, tree lhs0, vec<tree> &args,
-			unsigned idx)
+			gimple **res_stmt, tree lhs0,
+			vec<struct ifcvt_arg_entry> &args, unsigned idx)
 {
   if (idx == args.length ())
-    return args[idx - 1];
+    return args[idx - 1].arg;
 
-  vec<int> *indexes = phi_arg_map.get (args[idx - 1]);
   bool invert;
-  tree cond = gen_phi_arg_condition (phi, indexes, gsi, cond_set, &invert);
-  tree arg1 = gen_phi_nest_statement (phi, gsi, cond_set, type, phi_arg_map,
-				      res_stmt, lhs0, args, idx + 1);
+  tree cond = gen_phi_arg_condition (phi, args[idx - 1], gsi, cond_set,
+				     &invert);
+  tree arg1 = gen_phi_nest_statement (phi, gsi, cond_set, type, res_stmt, lhs0,
+				      args, idx + 1);
 
   unsigned prev = idx;
   unsigned curr = prev - 1;
-  tree arg0 = args[curr];
+  tree arg0 = args[curr].arg;
   tree rhs, lhs;
   if (idx > 1)
     lhs = make_temp_ssa_name (type, NULL, "_ifc_");
@@ -2055,6 +2074,25 @@ gen_phi_nest_statement (gphi *phi, gimple_stmt_iterator *gsi,
   return lhs;
 }
 
+static int
+cmp_arg_entry (const void *p1, const void *p2, void * /* data.  */)
+{
+  const ifcvt_arg_entry sval1 = *(const ifcvt_arg_entry *)p1;
+  const ifcvt_arg_entry sval2 = *(const ifcvt_arg_entry *)p2;
+
+  if (sval1.num_compares < sval2.num_compares)
+    return -1;
+  else if (sval1.num_compares > sval2.num_compares)
+    return 1;
+
+  if (sval1.occurs < sval2.occurs)
+    return -1;
+  else if (sval1.occurs > sval2.occurs)
+    return 1;
+
+  return 0;
+}
+
 /* Replace a scalar PHI node with a COND_EXPR using COND as condition.
    This routine can handle PHI nodes with more than two arguments.
 
@@ -2180,58 +2218,55 @@ predicate_scalar_phi (gphi *phi, gimple_stmt_iterator *gsi)
   hash_map<tree_operand_hash, auto_vec<int> > phi_arg_map;
   unsigned int num_args = gimple_phi_num_args (phi);
   /* Vector of different PHI argument values.  */
-  auto_vec<tree> args (num_args);
+  auto_vec<ifcvt_arg_entry_t> args;
 
-  /* Compute phi_arg_map.  */
+  /* Compute phi_arg_map, determine the list of unique PHI args and the indices
+     where they are in the PHI node.  The indices will be used to determine
+     the conditions to apply and their complexity.  */
+  auto_vec<tree> unique_args (num_args);
   for (i = 0; i < num_args; i++)
     {
       tree arg;
 
       arg = gimple_phi_arg_def (phi, i);
       if (!phi_arg_map.get (arg))
-	args.quick_push (arg);
+	unique_args.quick_push (arg);
       phi_arg_map.get_or_insert (arg).safe_push (i);
     }
 
-  /* Determine element with max number of occurrences and complexity.  Looking at only
-     number of occurrences as a measure for complexity isn't enough as all usages can
-     be unique but the comparisons to reach the PHI node differ per branch.  */
-  typedef std::pair <tree, std::pair <unsigned, unsigned>> ArgEntry;
-  auto_vec<ArgEntry> argsKV;
-  for (i = 0; i < args.length (); i++)
+  /* Determine element with max number of occurrences and complexity.  Looking
+     at only number of occurrences as a measure for complexity isn't enough as
+     all usages can be unique but the comparisons to reach the PHI node differ
+     per branch.  */
+  for (auto arg : unique_args)
     {
       unsigned int len = 0;
-      for (int index : phi_arg_map.get (args[i]))
+      vec<int> *indices = phi_arg_map.get (arg);
+      for (int index : *indices)
 	{
 	  edge e = gimple_phi_arg_edge (phi, index);
 	  len += get_bb_num_predicate_stmts (e->src);
 	}
 
-      unsigned occur = phi_arg_map.get (args[i])->length ();
+      unsigned occur = indices->length ();
       if (dump_file && (dump_flags & TDF_DETAILS))
 	fprintf (dump_file, "Ranking %d as len=%d, idx=%d\n", i, len, occur);
-      argsKV.safe_push ({ args[i], { len, occur }});
+      args.safe_push ({ arg, len, occur, indices });
     }
 
+  unique_args.release ();
   /* Sort elements based on rankings ARGS.  */
-  std::sort(argsKV.begin(), argsKV.end(), [](const ArgEntry &left,
-					     const ArgEntry &right) {
-    return left.second < right.second;
-  });
-
-  for (i = 0; i < args.length (); i++)
-    args[i] = argsKV[i].first;
+  args.stablesort (cmp_arg_entry, NULL);
 
   /* Handle one special case when number of arguments with different values
      is equal 2 and one argument has the only occurrence.  Such PHI can be
      handled as if would have only 2 arguments.  */
-  if (args.length () == 2 && phi_arg_map.get (args[0])->length () == 1)
+  if (args.length () == 2
+      && args[0].indexes->length () == 1)
     {
-      vec<int> *indexes;
-      indexes = phi_arg_map.get (args[0]);
-      index0 = (*indexes)[0];
-      arg0 = args[0];
-      arg1 = args[1];
+      index0 = (*args[0].indexes)[0];
+      arg0 = args[0].arg;
+      arg1 = args[1].arg;
       e = gimple_phi_arg_edge (phi, index0);
       cond = bb_predicate (e->src);
       if (TREE_CODE (cond) == TRUTH_NOT_EXPR)
@@ -2245,8 +2280,8 @@ predicate_scalar_phi (gphi *phi, gimple_stmt_iterator *gsi)
       if (!(is_cond_scalar_reduction (phi, &reduc, arg0 , arg1,
 				      &op0, &op1, true, &has_nop, &nop_reduc)))
 	rhs = fold_build_cond_expr (TREE_TYPE (res), unshare_expr (cond),
-				    swap? arg1 : arg0,
-				    swap? arg0 : arg1);
+				    swap ? arg1 : arg0,
+				    swap ? arg0 : arg1);
       else
 	{
 	  /* Convert reduction stmt into vectorizable form.  */
@@ -2262,8 +2297,8 @@ predicate_scalar_phi (gphi *phi, gimple_stmt_iterator *gsi)
     {
       /* Common case.  */
       tree type = TREE_TYPE (gimple_phi_result (phi));
-      gen_phi_nest_statement (phi, gsi, cond_set, type, phi_arg_map,
-			      &new_stmt, res, args, 1);
+      gen_phi_nest_statement (phi, gsi, cond_set, type, &new_stmt, res,
+			      args, 1);
     }
 
   if (dump_file && (dump_flags & TDF_DETAILS))

> 
> 	Jakub
Jakub Jelinek Oct. 5, 2023, 2:14 p.m. UTC | #8
On Thu, Oct 05, 2023 at 02:01:40PM +0000, Tamar Christina wrote:
> gcc/ChangeLog:
> 
> 	* tree-if-conv.cc (INCLUDE_ALGORITHM): Remove.
> 	(typedef struct ifcvt_arg_entry): New.
> 	(cmp_arg_entry): New.
> 	(gen_phi_arg_condition, gen_phi_nest_statement,
> 	predicate_scalar_phi): Use them.

> -  /* Compute phi_arg_map.  */
> +  /* Compute phi_arg_map, determine the list of unique PHI args and the indices
> +     where they are in the PHI node.  The indices will be used to determine
> +     the conditions to apply and their complexity.  */
> +  auto_vec<tree> unique_args (num_args);
>    for (i = 0; i < num_args; i++)
>      {
>        tree arg;
>  
>        arg = gimple_phi_arg_def (phi, i);
>        if (!phi_arg_map.get (arg))
> -	args.quick_push (arg);
> +	unique_args.quick_push (arg);
>        phi_arg_map.get_or_insert (arg).safe_push (i);
>      }

I meant instead of using another vector (unique_args) just do
	args.quick_push ({ arg, 0, 0, NULL });
above (to avoid needing another allocation etc.).

> -  /* Determine element with max number of occurrences and complexity.  Looking at only
> -     number of occurrences as a measure for complexity isn't enough as all usages can
> -     be unique but the comparisons to reach the PHI node differ per branch.  */
> -  typedef std::pair <tree, std::pair <unsigned, unsigned>> ArgEntry;
> -  auto_vec<ArgEntry> argsKV;
> -  for (i = 0; i < args.length (); i++)
> +  /* Determine element with max number of occurrences and complexity.  Looking
> +     at only number of occurrences as a measure for complexity isn't enough as
> +     all usages can be unique but the comparisons to reach the PHI node differ
> +     per branch.  */
> +  for (auto arg : unique_args)

And then
  for (auto &entry : args)
here with entry.arg instead of arg and

>      {
>        unsigned int len = 0;
> -      for (int index : phi_arg_map.get (args[i]))
> +      vec<int> *indices = phi_arg_map.get (arg);
> +      for (int index : *indices)
>  	{
>  	  edge e = gimple_phi_arg_edge (phi, index);
>  	  len += get_bb_num_predicate_stmts (e->src);
>  	}
>  
> -      unsigned occur = phi_arg_map.get (args[i])->length ();
> +      unsigned occur = indices->length ();
>        if (dump_file && (dump_flags & TDF_DETAILS))
>  	fprintf (dump_file, "Ranking %d as len=%d, idx=%d\n", i, len, occur);
> -      argsKV.safe_push ({ args[i], { len, occur }});
> +      args.safe_push ({ arg, len, occur, indices });

either
      entry.num_compares = len;
      entry.occur = occur;
      entry.indices = indices;
here or just using entry.{num_occurrences,occur,indices} directly
instead of the extra automatic vars.

>      }
>  
> +  unique_args.release ();

Plus drop this.

Though, if Richi or Jeff think this is ok as is, I won't stand against it.

	Jakub
Tamar Christina Oct. 6, 2023, 2:23 a.m. UTC | #9
> 
> On Thu, Oct 05, 2023 at 02:01:40PM +0000, Tamar Christina wrote:
> > gcc/ChangeLog:
> >
> > 	* tree-if-conv.cc (INCLUDE_ALGORITHM): Remove.
> > 	(typedef struct ifcvt_arg_entry): New.
> > 	(cmp_arg_entry): New.
> > 	(gen_phi_arg_condition, gen_phi_nest_statement,
> > 	predicate_scalar_phi): Use them.
> 
> > -  /* Compute phi_arg_map.  */
> > +  /* Compute phi_arg_map, determine the list of unique PHI args and the
> indices
> > +     where they are in the PHI node.  The indices will be used to determine
> > +     the conditions to apply and their complexity.  */
> > + auto_vec<tree> unique_args (num_args);
> >    for (i = 0; i < num_args; i++)
> >      {
> >        tree arg;
> >
> >        arg = gimple_phi_arg_def (phi, i);
> >        if (!phi_arg_map.get (arg))
> > -	args.quick_push (arg);
> > +	unique_args.quick_push (arg);
> >        phi_arg_map.get_or_insert (arg).safe_push (i);
> >      }
> 
> I meant instead of using another vector (unique_args) just do
> 	args.quick_push ({ arg, 0, 0, NULL });
> above (to avoid needing another allocation etc.).
> 

Arggs.. Sorry I completely misunderstood what you meant.. It should be fixed now.
Also realized I forgot to comment the compare function as you asked before.

--

This refactors the code to remove the args cache and index lookups
in favor of a single structure. It also again, removes the use of
std::sort as previously requested but avoids the new asserts in
trunk.

Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-linux-gnu
and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

	* tree-if-conv.cc (INCLUDE_ALGORITHM): Remove.
	(typedef struct ifcvt_arg_entry): New.
	(cmp_arg_entry): New.
	(gen_phi_arg_condition, gen_phi_nest_statement,
	predicate_scalar_phi): Use them.

--- inline copy of patch ---

diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc
index f76e0d8f2e6e0f59073fa8484b0b2c7a6cdc9783..a0c11cf5e30b1127166e1a51371f6521a22c45f2 100644
--- a/gcc/tree-if-conv.cc
+++ b/gcc/tree-if-conv.cc
@@ -80,7 +80,6 @@ along with GCC; see the file COPYING3.  If not see
      <L18>:;
 */
 
-#define INCLUDE_ALGORITHM
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -1937,11 +1936,32 @@ gen_simplified_condition (tree cond, scalar_cond_masked_set_type &cond_set)
   return cond;
 }
 
+/* Structure used to track meta-data on PHI arguments used to generate
+   most efficient comparison sequence to slatten a PHI node.  */
+
+typedef struct ifcvt_arg_entry
+{
+  /* The PHI node argument value.  */
+  tree arg;
+
+  /* The number of compares required to reach this PHI node from start of the
+     BB being if-converted.  */
+  unsigned num_compares;
+
+  /* The number of times this PHI node argument appears in the current PHI
+     node.  */
+  unsigned occurs;
+
+  /* The indices at which this PHI arg occurs inside the PHI node.  */
+  vec <int> *indexes;
+} ifcvt_arg_entry_t;
+
 /* Produce condition for all occurrences of ARG in PHI node.  Set *INVERT
    as to whether the condition is inverted.  */
 
 static tree
-gen_phi_arg_condition (gphi *phi, vec<int> *occur, gimple_stmt_iterator *gsi,
+gen_phi_arg_condition (gphi *phi, ifcvt_arg_entry_t &arg,
+		       gimple_stmt_iterator *gsi,
 		       scalar_cond_masked_set_type &cond_set, bool *invert)
 {
   int len;
@@ -1951,11 +1971,11 @@ gen_phi_arg_condition (gphi *phi, vec<int> *occur, gimple_stmt_iterator *gsi,
   edge e;
 
   *invert = false;
-  len = occur->length ();
+  len = arg.indexes->length ();
   gcc_assert (len > 0);
   for (i = 0; i < len; i++)
     {
-      e = gimple_phi_arg_edge (phi, (*occur)[i]);
+      e = gimple_phi_arg_edge (phi, (*arg.indexes)[i]);
       c = bb_predicate (e->src);
       if (is_true_predicate (c))
 	{
@@ -2020,22 +2040,21 @@ gen_phi_arg_condition (gphi *phi, vec<int> *occur, gimple_stmt_iterator *gsi,
 static tree
 gen_phi_nest_statement (gphi *phi, gimple_stmt_iterator *gsi,
 			scalar_cond_masked_set_type &cond_set, tree type,
-			hash_map<tree_operand_hash, auto_vec<int>> &phi_arg_map,
-			gimple **res_stmt, tree lhs0, vec<tree> &args,
-			unsigned idx)
+			gimple **res_stmt, tree lhs0,
+			vec<struct ifcvt_arg_entry> &args, unsigned idx)
 {
   if (idx == args.length ())
-    return args[idx - 1];
+    return args[idx - 1].arg;
 
-  vec<int> *indexes = phi_arg_map.get (args[idx - 1]);
   bool invert;
-  tree cond = gen_phi_arg_condition (phi, indexes, gsi, cond_set, &invert);
-  tree arg1 = gen_phi_nest_statement (phi, gsi, cond_set, type, phi_arg_map,
-				      res_stmt, lhs0, args, idx + 1);
+  tree cond = gen_phi_arg_condition (phi, args[idx - 1], gsi, cond_set,
+				     &invert);
+  tree arg1 = gen_phi_nest_statement (phi, gsi, cond_set, type, res_stmt, lhs0,
+				      args, idx + 1);
 
   unsigned prev = idx;
   unsigned curr = prev - 1;
-  tree arg0 = args[curr];
+  tree arg0 = args[curr].arg;
   tree rhs, lhs;
   if (idx > 1)
     lhs = make_temp_ssa_name (type, NULL, "_ifc_");
@@ -2055,6 +2074,36 @@ gen_phi_nest_statement (gphi *phi, gimple_stmt_iterator *gsi,
   return lhs;
 }
 
+/* When flattening a PHI node we have a choice of which conditions to test to
+   for all the paths from the start of the dominator block of the BB with the
+   PHI node.  If the PHI node has X arguments we have to only test X - 1
+   conditions as the last one is implicit.  It does matter which conditions we
+   test first.  We should test the shortest condition first (distance here is
+   measures in the number of logical operators in the condition) and the
+   longest one last.  This allows us to skip testing the most expensive
+   condition.  To accomplish this we need to sort the conditions.  P1 and P2
+   are sorted first based on the number of logical operations (num_compares)
+   and then by how often they occur in the PHI node.  */
+
+static int
+cmp_arg_entry (const void *p1, const void *p2, void * /* data.  */)
+{
+  const ifcvt_arg_entry sval1 = *(const ifcvt_arg_entry *)p1;
+  const ifcvt_arg_entry sval2 = *(const ifcvt_arg_entry *)p2;
+
+  if (sval1.num_compares < sval2.num_compares)
+    return -1;
+  else if (sval1.num_compares > sval2.num_compares)
+    return 1;
+
+  if (sval1.occurs < sval2.occurs)
+    return -1;
+  else if (sval1.occurs > sval2.occurs)
+    return 1;
+
+  return 0;
+}
+
 /* Replace a scalar PHI node with a COND_EXPR using COND as condition.
    This routine can handle PHI nodes with more than two arguments.
 
@@ -2180,58 +2229,55 @@ predicate_scalar_phi (gphi *phi, gimple_stmt_iterator *gsi)
   hash_map<tree_operand_hash, auto_vec<int> > phi_arg_map;
   unsigned int num_args = gimple_phi_num_args (phi);
   /* Vector of different PHI argument values.  */
-  auto_vec<tree> args (num_args);
+  auto_vec<ifcvt_arg_entry_t> args;
 
-  /* Compute phi_arg_map.  */
+  /* Compute phi_arg_map, determine the list of unique PHI args and the indices
+     where they are in the PHI node.  The indices will be used to determine
+     the conditions to apply and their complexity.  */
   for (i = 0; i < num_args; i++)
     {
       tree arg;
 
       arg = gimple_phi_arg_def (phi, i);
       if (!phi_arg_map.get (arg))
-	args.quick_push (arg);
+	args.safe_push ({ arg, 0, 0, NULL });
       phi_arg_map.get_or_insert (arg).safe_push (i);
     }
 
-  /* Determine element with max number of occurrences and complexity.  Looking at only
-     number of occurrences as a measure for complexity isn't enough as all usages can
-     be unique but the comparisons to reach the PHI node differ per branch.  */
-  typedef std::pair <tree, std::pair <unsigned, unsigned>> ArgEntry;
-  auto_vec<ArgEntry> argsKV;
-  for (i = 0; i < args.length (); i++)
+  /* Determine element with max number of occurrences and complexity.  Looking
+     at only number of occurrences as a measure for complexity isn't enough as
+     all usages can be unique but the comparisons to reach the PHI node differ
+     per branch.  */
+  for (unsigned i = 0; i < args.length (); i++)
     {
       unsigned int len = 0;
-      for (int index : phi_arg_map.get (args[i]))
+      vec<int> *indices = phi_arg_map.get (args[i].arg);
+      for (int index : *indices)
 	{
 	  edge e = gimple_phi_arg_edge (phi, index);
 	  len += get_bb_num_predicate_stmts (e->src);
 	}
 
-      unsigned occur = phi_arg_map.get (args[i])->length ();
+      unsigned occur = indices->length ();
       if (dump_file && (dump_flags & TDF_DETAILS))
 	fprintf (dump_file, "Ranking %d as len=%d, idx=%d\n", i, len, occur);
-      argsKV.safe_push ({ args[i], { len, occur }});
+      args[i].num_compares = len;
+      args[i].occurs = occur;
+      args[i].indexes = indices;
     }
 
   /* Sort elements based on rankings ARGS.  */
-  std::sort(argsKV.begin(), argsKV.end(), [](const ArgEntry &left,
-					     const ArgEntry &right) {
-    return left.second < right.second;
-  });
-
-  for (i = 0; i < args.length (); i++)
-    args[i] = argsKV[i].first;
+  args.stablesort (cmp_arg_entry, NULL);
 
   /* Handle one special case when number of arguments with different values
      is equal 2 and one argument has the only occurrence.  Such PHI can be
      handled as if would have only 2 arguments.  */
-  if (args.length () == 2 && phi_arg_map.get (args[0])->length () == 1)
+  if (args.length () == 2
+      && args[0].indexes->length () == 1)
     {
-      vec<int> *indexes;
-      indexes = phi_arg_map.get (args[0]);
-      index0 = (*indexes)[0];
-      arg0 = args[0];
-      arg1 = args[1];
+      index0 = (*args[0].indexes)[0];
+      arg0 = args[0].arg;
+      arg1 = args[1].arg;
       e = gimple_phi_arg_edge (phi, index0);
       cond = bb_predicate (e->src);
       if (TREE_CODE (cond) == TRUTH_NOT_EXPR)
@@ -2245,8 +2291,8 @@ predicate_scalar_phi (gphi *phi, gimple_stmt_iterator *gsi)
       if (!(is_cond_scalar_reduction (phi, &reduc, arg0 , arg1,
 				      &op0, &op1, true, &has_nop, &nop_reduc)))
 	rhs = fold_build_cond_expr (TREE_TYPE (res), unshare_expr (cond),
-				    swap? arg1 : arg0,
-				    swap? arg0 : arg1);
+				    swap ? arg1 : arg0,
+				    swap ? arg0 : arg1);
       else
 	{
 	  /* Convert reduction stmt into vectorizable form.  */
@@ -2262,8 +2308,8 @@ predicate_scalar_phi (gphi *phi, gimple_stmt_iterator *gsi)
     {
       /* Common case.  */
       tree type = TREE_TYPE (gimple_phi_result (phi));
-      gen_phi_nest_statement (phi, gsi, cond_set, type, phi_arg_map,
-			      &new_stmt, res, args, 1);
+      gen_phi_nest_statement (phi, gsi, cond_set, type, &new_stmt, res,
+			      args, 1);
     }
 
   if (dump_file && (dump_flags & TDF_DETAILS))
Jakub Jelinek Oct. 6, 2023, 6:38 a.m. UTC | #10
On Fri, Oct 06, 2023 at 02:23:06AM +0000, Tamar Christina wrote:
> gcc/ChangeLog:
> 
> 	* tree-if-conv.cc (INCLUDE_ALGORITHM): Remove.
> 	(typedef struct ifcvt_arg_entry): New.
> 	(cmp_arg_entry): New.
> 	(gen_phi_arg_condition, gen_phi_nest_statement,
> 	predicate_scalar_phi): Use them.

Ok, thanks.

	Jakub
diff mbox series

Patch

--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -1199,7 +1199,7 @@  namespace vec_detail
   template<typename T, typename U>
   struct is_trivially_copyable_or_pair<std::pair<T, U> >
   : std::integral_constant<bool, std::is_trivially_copyable<T>::value
-    && std::is_trivially_copyable<U>::value> { };
+    && is_trivially_copyable_or_pair<U>::value> { };
 }
 #endif