diff mbox series

phiopt: Optimize (x <=> y) cmp z [PR94589]

Message ID 20210504074413.GI1179226@tucnak
State New
Headers show
Series phiopt: Optimize (x <=> y) cmp z [PR94589] | expand

Commit Message

Jakub Jelinek May 4, 2021, 7:44 a.m. UTC
Hi!

genericize_spaceship genericizes i <=> j to approximately
({ int c; if (i == j) c = 0; else if (i < j) c = -1; else c = 1; c; })
for strong ordering and
({ int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; c; })
for partial ordering.
The C++ standard supports then == or != comparisons of that against
strong/partial ordering enums, or </<=/==/!=/>/>= comparisons of <=> result
against literal 0.

In some cases we already optimize that but in many cases we keep performing
all the 2 or 3 comparisons, compute the spaceship value and then compare
that.

The following patch recognizes those patterns if the <=> operands are
integral types or floating point (the latter only for -ffast-math) and
optimizes it to the single comparison that is needed (plus adds debug stmts
if needed for the spaceship result).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

There are two things I'd like to address in a follow-up:
1) if (HONOR_NANS (TREE_TYPE (lhs1)) || HONOR_SIGNED_ZEROS (TREE_TYPE (lhs1)))
is what I've copied from elsewhere in phiopt, but thinking about it,
alll we care is probably only HONOR_NANS, the matched pattern starts with
== or != comparison and branches to the PHI bb with -1/0/1/2 result if it is
equal, which should be the case for signed zero differences.
2) the pr94589-2.C testcase should be matching just 12 times each, but runs
into operator>=(strong_ordering, unspecified) being defined as
(_M_value&1)==_M_value
rather than _M_value>=0.  When not honoring NaNs, the 2 case should be
unreachable and so (_M_value&1)==_M_value is then equivalent to _M_value>=0,
but is not a single use but two uses.  I'll need to pattern match that case
specially.

2021-05-04  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/94589
	* tree-ssa-phiopt.c (tree_ssa_phiopt_worker): Call
	spaceship_replacement.
	(cond_only_block_p, spaceship_replacement): New functions.

	* gcc.dg/pr94589-1.c: New test.
	* gcc.dg/pr94589-2.c: New test.
	* gcc.dg/pr94589-3.c: New test.
	* gcc.dg/pr94589-4.c: New test.
	* g++.dg/opt/pr94589-1.C: New test.
	* g++.dg/opt/pr94589-2.C: New test.
	* g++.dg/opt/pr94589-3.C: New test.
	* g++.dg/opt/pr94589-4.C: New test.


	Jakub

Comments

Jonathan Wakely May 4, 2021, 9:42 a.m. UTC | #1
On 04/05/21 09:44 +0200, Jakub Jelinek wrote:
>Hi!
>
>genericize_spaceship genericizes i <=> j to approximately
>({ int c; if (i == j) c = 0; else if (i < j) c = -1; else c = 1; c; })
>for strong ordering and
>({ int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; c; })
>for partial ordering.
>The C++ standard supports then == or != comparisons of that against
>strong/partial ordering enums, or </<=/==/!=/>/>= comparisons of <=> result
>against literal 0.
>
>In some cases we already optimize that but in many cases we keep performing
>all the 2 or 3 comparisons, compute the spaceship value and then compare
>that.
>
>The following patch recognizes those patterns if the <=> operands are
>integral types or floating point (the latter only for -ffast-math) and
>optimizes it to the single comparison that is needed (plus adds debug stmts
>if needed for the spaceship result).
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
>There are two things I'd like to address in a follow-up:
>1) if (HONOR_NANS (TREE_TYPE (lhs1)) || HONOR_SIGNED_ZEROS (TREE_TYPE (lhs1)))
>is what I've copied from elsewhere in phiopt, but thinking about it,
>alll we care is probably only HONOR_NANS, the matched pattern starts with
>== or != comparison and branches to the PHI bb with -1/0/1/2 result if it is
>equal, which should be the case for signed zero differences.
>2) the pr94589-2.C testcase should be matching just 12 times each, but runs
>into operator>=(strong_ordering, unspecified) being defined as

Should this say s/strong/partial/  ?
Jakub Jelinek May 4, 2021, 9:56 a.m. UTC | #2
On Tue, May 04, 2021 at 10:42:12AM +0100, Jonathan Wakely wrote:
> > There are two things I'd like to address in a follow-up:
> > 1) if (HONOR_NANS (TREE_TYPE (lhs1)) || HONOR_SIGNED_ZEROS (TREE_TYPE (lhs1)))
> > is what I've copied from elsewhere in phiopt, but thinking about it,
> > alll we care is probably only HONOR_NANS, the matched pattern starts with
> > == or != comparison and branches to the PHI bb with -1/0/1/2 result if it is
> > equal, which should be the case for signed zero differences.
> > 2) the pr94589-2.C testcase should be matching just 12 times each, but runs
> > into operator>=(strong_ordering, unspecified) being defined as
> 
> Should this say s/strong/partial/  ?

Yeah, sorry.

	Jakub
Richard Biener May 5, 2021, 11:39 a.m. UTC | #3
On Tue, 4 May 2021, Jakub Jelinek wrote:

> Hi!
> 
> genericize_spaceship genericizes i <=> j to approximately
> ({ int c; if (i == j) c = 0; else if (i < j) c = -1; else c = 1; c; })
> for strong ordering and
> ({ int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; c; })
> for partial ordering.
> The C++ standard supports then == or != comparisons of that against
> strong/partial ordering enums, or </<=/==/!=/>/>= comparisons of <=> result
> against literal 0.
> 
> In some cases we already optimize that but in many cases we keep performing
> all the 2 or 3 comparisons, compute the spaceship value and then compare
> that.
> 
> The following patch recognizes those patterns if the <=> operands are
> integral types or floating point (the latter only for -ffast-math) and
> optimizes it to the single comparison that is needed (plus adds debug stmts
> if needed for the spaceship result).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> There are two things I'd like to address in a follow-up:
> 1) if (HONOR_NANS (TREE_TYPE (lhs1)) || HONOR_SIGNED_ZEROS (TREE_TYPE (lhs1)))
> is what I've copied from elsewhere in phiopt, but thinking about it,
> alll we care is probably only HONOR_NANS, the matched pattern starts with
> == or != comparison and branches to the PHI bb with -1/0/1/2 result if it is
> equal, which should be the case for signed zero differences.
> 2) the pr94589-2.C testcase should be matching just 12 times each, but runs
> into operator>=(strong_ordering, unspecified) being defined as
> (_M_value&1)==_M_value
> rather than _M_value>=0.  When not honoring NaNs, the 2 case should be
> unreachable and so (_M_value&1)==_M_value is then equivalent to _M_value>=0,
> but is not a single use but two uses.  I'll need to pattern match that case
> specially.
> 
> 2021-05-04  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR tree-optimization/94589
> 	* tree-ssa-phiopt.c (tree_ssa_phiopt_worker): Call
> 	spaceship_replacement.
> 	(cond_only_block_p, spaceship_replacement): New functions.
> 
> 	* gcc.dg/pr94589-1.c: New test.
> 	* gcc.dg/pr94589-2.c: New test.
> 	* gcc.dg/pr94589-3.c: New test.
> 	* gcc.dg/pr94589-4.c: New test.
> 	* g++.dg/opt/pr94589-1.C: New test.
> 	* g++.dg/opt/pr94589-2.C: New test.
> 	* g++.dg/opt/pr94589-3.C: New test.
> 	* g++.dg/opt/pr94589-4.C: New test.
> 
> --- gcc/tree-ssa-phiopt.c.jj	2021-05-02 10:17:49.095397758 +0200
> +++ gcc/tree-ssa-phiopt.c	2021-05-03 17:49:54.233300624 +0200
> @@ -64,6 +64,8 @@ static bool abs_replacement (basic_block
>  			     edge, edge, gimple *, tree, tree);
>  static bool xor_replacement (basic_block, basic_block,
>  			     edge, edge, gimple *, tree, tree);
> +static bool spaceship_replacement (basic_block, basic_block,
> +				   edge, edge, gimple *, tree, tree);
>  static bool cond_removal_in_popcount_clz_ctz_pattern (basic_block, basic_block,
>  						      edge, edge, gimple *,
>  						      tree, tree);
> @@ -357,6 +359,8 @@ tree_ssa_phiopt_worker (bool do_store_el
>  	    cfgchanged = true;
>  	  else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
>  	    cfgchanged = true;
> +	  else if (spaceship_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> +	    cfgchanged = true;
>  	}
>      }
>  
> @@ -1806,6 +1810,420 @@ minmax_replacement (basic_block cond_bb,
>  
>    return true;
>  }
> +
> +/* Return true if the only executable statement in BB is a GIMPLE_COND.  */
> +
> +static bool
> +cond_only_block_p (basic_block bb)
> +{
> +  /* BB must have no executable statements.  */
> +  gimple_stmt_iterator gsi = gsi_after_labels (bb);
> +  if (phi_nodes (bb))
> +    return false;
> +  while (!gsi_end_p (gsi))
> +    {
> +      gimple *stmt = gsi_stmt (gsi);
> +      if (is_gimple_debug (stmt))
> +	;
> +      else if (gimple_code (stmt) == GIMPLE_NOP
> +	       || gimple_code (stmt) == GIMPLE_PREDICT
> +	       || gimple_code (stmt) == GIMPLE_COND)
> +	;
> +      else
> +	return false;
> +      gsi_next (&gsi);
> +    }
> +  return true;
> +}
> +
> +/* Attempt to optimize (x <=> y) cmp 0 and similar comparisons.
> +   For strong ordering <=> try to match something like:
> +    <bb 2> :
> +    if (x_4(D) != y_5(D))
> +      goto <bb 3>; [INV]
> +    else
> +      goto <bb 6>; [INV]
> +
> +    <bb 3> :
> +    if (x_4(D) < y_5(D))
> +      goto <bb 6>; [INV]
> +    else
> +      goto <bb 4>; [INV]
> +
> +    <bb 4> :
> +
> +    <bb 6> :
> +    # iftmp.0_2 = PHI <1(4), 0(2), -1(3)>
> +    _1 = iftmp.0_2 == 0;
> +
> +   and for partial ordering <=> something like:
> +
> +    <bb 2> :
> +    if (a_3(D) == b_5(D))
> +      goto <bb 6>; [50.00%]
> +    else
> +      goto <bb 3>; [50.00%]
> +
> +    <bb 3> [local count: 536870913]:
> +    if (a_3(D) < b_5(D))
> +      goto <bb 6>; [50.00%]
> +    else
> +      goto <bb 4>; [50.00%]
> +
> +    <bb 4> [local count: 268435456]:
> +    if (a_3(D) > b_5(D))
> +      goto <bb 6>; [50.00%]
> +    else
> +      goto <bb 5>; [50.00%]
> +
> +    <bb 5> [local count: 134217728]:
> +
> +    <bb 6> [local count: 1073741824]:
> +    # SR.27_4 = PHI <0(2), -1(3), 1(4), 2(5)>
> +    _2 = SR.27_4 > 0;  */

Can you in the above IL snippets mark COND_BB and MIDDLE_BB?

> +static bool
> +spaceship_replacement (basic_block cond_bb, basic_block middle_bb,
> +		       edge e0, edge e1, gimple *phi,

This could be a gphi *

> +		       tree arg0, tree arg1)
> +{
> +  if (!INTEGRAL_TYPE_P (TREE_TYPE (PHI_RESULT (phi)))
> +      || TYPE_UNSIGNED (TREE_TYPE (PHI_RESULT (phi)))
> +      || !tree_fits_shwi_p (arg0)
> +      || !tree_fits_shwi_p (arg1)
> +      || !IN_RANGE (tree_to_shwi (arg0), -1, 2)
> +      || !IN_RANGE (tree_to_shwi (arg1), -1, 2))
> +    return false;
> +
> +  use_operand_p use_p;
> +  gimple *use_stmt;
> +  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (PHI_RESULT (phi)))
> +    return false;
> +  if (!single_imm_use (PHI_RESULT (phi), &use_p, &use_stmt))
> +    return false;
> +  enum tree_code cmp;
> +  tree lhs, rhs;
> +  if (gimple_code (use_stmt) == GIMPLE_COND)
> +    {
> +      cmp = gimple_cond_code (use_stmt);
> +      lhs = gimple_cond_lhs (use_stmt);
> +      rhs = gimple_cond_rhs (use_stmt);
> +    }
> +  else if (is_gimple_assign (use_stmt))
> +    {
> +      if (gimple_assign_rhs_class (use_stmt) == GIMPLE_BINARY_RHS)
> +	{
> +	  cmp = gimple_assign_rhs_code (use_stmt);
> +	  lhs = gimple_assign_rhs1 (use_stmt);
> +	  rhs = gimple_assign_rhs2 (use_stmt);
> +	}
> +      else if (gimple_assign_rhs_code (use_stmt) == COND_EXPR)
> +	{
> +	  tree cond = gimple_assign_rhs1 (use_stmt);
> +	  if (!COMPARISON_CLASS_P (cond))
> +	    return false;
> +	  cmp = TREE_CODE (cond);
> +	  lhs = TREE_OPERAND (cond, 0);
> +	  rhs = TREE_OPERAND (cond, 1);
> +	}
> +      else
> +	return false;
> +    }
> +  else
> +    return false;
> +  switch (cmp)
> +    {
> +    case EQ_EXPR:
> +    case NE_EXPR:
> +    case LT_EXPR:
> +    case GT_EXPR:
> +    case LE_EXPR:
> +    case GE_EXPR:
> +      break;
> +    default:
> +      return false;
> +    }
> +  if (lhs != PHI_RESULT (phi)
> +      || !tree_fits_shwi_p (rhs)
> +      || !IN_RANGE (tree_to_shwi (rhs), -1, 1))
> +    return false;
> +
> +  if (!empty_block_p (middle_bb))
> +    return false;
> +
> +  basic_block phi_bb = gimple_bb (phi);
> +  gcc_assert (phi_bb == e0->dest && phi_bb == e1->dest);
> +  if (!IN_RANGE (EDGE_COUNT (phi_bb->preds), 3, 4))
> +    return false;

I guess this check could be done earlier.

> +
> +  gcond *cond1 = as_a <gcond *> (last_stmt (cond_bb));
> +  enum tree_code cmp1 = gimple_cond_code (cond1);
> +  if (cmp1 != LT_EXPR && cmp1 != GT_EXPR)
> +    return false;
> +  tree lhs1 = gimple_cond_lhs (cond1);
> +  tree rhs1 = gimple_cond_rhs (cond1);
> +  /* The optimization may be unsafe due to NaNs.  */
> +  if (HONOR_NANS (TREE_TYPE (lhs1)) || HONOR_SIGNED_ZEROS (TREE_TYPE (lhs1)))
> +    return false;
> +  if (TREE_CODE (lhs1) == SSA_NAME && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs1))
> +    return false;
> +  if (TREE_CODE (rhs1) == SSA_NAME && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs1))
> +    return false;
> +
> +  if (!single_pred_p (cond_bb) || !cond_only_block_p (cond_bb))
> +    return false;
> +
> +  basic_block cond2_bb = single_pred (cond_bb);
> +  if (EDGE_COUNT (cond2_bb->succs) != 2)
> +    return false;
> +  edge cond2_phi_edge;
> +  if (EDGE_SUCC (cond2_bb, 0)->dest == cond_bb)
> +    {
> +      if (EDGE_SUCC (cond2_bb, 1)->dest != phi_bb)
> +	return false;
> +      cond2_phi_edge = EDGE_SUCC (cond2_bb, 1);
> +    }
> +  else if (EDGE_SUCC (cond2_bb, 0)->dest != phi_bb)
> +    return false;
> +  else
> +    cond2_phi_edge = EDGE_SUCC (cond2_bb, 0);
> +  tree arg2 = gimple_phi_arg_def (phi, cond2_phi_edge->dest_idx);
> +  if (!tree_fits_shwi_p (arg2))
> +    return false;
> +  gimple *cond2 = last_stmt (cond2_bb);
> +  if (cond2 == NULL || gimple_code (cond2) != GIMPLE_COND)
> +    return false;
> +  enum tree_code cmp2 = gimple_cond_code (cond2);
> +  tree lhs2 = gimple_cond_lhs (cond2);
> +  tree rhs2 = gimple_cond_rhs (cond2);
> +  if (lhs2 == lhs1)
> +    {
> +      if (!operand_equal_p (rhs2, rhs1, 0))
> +	return false;
> +    }
> +  else if (lhs2 == rhs1)
> +    {
> +      if (rhs2 != lhs1)
> +	return false;
> +    }
> +  else
> +    return false;
> +  tree arg3 = arg2;
> +  basic_block cond3_bb = cond2_bb;
> +  edge cond3_phi_edge = cond2_phi_edge;
> +  gimple *cond3 = cond2;
> +  enum tree_code cmp3 = cmp2;
> +  tree lhs3 = lhs2;
> +  tree rhs3 = rhs2;
> +  if (EDGE_COUNT (phi_bb->preds) == 4)
> +    {
> +      if (absu_hwi (tree_to_shwi (arg2)) != 1)
> +	return false;
> +      if (e1->flags & EDGE_TRUE_VALUE)
> +	{
> +	  if (tree_to_shwi (arg0) != 2
> +	      || absu_hwi (tree_to_shwi (arg1)) != 1
> +	      || wi::to_widest (arg1) == wi::to_widest (arg2))
> +	    return false;
> +	}
> +      else if (tree_to_shwi (arg1) != 2
> +	       || absu_hwi (tree_to_shwi (arg0)) != 1
> +	       || wi::to_widest (arg0) == wi::to_widest (arg1))
> +	return false;
> +      if (cmp2 != LT_EXPR && cmp2 != GT_EXPR)
> +	return false;
> +      /* if (x < y) goto phi_bb; else fallthru;
> +	 if (x > y) goto phi_bb; else fallthru;
> +	 bbx:;
> +	 phi_bb:;
> +	 is ok, but if x and y are swapped in one of the comparisons,
> +	 or the comparisons are the same and operands not swapped,
> +	 or second goto phi_bb is not the true edge, it is not.  */
> +      if ((lhs2 == lhs1)
> +	  ^ (cmp2 == cmp1)
> +	  ^ ((e1->flags & EDGE_TRUE_VALUE) != 0))
> +	return false;
> +      if ((cond2_phi_edge->flags & EDGE_TRUE_VALUE) == 0)
> +	return false;
> +      if (!single_pred_p (cond2_bb) || !cond_only_block_p (cond2_bb))
> +	return false;
> +      cond3_bb = single_pred (cond2_bb);
> +      if (EDGE_COUNT (cond2_bb->succs) != 2)
> +	return false;
> +      if (EDGE_SUCC (cond3_bb, 0)->dest == cond2_bb)
> +	{
> +	  if (EDGE_SUCC (cond3_bb, 1)->dest != phi_bb)
> +	    return false;
> +	  cond3_phi_edge = EDGE_SUCC (cond3_bb, 1);
> +	}
> +      else if (EDGE_SUCC (cond3_bb, 0)->dest != phi_bb)
> +	return false;
> +      else
> +	cond3_phi_edge = EDGE_SUCC (cond3_bb, 0);
> +      arg3 = gimple_phi_arg_def (phi, cond3_phi_edge->dest_idx);
> +      cond3 = last_stmt (cond3_bb);
> +      if (cond3 == NULL || gimple_code (cond3) != GIMPLE_COND)
> +	return false;
> +      cmp3 = gimple_cond_code (cond3);
> +      lhs3 = gimple_cond_lhs (cond3);
> +      rhs3 = gimple_cond_rhs (cond3);
> +      if (lhs3 == lhs1)
> +	{
> +	  if (!operand_equal_p (rhs3, rhs1, 0))
> +	    return false;
> +	}
> +      else if (lhs3 == rhs1)
> +	{
> +	  if (rhs3 != lhs1)
> +	    return false;
> +	}
> +      else
> +	return false;
> +    }
> +  else if (absu_hwi (tree_to_shwi (arg0)) != 1
> +	   || absu_hwi (tree_to_shwi (arg1)) != 1
> +	   || wi::to_widest (arg0) == wi::to_widest (arg1))
> +    return false;
> +
> +  if (!integer_zerop (arg3) || (cmp3 != EQ_EXPR && cmp3 != NE_EXPR))
> +    return false;
> +  if ((cond3_phi_edge->flags & (cmp3 == EQ_EXPR
> +				? EDGE_TRUE_VALUE : EDGE_FALSE_VALUE)) == 0)
> +    return false;
> +
> +  /* lhs1 one_cmp rhs1 results in PHI_RESULT (phi) of 1.  */
> +  enum tree_code one_cmp;
> +  if ((cmp1 == LT_EXPR)
> +      ^ (!integer_onep ((e1->flags & EDGE_TRUE_VALUE) ? arg1 : arg0)))
> +    one_cmp = LT_EXPR;
> +  else
> +    one_cmp = GT_EXPR;
> +
> +  enum tree_code res_cmp;
> +  switch (cmp)
> +    {
> +    case EQ_EXPR:
> +      if (integer_zerop (rhs))
> +	res_cmp = EQ_EXPR;
> +      else if (integer_minus_onep (rhs))
> +	res_cmp = one_cmp == LT_EXPR ? GT_EXPR : LT_EXPR;
> +      else if (integer_onep (rhs))
> +	res_cmp = one_cmp;
> +      else
> +	return false;
> +      break;
> +    case NE_EXPR:
> +      if (integer_zerop (rhs))
> +	res_cmp = NE_EXPR;
> +      else if (integer_minus_onep (rhs))
> +	res_cmp = one_cmp == LT_EXPR ? LE_EXPR : GE_EXPR;
> +      else if (integer_onep (rhs))
> +	res_cmp = one_cmp == LT_EXPR ? GE_EXPR : LE_EXPR;
> +      else
> +	return false;
> +      break;
> +    case LT_EXPR:
> +      if (integer_onep (rhs))
> +	res_cmp = one_cmp == LT_EXPR ? GE_EXPR : LE_EXPR;
> +      else if (integer_zerop (rhs))
> +	res_cmp = one_cmp == LT_EXPR ? GT_EXPR : LT_EXPR;
> +      else
> +	return false;
> +      break;
> +    case LE_EXPR:
> +      if (integer_zerop (rhs))
> +	res_cmp = one_cmp == LT_EXPR ? GE_EXPR : LE_EXPR;
> +      else if (integer_minus_onep (rhs))
> +	res_cmp = one_cmp == LT_EXPR ? GT_EXPR : LT_EXPR;
> +      else
> +	return false;
> +      break;
> +    case GT_EXPR:
> +      if (integer_minus_onep (rhs))
> +	res_cmp = one_cmp == LT_EXPR ? LE_EXPR : GE_EXPR;
> +      else if (integer_zerop (rhs))
> +	res_cmp = one_cmp;
> +      else
> +	return false;
> +      break;
> +    case GE_EXPR:
> +      if (integer_zerop (rhs))
> +	res_cmp = one_cmp == LT_EXPR ? LE_EXPR : GE_EXPR;
> +      else if (integer_onep (rhs))
> +	res_cmp = one_cmp;
> +      else
> +	return false;
> +      break;
> +    default:
> +      gcc_unreachable ();
> +    }
> +
> +  if (gimple_code (use_stmt) == GIMPLE_COND)
> +    {
> +      gcond *use_cond = as_a <gcond *> (use_stmt);
> +      gimple_cond_set_code (use_cond, res_cmp);
> +      gimple_cond_set_lhs (use_cond, lhs1);
> +      gimple_cond_set_rhs (use_cond, rhs1);
> +    }
> +  else if (gimple_assign_rhs_class (use_stmt) == GIMPLE_BINARY_RHS)
> +    {
> +      gimple_assign_set_rhs_code (use_stmt, res_cmp);
> +      gimple_assign_set_rhs1 (use_stmt, lhs1);
> +      gimple_assign_set_rhs2 (use_stmt, rhs1);
> +    }
> +  else
> +    {
> +      tree cond = build2 (res_cmp, TREE_TYPE (gimple_assign_rhs1 (use_stmt)),
> +			  lhs1, rhs1);
> +      gimple_assign_set_rhs1 (use_stmt, cond);
> +    }
> +  update_stmt (use_stmt);
> +  if (cmp3 == EQ_EXPR)
> +    gimple_cond_make_true (as_a <gcond *> (cond3));
> +  else
> +    gimple_cond_make_false (as_a <gcond *> (cond3));
> +  update_stmt (cond3);

I'm missing the removal of the PHI node - that's sth other phiopt 
transforms usually do.

> +
> +  if (MAY_HAVE_DEBUG_BIND_STMTS)
> +    {
> +      use_operand_p use_p;
> +      imm_use_iterator iter;
> +      bool has_debug_uses = false;
> +      FOR_EACH_IMM_USE_FAST (use_p, iter, PHI_RESULT (phi))
> +	{
> +	  gimple *use_stmt = USE_STMT (use_p);
> +	  gcc_assert (is_gimple_debug (use_stmt));
> +	  has_debug_uses = true;
> +	  break;
> +	}
> +
> +      if (has_debug_uses)
> +	{
> +	  gimple_stmt_iterator gsi = gsi_after_labels (gimple_bb (phi));
> +	  tree type = TREE_TYPE (PHI_RESULT (phi));
> +	  tree temp1 = make_node (DEBUG_EXPR_DECL);
> +	  DECL_ARTIFICIAL (temp1) = 1;
> +	  TREE_TYPE (temp1) = type;
> +	  SET_DECL_MODE (temp1, TYPE_MODE (type));
> +	  tree t = build2 (one_cmp, boolean_type_node, lhs1, rhs2);
> +	  t = build3 (COND_EXPR, type, t, build_one_cst (type),
> +		      build_int_cst (type, -1));
> +	  gimple *g = gimple_build_debug_bind (temp1, t, phi);
> +	  gsi_insert_before (&gsi, g, GSI_SAME_STMT);
> +	  tree temp2 = make_node (DEBUG_EXPR_DECL);
> +	  DECL_ARTIFICIAL (temp2) = 1;
> +	  TREE_TYPE (temp2) = type;
> +	  SET_DECL_MODE (temp2, TYPE_MODE (type));
> +	  t = build2 (EQ_EXPR, boolean_type_node, lhs1, rhs2);
> +	  t = build3 (COND_EXPR, type, t, build_zero_cst (type), temp1);
> +	  g = gimple_build_debug_bind (temp2, t, phi);
> +	  gsi_insert_before (&gsi, g, GSI_SAME_STMT);
> +	  replace_uses_by (PHI_RESULT (phi), temp2);

Please add a comment on what you are generating here.  Do we handle
COND_EXPRs in debug expr expansion at all?  I suppose we need
DW_OP_GNU_spaceship ;)

Otherwise looks OK, not that I like such elaborate matching code
much.

Thanks,
Richard.

> +	}
> +    }
> +
> +  return true;
> +}
>  
>  /* Convert
>  
> --- gcc/testsuite/gcc.dg/pr94589-1.c.jj	2021-05-03 17:32:14.940203230 +0200
> +++ gcc/testsuite/gcc.dg/pr94589-1.c	2021-05-03 17:54:33.081167518 +0200
> @@ -0,0 +1,35 @@
> +/* PR tree-optimization/94589 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -g0 -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-times "\[ij]_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) \[ij]_\[0-9]+\\(D\\)" 14 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "i_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) \[45]" 14 "optimized" } } */
> +
> +#define A __attribute__((noipa))
> +A int f1 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c == 0; }
> +A int f2 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c != 0; }
> +A int f3 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c > 0; }
> +A int f4 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c < 0; }
> +A int f5 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c >= 0; }
> +A int f6 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c <= 0; }
> +A int f7 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c == -1; }
> +A int f8 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c != -1; }
> +A int f9 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c > -1; }
> +A int f10 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c <= -1; }
> +A int f11 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c == 1; }
> +A int f12 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c != 1; }
> +A int f13 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c < 1; }
> +A int f14 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c >= 1; }
> +A int f15 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c == 0; }
> +A int f16 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c != 0; }
> +A int f17 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c > 0; }
> +A int f18 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c < 0; }
> +A int f19 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c >= 0; }
> +A int f20 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c <= 0; }
> +A int f21 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c == -1; }
> +A int f22 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c != -1; }
> +A int f23 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c > -1; }
> +A int f24 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c <= -1; }
> +A int f25 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c == 1; }
> +A int f26 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c != 1; }
> +A int f27 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c < 1; }
> +A int f28 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c >= 1; }
> --- gcc/testsuite/gcc.dg/pr94589-2.c.jj	2021-05-03 17:32:17.950169407 +0200
> +++ gcc/testsuite/gcc.dg/pr94589-2.c	2021-05-03 17:56:12.577049606 +0200
> @@ -0,0 +1,35 @@
> +/* PR tree-optimization/94589 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -g0 -ffast-math -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-times "\[ij]_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) \[ij]_\[0-9]+\\(D\\)" 14 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "i_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) 5\\.0" 14 "optimized" } } */
> +
> +#define A __attribute__((noipa))
> +A int f1 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c == 0; }
> +A int f2 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c != 0; }
> +A int f3 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c > 0; }
> +A int f4 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c < 0; }
> +A int f5 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c >= 0; }
> +A int f6 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c <= 0; }
> +A int f7 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c == -1; }
> +A int f8 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c != -1; }
> +A int f9 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c > -1; }
> +A int f10 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c <= -1; }
> +A int f11 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c == 1; }
> +A int f12 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c != 1; }
> +A int f13 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c < 1; }
> +A int f14 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c >= 1; }
> +A int f15 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c == 0; }
> +A int f16 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c != 0; }
> +A int f17 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c > 0; }
> +A int f18 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c < 0; }
> +A int f19 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c >= 0; }
> +A int f20 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c <= 0; }
> +A int f21 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c == -1; }
> +A int f22 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c != -1; }
> +A int f23 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c > -1; }
> +A int f24 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c <= -1; }
> +A int f25 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c == 1; }
> +A int f26 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c != 1; }
> +A int f27 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c < 1; }
> +A int f28 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c >= 1; }
> --- gcc/testsuite/gcc.dg/pr94589-3.c.jj	2021-05-03 17:32:19.998146397 +0200
> +++ gcc/testsuite/gcc.dg/pr94589-3.c	2021-05-03 18:02:41.628678802 +0200
> @@ -0,0 +1,97 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -g" } */
> +
> +#include "pr94589-1.c"
> +
> +#define C(fn, i, j, r) if (fn (i, j) != r) __builtin_abort ()
> +#define D(fn, i, r) if (fn (i) != r) __builtin_abort ()
> +
> +int
> +main ()
> +{
> +  C (f1, 7, 8, 0);
> +  C (f1, 8, 8, 1);
> +  C (f1, 9, 8, 0);
> +  C (f2, 7, 8, 1);
> +  C (f2, 8, 8, 0);
> +  C (f2, 9, 8, 1);
> +  C (f3, 7, 8, 0);
> +  C (f3, 8, 8, 0);
> +  C (f3, 9, 8, 1);
> +  C (f4, 7, 8, 1);
> +  C (f4, 8, 8, 0);
> +  C (f4, 9, 8, 0);
> +  C (f5, 7, 8, 0);
> +  C (f5, 8, 8, 1);
> +  C (f5, 9, 8, 1);
> +  C (f6, 7, 8, 1);
> +  C (f6, 8, 8, 1);
> +  C (f6, 9, 8, 0);
> +  C (f7, 7, 8, 1);
> +  C (f7, 8, 8, 0);
> +  C (f7, 9, 8, 0);
> +  C (f8, 7, 8, 0);
> +  C (f8, 8, 8, 1);
> +  C (f8, 9, 8, 1);
> +  C (f9, 7, 8, 0);
> +  C (f9, 8, 8, 1);
> +  C (f9, 9, 8, 1);
> +  C (f10, 7, 8, 1);
> +  C (f10, 8, 8, 0);
> +  C (f10, 9, 8, 0);
> +  C (f11, 7, 8, 0);
> +  C (f11, 8, 8, 0);
> +  C (f11, 9, 8, 1);
> +  C (f12, 7, 8, 1);
> +  C (f12, 8, 8, 1);
> +  C (f12, 9, 8, 0);
> +  C (f13, 7, 8, 1);
> +  C (f13, 8, 8, 1);
> +  C (f13, 9, 8, 0);
> +  C (f14, 7, 8, 0);
> +  C (f14, 8, 8, 0);
> +  C (f14, 9, 8, 1);
> +  D (f15, 4, 0);
> +  D (f15, 5, 1);
> +  D (f15, 6, 0);
> +  D (f16, 4, 1);
> +  D (f16, 5, 0);
> +  D (f16, 6, 1);
> +  D (f17, 4, 0);
> +  D (f17, 5, 0);
> +  D (f17, 6, 1);
> +  D (f18, 4, 1);
> +  D (f18, 5, 0);
> +  D (f18, 6, 0);
> +  D (f19, 4, 0);
> +  D (f19, 5, 1);
> +  D (f19, 6, 1);
> +  D (f20, 4, 1);
> +  D (f20, 5, 1);
> +  D (f20, 6, 0);
> +  D (f21, 4, 1);
> +  D (f21, 5, 0);
> +  D (f21, 6, 0);
> +  D (f22, 4, 0);
> +  D (f22, 5, 1);
> +  D (f22, 6, 1);
> +  D (f23, 4, 0);
> +  D (f23, 5, 1);
> +  D (f23, 6, 1);
> +  D (f24, 4, 1);
> +  D (f24, 5, 0);
> +  D (f24, 6, 0);
> +  D (f25, 4, 0);
> +  D (f25, 5, 0);
> +  D (f25, 6, 1);
> +  D (f26, 4, 1);
> +  D (f26, 5, 1);
> +  D (f26, 6, 0);
> +  D (f27, 4, 1);
> +  D (f27, 5, 1);
> +  D (f27, 6, 0);
> +  D (f28, 4, 0);
> +  D (f28, 5, 0);
> +  D (f28, 6, 1);
> +  return 0;
> +}
> --- gcc/testsuite/gcc.dg/pr94589-4.c.jj	2021-05-03 17:32:21.915124851 +0200
> +++ gcc/testsuite/gcc.dg/pr94589-4.c	2021-05-03 18:12:50.748835798 +0200
> @@ -0,0 +1,97 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -g -ffast-math" } */
> +
> +#include "pr94589-2.c"
> +
> +#define C(fn, i, j, r) if (fn (i, j) != r) __builtin_abort ()
> +#define D(fn, i, r) if (fn (i) != r) __builtin_abort ()
> +
> +int
> +main ()
> +{
> +  C (f1, 7.0, 8.0, 0);
> +  C (f1, 8.0, 8.0, 1);
> +  C (f1, 9.0, 8.0, 0);
> +  C (f2, 7.0, 8.0, 1);
> +  C (f2, 8.0, 8.0, 0);
> +  C (f2, 9.0, 8.0, 1);
> +  C (f3, 7.0, 8.0, 0);
> +  C (f3, 8.0, 8.0, 0);
> +  C (f3, 9.0, 8.0, 1);
> +  C (f4, 7.0, 8.0, 1);
> +  C (f4, 8.0, 8.0, 0);
> +  C (f4, 9.0, 8.0, 0);
> +  C (f5, 7.0, 8.0, 0);
> +  C (f5, 8.0, 8.0, 1);
> +  C (f5, 9.0, 8.0, 1);
> +  C (f6, 7.0, 8.0, 1);
> +  C (f6, 8.0, 8.0, 1);
> +  C (f6, 9.0, 8.0, 0);
> +  C (f7, 7.0, 8.0, 1);
> +  C (f7, 8.0, 8.0, 0);
> +  C (f7, 9.0, 8.0, 0);
> +  C (f8, 7.0, 8.0, 0);
> +  C (f8, 8.0, 8.0, 1);
> +  C (f8, 9.0, 8.0, 1);
> +  C (f9, 7.0, 8.0, 0);
> +  C (f9, 8.0, 8.0, 1);
> +  C (f9, 9.0, 8.0, 1);
> +  C (f10, 7.0, 8.0, 1);
> +  C (f10, 8.0, 8.0, 0);
> +  C (f10, 9.0, 8.0, 0);
> +  C (f11, 7.0, 8.0, 0);
> +  C (f11, 8.0, 8.0, 0);
> +  C (f11, 9.0, 8.0, 1);
> +  C (f12, 7.0, 8.0, 1);
> +  C (f12, 8.0, 8.0, 1);
> +  C (f12, 9.0, 8.0, 0);
> +  C (f13, 7.0, 8.0, 1);
> +  C (f13, 8.0, 8.0, 1);
> +  C (f13, 9.0, 8.0, 0);
> +  C (f14, 7.0, 8.0, 0);
> +  C (f14, 8.0, 8.0, 0);
> +  C (f14, 9.0, 8.0, 1);
> +  D (f15, 4.0, 0);
> +  D (f15, 5.0, 1);
> +  D (f15, 6.0, 0);
> +  D (f16, 4.0, 1);
> +  D (f16, 5.0, 0);
> +  D (f16, 6.0, 1);
> +  D (f17, 4.0, 0);
> +  D (f17, 5.0, 0);
> +  D (f17, 6.0, 1);
> +  D (f18, 4.0, 1);
> +  D (f18, 5.0, 0);
> +  D (f18, 6.0, 0);
> +  D (f19, 4.0, 0);
> +  D (f19, 5.0, 1);
> +  D (f19, 6.0, 1);
> +  D (f20, 4.0, 1);
> +  D (f20, 5.0, 1);
> +  D (f20, 6.0, 0);
> +  D (f21, 4.0, 1);
> +  D (f21, 5.0, 0);
> +  D (f21, 6.0, 0);
> +  D (f22, 4.0, 0);
> +  D (f22, 5.0, 1);
> +  D (f22, 6.0, 1);
> +  D (f23, 4.0, 0);
> +  D (f23, 5.0, 1);
> +  D (f23, 6.0, 1);
> +  D (f24, 4.0, 1);
> +  D (f24, 5.0, 0);
> +  D (f24, 6.0, 0);
> +  D (f25, 4.0, 0);
> +  D (f25, 5.0, 0);
> +  D (f25, 6.0, 1);
> +  D (f26, 4.0, 1);
> +  D (f26, 5.0, 1);
> +  D (f26, 6.0, 0);
> +  D (f27, 4.0, 1);
> +  D (f27, 5.0, 1);
> +  D (f27, 6.0, 0);
> +  D (f28, 4.0, 0);
> +  D (f28, 5.0, 0);
> +  D (f28, 6.0, 1);
> +  return 0;
> +}
> --- gcc/testsuite/g++.dg/opt/pr94589-1.C.jj	2021-05-03 15:24:02.002046458 +0200
> +++ gcc/testsuite/g++.dg/opt/pr94589-1.C	2021-05-03 15:23:47.445210269 +0200
> @@ -0,0 +1,33 @@
> +// PR tree-optimization/94589
> +// { dg-do compile { target c++20 } }
> +// { dg-options "-O2 -g0 -fdump-tree-optimized" }
> +// { dg-final { scan-tree-dump-times "\[ij]_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) \[ij]_\[0-9]+\\(D\\)" 12 "optimized" } }
> +// { dg-final { scan-tree-dump-times "i_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) \[45]" 12 "optimized" } }
> +
> +#include <compare>
> +
> +#define A __attribute__((noipa))
> +A bool f1 (int i, int j) { auto c = i <=> j; return c == 0; }
> +A bool f2 (int i, int j) { auto c = i <=> j; return c != 0; }
> +A bool f3 (int i, int j) { auto c = i <=> j; return c > 0; }
> +A bool f4 (int i, int j) { auto c = i <=> j; return c < 0; }
> +A bool f5 (int i, int j) { auto c = i <=> j; return c >= 0; }
> +A bool f6 (int i, int j) { auto c = i <=> j; return c <= 0; }
> +A bool f7 (int i, int j) { auto c = i <=> j; return c == std::strong_ordering::less; }
> +A bool f8 (int i, int j) { auto c = i <=> j; return c != std::strong_ordering::less; }
> +A bool f9 (int i, int j) { auto c = i <=> j; return c == std::strong_ordering::equal; }
> +A bool f10 (int i, int j) { auto c = i <=> j; return c != std::strong_ordering::equal; }
> +A bool f11 (int i, int j) { auto c = i <=> j; return c == std::strong_ordering::greater; }
> +A bool f12 (int i, int j) { auto c = i <=> j; return c != std::strong_ordering::greater; }
> +A bool f13 (int i) { auto c = i <=> 5; return c == 0; }
> +A bool f14 (int i) { auto c = i <=> 5; return c != 0; }
> +A bool f15 (int i) { auto c = i <=> 5; return c > 0; }
> +A bool f16 (int i) { auto c = i <=> 5; return c < 0; }
> +A bool f17 (int i) { auto c = i <=> 5; return c >= 0; }
> +A bool f18 (int i) { auto c = i <=> 5; return c <= 0; }
> +A bool f19 (int i) { auto c = i <=> 5; return c == std::strong_ordering::less; }
> +A bool f20 (int i) { auto c = i <=> 5; return c != std::strong_ordering::less; }
> +A bool f21 (int i) { auto c = i <=> 5; return c == std::strong_ordering::equal; }
> +A bool f22 (int i) { auto c = i <=> 5; return c != std::strong_ordering::equal; }
> +A bool f23 (int i) { auto c = i <=> 5; return c == std::strong_ordering::greater; }
> +A bool f24 (int i) { auto c = i <=> 5; return c != std::strong_ordering::greater; }
> --- gcc/testsuite/g++.dg/opt/pr94589-2.C.jj	2021-05-03 16:12:12.002540809 +0200
> +++ gcc/testsuite/g++.dg/opt/pr94589-2.C	2021-05-03 16:16:53.194380966 +0200
> @@ -0,0 +1,33 @@
> +// PR tree-optimization/94589
> +// { dg-do compile { target c++20 } }
> +// { dg-options "-O2 -g0 -ffast-math -fdump-tree-optimized" }
> +// { dg-final { scan-tree-dump-times "\[ij]_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) \[ij]_\[0-9]+\\(D\\)" 14 "optimized" } }
> +// { dg-final { scan-tree-dump-times "i_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) 5\\.0" 14 "optimized" } }
> +
> +#include <compare>
> +
> +#define A __attribute__((noipa))
> +A bool f1 (double i, double j) { auto c = i <=> j; return c == 0; }
> +A bool f2 (double i, double j) { auto c = i <=> j; return c != 0; }
> +A bool f3 (double i, double j) { auto c = i <=> j; return c > 0; }
> +A bool f4 (double i, double j) { auto c = i <=> j; return c < 0; }
> +A bool f5 (double i, double j) { auto c = i <=> j; return c >= 0; }
> +A bool f6 (double i, double j) { auto c = i <=> j; return c <= 0; }
> +A bool f7 (double i, double j) { auto c = i <=> j; return c == std::partial_ordering::less; }
> +A bool f8 (double i, double j) { auto c = i <=> j; return c != std::partial_ordering::less; }
> +A bool f9 (double i, double j) { auto c = i <=> j; return c == std::partial_ordering::equivalent; }
> +A bool f10 (double i, double j) { auto c = i <=> j; return c != std::partial_ordering::equivalent; }
> +A bool f11 (double i, double j) { auto c = i <=> j; return c == std::partial_ordering::greater; }
> +A bool f12 (double i, double j) { auto c = i <=> j; return c != std::partial_ordering::greater; }
> +A bool f13 (double i) { auto c = i <=> 5.0; return c == 0; }
> +A bool f14 (double i) { auto c = i <=> 5.0; return c != 0; }
> +A bool f15 (double i) { auto c = i <=> 5.0; return c > 0; }
> +A bool f16 (double i) { auto c = i <=> 5.0; return c < 0; }
> +A bool f17 (double i) { auto c = i <=> 5.0; return c >= 0; }
> +A bool f18 (double i) { auto c = i <=> 5.0; return c <= 0; }
> +A bool f19 (double i) { auto c = i <=> 5.0; return c == std::partial_ordering::less; }
> +A bool f20 (double i) { auto c = i <=> 5.0; return c != std::partial_ordering::less; }
> +A bool f21 (double i) { auto c = i <=> 5.0; return c == std::partial_ordering::equivalent; }
> +A bool f22 (double i) { auto c = i <=> 5.0; return c != std::partial_ordering::equivalent; }
> +A bool f23 (double i) { auto c = i <=> 5.0; return c == std::partial_ordering::greater; }
> +A bool f24 (double i) { auto c = i <=> 5.0; return c != std::partial_ordering::greater; }
> --- gcc/testsuite/g++.dg/opt/pr94589-3.C.jj	2021-05-03 16:22:29.898597327 +0200
> +++ gcc/testsuite/g++.dg/opt/pr94589-3.C	2021-05-03 17:21:23.290556509 +0200
> @@ -0,0 +1,84 @@
> +// { dg-do run { target c++20 } }
> +// { dg-options "-O2 -g" }
> +
> +#include "pr94589-1.C"
> +
> +#define C(fn, i, j, r) if (fn (i, j) != r) __builtin_abort ()
> +#define D(fn, i, r) if (fn (i) != r) __builtin_abort ()
> +
> +int
> +main ()
> +{
> +  C (f1, 7, 8, false);
> +  C (f1, 8, 8, true);
> +  C (f1, 9, 8, false);
> +  C (f2, 7, 8, true);
> +  C (f2, 8, 8, false);
> +  C (f2, 9, 8, true);
> +  C (f3, 7, 8, false);
> +  C (f3, 8, 8, false);
> +  C (f3, 9, 8, true);
> +  C (f4, 7, 8, true);
> +  C (f4, 8, 8, false);
> +  C (f4, 9, 8, false);
> +  C (f5, 7, 8, false);
> +  C (f5, 8, 8, true);
> +  C (f5, 9, 8, true);
> +  C (f6, 7, 8, true);
> +  C (f6, 8, 8, true);
> +  C (f6, 9, 8, false);
> +  C (f7, 7, 8, true);
> +  C (f7, 8, 8, false);
> +  C (f7, 9, 8, false);
> +  C (f8, 7, 8, false);
> +  C (f8, 8, 8, true);
> +  C (f8, 9, 8, true);
> +  C (f9, 7, 8, false);
> +  C (f9, 8, 8, true);
> +  C (f9, 9, 8, false);
> +  C (f10, 7, 8, true);
> +  C (f10, 8, 8, false);
> +  C (f10, 9, 8, true);
> +  C (f11, 7, 8, false);
> +  C (f11, 8, 8, false);
> +  C (f11, 9, 8, true);
> +  C (f12, 7, 8, true);
> +  C (f12, 8, 8, true);
> +  C (f12, 9, 8, false);
> +  D (f13, 4, false);
> +  D (f13, 5, true);
> +  D (f13, 6, false);
> +  D (f14, 4, true);
> +  D (f14, 5, false);
> +  D (f14, 6, true);
> +  D (f15, 4, false);
> +  D (f15, 5, false);
> +  D (f15, 6, true);
> +  D (f16, 4, true);
> +  D (f16, 5, false);
> +  D (f16, 6, false);
> +  D (f17, 4, false);
> +  D (f17, 5, true);
> +  D (f17, 6, true);
> +  D (f18, 4, true);
> +  D (f18, 5, true);
> +  D (f18, 6, false);
> +  D (f19, 4, true);
> +  D (f19, 5, false);
> +  D (f19, 6, false);
> +  D (f20, 4, false);
> +  D (f20, 5, true);
> +  D (f20, 6, true);
> +  D (f21, 4, false);
> +  D (f21, 5, true);
> +  D (f21, 6, false);
> +  D (f22, 4, true);
> +  D (f22, 5, false);
> +  D (f22, 6, true);
> +  D (f23, 4, false);
> +  D (f23, 5, false);
> +  D (f23, 6, true);
> +  D (f24, 4, true);
> +  D (f24, 5, true);
> +  D (f24, 6, false);
> +}
> --- gcc/testsuite/g++.dg/opt/pr94589-4.C.jj	2021-05-03 17:21:40.968351902 +0200
> +++ gcc/testsuite/g++.dg/opt/pr94589-4.C	2021-05-03 17:23:13.313289481 +0200
> @@ -0,0 +1,84 @@
> +// { dg-do run { target c++20 } }
> +// { dg-options "-O2 -g -ffast-math" }
> +
> +#include "pr94589-2.C"
> +
> +#define C(fn, i, j, r) if (fn (i, j) != r) __builtin_abort ()
> +#define D(fn, i, r) if (fn (i) != r) __builtin_abort ()
> +
> +int
> +main ()
> +{
> +  C (f1, 7.0, 8.0, false);
> +  C (f1, 8.0, 8.0, true);
> +  C (f1, 9.0, 8.0, false);
> +  C (f2, 7.0, 8.0, true);
> +  C (f2, 8.0, 8.0, false);
> +  C (f2, 9.0, 8.0, true);
> +  C (f3, 7.0, 8.0, false);
> +  C (f3, 8.0, 8.0, false);
> +  C (f3, 9.0, 8.0, true);
> +  C (f4, 7.0, 8.0, true);
> +  C (f4, 8.0, 8.0, false);
> +  C (f4, 9.0, 8.0, false);
> +  C (f5, 7.0, 8.0, false);
> +  C (f5, 8.0, 8.0, true);
> +  C (f5, 9.0, 8.0, true);
> +  C (f6, 7.0, 8.0, true);
> +  C (f6, 8.0, 8.0, true);
> +  C (f6, 9.0, 8.0, false);
> +  C (f7, 7.0, 8.0, true);
> +  C (f7, 8.0, 8.0, false);
> +  C (f7, 9.0, 8.0, false);
> +  C (f8, 7.0, 8.0, false);
> +  C (f8, 8.0, 8.0, true);
> +  C (f8, 9.0, 8.0, true);
> +  C (f9, 7.0, 8.0, false);
> +  C (f9, 8.0, 8.0, true);
> +  C (f9, 9.0, 8.0, false);
> +  C (f10, 7.0, 8.0, true);
> +  C (f10, 8.0, 8.0, false);
> +  C (f10, 9.0, 8.0, true);
> +  C (f11, 7.0, 8.0, false);
> +  C (f11, 8.0, 8.0, false);
> +  C (f11, 9.0, 8.0, true);
> +  C (f12, 7.0, 8.0, true);
> +  C (f12, 8.0, 8.0, true);
> +  C (f12, 9.0, 8.0, false);
> +  D (f13, 4.0, false);
> +  D (f13, 5.0, true);
> +  D (f13, 6.0, false);
> +  D (f14, 4.0, true);
> +  D (f14, 5.0, false);
> +  D (f14, 6.0, true);
> +  D (f15, 4.0, false);
> +  D (f15, 5.0, false);
> +  D (f15, 6.0, true);
> +  D (f16, 4.0, true);
> +  D (f16, 5.0, false);
> +  D (f16, 6.0, false);
> +  D (f17, 4.0, false);
> +  D (f17, 5.0, true);
> +  D (f17, 6.0, true);
> +  D (f18, 4.0, true);
> +  D (f18, 5.0, true);
> +  D (f18, 6.0, false);
> +  D (f19, 4.0, true);
> +  D (f19, 5.0, false);
> +  D (f19, 6.0, false);
> +  D (f20, 4.0, false);
> +  D (f20, 5.0, true);
> +  D (f20, 6.0, true);
> +  D (f21, 4.0, false);
> +  D (f21, 5.0, true);
> +  D (f21, 6.0, false);
> +  D (f22, 4.0, true);
> +  D (f22, 5.0, false);
> +  D (f22, 6.0, true);
> +  D (f23, 4.0, false);
> +  D (f23, 5.0, false);
> +  D (f23, 6.0, true);
> +  D (f24, 4.0, true);
> +  D (f24, 5.0, true);
> +  D (f24, 6.0, false);
> +}
> 
> 	Jakub
> 
> 
>
Marc Glisse May 5, 2021, 11:45 a.m. UTC | #4
On Tue, 4 May 2021, Jakub Jelinek via Gcc-patches wrote:

> 2) the pr94589-2.C testcase should be matching just 12 times each, but runs
> into operator>=(strong_ordering, unspecified) being defined as
> (_M_value&1)==_M_value
> rather than _M_value>=0.  When not honoring NaNs, the 2 case should be
> unreachable and so (_M_value&1)==_M_value is then equivalent to _M_value>=0,
> but is not a single use but two uses.  I'll need to pattern match that case
> specially.

Somewhere in RTL (_M_value&1)==_M_value is turned into (_M_value&-2)==0, 
that could be worth doing already in GIMPLE.
Martin Sebor May 5, 2021, 3:45 p.m. UTC | #5
On 5/4/21 1:44 AM, Jakub Jelinek via Gcc-patches wrote:
> Hi!
> 
> genericize_spaceship genericizes i <=> j to approximately
> ({ int c; if (i == j) c = 0; else if (i < j) c = -1; else c = 1; c; })
> for strong ordering and
> ({ int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; c; })
> for partial ordering.
> The C++ standard supports then == or != comparisons of that against
> strong/partial ordering enums, or </<=/==/!=/>/>= comparisons of <=> result
> against literal 0.
> 
> In some cases we already optimize that but in many cases we keep performing
> all the 2 or 3 comparisons, compute the spaceship value and then compare
> that.
> 
> The following patch recognizes those patterns if the <=> operands are
> integral types or floating point (the latter only for -ffast-math) and
> optimizes it to the single comparison that is needed (plus adds debug stmts
> if needed for the spaceship result).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Just a few words on readabilty.

I didn't follow all the logic carefully but I found the code very
dense and hard to read, and a little too frugal with comments.
The new spaceship_replacement function is over 340 lines long!  It
could stand to be broken up into a few smaller logical pieces to
be easier to read and understand.  Commenting the pieces would
also help (a lot).  (It may seem crystal clear to you the way it
is but not all of us read code as it was prose.  It's good to
keep that in mind.)

Overall, long functions are easier to read if they make more liberal
use of vertical whitespace than when they look like one uninterrupted
stream of text.  I see that sometimes you separate chunks of code with
blank lines but other times you don't.  I'm not sure I see any pattern
to it but it would help to separate logical blocks with blank lines,
especially after return statements.  I marked up a few spots to give
you a general idea what I mean.

Martin

> 
> There are two things I'd like to address in a follow-up:
> 1) if (HONOR_NANS (TREE_TYPE (lhs1)) || HONOR_SIGNED_ZEROS (TREE_TYPE (lhs1)))
> is what I've copied from elsewhere in phiopt, but thinking about it,
> alll we care is probably only HONOR_NANS, the matched pattern starts with
> == or != comparison and branches to the PHI bb with -1/0/1/2 result if it is
> equal, which should be the case for signed zero differences.
> 2) the pr94589-2.C testcase should be matching just 12 times each, but runs
> into operator>=(strong_ordering, unspecified) being defined as
> (_M_value&1)==_M_value
> rather than _M_value>=0.  When not honoring NaNs, the 2 case should be
> unreachable and so (_M_value&1)==_M_value is then equivalent to _M_value>=0,
> but is not a single use but two uses.  I'll need to pattern match that case
> specially.
> 
> 2021-05-04  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR tree-optimization/94589
> 	* tree-ssa-phiopt.c (tree_ssa_phiopt_worker): Call
> 	spaceship_replacement.
> 	(cond_only_block_p, spaceship_replacement): New functions.
> 
> 	* gcc.dg/pr94589-1.c: New test.
> 	* gcc.dg/pr94589-2.c: New test.
> 	* gcc.dg/pr94589-3.c: New test.
> 	* gcc.dg/pr94589-4.c: New test.
> 	* g++.dg/opt/pr94589-1.C: New test.
> 	* g++.dg/opt/pr94589-2.C: New test.
> 	* g++.dg/opt/pr94589-3.C: New test.
> 	* g++.dg/opt/pr94589-4.C: New test.
> 
> --- gcc/tree-ssa-phiopt.c.jj	2021-05-02 10:17:49.095397758 +0200
> +++ gcc/tree-ssa-phiopt.c	2021-05-03 17:49:54.233300624 +0200
> @@ -64,6 +64,8 @@ static bool abs_replacement (basic_block
>   			     edge, edge, gimple *, tree, tree);
>   static bool xor_replacement (basic_block, basic_block,
>   			     edge, edge, gimple *, tree, tree);
> +static bool spaceship_replacement (basic_block, basic_block,
> +				   edge, edge, gimple *, tree, tree);
>   static bool cond_removal_in_popcount_clz_ctz_pattern (basic_block, basic_block,
>   						      edge, edge, gimple *,
>   						      tree, tree);
> @@ -357,6 +359,8 @@ tree_ssa_phiopt_worker (bool do_store_el
>   	    cfgchanged = true;
>   	  else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
>   	    cfgchanged = true;
> +	  else if (spaceship_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> +	    cfgchanged = true;
>   	}
>       }
>   
> @@ -1806,6 +1810,420 @@ minmax_replacement (basic_block cond_bb,
>   
>     return true;
>   }
> +
> +/* Return true if the only executable statement in BB is a GIMPLE_COND.  */
> +
> +static bool
> +cond_only_block_p (basic_block bb)
> +{
> +  /* BB must have no executable statements.  */
> +  gimple_stmt_iterator gsi = gsi_after_labels (bb);
> +  if (phi_nodes (bb))
> +    return false;
> +  while (!gsi_end_p (gsi))
> +    {
> +      gimple *stmt = gsi_stmt (gsi);
> +      if (is_gimple_debug (stmt))
> +	;
> +      else if (gimple_code (stmt) == GIMPLE_NOP
> +	       || gimple_code (stmt) == GIMPLE_PREDICT
> +	       || gimple_code (stmt) == GIMPLE_COND)
> +	;
> +      else
> +	return false;
> +      gsi_next (&gsi);
> +    }
> +  return true;
> +}
> +
> +/* Attempt to optimize (x <=> y) cmp 0 and similar comparisons.
> +   For strong ordering <=> try to match something like:
> +    <bb 2> :
> +    if (x_4(D) != y_5(D))
> +      goto <bb 3>; [INV]
> +    else
> +      goto <bb 6>; [INV]
> +
> +    <bb 3> :
> +    if (x_4(D) < y_5(D))
> +      goto <bb 6>; [INV]
> +    else
> +      goto <bb 4>; [INV]
> +
> +    <bb 4> :
> +
> +    <bb 6> :
> +    # iftmp.0_2 = PHI <1(4), 0(2), -1(3)>
> +    _1 = iftmp.0_2 == 0;
> +
> +   and for partial ordering <=> something like:
> +
> +    <bb 2> :
> +    if (a_3(D) == b_5(D))
> +      goto <bb 6>; [50.00%]
> +    else
> +      goto <bb 3>; [50.00%]
> +
> +    <bb 3> [local count: 536870913]:
> +    if (a_3(D) < b_5(D))
> +      goto <bb 6>; [50.00%]
> +    else
> +      goto <bb 4>; [50.00%]
> +
> +    <bb 4> [local count: 268435456]:
> +    if (a_3(D) > b_5(D))
> +      goto <bb 6>; [50.00%]
> +    else
> +      goto <bb 5>; [50.00%]
> +
> +    <bb 5> [local count: 134217728]:
> +
> +    <bb 6> [local count: 1073741824]:
> +    # SR.27_4 = PHI <0(2), -1(3), 1(4), 2(5)>
> +    _2 = SR.27_4 > 0;  */

The code below uses variable names like lhs1, rhs1, lhs2, and rhs2
that correspond to some of the operands above.  Annotating the code
above with the names of the variables would make the function easier
to understand.

> +
> +static bool
> +spaceship_replacement (basic_block cond_bb, basic_block middle_bb,
> +		       edge e0, edge e1, gimple *phi,
> +		       tree arg0, tree arg1)
> +{
> +  if (!INTEGRAL_TYPE_P (TREE_TYPE (PHI_RESULT (phi)))
> +      || TYPE_UNSIGNED (TREE_TYPE (PHI_RESULT (phi)))
> +      || !tree_fits_shwi_p (arg0)
> +      || !tree_fits_shwi_p (arg1)
> +      || !IN_RANGE (tree_to_shwi (arg0), -1, 2)
> +      || !IN_RANGE (tree_to_shwi (arg1), -1, 2))
> +    return false;
> +
> +  use_operand_p use_p;
> +  gimple *use_stmt;
> +  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (PHI_RESULT (phi)))
> +    return false;
> +  if (!single_imm_use (PHI_RESULT (phi), &use_p, &use_stmt))
> +    return false;

Add a blank line here and a comment below.

> +  enum tree_code cmp;
> +  tree lhs, rhs;
> +  if (gimple_code (use_stmt) == GIMPLE_COND)
> +    {
> +      cmp = gimple_cond_code (use_stmt);
> +      lhs = gimple_cond_lhs (use_stmt);
> +      rhs = gimple_cond_rhs (use_stmt);
> +    }
> +  else if (is_gimple_assign (use_stmt))
> +    {
> +      if (gimple_assign_rhs_class (use_stmt) == GIMPLE_BINARY_RHS)
> +	{
> +	  cmp = gimple_assign_rhs_code (use_stmt);
> +	  lhs = gimple_assign_rhs1 (use_stmt);
> +	  rhs = gimple_assign_rhs2 (use_stmt);
> +	}
> +      else if (gimple_assign_rhs_code (use_stmt) == COND_EXPR)
> +	{
> +	  tree cond = gimple_assign_rhs1 (use_stmt);
> +	  if (!COMPARISON_CLASS_P (cond))
> +	    return false;
> +	  cmp = TREE_CODE (cond);
> +	  lhs = TREE_OPERAND (cond, 0);
> +	  rhs = TREE_OPERAND (cond, 1);
> +	}
> +      else
> +	return false;
> +    }
> +  else
> +    return false;

Add a blank line here.

> +  switch (cmp)
> +    {
> +    case EQ_EXPR:
> +    case NE_EXPR:
> +    case LT_EXPR:
> +    case GT_EXPR:
> +    case LE_EXPR:
> +    case GE_EXPR:
> +      break;
> +    default:
> +      return false;
> +    }

Add a blank line here.

> +  if (lhs != PHI_RESULT (phi)
> +      || !tree_fits_shwi_p (rhs)
> +      || !IN_RANGE (tree_to_shwi (rhs), -1, 1))
> +    return false;
> +
> +  if (!empty_block_p (middle_bb))
> +    return false;
> +
> +  basic_block phi_bb = gimple_bb (phi);
> +  gcc_assert (phi_bb == e0->dest && phi_bb == e1->dest);
> +  if (!IN_RANGE (EDGE_COUNT (phi_bb->preds), 3, 4))
> +    return false;

Helpful blank lines above.

> +
> +  gcond *cond1 = as_a <gcond *> (last_stmt (cond_bb));
> +  enum tree_code cmp1 = gimple_cond_code (cond1);
> +  if (cmp1 != LT_EXPR && cmp1 != GT_EXPR)
> +    return false;

Add a blank line here.

We define lhs and rhs above and lhs1 and rhs1 below.  A comment
explaining what the variables correspond to would help (as might
choosing better names).

> +  tree lhs1 = gimple_cond_lhs (cond1);
> +  tree rhs1 = gimple_cond_rhs (cond1);
> +  /* The optimization may be unsafe due to NaNs.  */
> +  if (HONOR_NANS (TREE_TYPE (lhs1)) || HONOR_SIGNED_ZEROS (TREE_TYPE (lhs1)))
> +    return false;
> +  if (TREE_CODE (lhs1) == SSA_NAME && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs1))
> +    return false;
> +  if (TREE_CODE (rhs1) == SSA_NAME && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs1))
> +    return false;
> +
> +  if (!single_pred_p (cond_bb) || !cond_only_block_p (cond_bb))
> +    return false;
> +

Helpful blank lines above.

The code below could use a comment.

> +  basic_block cond2_bb = single_pred (cond_bb);
> +  if (EDGE_COUNT (cond2_bb->succs) != 2)
> +    return false;
> +  edge cond2_phi_edge;
> +  if (EDGE_SUCC (cond2_bb, 0)->dest == cond_bb)
> +    {
> +      if (EDGE_SUCC (cond2_bb, 1)->dest != phi_bb)
> +	return false;
> +      cond2_phi_edge = EDGE_SUCC (cond2_bb, 1);
> +    }
> +  else if (EDGE_SUCC (cond2_bb, 0)->dest != phi_bb)
> +    return false;
> +  else
> +    cond2_phi_edge = EDGE_SUCC (cond2_bb, 0);

Add blank line here.

The code below could use a comment.  Same for lhs2 and rhs2 (picking
better names might also help).

> +  tree arg2 = gimple_phi_arg_def (phi, cond2_phi_edge->dest_idx);
> +  if (!tree_fits_shwi_p (arg2))
> +    return false;
> +  gimple *cond2 = last_stmt (cond2_bb);
> +  if (cond2 == NULL || gimple_code (cond2) != GIMPLE_COND)
> +    return false;
> +  enum tree_code cmp2 = gimple_cond_code (cond2);
> +  tree lhs2 = gimple_cond_lhs (cond2);
> +  tree rhs2 = gimple_cond_rhs (cond2);
> +  if (lhs2 == lhs1)
> +    {
> +      if (!operand_equal_p (rhs2, rhs1, 0))
> +	return false;
> +    }
> +  else if (lhs2 == rhs1)
> +    {
> +      if (rhs2 != lhs1)
> +	return false;
> +    }
> +  else
> +    return false;

Add a blank line here.

The big if statement below could use a comment.

> +  tree arg3 = arg2;
> +  basic_block cond3_bb = cond2_bb;
> +  edge cond3_phi_edge = cond2_phi_edge;
> +  gimple *cond3 = cond2;
> +  enum tree_code cmp3 = cmp2;
> +  tree lhs3 = lhs2;
> +  tree rhs3 = rhs2;
> +  if (EDGE_COUNT (phi_bb->preds) == 4)
> +    {
> +      if (absu_hwi (tree_to_shwi (arg2)) != 1)
> +	return false;

Add a blank line here.

> +      if (e1->flags & EDGE_TRUE_VALUE)
> +	{
> +	  if (tree_to_shwi (arg0) != 2
> +	      || absu_hwi (tree_to_shwi (arg1)) != 1
> +	      || wi::to_widest (arg1) == wi::to_widest (arg2))
> +	    return false;
> +	}
> +      else if (tree_to_shwi (arg1) != 2
> +	       || absu_hwi (tree_to_shwi (arg0)) != 1
> +	       || wi::to_widest (arg0) == wi::to_widest (arg1))
> +	return false;

Add a blank line here.

> +      if (cmp2 != LT_EXPR && cmp2 != GT_EXPR)
> +	return false;

Add a blank line here.

> +      /* if (x < y) goto phi_bb; else fallthru;
> +	 if (x > y) goto phi_bb; else fallthru;
> +	 bbx:;
> +	 phi_bb:;
> +	 is ok, but if x and y are swapped in one of the comparisons,
> +	 or the comparisons are the same and operands not swapped,
> +	 or second goto phi_bb is not the true edge, it is not.  */
> +      if ((lhs2 == lhs1)
> +	  ^ (cmp2 == cmp1)
> +	  ^ ((e1->flags & EDGE_TRUE_VALUE) != 0))
> +	return false;

Add a blank line here and a comment for code below?

> +      if ((cond2_phi_edge->flags & EDGE_TRUE_VALUE) == 0)
> +	return false;
> +      if (!single_pred_p (cond2_bb) || !cond_only_block_p (cond2_bb))
> +	return false;
> +      cond3_bb = single_pred (cond2_bb);
> +      if (EDGE_COUNT (cond2_bb->succs) != 2)
> +	return false;
> +      if (EDGE_SUCC (cond3_bb, 0)->dest == cond2_bb)
> +	{
> +	  if (EDGE_SUCC (cond3_bb, 1)->dest != phi_bb)
> +	    return false;
> +	  cond3_phi_edge = EDGE_SUCC (cond3_bb, 1);
> +	}
> +      else if (EDGE_SUCC (cond3_bb, 0)->dest != phi_bb)
> +	return false;
> +      else
> +	cond3_phi_edge = EDGE_SUCC (cond3_bb, 0);

Add a blank line here and comment for code below.

> +      arg3 = gimple_phi_arg_def (phi, cond3_phi_edge->dest_idx);
> +      cond3 = last_stmt (cond3_bb);
> +      if (cond3 == NULL || gimple_code (cond3) != GIMPLE_COND)
> +	return false;
> +      cmp3 = gimple_cond_code (cond3);
> +      lhs3 = gimple_cond_lhs (cond3);
> +      rhs3 = gimple_cond_rhs (cond3);
> +      if (lhs3 == lhs1)
> +	{
> +	  if (!operand_equal_p (rhs3, rhs1, 0))
> +	    return false;
> +	}
> +      else if (lhs3 == rhs1)
> +	{
> +	  if (rhs3 != lhs1)
> +	    return false;
> +	}
> +      else
> +	return false;
> +    }
> +  else if (absu_hwi (tree_to_shwi (arg0)) != 1
> +	   || absu_hwi (tree_to_shwi (arg1)) != 1
> +	   || wi::to_widest (arg0) == wi::to_widest (arg1))
> +    return false;
> +
> +  if (!integer_zerop (arg3) || (cmp3 != EQ_EXPR && cmp3 != NE_EXPR))
> +    return false;
> +  if ((cond3_phi_edge->flags & (cmp3 == EQ_EXPR
> +				? EDGE_TRUE_VALUE : EDGE_FALSE_VALUE)) == 0)
> +    return false;
> +

Helpful blank lines above.

> +  /* lhs1 one_cmp rhs1 results in PHI_RESULT (phi) of 1.  */
> +  enum tree_code one_cmp;
> +  if ((cmp1 == LT_EXPR)
> +      ^ (!integer_onep ((e1->flags & EDGE_TRUE_VALUE) ? arg1 : arg0)))
> +    one_cmp = LT_EXPR;
> +  else
> +    one_cmp = GT_EXPR;
> +

The switch statement below could use a comment explaining what it
does.  Better yet, move it to its own function with such a comment.

> +  enum tree_code res_cmp;
> +  switch (cmp)
> +    {
> +    case EQ_EXPR:
> +      if (integer_zerop (rhs))
> +	res_cmp = EQ_EXPR;
> +      else if (integer_minus_onep (rhs))
> +	res_cmp = one_cmp == LT_EXPR ? GT_EXPR : LT_EXPR;
> +      else if (integer_onep (rhs))
> +	res_cmp = one_cmp;
> +      else
> +	return false;
> +      break;
> +    case NE_EXPR:
> +      if (integer_zerop (rhs))
> +	res_cmp = NE_EXPR;
> +      else if (integer_minus_onep (rhs))
> +	res_cmp = one_cmp == LT_EXPR ? LE_EXPR : GE_EXPR;
> +      else if (integer_onep (rhs))
> +	res_cmp = one_cmp == LT_EXPR ? GE_EXPR : LE_EXPR;
> +      else
> +	return false;
> +      break;
> +    case LT_EXPR:
> +      if (integer_onep (rhs))
> +	res_cmp = one_cmp == LT_EXPR ? GE_EXPR : LE_EXPR;
> +      else if (integer_zerop (rhs))
> +	res_cmp = one_cmp == LT_EXPR ? GT_EXPR : LT_EXPR;
> +      else
> +	return false;
> +      break;
> +    case LE_EXPR:
> +      if (integer_zerop (rhs))
> +	res_cmp = one_cmp == LT_EXPR ? GE_EXPR : LE_EXPR;
> +      else if (integer_minus_onep (rhs))
> +	res_cmp = one_cmp == LT_EXPR ? GT_EXPR : LT_EXPR;
> +      else
> +	return false;
> +      break;
> +    case GT_EXPR:
> +      if (integer_minus_onep (rhs))
> +	res_cmp = one_cmp == LT_EXPR ? LE_EXPR : GE_EXPR;
> +      else if (integer_zerop (rhs))
> +	res_cmp = one_cmp;
> +      else
> +	return false;
> +      break;
> +    case GE_EXPR:
> +      if (integer_zerop (rhs))
> +	res_cmp = one_cmp == LT_EXPR ? LE_EXPR : GE_EXPR;
> +      else if (integer_onep (rhs))
> +	res_cmp = one_cmp;
> +      else
> +	return false;
> +      break;
> +    default:
> +      gcc_unreachable ();
> +    }
> +

Add a comment for code below.

> +  if (gimple_code (use_stmt) == GIMPLE_COND)
> +    {
> +      gcond *use_cond = as_a <gcond *> (use_stmt);
> +      gimple_cond_set_code (use_cond, res_cmp);
> +      gimple_cond_set_lhs (use_cond, lhs1);
> +      gimple_cond_set_rhs (use_cond, rhs1);
> +    }
> +  else if (gimple_assign_rhs_class (use_stmt) == GIMPLE_BINARY_RHS)
> +    {
> +      gimple_assign_set_rhs_code (use_stmt, res_cmp);
> +      gimple_assign_set_rhs1 (use_stmt, lhs1);
> +      gimple_assign_set_rhs2 (use_stmt, rhs1);
> +    }
> +  else
> +    {
> +      tree cond = build2 (res_cmp, TREE_TYPE (gimple_assign_rhs1 (use_stmt)),
> +			  lhs1, rhs1);
> +      gimple_assign_set_rhs1 (use_stmt, cond);
> +    }
> +  update_stmt (use_stmt);
> +  if (cmp3 == EQ_EXPR)
> +    gimple_cond_make_true (as_a <gcond *> (cond3));
> +  else
> +    gimple_cond_make_false (as_a <gcond *> (cond3));
> +  update_stmt (cond3);
> +

Add a comment.

> +  if (MAY_HAVE_DEBUG_BIND_STMTS)
> +    {
> +      use_operand_p use_p;
> +      imm_use_iterator iter;
> +      bool has_debug_uses = false;
> +      FOR_EACH_IMM_USE_FAST (use_p, iter, PHI_RESULT (phi))
> +	{
> +	  gimple *use_stmt = USE_STMT (use_p);
> +	  gcc_assert (is_gimple_debug (use_stmt));
> +	  has_debug_uses = true;
> +	  break;
> +	}
> +

Add a comment.

> +      if (has_debug_uses)
> +	{
> +	  gimple_stmt_iterator gsi = gsi_after_labels (gimple_bb (phi));
> +	  tree type = TREE_TYPE (PHI_RESULT (phi));
> +	  tree temp1 = make_node (DEBUG_EXPR_DECL);
> +	  DECL_ARTIFICIAL (temp1) = 1;
> +	  TREE_TYPE (temp1) = type;
> +	  SET_DECL_MODE (temp1, TYPE_MODE (type));
> +	  tree t = build2 (one_cmp, boolean_type_node, lhs1, rhs2);
> +	  t = build3 (COND_EXPR, type, t, build_one_cst (type),
> +		      build_int_cst (type, -1));
> +	  gimple *g = gimple_build_debug_bind (temp1, t, phi);
> +	  gsi_insert_before (&gsi, g, GSI_SAME_STMT);
> +	  tree temp2 = make_node (DEBUG_EXPR_DECL);
> +	  DECL_ARTIFICIAL (temp2) = 1;
> +	  TREE_TYPE (temp2) = type;
> +	  SET_DECL_MODE (temp2, TYPE_MODE (type));
> +	  t = build2 (EQ_EXPR, boolean_type_node, lhs1, rhs2);
> +	  t = build3 (COND_EXPR, type, t, build_zero_cst (type), temp1);
> +	  g = gimple_build_debug_bind (temp2, t, phi);
> +	  gsi_insert_before (&gsi, g, GSI_SAME_STMT);
> +	  replace_uses_by (PHI_RESULT (phi), temp2);
> +	}
> +    }
> +
> +  return true;
> +}
>   
>   /* Convert
>   
> --- gcc/testsuite/gcc.dg/pr94589-1.c.jj	2021-05-03 17:32:14.940203230 +0200
> +++ gcc/testsuite/gcc.dg/pr94589-1.c	2021-05-03 17:54:33.081167518 +0200
> @@ -0,0 +1,35 @@
> +/* PR tree-optimization/94589 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -g0 -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-times "\[ij]_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) \[ij]_\[0-9]+\\(D\\)" 14 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "i_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) \[45]" 14 "optimized" } } */
> +
> +#define A __attribute__((noipa))
> +A int f1 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c == 0; }
> +A int f2 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c != 0; }
> +A int f3 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c > 0; }
> +A int f4 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c < 0; }
> +A int f5 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c >= 0; }
> +A int f6 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c <= 0; }
> +A int f7 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c == -1; }
> +A int f8 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c != -1; }
> +A int f9 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c > -1; }
> +A int f10 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c <= -1; }
> +A int f11 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c == 1; }
> +A int f12 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c != 1; }
> +A int f13 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c < 1; }
> +A int f14 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c >= 1; }
> +A int f15 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c == 0; }
> +A int f16 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c != 0; }
> +A int f17 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c > 0; }
> +A int f18 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c < 0; }
> +A int f19 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c >= 0; }
> +A int f20 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c <= 0; }
> +A int f21 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c == -1; }
> +A int f22 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c != -1; }
> +A int f23 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c > -1; }
> +A int f24 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c <= -1; }
> +A int f25 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c == 1; }
> +A int f26 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c != 1; }
> +A int f27 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c < 1; }
> +A int f28 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c >= 1; }
> --- gcc/testsuite/gcc.dg/pr94589-2.c.jj	2021-05-03 17:32:17.950169407 +0200
> +++ gcc/testsuite/gcc.dg/pr94589-2.c	2021-05-03 17:56:12.577049606 +0200
> @@ -0,0 +1,35 @@
> +/* PR tree-optimization/94589 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -g0 -ffast-math -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-times "\[ij]_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) \[ij]_\[0-9]+\\(D\\)" 14 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "i_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) 5\\.0" 14 "optimized" } } */
> +
> +#define A __attribute__((noipa))
> +A int f1 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c == 0; }
> +A int f2 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c != 0; }
> +A int f3 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c > 0; }
> +A int f4 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c < 0; }
> +A int f5 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c >= 0; }
> +A int f6 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c <= 0; }
> +A int f7 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c == -1; }
> +A int f8 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c != -1; }
> +A int f9 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c > -1; }
> +A int f10 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c <= -1; }
> +A int f11 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c == 1; }
> +A int f12 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c != 1; }
> +A int f13 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c < 1; }
> +A int f14 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c >= 1; }
> +A int f15 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c == 0; }
> +A int f16 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c != 0; }
> +A int f17 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c > 0; }
> +A int f18 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c < 0; }
> +A int f19 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c >= 0; }
> +A int f20 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c <= 0; }
> +A int f21 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c == -1; }
> +A int f22 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c != -1; }
> +A int f23 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c > -1; }
> +A int f24 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c <= -1; }
> +A int f25 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c == 1; }
> +A int f26 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c != 1; }
> +A int f27 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c < 1; }
> +A int f28 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c >= 1; }
> --- gcc/testsuite/gcc.dg/pr94589-3.c.jj	2021-05-03 17:32:19.998146397 +0200
> +++ gcc/testsuite/gcc.dg/pr94589-3.c	2021-05-03 18:02:41.628678802 +0200
> @@ -0,0 +1,97 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -g" } */
> +
> +#include "pr94589-1.c"
> +
> +#define C(fn, i, j, r) if (fn (i, j) != r) __builtin_abort ()
> +#define D(fn, i, r) if (fn (i) != r) __builtin_abort ()
> +
> +int
> +main ()
> +{
> +  C (f1, 7, 8, 0);
> +  C (f1, 8, 8, 1);
> +  C (f1, 9, 8, 0);
> +  C (f2, 7, 8, 1);
> +  C (f2, 8, 8, 0);
> +  C (f2, 9, 8, 1);
> +  C (f3, 7, 8, 0);
> +  C (f3, 8, 8, 0);
> +  C (f3, 9, 8, 1);
> +  C (f4, 7, 8, 1);
> +  C (f4, 8, 8, 0);
> +  C (f4, 9, 8, 0);
> +  C (f5, 7, 8, 0);
> +  C (f5, 8, 8, 1);
> +  C (f5, 9, 8, 1);
> +  C (f6, 7, 8, 1);
> +  C (f6, 8, 8, 1);
> +  C (f6, 9, 8, 0);
> +  C (f7, 7, 8, 1);
> +  C (f7, 8, 8, 0);
> +  C (f7, 9, 8, 0);
> +  C (f8, 7, 8, 0);
> +  C (f8, 8, 8, 1);
> +  C (f8, 9, 8, 1);
> +  C (f9, 7, 8, 0);
> +  C (f9, 8, 8, 1);
> +  C (f9, 9, 8, 1);
> +  C (f10, 7, 8, 1);
> +  C (f10, 8, 8, 0);
> +  C (f10, 9, 8, 0);
> +  C (f11, 7, 8, 0);
> +  C (f11, 8, 8, 0);
> +  C (f11, 9, 8, 1);
> +  C (f12, 7, 8, 1);
> +  C (f12, 8, 8, 1);
> +  C (f12, 9, 8, 0);
> +  C (f13, 7, 8, 1);
> +  C (f13, 8, 8, 1);
> +  C (f13, 9, 8, 0);
> +  C (f14, 7, 8, 0);
> +  C (f14, 8, 8, 0);
> +  C (f14, 9, 8, 1);
> +  D (f15, 4, 0);
> +  D (f15, 5, 1);
> +  D (f15, 6, 0);
> +  D (f16, 4, 1);
> +  D (f16, 5, 0);
> +  D (f16, 6, 1);
> +  D (f17, 4, 0);
> +  D (f17, 5, 0);
> +  D (f17, 6, 1);
> +  D (f18, 4, 1);
> +  D (f18, 5, 0);
> +  D (f18, 6, 0);
> +  D (f19, 4, 0);
> +  D (f19, 5, 1);
> +  D (f19, 6, 1);
> +  D (f20, 4, 1);
> +  D (f20, 5, 1);
> +  D (f20, 6, 0);
> +  D (f21, 4, 1);
> +  D (f21, 5, 0);
> +  D (f21, 6, 0);
> +  D (f22, 4, 0);
> +  D (f22, 5, 1);
> +  D (f22, 6, 1);
> +  D (f23, 4, 0);
> +  D (f23, 5, 1);
> +  D (f23, 6, 1);
> +  D (f24, 4, 1);
> +  D (f24, 5, 0);
> +  D (f24, 6, 0);
> +  D (f25, 4, 0);
> +  D (f25, 5, 0);
> +  D (f25, 6, 1);
> +  D (f26, 4, 1);
> +  D (f26, 5, 1);
> +  D (f26, 6, 0);
> +  D (f27, 4, 1);
> +  D (f27, 5, 1);
> +  D (f27, 6, 0);
> +  D (f28, 4, 0);
> +  D (f28, 5, 0);
> +  D (f28, 6, 1);
> +  return 0;
> +}
> --- gcc/testsuite/gcc.dg/pr94589-4.c.jj	2021-05-03 17:32:21.915124851 +0200
> +++ gcc/testsuite/gcc.dg/pr94589-4.c	2021-05-03 18:12:50.748835798 +0200
> @@ -0,0 +1,97 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -g -ffast-math" } */
> +
> +#include "pr94589-2.c"
> +
> +#define C(fn, i, j, r) if (fn (i, j) != r) __builtin_abort ()
> +#define D(fn, i, r) if (fn (i) != r) __builtin_abort ()
> +
> +int
> +main ()
> +{
> +  C (f1, 7.0, 8.0, 0);
> +  C (f1, 8.0, 8.0, 1);
> +  C (f1, 9.0, 8.0, 0);
> +  C (f2, 7.0, 8.0, 1);
> +  C (f2, 8.0, 8.0, 0);
> +  C (f2, 9.0, 8.0, 1);
> +  C (f3, 7.0, 8.0, 0);
> +  C (f3, 8.0, 8.0, 0);
> +  C (f3, 9.0, 8.0, 1);
> +  C (f4, 7.0, 8.0, 1);
> +  C (f4, 8.0, 8.0, 0);
> +  C (f4, 9.0, 8.0, 0);
> +  C (f5, 7.0, 8.0, 0);
> +  C (f5, 8.0, 8.0, 1);
> +  C (f5, 9.0, 8.0, 1);
> +  C (f6, 7.0, 8.0, 1);
> +  C (f6, 8.0, 8.0, 1);
> +  C (f6, 9.0, 8.0, 0);
> +  C (f7, 7.0, 8.0, 1);
> +  C (f7, 8.0, 8.0, 0);
> +  C (f7, 9.0, 8.0, 0);
> +  C (f8, 7.0, 8.0, 0);
> +  C (f8, 8.0, 8.0, 1);
> +  C (f8, 9.0, 8.0, 1);
> +  C (f9, 7.0, 8.0, 0);
> +  C (f9, 8.0, 8.0, 1);
> +  C (f9, 9.0, 8.0, 1);
> +  C (f10, 7.0, 8.0, 1);
> +  C (f10, 8.0, 8.0, 0);
> +  C (f10, 9.0, 8.0, 0);
> +  C (f11, 7.0, 8.0, 0);
> +  C (f11, 8.0, 8.0, 0);
> +  C (f11, 9.0, 8.0, 1);
> +  C (f12, 7.0, 8.0, 1);
> +  C (f12, 8.0, 8.0, 1);
> +  C (f12, 9.0, 8.0, 0);
> +  C (f13, 7.0, 8.0, 1);
> +  C (f13, 8.0, 8.0, 1);
> +  C (f13, 9.0, 8.0, 0);
> +  C (f14, 7.0, 8.0, 0);
> +  C (f14, 8.0, 8.0, 0);
> +  C (f14, 9.0, 8.0, 1);
> +  D (f15, 4.0, 0);
> +  D (f15, 5.0, 1);
> +  D (f15, 6.0, 0);
> +  D (f16, 4.0, 1);
> +  D (f16, 5.0, 0);
> +  D (f16, 6.0, 1);
> +  D (f17, 4.0, 0);
> +  D (f17, 5.0, 0);
> +  D (f17, 6.0, 1);
> +  D (f18, 4.0, 1);
> +  D (f18, 5.0, 0);
> +  D (f18, 6.0, 0);
> +  D (f19, 4.0, 0);
> +  D (f19, 5.0, 1);
> +  D (f19, 6.0, 1);
> +  D (f20, 4.0, 1);
> +  D (f20, 5.0, 1);
> +  D (f20, 6.0, 0);
> +  D (f21, 4.0, 1);
> +  D (f21, 5.0, 0);
> +  D (f21, 6.0, 0);
> +  D (f22, 4.0, 0);
> +  D (f22, 5.0, 1);
> +  D (f22, 6.0, 1);
> +  D (f23, 4.0, 0);
> +  D (f23, 5.0, 1);
> +  D (f23, 6.0, 1);
> +  D (f24, 4.0, 1);
> +  D (f24, 5.0, 0);
> +  D (f24, 6.0, 0);
> +  D (f25, 4.0, 0);
> +  D (f25, 5.0, 0);
> +  D (f25, 6.0, 1);
> +  D (f26, 4.0, 1);
> +  D (f26, 5.0, 1);
> +  D (f26, 6.0, 0);
> +  D (f27, 4.0, 1);
> +  D (f27, 5.0, 1);
> +  D (f27, 6.0, 0);
> +  D (f28, 4.0, 0);
> +  D (f28, 5.0, 0);
> +  D (f28, 6.0, 1);
> +  return 0;
> +}
> --- gcc/testsuite/g++.dg/opt/pr94589-1.C.jj	2021-05-03 15:24:02.002046458 +0200
> +++ gcc/testsuite/g++.dg/opt/pr94589-1.C	2021-05-03 15:23:47.445210269 +0200
> @@ -0,0 +1,33 @@
> +// PR tree-optimization/94589
> +// { dg-do compile { target c++20 } }
> +// { dg-options "-O2 -g0 -fdump-tree-optimized" }
> +// { dg-final { scan-tree-dump-times "\[ij]_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) \[ij]_\[0-9]+\\(D\\)" 12 "optimized" } }
> +// { dg-final { scan-tree-dump-times "i_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) \[45]" 12 "optimized" } }
> +
> +#include <compare>
> +
> +#define A __attribute__((noipa))
> +A bool f1 (int i, int j) { auto c = i <=> j; return c == 0; }
> +A bool f2 (int i, int j) { auto c = i <=> j; return c != 0; }
> +A bool f3 (int i, int j) { auto c = i <=> j; return c > 0; }
> +A bool f4 (int i, int j) { auto c = i <=> j; return c < 0; }
> +A bool f5 (int i, int j) { auto c = i <=> j; return c >= 0; }
> +A bool f6 (int i, int j) { auto c = i <=> j; return c <= 0; }
> +A bool f7 (int i, int j) { auto c = i <=> j; return c == std::strong_ordering::less; }
> +A bool f8 (int i, int j) { auto c = i <=> j; return c != std::strong_ordering::less; }
> +A bool f9 (int i, int j) { auto c = i <=> j; return c == std::strong_ordering::equal; }
> +A bool f10 (int i, int j) { auto c = i <=> j; return c != std::strong_ordering::equal; }
> +A bool f11 (int i, int j) { auto c = i <=> j; return c == std::strong_ordering::greater; }
> +A bool f12 (int i, int j) { auto c = i <=> j; return c != std::strong_ordering::greater; }
> +A bool f13 (int i) { auto c = i <=> 5; return c == 0; }
> +A bool f14 (int i) { auto c = i <=> 5; return c != 0; }
> +A bool f15 (int i) { auto c = i <=> 5; return c > 0; }
> +A bool f16 (int i) { auto c = i <=> 5; return c < 0; }
> +A bool f17 (int i) { auto c = i <=> 5; return c >= 0; }
> +A bool f18 (int i) { auto c = i <=> 5; return c <= 0; }
> +A bool f19 (int i) { auto c = i <=> 5; return c == std::strong_ordering::less; }
> +A bool f20 (int i) { auto c = i <=> 5; return c != std::strong_ordering::less; }
> +A bool f21 (int i) { auto c = i <=> 5; return c == std::strong_ordering::equal; }
> +A bool f22 (int i) { auto c = i <=> 5; return c != std::strong_ordering::equal; }
> +A bool f23 (int i) { auto c = i <=> 5; return c == std::strong_ordering::greater; }
> +A bool f24 (int i) { auto c = i <=> 5; return c != std::strong_ordering::greater; }
> --- gcc/testsuite/g++.dg/opt/pr94589-2.C.jj	2021-05-03 16:12:12.002540809 +0200
> +++ gcc/testsuite/g++.dg/opt/pr94589-2.C	2021-05-03 16:16:53.194380966 +0200
> @@ -0,0 +1,33 @@
> +// PR tree-optimization/94589
> +// { dg-do compile { target c++20 } }
> +// { dg-options "-O2 -g0 -ffast-math -fdump-tree-optimized" }
> +// { dg-final { scan-tree-dump-times "\[ij]_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) \[ij]_\[0-9]+\\(D\\)" 14 "optimized" } }
> +// { dg-final { scan-tree-dump-times "i_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) 5\\.0" 14 "optimized" } }
> +
> +#include <compare>
> +
> +#define A __attribute__((noipa))
> +A bool f1 (double i, double j) { auto c = i <=> j; return c == 0; }
> +A bool f2 (double i, double j) { auto c = i <=> j; return c != 0; }
> +A bool f3 (double i, double j) { auto c = i <=> j; return c > 0; }
> +A bool f4 (double i, double j) { auto c = i <=> j; return c < 0; }
> +A bool f5 (double i, double j) { auto c = i <=> j; return c >= 0; }
> +A bool f6 (double i, double j) { auto c = i <=> j; return c <= 0; }
> +A bool f7 (double i, double j) { auto c = i <=> j; return c == std::partial_ordering::less; }
> +A bool f8 (double i, double j) { auto c = i <=> j; return c != std::partial_ordering::less; }
> +A bool f9 (double i, double j) { auto c = i <=> j; return c == std::partial_ordering::equivalent; }
> +A bool f10 (double i, double j) { auto c = i <=> j; return c != std::partial_ordering::equivalent; }
> +A bool f11 (double i, double j) { auto c = i <=> j; return c == std::partial_ordering::greater; }
> +A bool f12 (double i, double j) { auto c = i <=> j; return c != std::partial_ordering::greater; }
> +A bool f13 (double i) { auto c = i <=> 5.0; return c == 0; }
> +A bool f14 (double i) { auto c = i <=> 5.0; return c != 0; }
> +A bool f15 (double i) { auto c = i <=> 5.0; return c > 0; }
> +A bool f16 (double i) { auto c = i <=> 5.0; return c < 0; }
> +A bool f17 (double i) { auto c = i <=> 5.0; return c >= 0; }
> +A bool f18 (double i) { auto c = i <=> 5.0; return c <= 0; }
> +A bool f19 (double i) { auto c = i <=> 5.0; return c == std::partial_ordering::less; }
> +A bool f20 (double i) { auto c = i <=> 5.0; return c != std::partial_ordering::less; }
> +A bool f21 (double i) { auto c = i <=> 5.0; return c == std::partial_ordering::equivalent; }
> +A bool f22 (double i) { auto c = i <=> 5.0; return c != std::partial_ordering::equivalent; }
> +A bool f23 (double i) { auto c = i <=> 5.0; return c == std::partial_ordering::greater; }
> +A bool f24 (double i) { auto c = i <=> 5.0; return c != std::partial_ordering::greater; }
> --- gcc/testsuite/g++.dg/opt/pr94589-3.C.jj	2021-05-03 16:22:29.898597327 +0200
> +++ gcc/testsuite/g++.dg/opt/pr94589-3.C	2021-05-03 17:21:23.290556509 +0200
> @@ -0,0 +1,84 @@
> +// { dg-do run { target c++20 } }
> +// { dg-options "-O2 -g" }
> +
> +#include "pr94589-1.C"
> +
> +#define C(fn, i, j, r) if (fn (i, j) != r) __builtin_abort ()
> +#define D(fn, i, r) if (fn (i) != r) __builtin_abort ()
> +
> +int
> +main ()
> +{
> +  C (f1, 7, 8, false);
> +  C (f1, 8, 8, true);
> +  C (f1, 9, 8, false);
> +  C (f2, 7, 8, true);
> +  C (f2, 8, 8, false);
> +  C (f2, 9, 8, true);
> +  C (f3, 7, 8, false);
> +  C (f3, 8, 8, false);
> +  C (f3, 9, 8, true);
> +  C (f4, 7, 8, true);
> +  C (f4, 8, 8, false);
> +  C (f4, 9, 8, false);
> +  C (f5, 7, 8, false);
> +  C (f5, 8, 8, true);
> +  C (f5, 9, 8, true);
> +  C (f6, 7, 8, true);
> +  C (f6, 8, 8, true);
> +  C (f6, 9, 8, false);
> +  C (f7, 7, 8, true);
> +  C (f7, 8, 8, false);
> +  C (f7, 9, 8, false);
> +  C (f8, 7, 8, false);
> +  C (f8, 8, 8, true);
> +  C (f8, 9, 8, true);
> +  C (f9, 7, 8, false);
> +  C (f9, 8, 8, true);
> +  C (f9, 9, 8, false);
> +  C (f10, 7, 8, true);
> +  C (f10, 8, 8, false);
> +  C (f10, 9, 8, true);
> +  C (f11, 7, 8, false);
> +  C (f11, 8, 8, false);
> +  C (f11, 9, 8, true);
> +  C (f12, 7, 8, true);
> +  C (f12, 8, 8, true);
> +  C (f12, 9, 8, false);
> +  D (f13, 4, false);
> +  D (f13, 5, true);
> +  D (f13, 6, false);
> +  D (f14, 4, true);
> +  D (f14, 5, false);
> +  D (f14, 6, true);
> +  D (f15, 4, false);
> +  D (f15, 5, false);
> +  D (f15, 6, true);
> +  D (f16, 4, true);
> +  D (f16, 5, false);
> +  D (f16, 6, false);
> +  D (f17, 4, false);
> +  D (f17, 5, true);
> +  D (f17, 6, true);
> +  D (f18, 4, true);
> +  D (f18, 5, true);
> +  D (f18, 6, false);
> +  D (f19, 4, true);
> +  D (f19, 5, false);
> +  D (f19, 6, false);
> +  D (f20, 4, false);
> +  D (f20, 5, true);
> +  D (f20, 6, true);
> +  D (f21, 4, false);
> +  D (f21, 5, true);
> +  D (f21, 6, false);
> +  D (f22, 4, true);
> +  D (f22, 5, false);
> +  D (f22, 6, true);
> +  D (f23, 4, false);
> +  D (f23, 5, false);
> +  D (f23, 6, true);
> +  D (f24, 4, true);
> +  D (f24, 5, true);
> +  D (f24, 6, false);
> +}
> --- gcc/testsuite/g++.dg/opt/pr94589-4.C.jj	2021-05-03 17:21:40.968351902 +0200
> +++ gcc/testsuite/g++.dg/opt/pr94589-4.C	2021-05-03 17:23:13.313289481 +0200
> @@ -0,0 +1,84 @@
> +// { dg-do run { target c++20 } }
> +// { dg-options "-O2 -g -ffast-math" }
> +
> +#include "pr94589-2.C"
> +
> +#define C(fn, i, j, r) if (fn (i, j) != r) __builtin_abort ()
> +#define D(fn, i, r) if (fn (i) != r) __builtin_abort ()
> +
> +int
> +main ()
> +{
> +  C (f1, 7.0, 8.0, false);
> +  C (f1, 8.0, 8.0, true);
> +  C (f1, 9.0, 8.0, false);
> +  C (f2, 7.0, 8.0, true);
> +  C (f2, 8.0, 8.0, false);
> +  C (f2, 9.0, 8.0, true);
> +  C (f3, 7.0, 8.0, false);
> +  C (f3, 8.0, 8.0, false);
> +  C (f3, 9.0, 8.0, true);
> +  C (f4, 7.0, 8.0, true);
> +  C (f4, 8.0, 8.0, false);
> +  C (f4, 9.0, 8.0, false);
> +  C (f5, 7.0, 8.0, false);
> +  C (f5, 8.0, 8.0, true);
> +  C (f5, 9.0, 8.0, true);
> +  C (f6, 7.0, 8.0, true);
> +  C (f6, 8.0, 8.0, true);
> +  C (f6, 9.0, 8.0, false);
> +  C (f7, 7.0, 8.0, true);
> +  C (f7, 8.0, 8.0, false);
> +  C (f7, 9.0, 8.0, false);
> +  C (f8, 7.0, 8.0, false);
> +  C (f8, 8.0, 8.0, true);
> +  C (f8, 9.0, 8.0, true);
> +  C (f9, 7.0, 8.0, false);
> +  C (f9, 8.0, 8.0, true);
> +  C (f9, 9.0, 8.0, false);
> +  C (f10, 7.0, 8.0, true);
> +  C (f10, 8.0, 8.0, false);
> +  C (f10, 9.0, 8.0, true);
> +  C (f11, 7.0, 8.0, false);
> +  C (f11, 8.0, 8.0, false);
> +  C (f11, 9.0, 8.0, true);
> +  C (f12, 7.0, 8.0, true);
> +  C (f12, 8.0, 8.0, true);
> +  C (f12, 9.0, 8.0, false);
> +  D (f13, 4.0, false);
> +  D (f13, 5.0, true);
> +  D (f13, 6.0, false);
> +  D (f14, 4.0, true);
> +  D (f14, 5.0, false);
> +  D (f14, 6.0, true);
> +  D (f15, 4.0, false);
> +  D (f15, 5.0, false);
> +  D (f15, 6.0, true);
> +  D (f16, 4.0, true);
> +  D (f16, 5.0, false);
> +  D (f16, 6.0, false);
> +  D (f17, 4.0, false);
> +  D (f17, 5.0, true);
> +  D (f17, 6.0, true);
> +  D (f18, 4.0, true);
> +  D (f18, 5.0, true);
> +  D (f18, 6.0, false);
> +  D (f19, 4.0, true);
> +  D (f19, 5.0, false);
> +  D (f19, 6.0, false);
> +  D (f20, 4.0, false);
> +  D (f20, 5.0, true);
> +  D (f20, 6.0, true);
> +  D (f21, 4.0, false);
> +  D (f21, 5.0, true);
> +  D (f21, 6.0, false);
> +  D (f22, 4.0, true);
> +  D (f22, 5.0, false);
> +  D (f22, 6.0, true);
> +  D (f23, 4.0, false);
> +  D (f23, 5.0, false);
> +  D (f23, 6.0, true);
> +  D (f24, 4.0, true);
> +  D (f24, 5.0, true);
> +  D (f24, 6.0, false);
> +}
> 
> 	Jakub
>
Jakub Jelinek May 5, 2021, 4:52 p.m. UTC | #6
On Wed, May 05, 2021 at 01:45:29PM +0200, Marc Glisse wrote:
> On Tue, 4 May 2021, Jakub Jelinek via Gcc-patches wrote:
> 
> > 2) the pr94589-2.C testcase should be matching just 12 times each, but runs
> > into operator>=(strong_ordering, unspecified) being defined as
> > (_M_value&1)==_M_value
> > rather than _M_value>=0.  When not honoring NaNs, the 2 case should be
> > unreachable and so (_M_value&1)==_M_value is then equivalent to _M_value>=0,
> > but is not a single use but two uses.  I'll need to pattern match that case
> > specially.
> 
> Somewhere in RTL (_M_value&1)==_M_value is turned into (_M_value&-2)==0,
> that could be worth doing already in GIMPLE.

Apparently it is
  /* Simplify eq/ne (and/ior x y) x/y) for targets with a BICS instruction or
     constant folding if x/y is a constant.  */
  if ((code == EQ || code == NE)
      && (op0code == AND || op0code == IOR)
      && !side_effects_p (op1)
      && op1 != CONST0_RTX (cmp_mode))
    {
      /* Both (eq/ne (and x y) x) and (eq/ne (ior x y) y) simplify to
         (eq/ne (and (not y) x) 0).  */
...
      /* Both (eq/ne (and x y) y) and (eq/ne (ior x y) x) simplify to
         (eq/ne (and (not x) y) 0).  */
Yes, doing that on GIMPLE for the case where the not argument is constant
would simplify the phiopt follow-up (it would be single imm use then).

	Jakub
Jakub Jelinek May 6, 2021, 10:22 a.m. UTC | #7
On Wed, May 05, 2021 at 06:52:27PM +0200, Jakub Jelinek via Gcc-patches wrote:
> On Wed, May 05, 2021 at 01:45:29PM +0200, Marc Glisse wrote:
> > On Tue, 4 May 2021, Jakub Jelinek via Gcc-patches wrote:
> > 
> > > 2) the pr94589-2.C testcase should be matching just 12 times each, but runs
> > > into operator>=(strong_ordering, unspecified) being defined as
> > > (_M_value&1)==_M_value
> > > rather than _M_value>=0.  When not honoring NaNs, the 2 case should be
> > > unreachable and so (_M_value&1)==_M_value is then equivalent to _M_value>=0,
> > > but is not a single use but two uses.  I'll need to pattern match that case
> > > specially.
> > 
> > Somewhere in RTL (_M_value&1)==_M_value is turned into (_M_value&-2)==0,
> > that could be worth doing already in GIMPLE.
> 
> Apparently it is
>   /* Simplify eq/ne (and/ior x y) x/y) for targets with a BICS instruction or
>      constant folding if x/y is a constant.  */
>   if ((code == EQ || code == NE)
>       && (op0code == AND || op0code == IOR)
>       && !side_effects_p (op1)
>       && op1 != CONST0_RTX (cmp_mode))
>     {
>       /* Both (eq/ne (and x y) x) and (eq/ne (ior x y) y) simplify to
>          (eq/ne (and (not y) x) 0).  */
> ...
>       /* Both (eq/ne (and x y) y) and (eq/ne (ior x y) x) simplify to
>          (eq/ne (and (not x) y) 0).  */
> Yes, doing that on GIMPLE for the case where the not argument is constant
> would simplify the phiopt follow-up (it would be single imm use then).

Though, (x&1) == x is equivalent to both (x&~1)==0 and to x < 2U
and from the latter two it isn't obvious which one is better/more canonical.
On aarch64 I don't see differences in number of insns nor in their size:
  10:	13001c00 	sxtb	w0, w0
  14:	721f781f 	tst	w0, #0xfffffffe
  18:	1a9f17e0 	cset	w0, eq  // eq = none
  1c:	d65f03c0 	ret
vs.
  20:	12001c00 	and	w0, w0, #0xff
  24:	7100041f 	cmp	w0, #0x1
  28:	1a9f87e0 	cset	w0, ls  // ls = plast
  2c:	d65f03c0 	ret
On x86_64 same number of insns, but the comparison is shorter (note, the
spaceship result is a struct with signed char based enum):
  10:	31 c0                	xor    %eax,%eax
  12:	81 e7 fe 00 00 00    	and    $0xfe,%edi
  18:	0f 94 c0             	sete   %al
  1b:	c3                   	retq   
  1c:	0f 1f 40 00          	nopl   0x0(%rax)
vs.
  20:	31 c0                	xor    %eax,%eax
  22:	40 80 ff 01          	cmp    $0x1,%dil
  26:	0f 96 c0             	setbe  %al
  29:	c3                   	retq   
Generally, I'd think that the comparison should be better because it
will be most common in user code that way and VRP will be able to do the
best thing for it.

Before we decide how to optimize it, can't we change libstdc++
to use the more efficient implementation?

I.e. either following, or { return (__v._M_value & ~1) == 0; } ?

2021-05-06  Jakub Jelinek  <jakub@redhat.com>

	* compare (operator>=(partial_ordering, __cmp_cat::__unspec),
	operator<=(__cmp_cat::__unspec, partial_ordering)): Simplify to
	(unsigned char) __v._M_value < 2.

--- gcc/compare.jj	2021-03-10 17:36:41.288491012 +0100
+++ gcc/compare	2021-05-06 11:44:37.519553192 +0200
@@ -107,7 +107,7 @@ namespace std
 
     friend constexpr bool
     operator>=(partial_ordering __v, __cmp_cat::__unspec) noexcept
-    { return __cmp_cat::type(__v._M_value & 1) == __v._M_value; }
+    { return static_cast<unsigned char>(__v._M_value) < 2; }
 
     friend constexpr bool
     operator< (__cmp_cat::__unspec, partial_ordering __v) noexcept
@@ -119,7 +119,7 @@ namespace std
 
     friend constexpr bool
     operator<=(__cmp_cat::__unspec, partial_ordering __v) noexcept
-    { return __cmp_cat::type(__v._M_value & 1) == __v._M_value; }
+    { return static_cast<unsigned char>(__v._M_value) < 2; }
 
     friend constexpr bool
     operator>=(__cmp_cat::__unspec, partial_ordering __v) noexcept


	Jakub
Marc Glisse May 6, 2021, 7:42 p.m. UTC | #8
On Thu, 6 May 2021, Jakub Jelinek via Gcc-patches wrote:

> Though, (x&1) == x is equivalent to both (x&~1)==0 and to x < 2U
> and from the latter two it isn't obvious which one is better/more canonical.
> On aarch64 I don't see differences in number of insns nor in their size:
>  10:	13001c00 	sxtb	w0, w0
>  14:	721f781f 	tst	w0, #0xfffffffe
>  18:	1a9f17e0 	cset	w0, eq  // eq = none
>  1c:	d65f03c0 	ret
> vs.
>  20:	12001c00 	and	w0, w0, #0xff
>  24:	7100041f 	cmp	w0, #0x1
>  28:	1a9f87e0 	cset	w0, ls  // ls = plast
>  2c:	d65f03c0 	ret
> On x86_64 same number of insns, but the comparison is shorter (note, the
> spaceship result is a struct with signed char based enum):
>  10:	31 c0                	xor    %eax,%eax
>  12:	81 e7 fe 00 00 00    	and    $0xfe,%edi
>  18:	0f 94 c0             	sete   %al
>  1b:	c3                   	retq
>  1c:	0f 1f 40 00          	nopl   0x0(%rax)
> vs.
>  20:	31 c0                	xor    %eax,%eax
>  22:	40 80 ff 01          	cmp    $0x1,%dil
>  26:	0f 96 c0             	setbe  %al
>  29:	c3                   	retq
> Generally, I'd think that the comparison should be better because it
> will be most common in user code that way and VRP will be able to do the
> best thing for it.

We can probably do it in 2 steps, first something like

(for cmp (eq ne)
  (simplify
   (cmp (bit_and:c @0 @1) @0)
   (cmp (@0 (bit_not! @1)) { build_zero_cst (TREE_TYPE (@0)); })))

to get rid of the double use, and then simplify X&C==0 to X<=~C if C is a 
mask 111...000 (I thought we already had a function to detect such masks, 
or the 000...111, but I can't find them anymore).

I agree that the comparison seems preferable, although if X is signed, the 
way GIMPLE represents types will add an inconvenient cast. And I think VRP 
already manages to use the bit test to derive a range.
Jakub Jelinek May 14, 2021, 5:26 p.m. UTC | #9
On Thu, May 06, 2021 at 09:42:41PM +0200, Marc Glisse wrote:
> We can probably do it in 2 steps, first something like
> 
> (for cmp (eq ne)
>  (simplify
>   (cmp (bit_and:c @0 @1) @0)
>   (cmp (@0 (bit_not! @1)) { build_zero_cst (TREE_TYPE (@0)); })))
> 
> to get rid of the double use, and then simplify X&C==0 to X<=~C if C is a
> mask 111...000 (I thought we already had a function to detect such masks, or
> the 000...111, but I can't find them anymore).
> 
> I agree that the comparison seems preferable, although if X is signed, the
> way GIMPLE represents types will add an inconvenient cast. And I think VRP
> already manages to use the bit test to derive a range.

I've tried the second step, but it unfortunately regresses
+FAIL: gcc.dg/ipa/propbits-2.c scan-tree-dump-not optimized "fail_test"
+FAIL: gcc.dg/tree-ssa/loop-42.c scan-tree-dump-not ivcanon "under assumptions "
so maybe it is better to keep these cases as the users wrote them.

So posting this patch just for archival purposes.

2021-05-13  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/94589
	* match.pd ((X & (-1 << N)) == 0 -> X <= (1 << N) - 1U): New
	simplification.

	* gcc.dg/tree-ssa/pr94589-2.c: New test.

--- gcc/match.pd.jj	2021-05-12 09:45:55.832976540 +0200
+++ gcc/match.pd	2021-05-13 19:51:17.458507854 +0200
@@ -4792,6 +4792,22 @@ (define_operator_list COND_TERNARY
   (cmp (bit_and:cs @0 @2) (bit_and:cs @1 @2))
   (cmp (bit_and (bit_xor @0 @1) @2) { build_zero_cst (TREE_TYPE (@2)); })))
 
+#if GIMPLE
+/* (X & (-1 << N)) == 0 becomes X <= (1 << N) - 1.  */
+(for cmp (eq ne)
+     ncmp (le gt)
+ (simplify
+  (cmp:c (bit_and:cs @0 INTEGER_CST@1) integer_zerop)
+  (with { tree utype = NULL_TREE;
+	  int tz = wi::ctz (wi::to_wide (@1));
+	  int prec = TYPE_PRECISION (TREE_TYPE (@1));
+	  if (tz && wi::eq_p (wi::shifted_mask (tz, prec - tz, false, prec),
+			      wi::to_wide (@1)))
+	    utype = unsigned_type_for (TREE_TYPE (@0)); }
+   (if (utype)
+    (ncmp (convert:utype @0) (convert:utype (bit_not @1)))))))
+#endif
+
 /* (X < 0) != (Y < 0) into (X ^ Y) < 0.
    (X >= 0) != (Y >= 0) into (X ^ Y) < 0.
    (X < 0) == (Y < 0) into (X ^ Y) >= 0.
--- gcc/testsuite/gcc.dg/tree-ssa/pr94589-2.c.jj	2021-05-13 19:55:21.201854003 +0200
+++ gcc/testsuite/gcc.dg/tree-ssa/pr94589-2.c	2021-05-13 19:54:59.804086977 +0200
@@ -0,0 +1,21 @@
+/* PR tree-optimization/94589 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int x)
+{
+  return (x & (-1 << 7)) == 0;
+/* { dg-final { scan-tree-dump " = \\\(unsigned int\\\) x_" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " & -128" "optimized" } } */
+/* { dg-final { scan-tree-dump " <= 127" "optimized" } } */
+}
+
+int
+bar (int y)
+{
+  return (y & (-1 << 12)) != 0;
+/* { dg-final { scan-tree-dump " = \\\(unsigned int\\\) y_" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " & -4096" "optimized" } } */
+/* { dg-final { scan-tree-dump " > 4095" "optimized" } } */
+}


	Jakub
Marc Glisse May 15, 2021, 10:09 a.m. UTC | #10
On Fri, 14 May 2021, Jakub Jelinek via Gcc-patches wrote:

> On Thu, May 06, 2021 at 09:42:41PM +0200, Marc Glisse wrote:
>> We can probably do it in 2 steps, first something like
>>
>> (for cmp (eq ne)
>>  (simplify
>>   (cmp (bit_and:c @0 @1) @0)
>>   (cmp (@0 (bit_not! @1)) { build_zero_cst (TREE_TYPE (@0)); })))
>>
>> to get rid of the double use, and then simplify X&C==0 to X<=~C if C is a
>> mask 111...000 (I thought we already had a function to detect such masks, or
>> the 000...111, but I can't find them anymore).
>>
>> I agree that the comparison seems preferable, although if X is signed, the
>> way GIMPLE represents types will add an inconvenient cast. And I think VRP
>> already manages to use the bit test to derive a range.
>
> I've tried the second step, but it unfortunately regresses
> +FAIL: gcc.dg/ipa/propbits-2.c scan-tree-dump-not optimized "fail_test"
> +FAIL: gcc.dg/tree-ssa/loop-42.c scan-tree-dump-not ivcanon "under assumptions "
> so maybe it is better to keep these cases as the users wrote them.

Thank you for trying it, the patch looks good (it would probably also work 
on GENERIC), but indeed it is just a canonicalization, not necessarily 
worth breaking other stuff. This seems to point to another missed 
optimization. Looking at propbits-2.c, if I rewrite the condition in f1 as

   if ((unsigned)x <= 0xf)

I see later in VRP

   # RANGE [0, 4294967295] NONZERO 15
   x.2_1 = (unsigned intD.9) x_4(D);
   if (x.2_1 <= 15)

where we fail to derive a range from the nonzero bits. Maybe VRP expects 
IPA-CP should have done it already and doesn't bother recomputing it.

I understand this may not be a priority though, that's fine.

I didn't look at loop-42.c.
diff mbox series

Patch

--- gcc/tree-ssa-phiopt.c.jj	2021-05-02 10:17:49.095397758 +0200
+++ gcc/tree-ssa-phiopt.c	2021-05-03 17:49:54.233300624 +0200
@@ -64,6 +64,8 @@  static bool abs_replacement (basic_block
 			     edge, edge, gimple *, tree, tree);
 static bool xor_replacement (basic_block, basic_block,
 			     edge, edge, gimple *, tree, tree);
+static bool spaceship_replacement (basic_block, basic_block,
+				   edge, edge, gimple *, tree, tree);
 static bool cond_removal_in_popcount_clz_ctz_pattern (basic_block, basic_block,
 						      edge, edge, gimple *,
 						      tree, tree);
@@ -357,6 +359,8 @@  tree_ssa_phiopt_worker (bool do_store_el
 	    cfgchanged = true;
 	  else if (minmax_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
 	    cfgchanged = true;
+	  else if (spaceship_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
+	    cfgchanged = true;
 	}
     }
 
@@ -1806,6 +1810,420 @@  minmax_replacement (basic_block cond_bb,
 
   return true;
 }
+
+/* Return true if the only executable statement in BB is a GIMPLE_COND.  */
+
+static bool
+cond_only_block_p (basic_block bb)
+{
+  /* BB must have no executable statements.  */
+  gimple_stmt_iterator gsi = gsi_after_labels (bb);
+  if (phi_nodes (bb))
+    return false;
+  while (!gsi_end_p (gsi))
+    {
+      gimple *stmt = gsi_stmt (gsi);
+      if (is_gimple_debug (stmt))
+	;
+      else if (gimple_code (stmt) == GIMPLE_NOP
+	       || gimple_code (stmt) == GIMPLE_PREDICT
+	       || gimple_code (stmt) == GIMPLE_COND)
+	;
+      else
+	return false;
+      gsi_next (&gsi);
+    }
+  return true;
+}
+
+/* Attempt to optimize (x <=> y) cmp 0 and similar comparisons.
+   For strong ordering <=> try to match something like:
+    <bb 2> :
+    if (x_4(D) != y_5(D))
+      goto <bb 3>; [INV]
+    else
+      goto <bb 6>; [INV]
+
+    <bb 3> :
+    if (x_4(D) < y_5(D))
+      goto <bb 6>; [INV]
+    else
+      goto <bb 4>; [INV]
+
+    <bb 4> :
+
+    <bb 6> :
+    # iftmp.0_2 = PHI <1(4), 0(2), -1(3)>
+    _1 = iftmp.0_2 == 0;
+
+   and for partial ordering <=> something like:
+
+    <bb 2> :
+    if (a_3(D) == b_5(D))
+      goto <bb 6>; [50.00%]
+    else
+      goto <bb 3>; [50.00%]
+
+    <bb 3> [local count: 536870913]:
+    if (a_3(D) < b_5(D))
+      goto <bb 6>; [50.00%]
+    else
+      goto <bb 4>; [50.00%]
+
+    <bb 4> [local count: 268435456]:
+    if (a_3(D) > b_5(D))
+      goto <bb 6>; [50.00%]
+    else
+      goto <bb 5>; [50.00%]
+
+    <bb 5> [local count: 134217728]:
+
+    <bb 6> [local count: 1073741824]:
+    # SR.27_4 = PHI <0(2), -1(3), 1(4), 2(5)>
+    _2 = SR.27_4 > 0;  */
+
+static bool
+spaceship_replacement (basic_block cond_bb, basic_block middle_bb,
+		       edge e0, edge e1, gimple *phi,
+		       tree arg0, tree arg1)
+{
+  if (!INTEGRAL_TYPE_P (TREE_TYPE (PHI_RESULT (phi)))
+      || TYPE_UNSIGNED (TREE_TYPE (PHI_RESULT (phi)))
+      || !tree_fits_shwi_p (arg0)
+      || !tree_fits_shwi_p (arg1)
+      || !IN_RANGE (tree_to_shwi (arg0), -1, 2)
+      || !IN_RANGE (tree_to_shwi (arg1), -1, 2))
+    return false;
+
+  use_operand_p use_p;
+  gimple *use_stmt;
+  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (PHI_RESULT (phi)))
+    return false;
+  if (!single_imm_use (PHI_RESULT (phi), &use_p, &use_stmt))
+    return false;
+  enum tree_code cmp;
+  tree lhs, rhs;
+  if (gimple_code (use_stmt) == GIMPLE_COND)
+    {
+      cmp = gimple_cond_code (use_stmt);
+      lhs = gimple_cond_lhs (use_stmt);
+      rhs = gimple_cond_rhs (use_stmt);
+    }
+  else if (is_gimple_assign (use_stmt))
+    {
+      if (gimple_assign_rhs_class (use_stmt) == GIMPLE_BINARY_RHS)
+	{
+	  cmp = gimple_assign_rhs_code (use_stmt);
+	  lhs = gimple_assign_rhs1 (use_stmt);
+	  rhs = gimple_assign_rhs2 (use_stmt);
+	}
+      else if (gimple_assign_rhs_code (use_stmt) == COND_EXPR)
+	{
+	  tree cond = gimple_assign_rhs1 (use_stmt);
+	  if (!COMPARISON_CLASS_P (cond))
+	    return false;
+	  cmp = TREE_CODE (cond);
+	  lhs = TREE_OPERAND (cond, 0);
+	  rhs = TREE_OPERAND (cond, 1);
+	}
+      else
+	return false;
+    }
+  else
+    return false;
+  switch (cmp)
+    {
+    case EQ_EXPR:
+    case NE_EXPR:
+    case LT_EXPR:
+    case GT_EXPR:
+    case LE_EXPR:
+    case GE_EXPR:
+      break;
+    default:
+      return false;
+    }
+  if (lhs != PHI_RESULT (phi)
+      || !tree_fits_shwi_p (rhs)
+      || !IN_RANGE (tree_to_shwi (rhs), -1, 1))
+    return false;
+
+  if (!empty_block_p (middle_bb))
+    return false;
+
+  basic_block phi_bb = gimple_bb (phi);
+  gcc_assert (phi_bb == e0->dest && phi_bb == e1->dest);
+  if (!IN_RANGE (EDGE_COUNT (phi_bb->preds), 3, 4))
+    return false;
+
+  gcond *cond1 = as_a <gcond *> (last_stmt (cond_bb));
+  enum tree_code cmp1 = gimple_cond_code (cond1);
+  if (cmp1 != LT_EXPR && cmp1 != GT_EXPR)
+    return false;
+  tree lhs1 = gimple_cond_lhs (cond1);
+  tree rhs1 = gimple_cond_rhs (cond1);
+  /* The optimization may be unsafe due to NaNs.  */
+  if (HONOR_NANS (TREE_TYPE (lhs1)) || HONOR_SIGNED_ZEROS (TREE_TYPE (lhs1)))
+    return false;
+  if (TREE_CODE (lhs1) == SSA_NAME && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs1))
+    return false;
+  if (TREE_CODE (rhs1) == SSA_NAME && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs1))
+    return false;
+
+  if (!single_pred_p (cond_bb) || !cond_only_block_p (cond_bb))
+    return false;
+
+  basic_block cond2_bb = single_pred (cond_bb);
+  if (EDGE_COUNT (cond2_bb->succs) != 2)
+    return false;
+  edge cond2_phi_edge;
+  if (EDGE_SUCC (cond2_bb, 0)->dest == cond_bb)
+    {
+      if (EDGE_SUCC (cond2_bb, 1)->dest != phi_bb)
+	return false;
+      cond2_phi_edge = EDGE_SUCC (cond2_bb, 1);
+    }
+  else if (EDGE_SUCC (cond2_bb, 0)->dest != phi_bb)
+    return false;
+  else
+    cond2_phi_edge = EDGE_SUCC (cond2_bb, 0);
+  tree arg2 = gimple_phi_arg_def (phi, cond2_phi_edge->dest_idx);
+  if (!tree_fits_shwi_p (arg2))
+    return false;
+  gimple *cond2 = last_stmt (cond2_bb);
+  if (cond2 == NULL || gimple_code (cond2) != GIMPLE_COND)
+    return false;
+  enum tree_code cmp2 = gimple_cond_code (cond2);
+  tree lhs2 = gimple_cond_lhs (cond2);
+  tree rhs2 = gimple_cond_rhs (cond2);
+  if (lhs2 == lhs1)
+    {
+      if (!operand_equal_p (rhs2, rhs1, 0))
+	return false;
+    }
+  else if (lhs2 == rhs1)
+    {
+      if (rhs2 != lhs1)
+	return false;
+    }
+  else
+    return false;
+  tree arg3 = arg2;
+  basic_block cond3_bb = cond2_bb;
+  edge cond3_phi_edge = cond2_phi_edge;
+  gimple *cond3 = cond2;
+  enum tree_code cmp3 = cmp2;
+  tree lhs3 = lhs2;
+  tree rhs3 = rhs2;
+  if (EDGE_COUNT (phi_bb->preds) == 4)
+    {
+      if (absu_hwi (tree_to_shwi (arg2)) != 1)
+	return false;
+      if (e1->flags & EDGE_TRUE_VALUE)
+	{
+	  if (tree_to_shwi (arg0) != 2
+	      || absu_hwi (tree_to_shwi (arg1)) != 1
+	      || wi::to_widest (arg1) == wi::to_widest (arg2))
+	    return false;
+	}
+      else if (tree_to_shwi (arg1) != 2
+	       || absu_hwi (tree_to_shwi (arg0)) != 1
+	       || wi::to_widest (arg0) == wi::to_widest (arg1))
+	return false;
+      if (cmp2 != LT_EXPR && cmp2 != GT_EXPR)
+	return false;
+      /* if (x < y) goto phi_bb; else fallthru;
+	 if (x > y) goto phi_bb; else fallthru;
+	 bbx:;
+	 phi_bb:;
+	 is ok, but if x and y are swapped in one of the comparisons,
+	 or the comparisons are the same and operands not swapped,
+	 or second goto phi_bb is not the true edge, it is not.  */
+      if ((lhs2 == lhs1)
+	  ^ (cmp2 == cmp1)
+	  ^ ((e1->flags & EDGE_TRUE_VALUE) != 0))
+	return false;
+      if ((cond2_phi_edge->flags & EDGE_TRUE_VALUE) == 0)
+	return false;
+      if (!single_pred_p (cond2_bb) || !cond_only_block_p (cond2_bb))
+	return false;
+      cond3_bb = single_pred (cond2_bb);
+      if (EDGE_COUNT (cond2_bb->succs) != 2)
+	return false;
+      if (EDGE_SUCC (cond3_bb, 0)->dest == cond2_bb)
+	{
+	  if (EDGE_SUCC (cond3_bb, 1)->dest != phi_bb)
+	    return false;
+	  cond3_phi_edge = EDGE_SUCC (cond3_bb, 1);
+	}
+      else if (EDGE_SUCC (cond3_bb, 0)->dest != phi_bb)
+	return false;
+      else
+	cond3_phi_edge = EDGE_SUCC (cond3_bb, 0);
+      arg3 = gimple_phi_arg_def (phi, cond3_phi_edge->dest_idx);
+      cond3 = last_stmt (cond3_bb);
+      if (cond3 == NULL || gimple_code (cond3) != GIMPLE_COND)
+	return false;
+      cmp3 = gimple_cond_code (cond3);
+      lhs3 = gimple_cond_lhs (cond3);
+      rhs3 = gimple_cond_rhs (cond3);
+      if (lhs3 == lhs1)
+	{
+	  if (!operand_equal_p (rhs3, rhs1, 0))
+	    return false;
+	}
+      else if (lhs3 == rhs1)
+	{
+	  if (rhs3 != lhs1)
+	    return false;
+	}
+      else
+	return false;
+    }
+  else if (absu_hwi (tree_to_shwi (arg0)) != 1
+	   || absu_hwi (tree_to_shwi (arg1)) != 1
+	   || wi::to_widest (arg0) == wi::to_widest (arg1))
+    return false;
+
+  if (!integer_zerop (arg3) || (cmp3 != EQ_EXPR && cmp3 != NE_EXPR))
+    return false;
+  if ((cond3_phi_edge->flags & (cmp3 == EQ_EXPR
+				? EDGE_TRUE_VALUE : EDGE_FALSE_VALUE)) == 0)
+    return false;
+
+  /* lhs1 one_cmp rhs1 results in PHI_RESULT (phi) of 1.  */
+  enum tree_code one_cmp;
+  if ((cmp1 == LT_EXPR)
+      ^ (!integer_onep ((e1->flags & EDGE_TRUE_VALUE) ? arg1 : arg0)))
+    one_cmp = LT_EXPR;
+  else
+    one_cmp = GT_EXPR;
+
+  enum tree_code res_cmp;
+  switch (cmp)
+    {
+    case EQ_EXPR:
+      if (integer_zerop (rhs))
+	res_cmp = EQ_EXPR;
+      else if (integer_minus_onep (rhs))
+	res_cmp = one_cmp == LT_EXPR ? GT_EXPR : LT_EXPR;
+      else if (integer_onep (rhs))
+	res_cmp = one_cmp;
+      else
+	return false;
+      break;
+    case NE_EXPR:
+      if (integer_zerop (rhs))
+	res_cmp = NE_EXPR;
+      else if (integer_minus_onep (rhs))
+	res_cmp = one_cmp == LT_EXPR ? LE_EXPR : GE_EXPR;
+      else if (integer_onep (rhs))
+	res_cmp = one_cmp == LT_EXPR ? GE_EXPR : LE_EXPR;
+      else
+	return false;
+      break;
+    case LT_EXPR:
+      if (integer_onep (rhs))
+	res_cmp = one_cmp == LT_EXPR ? GE_EXPR : LE_EXPR;
+      else if (integer_zerop (rhs))
+	res_cmp = one_cmp == LT_EXPR ? GT_EXPR : LT_EXPR;
+      else
+	return false;
+      break;
+    case LE_EXPR:
+      if (integer_zerop (rhs))
+	res_cmp = one_cmp == LT_EXPR ? GE_EXPR : LE_EXPR;
+      else if (integer_minus_onep (rhs))
+	res_cmp = one_cmp == LT_EXPR ? GT_EXPR : LT_EXPR;
+      else
+	return false;
+      break;
+    case GT_EXPR:
+      if (integer_minus_onep (rhs))
+	res_cmp = one_cmp == LT_EXPR ? LE_EXPR : GE_EXPR;
+      else if (integer_zerop (rhs))
+	res_cmp = one_cmp;
+      else
+	return false;
+      break;
+    case GE_EXPR:
+      if (integer_zerop (rhs))
+	res_cmp = one_cmp == LT_EXPR ? LE_EXPR : GE_EXPR;
+      else if (integer_onep (rhs))
+	res_cmp = one_cmp;
+      else
+	return false;
+      break;
+    default:
+      gcc_unreachable ();
+    }
+
+  if (gimple_code (use_stmt) == GIMPLE_COND)
+    {
+      gcond *use_cond = as_a <gcond *> (use_stmt);
+      gimple_cond_set_code (use_cond, res_cmp);
+      gimple_cond_set_lhs (use_cond, lhs1);
+      gimple_cond_set_rhs (use_cond, rhs1);
+    }
+  else if (gimple_assign_rhs_class (use_stmt) == GIMPLE_BINARY_RHS)
+    {
+      gimple_assign_set_rhs_code (use_stmt, res_cmp);
+      gimple_assign_set_rhs1 (use_stmt, lhs1);
+      gimple_assign_set_rhs2 (use_stmt, rhs1);
+    }
+  else
+    {
+      tree cond = build2 (res_cmp, TREE_TYPE (gimple_assign_rhs1 (use_stmt)),
+			  lhs1, rhs1);
+      gimple_assign_set_rhs1 (use_stmt, cond);
+    }
+  update_stmt (use_stmt);
+  if (cmp3 == EQ_EXPR)
+    gimple_cond_make_true (as_a <gcond *> (cond3));
+  else
+    gimple_cond_make_false (as_a <gcond *> (cond3));
+  update_stmt (cond3);
+
+  if (MAY_HAVE_DEBUG_BIND_STMTS)
+    {
+      use_operand_p use_p;
+      imm_use_iterator iter;
+      bool has_debug_uses = false;
+      FOR_EACH_IMM_USE_FAST (use_p, iter, PHI_RESULT (phi))
+	{
+	  gimple *use_stmt = USE_STMT (use_p);
+	  gcc_assert (is_gimple_debug (use_stmt));
+	  has_debug_uses = true;
+	  break;
+	}
+
+      if (has_debug_uses)
+	{
+	  gimple_stmt_iterator gsi = gsi_after_labels (gimple_bb (phi));
+	  tree type = TREE_TYPE (PHI_RESULT (phi));
+	  tree temp1 = make_node (DEBUG_EXPR_DECL);
+	  DECL_ARTIFICIAL (temp1) = 1;
+	  TREE_TYPE (temp1) = type;
+	  SET_DECL_MODE (temp1, TYPE_MODE (type));
+	  tree t = build2 (one_cmp, boolean_type_node, lhs1, rhs2);
+	  t = build3 (COND_EXPR, type, t, build_one_cst (type),
+		      build_int_cst (type, -1));
+	  gimple *g = gimple_build_debug_bind (temp1, t, phi);
+	  gsi_insert_before (&gsi, g, GSI_SAME_STMT);
+	  tree temp2 = make_node (DEBUG_EXPR_DECL);
+	  DECL_ARTIFICIAL (temp2) = 1;
+	  TREE_TYPE (temp2) = type;
+	  SET_DECL_MODE (temp2, TYPE_MODE (type));
+	  t = build2 (EQ_EXPR, boolean_type_node, lhs1, rhs2);
+	  t = build3 (COND_EXPR, type, t, build_zero_cst (type), temp1);
+	  g = gimple_build_debug_bind (temp2, t, phi);
+	  gsi_insert_before (&gsi, g, GSI_SAME_STMT);
+	  replace_uses_by (PHI_RESULT (phi), temp2);
+	}
+    }
+
+  return true;
+}
 
 /* Convert
 
--- gcc/testsuite/gcc.dg/pr94589-1.c.jj	2021-05-03 17:32:14.940203230 +0200
+++ gcc/testsuite/gcc.dg/pr94589-1.c	2021-05-03 17:54:33.081167518 +0200
@@ -0,0 +1,35 @@ 
+/* PR tree-optimization/94589 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -g0 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-times "\[ij]_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) \[ij]_\[0-9]+\\(D\\)" 14 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "i_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) \[45]" 14 "optimized" } } */
+
+#define A __attribute__((noipa))
+A int f1 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c == 0; }
+A int f2 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c != 0; }
+A int f3 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c > 0; }
+A int f4 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c < 0; }
+A int f5 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c >= 0; }
+A int f6 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c <= 0; }
+A int f7 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c == -1; }
+A int f8 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c != -1; }
+A int f9 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c > -1; }
+A int f10 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c <= -1; }
+A int f11 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c == 1; }
+A int f12 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c != 1; }
+A int f13 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c < 1; }
+A int f14 (int i, int j) { int c = i == j ? 0 : i < j ? -1 : 1; return c >= 1; }
+A int f15 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c == 0; }
+A int f16 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c != 0; }
+A int f17 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c > 0; }
+A int f18 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c < 0; }
+A int f19 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c >= 0; }
+A int f20 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c <= 0; }
+A int f21 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c == -1; }
+A int f22 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c != -1; }
+A int f23 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c > -1; }
+A int f24 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c <= -1; }
+A int f25 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c == 1; }
+A int f26 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c != 1; }
+A int f27 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c < 1; }
+A int f28 (int i) { int c = i == 5 ? 0 : i < 5 ? -1 : 1; return c >= 1; }
--- gcc/testsuite/gcc.dg/pr94589-2.c.jj	2021-05-03 17:32:17.950169407 +0200
+++ gcc/testsuite/gcc.dg/pr94589-2.c	2021-05-03 17:56:12.577049606 +0200
@@ -0,0 +1,35 @@ 
+/* PR tree-optimization/94589 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -g0 -ffast-math -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-times "\[ij]_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) \[ij]_\[0-9]+\\(D\\)" 14 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "i_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) 5\\.0" 14 "optimized" } } */
+
+#define A __attribute__((noipa))
+A int f1 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c == 0; }
+A int f2 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c != 0; }
+A int f3 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c > 0; }
+A int f4 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c < 0; }
+A int f5 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c >= 0; }
+A int f6 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c <= 0; }
+A int f7 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c == -1; }
+A int f8 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c != -1; }
+A int f9 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c > -1; }
+A int f10 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c <= -1; }
+A int f11 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c == 1; }
+A int f12 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c != 1; }
+A int f13 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c < 1; }
+A int f14 (double i, double j) { int c; if (i == j) c = 0; else if (i < j) c = -1; else if (i > j) c = 1; else c = 2; return c >= 1; }
+A int f15 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c == 0; }
+A int f16 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c != 0; }
+A int f17 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c > 0; }
+A int f18 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c < 0; }
+A int f19 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c >= 0; }
+A int f20 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c <= 0; }
+A int f21 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c == -1; }
+A int f22 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c != -1; }
+A int f23 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c > -1; }
+A int f24 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c <= -1; }
+A int f25 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c == 1; }
+A int f26 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c != 1; }
+A int f27 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c < 1; }
+A int f28 (double i) { int c; if (i == 5.0) c = 0; else if (i < 5.0) c = -1; else if (i > 5.0) c = 1; else c = 2; return c >= 1; }
--- gcc/testsuite/gcc.dg/pr94589-3.c.jj	2021-05-03 17:32:19.998146397 +0200
+++ gcc/testsuite/gcc.dg/pr94589-3.c	2021-05-03 18:02:41.628678802 +0200
@@ -0,0 +1,97 @@ 
+/* { dg-do run } */
+/* { dg-options "-O2 -g" } */
+
+#include "pr94589-1.c"
+
+#define C(fn, i, j, r) if (fn (i, j) != r) __builtin_abort ()
+#define D(fn, i, r) if (fn (i) != r) __builtin_abort ()
+
+int
+main ()
+{
+  C (f1, 7, 8, 0);
+  C (f1, 8, 8, 1);
+  C (f1, 9, 8, 0);
+  C (f2, 7, 8, 1);
+  C (f2, 8, 8, 0);
+  C (f2, 9, 8, 1);
+  C (f3, 7, 8, 0);
+  C (f3, 8, 8, 0);
+  C (f3, 9, 8, 1);
+  C (f4, 7, 8, 1);
+  C (f4, 8, 8, 0);
+  C (f4, 9, 8, 0);
+  C (f5, 7, 8, 0);
+  C (f5, 8, 8, 1);
+  C (f5, 9, 8, 1);
+  C (f6, 7, 8, 1);
+  C (f6, 8, 8, 1);
+  C (f6, 9, 8, 0);
+  C (f7, 7, 8, 1);
+  C (f7, 8, 8, 0);
+  C (f7, 9, 8, 0);
+  C (f8, 7, 8, 0);
+  C (f8, 8, 8, 1);
+  C (f8, 9, 8, 1);
+  C (f9, 7, 8, 0);
+  C (f9, 8, 8, 1);
+  C (f9, 9, 8, 1);
+  C (f10, 7, 8, 1);
+  C (f10, 8, 8, 0);
+  C (f10, 9, 8, 0);
+  C (f11, 7, 8, 0);
+  C (f11, 8, 8, 0);
+  C (f11, 9, 8, 1);
+  C (f12, 7, 8, 1);
+  C (f12, 8, 8, 1);
+  C (f12, 9, 8, 0);
+  C (f13, 7, 8, 1);
+  C (f13, 8, 8, 1);
+  C (f13, 9, 8, 0);
+  C (f14, 7, 8, 0);
+  C (f14, 8, 8, 0);
+  C (f14, 9, 8, 1);
+  D (f15, 4, 0);
+  D (f15, 5, 1);
+  D (f15, 6, 0);
+  D (f16, 4, 1);
+  D (f16, 5, 0);
+  D (f16, 6, 1);
+  D (f17, 4, 0);
+  D (f17, 5, 0);
+  D (f17, 6, 1);
+  D (f18, 4, 1);
+  D (f18, 5, 0);
+  D (f18, 6, 0);
+  D (f19, 4, 0);
+  D (f19, 5, 1);
+  D (f19, 6, 1);
+  D (f20, 4, 1);
+  D (f20, 5, 1);
+  D (f20, 6, 0);
+  D (f21, 4, 1);
+  D (f21, 5, 0);
+  D (f21, 6, 0);
+  D (f22, 4, 0);
+  D (f22, 5, 1);
+  D (f22, 6, 1);
+  D (f23, 4, 0);
+  D (f23, 5, 1);
+  D (f23, 6, 1);
+  D (f24, 4, 1);
+  D (f24, 5, 0);
+  D (f24, 6, 0);
+  D (f25, 4, 0);
+  D (f25, 5, 0);
+  D (f25, 6, 1);
+  D (f26, 4, 1);
+  D (f26, 5, 1);
+  D (f26, 6, 0);
+  D (f27, 4, 1);
+  D (f27, 5, 1);
+  D (f27, 6, 0);
+  D (f28, 4, 0);
+  D (f28, 5, 0);
+  D (f28, 6, 1);
+  return 0;
+}
--- gcc/testsuite/gcc.dg/pr94589-4.c.jj	2021-05-03 17:32:21.915124851 +0200
+++ gcc/testsuite/gcc.dg/pr94589-4.c	2021-05-03 18:12:50.748835798 +0200
@@ -0,0 +1,97 @@ 
+/* { dg-do run } */
+/* { dg-options "-O2 -g -ffast-math" } */
+
+#include "pr94589-2.c"
+
+#define C(fn, i, j, r) if (fn (i, j) != r) __builtin_abort ()
+#define D(fn, i, r) if (fn (i) != r) __builtin_abort ()
+
+int
+main ()
+{
+  C (f1, 7.0, 8.0, 0);
+  C (f1, 8.0, 8.0, 1);
+  C (f1, 9.0, 8.0, 0);
+  C (f2, 7.0, 8.0, 1);
+  C (f2, 8.0, 8.0, 0);
+  C (f2, 9.0, 8.0, 1);
+  C (f3, 7.0, 8.0, 0);
+  C (f3, 8.0, 8.0, 0);
+  C (f3, 9.0, 8.0, 1);
+  C (f4, 7.0, 8.0, 1);
+  C (f4, 8.0, 8.0, 0);
+  C (f4, 9.0, 8.0, 0);
+  C (f5, 7.0, 8.0, 0);
+  C (f5, 8.0, 8.0, 1);
+  C (f5, 9.0, 8.0, 1);
+  C (f6, 7.0, 8.0, 1);
+  C (f6, 8.0, 8.0, 1);
+  C (f6, 9.0, 8.0, 0);
+  C (f7, 7.0, 8.0, 1);
+  C (f7, 8.0, 8.0, 0);
+  C (f7, 9.0, 8.0, 0);
+  C (f8, 7.0, 8.0, 0);
+  C (f8, 8.0, 8.0, 1);
+  C (f8, 9.0, 8.0, 1);
+  C (f9, 7.0, 8.0, 0);
+  C (f9, 8.0, 8.0, 1);
+  C (f9, 9.0, 8.0, 1);
+  C (f10, 7.0, 8.0, 1);
+  C (f10, 8.0, 8.0, 0);
+  C (f10, 9.0, 8.0, 0);
+  C (f11, 7.0, 8.0, 0);
+  C (f11, 8.0, 8.0, 0);
+  C (f11, 9.0, 8.0, 1);
+  C (f12, 7.0, 8.0, 1);
+  C (f12, 8.0, 8.0, 1);
+  C (f12, 9.0, 8.0, 0);
+  C (f13, 7.0, 8.0, 1);
+  C (f13, 8.0, 8.0, 1);
+  C (f13, 9.0, 8.0, 0);
+  C (f14, 7.0, 8.0, 0);
+  C (f14, 8.0, 8.0, 0);
+  C (f14, 9.0, 8.0, 1);
+  D (f15, 4.0, 0);
+  D (f15, 5.0, 1);
+  D (f15, 6.0, 0);
+  D (f16, 4.0, 1);
+  D (f16, 5.0, 0);
+  D (f16, 6.0, 1);
+  D (f17, 4.0, 0);
+  D (f17, 5.0, 0);
+  D (f17, 6.0, 1);
+  D (f18, 4.0, 1);
+  D (f18, 5.0, 0);
+  D (f18, 6.0, 0);
+  D (f19, 4.0, 0);
+  D (f19, 5.0, 1);
+  D (f19, 6.0, 1);
+  D (f20, 4.0, 1);
+  D (f20, 5.0, 1);
+  D (f20, 6.0, 0);
+  D (f21, 4.0, 1);
+  D (f21, 5.0, 0);
+  D (f21, 6.0, 0);
+  D (f22, 4.0, 0);
+  D (f22, 5.0, 1);
+  D (f22, 6.0, 1);
+  D (f23, 4.0, 0);
+  D (f23, 5.0, 1);
+  D (f23, 6.0, 1);
+  D (f24, 4.0, 1);
+  D (f24, 5.0, 0);
+  D (f24, 6.0, 0);
+  D (f25, 4.0, 0);
+  D (f25, 5.0, 0);
+  D (f25, 6.0, 1);
+  D (f26, 4.0, 1);
+  D (f26, 5.0, 1);
+  D (f26, 6.0, 0);
+  D (f27, 4.0, 1);
+  D (f27, 5.0, 1);
+  D (f27, 6.0, 0);
+  D (f28, 4.0, 0);
+  D (f28, 5.0, 0);
+  D (f28, 6.0, 1);
+  return 0;
+}
--- gcc/testsuite/g++.dg/opt/pr94589-1.C.jj	2021-05-03 15:24:02.002046458 +0200
+++ gcc/testsuite/g++.dg/opt/pr94589-1.C	2021-05-03 15:23:47.445210269 +0200
@@ -0,0 +1,33 @@ 
+// PR tree-optimization/94589
+// { dg-do compile { target c++20 } }
+// { dg-options "-O2 -g0 -fdump-tree-optimized" }
+// { dg-final { scan-tree-dump-times "\[ij]_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) \[ij]_\[0-9]+\\(D\\)" 12 "optimized" } }
+// { dg-final { scan-tree-dump-times "i_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) \[45]" 12 "optimized" } }
+
+#include <compare>
+
+#define A __attribute__((noipa))
+A bool f1 (int i, int j) { auto c = i <=> j; return c == 0; }
+A bool f2 (int i, int j) { auto c = i <=> j; return c != 0; }
+A bool f3 (int i, int j) { auto c = i <=> j; return c > 0; }
+A bool f4 (int i, int j) { auto c = i <=> j; return c < 0; }
+A bool f5 (int i, int j) { auto c = i <=> j; return c >= 0; }
+A bool f6 (int i, int j) { auto c = i <=> j; return c <= 0; }
+A bool f7 (int i, int j) { auto c = i <=> j; return c == std::strong_ordering::less; }
+A bool f8 (int i, int j) { auto c = i <=> j; return c != std::strong_ordering::less; }
+A bool f9 (int i, int j) { auto c = i <=> j; return c == std::strong_ordering::equal; }
+A bool f10 (int i, int j) { auto c = i <=> j; return c != std::strong_ordering::equal; }
+A bool f11 (int i, int j) { auto c = i <=> j; return c == std::strong_ordering::greater; }
+A bool f12 (int i, int j) { auto c = i <=> j; return c != std::strong_ordering::greater; }
+A bool f13 (int i) { auto c = i <=> 5; return c == 0; }
+A bool f14 (int i) { auto c = i <=> 5; return c != 0; }
+A bool f15 (int i) { auto c = i <=> 5; return c > 0; }
+A bool f16 (int i) { auto c = i <=> 5; return c < 0; }
+A bool f17 (int i) { auto c = i <=> 5; return c >= 0; }
+A bool f18 (int i) { auto c = i <=> 5; return c <= 0; }
+A bool f19 (int i) { auto c = i <=> 5; return c == std::strong_ordering::less; }
+A bool f20 (int i) { auto c = i <=> 5; return c != std::strong_ordering::less; }
+A bool f21 (int i) { auto c = i <=> 5; return c == std::strong_ordering::equal; }
+A bool f22 (int i) { auto c = i <=> 5; return c != std::strong_ordering::equal; }
+A bool f23 (int i) { auto c = i <=> 5; return c == std::strong_ordering::greater; }
+A bool f24 (int i) { auto c = i <=> 5; return c != std::strong_ordering::greater; }
--- gcc/testsuite/g++.dg/opt/pr94589-2.C.jj	2021-05-03 16:12:12.002540809 +0200
+++ gcc/testsuite/g++.dg/opt/pr94589-2.C	2021-05-03 16:16:53.194380966 +0200
@@ -0,0 +1,33 @@ 
+// PR tree-optimization/94589
+// { dg-do compile { target c++20 } }
+// { dg-options "-O2 -g0 -ffast-math -fdump-tree-optimized" }
+// { dg-final { scan-tree-dump-times "\[ij]_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) \[ij]_\[0-9]+\\(D\\)" 14 "optimized" } }
+// { dg-final { scan-tree-dump-times "i_\[0-9]+\\(D\\) (?:<|<=|==|!=|>|>=) 5\\.0" 14 "optimized" } }
+
+#include <compare>
+
+#define A __attribute__((noipa))
+A bool f1 (double i, double j) { auto c = i <=> j; return c == 0; }
+A bool f2 (double i, double j) { auto c = i <=> j; return c != 0; }
+A bool f3 (double i, double j) { auto c = i <=> j; return c > 0; }
+A bool f4 (double i, double j) { auto c = i <=> j; return c < 0; }
+A bool f5 (double i, double j) { auto c = i <=> j; return c >= 0; }
+A bool f6 (double i, double j) { auto c = i <=> j; return c <= 0; }
+A bool f7 (double i, double j) { auto c = i <=> j; return c == std::partial_ordering::less; }
+A bool f8 (double i, double j) { auto c = i <=> j; return c != std::partial_ordering::less; }
+A bool f9 (double i, double j) { auto c = i <=> j; return c == std::partial_ordering::equivalent; }
+A bool f10 (double i, double j) { auto c = i <=> j; return c != std::partial_ordering::equivalent; }
+A bool f11 (double i, double j) { auto c = i <=> j; return c == std::partial_ordering::greater; }
+A bool f12 (double i, double j) { auto c = i <=> j; return c != std::partial_ordering::greater; }
+A bool f13 (double i) { auto c = i <=> 5.0; return c == 0; }
+A bool f14 (double i) { auto c = i <=> 5.0; return c != 0; }
+A bool f15 (double i) { auto c = i <=> 5.0; return c > 0; }
+A bool f16 (double i) { auto c = i <=> 5.0; return c < 0; }
+A bool f17 (double i) { auto c = i <=> 5.0; return c >= 0; }
+A bool f18 (double i) { auto c = i <=> 5.0; return c <= 0; }
+A bool f19 (double i) { auto c = i <=> 5.0; return c == std::partial_ordering::less; }
+A bool f20 (double i) { auto c = i <=> 5.0; return c != std::partial_ordering::less; }
+A bool f21 (double i) { auto c = i <=> 5.0; return c == std::partial_ordering::equivalent; }
+A bool f22 (double i) { auto c = i <=> 5.0; return c != std::partial_ordering::equivalent; }
+A bool f23 (double i) { auto c = i <=> 5.0; return c == std::partial_ordering::greater; }
+A bool f24 (double i) { auto c = i <=> 5.0; return c != std::partial_ordering::greater; }
--- gcc/testsuite/g++.dg/opt/pr94589-3.C.jj	2021-05-03 16:22:29.898597327 +0200
+++ gcc/testsuite/g++.dg/opt/pr94589-3.C	2021-05-03 17:21:23.290556509 +0200
@@ -0,0 +1,84 @@ 
+// { dg-do run { target c++20 } }
+// { dg-options "-O2 -g" }
+
+#include "pr94589-1.C"
+
+#define C(fn, i, j, r) if (fn (i, j) != r) __builtin_abort ()
+#define D(fn, i, r) if (fn (i) != r) __builtin_abort ()
+
+int
+main ()
+{
+  C (f1, 7, 8, false);
+  C (f1, 8, 8, true);
+  C (f1, 9, 8, false);
+  C (f2, 7, 8, true);
+  C (f2, 8, 8, false);
+  C (f2, 9, 8, true);
+  C (f3, 7, 8, false);
+  C (f3, 8, 8, false);
+  C (f3, 9, 8, true);
+  C (f4, 7, 8, true);
+  C (f4, 8, 8, false);
+  C (f4, 9, 8, false);
+  C (f5, 7, 8, false);
+  C (f5, 8, 8, true);
+  C (f5, 9, 8, true);
+  C (f6, 7, 8, true);
+  C (f6, 8, 8, true);
+  C (f6, 9, 8, false);
+  C (f7, 7, 8, true);
+  C (f7, 8, 8, false);
+  C (f7, 9, 8, false);
+  C (f8, 7, 8, false);
+  C (f8, 8, 8, true);
+  C (f8, 9, 8, true);
+  C (f9, 7, 8, false);
+  C (f9, 8, 8, true);
+  C (f9, 9, 8, false);
+  C (f10, 7, 8, true);
+  C (f10, 8, 8, false);
+  C (f10, 9, 8, true);
+  C (f11, 7, 8, false);
+  C (f11, 8, 8, false);
+  C (f11, 9, 8, true);
+  C (f12, 7, 8, true);
+  C (f12, 8, 8, true);
+  C (f12, 9, 8, false);
+  D (f13, 4, false);
+  D (f13, 5, true);
+  D (f13, 6, false);
+  D (f14, 4, true);
+  D (f14, 5, false);
+  D (f14, 6, true);
+  D (f15, 4, false);
+  D (f15, 5, false);
+  D (f15, 6, true);
+  D (f16, 4, true);
+  D (f16, 5, false);
+  D (f16, 6, false);
+  D (f17, 4, false);
+  D (f17, 5, true);
+  D (f17, 6, true);
+  D (f18, 4, true);
+  D (f18, 5, true);
+  D (f18, 6, false);
+  D (f19, 4, true);
+  D (f19, 5, false);
+  D (f19, 6, false);
+  D (f20, 4, false);
+  D (f20, 5, true);
+  D (f20, 6, true);
+  D (f21, 4, false);
+  D (f21, 5, true);
+  D (f21, 6, false);
+  D (f22, 4, true);
+  D (f22, 5, false);
+  D (f22, 6, true);
+  D (f23, 4, false);
+  D (f23, 5, false);
+  D (f23, 6, true);
+  D (f24, 4, true);
+  D (f24, 5, true);
+  D (f24, 6, false);
+}
--- gcc/testsuite/g++.dg/opt/pr94589-4.C.jj	2021-05-03 17:21:40.968351902 +0200
+++ gcc/testsuite/g++.dg/opt/pr94589-4.C	2021-05-03 17:23:13.313289481 +0200
@@ -0,0 +1,84 @@ 
+// { dg-do run { target c++20 } }
+// { dg-options "-O2 -g -ffast-math" }
+
+#include "pr94589-2.C"
+
+#define C(fn, i, j, r) if (fn (i, j) != r) __builtin_abort ()
+#define D(fn, i, r) if (fn (i) != r) __builtin_abort ()
+
+int
+main ()
+{
+  C (f1, 7.0, 8.0, false);
+  C (f1, 8.0, 8.0, true);
+  C (f1, 9.0, 8.0, false);
+  C (f2, 7.0, 8.0, true);
+  C (f2, 8.0, 8.0, false);
+  C (f2, 9.0, 8.0, true);
+  C (f3, 7.0, 8.0, false);
+  C (f3, 8.0, 8.0, false);
+  C (f3, 9.0, 8.0, true);
+  C (f4, 7.0, 8.0, true);
+  C (f4, 8.0, 8.0, false);
+  C (f4, 9.0, 8.0, false);
+  C (f5, 7.0, 8.0, false);
+  C (f5, 8.0, 8.0, true);
+  C (f5, 9.0, 8.0, true);
+  C (f6, 7.0, 8.0, true);
+  C (f6, 8.0, 8.0, true);
+  C (f6, 9.0, 8.0, false);
+  C (f7, 7.0, 8.0, true);
+  C (f7, 8.0, 8.0, false);
+  C (f7, 9.0, 8.0, false);
+  C (f8, 7.0, 8.0, false);
+  C (f8, 8.0, 8.0, true);
+  C (f8, 9.0, 8.0, true);
+  C (f9, 7.0, 8.0, false);
+  C (f9, 8.0, 8.0, true);
+  C (f9, 9.0, 8.0, false);
+  C (f10, 7.0, 8.0, true);
+  C (f10, 8.0, 8.0, false);
+  C (f10, 9.0, 8.0, true);
+  C (f11, 7.0, 8.0, false);
+  C (f11, 8.0, 8.0, false);
+  C (f11, 9.0, 8.0, true);
+  C (f12, 7.0, 8.0, true);
+  C (f12, 8.0, 8.0, true);
+  C (f12, 9.0, 8.0, false);
+  D (f13, 4.0, false);
+  D (f13, 5.0, true);
+  D (f13, 6.0, false);
+  D (f14, 4.0, true);
+  D (f14, 5.0, false);
+  D (f14, 6.0, true);
+  D (f15, 4.0, false);
+  D (f15, 5.0, false);
+  D (f15, 6.0, true);
+  D (f16, 4.0, true);
+  D (f16, 5.0, false);
+  D (f16, 6.0, false);
+  D (f17, 4.0, false);
+  D (f17, 5.0, true);
+  D (f17, 6.0, true);
+  D (f18, 4.0, true);
+  D (f18, 5.0, true);
+  D (f18, 6.0, false);
+  D (f19, 4.0, true);
+  D (f19, 5.0, false);
+  D (f19, 6.0, false);
+  D (f20, 4.0, false);
+  D (f20, 5.0, true);
+  D (f20, 6.0, true);
+  D (f21, 4.0, false);
+  D (f21, 5.0, true);
+  D (f21, 6.0, false);
+  D (f22, 4.0, true);
+  D (f22, 5.0, false);
+  D (f22, 6.0, true);
+  D (f23, 4.0, false);
+  D (f23, 5.0, false);
+  D (f23, 6.0, true);
+  D (f24, 4.0, true);
+  D (f24, 5.0, true);
+  D (f24, 6.0, false);
+}