Patchwork RFA: Fix tree-optimization/55524

login
register
mail settings
Submitter Joern Rennecke
Date April 9, 2013, 4:24 p.m.
Message ID <20130409122433.lbfstq22o0g4g4ko-nzlynne@webmail.spamcop.net>
Download mbox | patch
Permalink /patch/235140/
State New
Headers show

Comments

Joern Rennecke - April 9, 2013, 4:24 p.m.
Quoting Richard Biener <richard.guenther@gmail.com>:

> I don't see that.  It's merely a complication of optimal handling of
> a * b +- c * d vs. just a * b +- c.  The pass does simple pattern matching
> only, not doing a global optimal transform, so adding another special-case
> is reasonable.  Special-casing just for single-use 2nd multiplication
> simplifies the cases for example.

I have attached a version of the patch that uses this simpler test.
Currently bootstrapping / regtesting on i686-pc-linux-gnu .
gcc:
2013-04-09  Joern Rennecke <joern.rennecke@embecosm.com>

	PR tree-optimization/55524
	* tree-ssa-math-opts.c
	(convert_mult_to_fma): Don't use an fms construct
	when we don't have an fms operation, but fnma, and it looks
	likely that we'll be able to use the latter.

gcc/testsuite:
2013-04-09  Joern Rennecke <joern.rennecke@embecosm.com>

	PR tree-optimization/55524
	* gcc.target/epiphany/fnma-1.c: New test.
Richard Guenther - April 10, 2013, 8:23 a.m.
On Tue, Apr 9, 2013 at 6:24 PM, Joern Rennecke
<joern.rennecke@embecosm.com> wrote:
> Quoting Richard Biener <richard.guenther@gmail.com>:
>
>> I don't see that.  It's merely a complication of optimal handling of
>> a * b +- c * d vs. just a * b +- c.  The pass does simple pattern matching
>> only, not doing a global optimal transform, so adding another special-case
>> is reasonable.  Special-casing just for single-use 2nd multiplication
>> simplifies the cases for example.
>
>
> I have attached a version of the patch that uses this simpler test.
> Currently bootstrapping / regtesting on i686-pc-linux-gnu .

Ok if the testing succeeds.

Thanks,
Richard.

>
> gcc:
> 2013-04-09  Joern Rennecke <joern.rennecke@embecosm.com>
>
>         PR tree-optimization/55524
>         * tree-ssa-math-opts.c
>         (convert_mult_to_fma): Don't use an fms construct
>         when we don't have an fms operation, but fnma, and it looks
>         likely that we'll be able to use the latter.
>
> gcc/testsuite:
> 2013-04-09  Joern Rennecke <joern.rennecke@embecosm.com>
>
>         PR tree-optimization/55524
>         * gcc.target/epiphany/fnma-1.c: New test.
>
> Index: testsuite/gcc.target/epiphany/fnma-1.c
> ===================================================================
> --- testsuite/gcc.target/epiphany/fnma-1.c      (revision 0)
> +++ testsuite/gcc.target/epiphany/fnma-1.c      (working copy)
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { scan-assembler-times "fmsub\[ \ta-zA-Z0-9\]*," 1 } } */
> +
> +float
> +f (float ar, float ai, float br, float bi)
> +{
> +  return ar * br - ai * bi;
> +}
> Index: tree-ssa-math-opts.c
> ===================================================================
> --- tree-ssa-math-opts.c        (revision 197578)
> +++ tree-ssa-math-opts.c        (working copy)
> @@ -2570,6 +2570,24 @@ convert_mult_to_fma (gimple mul_stmt, tr
>           return false;
>         }
>
> +      /* If the subtrahend (gimple_assign_rhs2 (use_stmt)) is computed
> +        by a MULT_EXPR that we'll visit later, we might be able to
> +        get a more profitable match with fnma.
> +        OTOH, if we don't, a negate / fma pair has likely lower latency
> +        that a mult / subtract pair.  */
> +      if (use_code == MINUS_EXPR && !negate_p
> +         && gimple_assign_rhs1 (use_stmt) == result
> +         && optab_handler (fms_optab, TYPE_MODE (type)) == CODE_FOR_nothing
> +         && optab_handler (fnma_optab, TYPE_MODE (type)) !=
> CODE_FOR_nothing)
> +       {
> +         tree rhs2 = gimple_assign_rhs2 (use_stmt);
> +         gimple stmt2 = SSA_NAME_DEF_STMT (rhs2);
> +
> +         if (has_single_use (rhs2)
> +             && gimple_assign_rhs_code (stmt2) == MULT_EXPR)
> +           return false;
> +       }
> +
>        /* We can't handle a * b + a * b.  */
>        if (gimple_assign_rhs1 (use_stmt) == gimple_assign_rhs2 (use_stmt))
>         return false;
>

Patch

Index: testsuite/gcc.target/epiphany/fnma-1.c
===================================================================
--- testsuite/gcc.target/epiphany/fnma-1.c	(revision 0)
+++ testsuite/gcc.target/epiphany/fnma-1.c	(working copy)
@@ -0,0 +1,9 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-times "fmsub\[ \ta-zA-Z0-9\]*," 1 } } */
+
+float
+f (float ar, float ai, float br, float bi)
+{
+  return ar * br - ai * bi;
+}
Index: tree-ssa-math-opts.c
===================================================================
--- tree-ssa-math-opts.c	(revision 197578)
+++ tree-ssa-math-opts.c	(working copy)
@@ -2570,6 +2570,24 @@  convert_mult_to_fma (gimple mul_stmt, tr
 	  return false;
 	}
 
+      /* If the subtrahend (gimple_assign_rhs2 (use_stmt)) is computed
+	 by a MULT_EXPR that we'll visit later, we might be able to
+	 get a more profitable match with fnma.
+	 OTOH, if we don't, a negate / fma pair has likely lower latency
+	 that a mult / subtract pair.  */
+      if (use_code == MINUS_EXPR && !negate_p
+	  && gimple_assign_rhs1 (use_stmt) == result
+	  && optab_handler (fms_optab, TYPE_MODE (type)) == CODE_FOR_nothing
+	  && optab_handler (fnma_optab, TYPE_MODE (type)) != CODE_FOR_nothing)
+	{
+	  tree rhs2 = gimple_assign_rhs2 (use_stmt);
+	  gimple stmt2 = SSA_NAME_DEF_STMT (rhs2);
+
+	  if (has_single_use (rhs2)
+	      && gimple_assign_rhs_code (stmt2) == MULT_EXPR)
+	    return false;
+	}
+
       /* We can't handle a * b + a * b.  */
       if (gimple_assign_rhs1 (use_stmt) == gimple_assign_rhs2 (use_stmt))
 	return false;