Patchwork [(3/7)] Widening multiply-and-accumulate pattern matching

login
register
mail settings
Submitter Andrew Stubbs
Date June 23, 2011, 2:40 p.m.
Message ID <4E03504B.9060305@codesourcery.com>
Download mbox | patch
Permalink /patch/101635/
State New
Headers show

Comments

Andrew Stubbs - June 23, 2011, 2:40 p.m.
There are many cases where the widening_mult pass does not recognise 
widening multiply-and-accumulate cases simply because there is a type 
conversion step between the multiply and add statements.

This patch should rectify that simply by looking beyond those conversions.

OK?

Andrew
Richard Guenther - June 23, 2011, 4:26 p.m.
On Thu, Jun 23, 2011 at 4:40 PM, Andrew Stubbs <andrew.stubbs@linaro.org> wrote:
> There are many cases where the widening_mult pass does not recognise
> widening multiply-and-accumulate cases simply because there is a type
> conversion step between the multiply and add statements.
>
> This patch should rectify that simply by looking beyond those conversions.

That's surely wrong for (int)(short)int_var.  You have to constrain
the conversions
you look through properly.

Richard.

> OK?
>
> Andrew
>
>
Janis Johnson - June 23, 2011, 9:42 p.m.
On 06/23/2011 07:40 AM, Andrew Stubbs wrote:

+++ b/gcc/testsuite/gcc.target/arm/umlal-1.c
+/* { dg-final { scan-assembler "umlal" } } */

Don't use the name of the instruction as the test name or the scan
will always pass, because the file name shows up in assembly output.

See http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01823.html for a
proposed effective target that can be used in this test.

Janis
Andrew Stubbs - June 24, 2011, 8:05 a.m.
On 23/06/11 17:26, Richard Guenther wrote:
> On Thu, Jun 23, 2011 at 4:40 PM, Andrew Stubbs<andrew.stubbs@linaro.org>  wrote:
>> There are many cases where the widening_mult pass does not recognise
>> widening multiply-and-accumulate cases simply because there is a type
>> conversion step between the multiply and add statements.
>>
>> This patch should rectify that simply by looking beyond those conversions.
>
> That's surely wrong for (int)(short)int_var.  You have to constrain
> the conversions
> you look through properly.

To be clear, it only skips past NOP_EXPR. Is it not the case that what 
you're describing would need a CONVERT_EXPR?

Andrew
Richard Guenther - June 24, 2011, 8:28 a.m.
On Fri, Jun 24, 2011 at 10:05 AM, Andrew Stubbs
<andrew.stubbs@linaro.org> wrote:
> On 23/06/11 17:26, Richard Guenther wrote:
>>
>> On Thu, Jun 23, 2011 at 4:40 PM, Andrew Stubbs<andrew.stubbs@linaro.org>
>>  wrote:
>>>
>>> There are many cases where the widening_mult pass does not recognise
>>> widening multiply-and-accumulate cases simply because there is a type
>>> conversion step between the multiply and add statements.
>>>
>>> This patch should rectify that simply by looking beyond those
>>> conversions.
>>
>> That's surely wrong for (int)(short)int_var.  You have to constrain
>> the conversions
>> you look through properly.
>
> To be clear, it only skips past NOP_EXPR. Is it not the case that what
> you're describing would need a CONVERT_EXPR?

NOP_EXPR is the same as CONVERT_EXPR.

Richard.

> Andrew
>
Stubbs, Andrew - June 24, 2011, 1:46 p.m.
On 24/06/11 09:28, Richard Guenther wrote:
>> >  To be clear, it only skips past NOP_EXPR. Is it not the case that what
>> >  you're describing would need a CONVERT_EXPR?
> NOP_EXPR is the same as CONVERT_EXPR.

Are you sure?

I thought this was safe because the internals manual says:

   NOP_EXPR
   These nodes are used to represent conversions that do not require any
   code-generation ....

   CONVERT_EXPR
   These nodes are similar to NOP_EXPRs, but are used in those
   situations where code may need to be generated ....

So, I tried this example:

int
foo (int a, short b, short c)
{
   int bc = b * c;
   return a + (short)bc;
}

Both before and after my patch, GCC gives:

         mul     r2, r1, r2
         sxtah   r0, r0, r2

(where, SXTAH means sign-extend the third operand from HImode to SImode 
and add to the second operand.)

The dump after the widening_mult pass is:

foo (int a, short int b, short int c)
{
   int bc;
   int D.2018;
   short int D.2017;
   int D.2016;
   int D.2015;
   int D.2014;

<bb 2>:
   D.2014_2 = (int) b_1(D);
   D.2015_4 = (int) c_3(D);
   bc_5 = b_1(D) w* c_3(D);
   D.2017_6 = (short int) bc_5;
   D.2018_7 = (int) D.2017_6;
   D.2016_9 = D.2018_7 + a_8(D);
   return D.2016_9;

}

Where you can clearly see that the addition has not been recognised as a 
multiply-and-accumulate.

When I step through convert_plusminus_to_widen, I can see that the 
reason it has not matched is because "D.2017_6 = (short int) bc_5" is 
encoded with a CONVERT_EXPR, just as the manual said it would be.

So, according to the manual, and my (admittedly limited) experiments, 
skipping over NOP_EXPR does appear to be safe.

But you say that it isn't safe. So now I'm confused. :(

I can certainly add checks to make sure that the skipped operations 
actually don't make any important changes to the value, but do I need to?

Andrew
Richard Guenther - June 24, 2011, 3:47 p.m.
On Fri, Jun 24, 2011 at 3:46 PM, Stubbs, Andrew
<Andrew_Stubbs@mentor.com> wrote:
> On 24/06/11 09:28, Richard Guenther wrote:
>>> >  To be clear, it only skips past NOP_EXPR. Is it not the case that what
>>> >  you're describing would need a CONVERT_EXPR?
>> NOP_EXPR is the same as CONVERT_EXPR.
>
> Are you sure?

Yes, definitely.  They are synonyms of each other (an unfinished merging
process), the usual check for them is via CONVERT_EXPR_P.

> I thought this was safe because the internals manual says:
>
>   NOP_EXPR
>   These nodes are used to represent conversions that do not require any
>   code-generation ....
>
>   CONVERT_EXPR
>   These nodes are similar to NOP_EXPRs, but are used in those
>   situations where code may need to be generated ....

Which is wrong (sorry).

> So, I tried this example:
>
> int
> foo (int a, short b, short c)
> {
>   int bc = b * c;
>   return a + (short)bc;
> }
>
> Both before and after my patch, GCC gives:
>
>         mul     r2, r1, r2
>         sxtah   r0, r0, r2
>
> (where, SXTAH means sign-extend the third operand from HImode to SImode
> and add to the second operand.)
>
> The dump after the widening_mult pass is:
>
> foo (int a, short int b, short int c)
> {
>   int bc;
>   int D.2018;
>   short int D.2017;
>   int D.2016;
>   int D.2015;
>   int D.2014;
>
> <bb 2>:
>   D.2014_2 = (int) b_1(D);
>   D.2015_4 = (int) c_3(D);
>   bc_5 = b_1(D) w* c_3(D);
>   D.2017_6 = (short int) bc_5;
>   D.2018_7 = (int) D.2017_6;
>   D.2016_9 = D.2018_7 + a_8(D);
>   return D.2016_9;
>
> }
>
> Where you can clearly see that the addition has not been recognised as a
> multiply-and-accumulate.
>
> When I step through convert_plusminus_to_widen, I can see that the
> reason it has not matched is because "D.2017_6 = (short int) bc_5" is
> encoded with a CONVERT_EXPR, just as the manual said it would be.

A NOP_EXPR in this place would be valid as well.  The merging hasn't
been completed and at least the C frontend still generates CONVERT_EXPRs
in some cases.

> So, according to the manual, and my (admittedly limited) experiments,
> skipping over NOP_EXPR does appear to be safe.
>
> But you say that it isn't safe. So now I'm confused. :(
>
> I can certainly add checks to make sure that the skipped operations
> actually don't make any important changes to the value, but do I need to?

Yes.

Thanks,
Richard.

> Andrew
>
Stubbs, Andrew - June 24, 2011, 4:58 p.m.
On 24/06/11 16:47, Richard Guenther wrote:
>> >  I can certainly add checks to make sure that the skipped operations
>> >  actually don't make any important changes to the value, but do I need to?
> Yes.

Ok, I'll go away and do that then.

BTW, I see useless_type_conversion_p, but that's not quite what I want. 
Is there an equivalent existing function to determine whether a 
conversion changes the logical/arithmetic meaning of a type?

I mean, conversion to a wider mode is not "useless", but it is harmless, 
whereas conversion to a narrower mode may truncate the value.

Andrew
Richard Guenther - June 25, 2011, 9:34 a.m.
On Fri, Jun 24, 2011 at 6:58 PM, Stubbs, Andrew
<Andrew_Stubbs@mentor.com> wrote:
> On 24/06/11 16:47, Richard Guenther wrote:
>>> >  I can certainly add checks to make sure that the skipped operations
>>> >  actually don't make any important changes to the value, but do I need to?
>> Yes.
>
> Ok, I'll go away and do that then.
>
> BTW, I see useless_type_conversion_p, but that's not quite what I want.
> Is there an equivalent existing function to determine whether a
> conversion changes the logical/arithmetic meaning of a type?
>
> I mean, conversion to a wider mode is not "useless", but it is harmless,
> whereas conversion to a narrower mode may truncate the value.

Well, you have to decide that for the concrete situation based on
the signedness and precision of the types involved.  All such
conversions change the logical/arithmetic meaning of a type if
seen in the right context.

Richard.

> Andrew
>

Patch

2011-06-23  Andrew Stubbs  <ams@codesourcery.com>

	gcc/
	* tree-ssa-math-opts.c (convert_plusminus_to_widen): Look for
	multiply statement beyond NOP_EXPR statements.

	gcc/testsuite/
	* gcc.target/arm/umlal-1.c: New file.

--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/umlal-1.c
@@ -0,0 +1,10 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=armv7-a" } */
+
+long long
+foo (long long a, char *b, char *c)
+{
+  return a + *b * *c;
+}
+
+/* { dg-final { scan-assembler "umlal" } } */
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2114,26 +2114,39 @@  convert_plusminus_to_widen (gimple_stmt_iterator *gsi, gimple stmt,
   else
     wmult_code = WIDEN_MULT_PLUS_EXPR;
 
-  rhs1 = gimple_assign_rhs1 (stmt);
-  rhs2 = gimple_assign_rhs2 (stmt);
-
-  if (TREE_CODE (rhs1) == SSA_NAME)
+  rhs1_stmt = stmt;
+  do
     {
-      rhs1_stmt = SSA_NAME_DEF_STMT (rhs1);
-      if (is_gimple_assign (rhs1_stmt))
-	rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
+      rhs1_code = ERROR_MARK;
+      rhs1 = gimple_assign_rhs1 (rhs1_stmt);
+
+      if (TREE_CODE (rhs1) == SSA_NAME)
+	{
+	  rhs1_stmt = SSA_NAME_DEF_STMT (rhs1);
+	  if (is_gimple_assign (rhs1_stmt))
+	    rhs1_code = gimple_assign_rhs_code (rhs1_stmt);
+	}
+      else
+	return false;
     }
-  else
-    return false;
+  while (rhs1_code == NOP_EXPR);
 
-  if (TREE_CODE (rhs2) == SSA_NAME)
+  rhs2_stmt = stmt;
+  do
     {
-      rhs2_stmt = SSA_NAME_DEF_STMT (rhs2);
-      if (is_gimple_assign (rhs2_stmt))
-	rhs2_code = gimple_assign_rhs_code (rhs2_stmt);
+      rhs2_code = ERROR_MARK;
+      rhs2 = gimple_assign_rhs2 (rhs2_stmt);
+
+      if (rhs2 && TREE_CODE (rhs2) == SSA_NAME)
+	{
+	  rhs2_stmt = SSA_NAME_DEF_STMT (rhs2);
+	  if (is_gimple_assign (rhs2_stmt))
+	    rhs2_code = gimple_assign_rhs_code (rhs2_stmt);
+	}
+      else
+	return false;
     }
-  else
-    return false;
+  while (rhs2_code == NOP_EXPR);
 
   if (code == PLUS_EXPR && rhs1_code == MULT_EXPR)
     {