diff mbox series

[v2,rs6000] Add a combine pattern for CA minus one [PR95737]

Message ID c94c55ff-38dd-d15b-05f6-6bba18aff5b0@linux.ibm.com
State New
Headers show
Series [v2,rs6000] Add a combine pattern for CA minus one [PR95737] | expand

Commit Message

HAO CHEN GUI Jan. 20, 2022, 7:35 a.m. UTC
Hi,
   This patch adds a combine pattern for "CA minus one". As CA only has two
values (0 or 1), we could convert following pattern
      (sign_extend:DI (plus:SI (reg:SI 98 ca)
                (const_int -1 [0xffffffffffffffff]))))
to
       (plus:DI (reg:DI 98 ca)
            (const_int -1 [0xffffffffffffffff])))
   With this patch, one unnecessary sign extend is eliminated.

   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
Is this okay for trunk? Any recommendations? Thanks a lot.

ChangeLog
2022-01-20 Haochen Gui <guihaoc@linux.ibm.com>

gcc/
	* config/rs6000/rs6000.md (extenddi_ca_minus_one): Define.

gcc/testsuite/
	* gcc.target/powerpc/pr95737.c: New.


patch.diff

Comments

David Edelsohn Jan. 20, 2022, 6:46 p.m. UTC | #1
On Thu, Jan 20, 2022 at 2:36 AM HAO CHEN GUI <guihaoc@linux.ibm.com> wrote:
>
> Hi,
>    This patch adds a combine pattern for "CA minus one". As CA only has two
> values (0 or 1), we could convert following pattern
>       (sign_extend:DI (plus:SI (reg:SI 98 ca)
>                 (const_int -1 [0xffffffffffffffff]))))
> to
>        (plus:DI (reg:DI 98 ca)
>             (const_int -1 [0xffffffffffffffff])))
>    With this patch, one unnecessary sign extend is eliminated.
>
>    Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
> Is this okay for trunk? Any recommendations? Thanks a lot.
>
> ChangeLog
> 2022-01-20 Haochen Gui <guihaoc@linux.ibm.com>
>
> gcc/
>         * config/rs6000/rs6000.md (extenddi_ca_minus_one): Define.
>
> gcc/testsuite/
>         * gcc.target/powerpc/pr95737.c: New.
>
>
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 6ecb0bd6142..1d8b212962f 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -2358,6 +2358,19 @@ (define_insn "subf<mode>3_carry_in_xx"
>    "subfe %0,%0,%0"
>    [(set_attr "type" "add")])
>
> +(define_insn_and_split "*extenddi_ca_minus_one"
> +  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
> +       (sign_extend:DI (plus:SI (reg:SI CA_REGNO)
> +                                (const_int -1))))]
> +  ""
> +  "#"
> +  ""
> +  [(parallel [(set (match_dup 0)
> +                  (plus:DI (reg:DI CA_REGNO)
> +                           (const_int -1)))
> +             (clobber (reg:DI CA_REGNO))])]
> +  ""
> +)
>
>  (define_insn "@neg<mode>2"
>    [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr95737.c b/gcc/testsuite/gcc.target/powerpc/pr95737.c
> new file mode 100644
> index 00000000000..94320f23423
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr95737.c
> @@ -0,0 +1,10 @@
> +/* PR target/95737 */
> +/* { dg-do compile { target lp64 } } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power8" } */

Why does the testcase force power8? This testcase is not specific to
Power8 or later.

> +/* { dg-final { scan-assembler-not {\mextsw\M} } } */
> +
> +
> +unsigned long long negativeLessThan (unsigned long long a, unsigned long long b)
> +{
> +   return -(a < b);
> +}

If you're only testing for lp64, the testcase could use "long" instead
of "long long".

This is okay with those changes.

Thanks, David
Segher Boessenkool Jan. 20, 2022, 9:42 p.m. UTC | #2
Hi!

On Thu, Jan 20, 2022 at 01:46:48PM -0500, David Edelsohn wrote:
> On Thu, Jan 20, 2022 at 2:36 AM HAO CHEN GUI <guihaoc@linux.ibm.com> wrote:
> >    This patch adds a combine pattern for "CA minus one". As CA only has two
> > values (0 or 1), we could convert following pattern
> >       (sign_extend:DI (plus:SI (reg:SI 98 ca)
> >                 (const_int -1 [0xffffffffffffffff]))))
> > to
> >        (plus:DI (reg:DI 98 ca)
> >             (const_int -1 [0xffffffffffffffff])))
> >    With this patch, one unnecessary sign extend is eliminated.
> >
> >    Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
> > Is this okay for trunk? Any recommendations? Thanks a lot.

There are ten gazillion similar things we could make extra backend
patterns for, and we still would not cover a majority of cases.

If instead we got some generic way to handle this we could cover many
more cases, for much less effort.

We need both widening modes from SI to DI, amd narrowing modes from DI
to SI.  Both are useful in certain cases; it is not like using wider
modes is always better, in some cases narrower modes is better (in cases
where we can let the generated code then generate whatever bits in the
high half of the word, for example; a typical example is addition in an
unsigned int).

> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/pr95737.c
> > @@ -0,0 +1,10 @@
> > +/* PR target/95737 */
> > +/* { dg-do compile { target lp64 } } */
> > +/* { dg-options "-O2 -mdejagnu-cpu=power8" } */
> 
> Why does the testcase force power8? This testcase is not specific to
> Power8 or later.

Yes, and we should generate the same code on older machines.

> > +/* { dg-final { scan-assembler-not {\mextsw\M} } } */
> > +
> > +
> > +unsigned long long negativeLessThan (unsigned long long a, unsigned long long b)
> > +{
> > +   return -(a < b);
> > +}
> 
> If you're only testing for lp64, the testcase could use "long" instead
> of "long long".

The testcase really needs "powerpc64", if that would mean "test if
-mpowerpc64 is (implicitly) used".  But that is not what it currently
means (it is something akin to "powerpc64_hw", instead).

So we test lp64, which is set if and only if -m64 was used.  It is
reasonable coverage, no one cares much for -m32 -mpowerpc64 .


Segher
HAO CHEN GUI Jan. 21, 2022, 6:31 a.m. UTC | #3
Thanks so much for your advice. Please see my comments.

On 21/1/2022 上午 5:42, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Jan 20, 2022 at 01:46:48PM -0500, David Edelsohn wrote:
>> On Thu, Jan 20, 2022 at 2:36 AM HAO CHEN GUI <guihaoc@linux.ibm.com> wrote:
>>>    This patch adds a combine pattern for "CA minus one". As CA only has two
>>> values (0 or 1), we could convert following pattern
>>>       (sign_extend:DI (plus:SI (reg:SI 98 ca)
>>>                 (const_int -1 [0xffffffffffffffff]))))
>>> to
>>>        (plus:DI (reg:DI 98 ca)
>>>             (const_int -1 [0xffffffffffffffff])))
>>>    With this patch, one unnecessary sign extend is eliminated.
>>>
>>>    Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
>>> Is this okay for trunk? Any recommendations? Thanks a lot.
> 
> There are ten gazillion similar things we could make extra backend
> patterns for, and we still would not cover a majority of cases.
> 
> If instead we got some generic way to handle this we could cover many
> more cases, for much less effort.
Could we add an additional pass to exam the finally generated instructions
and its used registers to decide which extension is unnecessary?
> 
> We need both widening modes from SI to DI, amd narrowing modes from DI
> to SI.  Both are useful in certain cases; it is not like using wider
> modes is always better, in some cases narrower modes is better (in cases
> where we can let the generated code then generate whatever bits in the
> high half of the word, for example; a typical example is addition in an
> unsigned int).
Just for this case, converting CA from DI to SI is supported in simplify_rtx.
The original comparison result is in DI mode. But it's truncated to SI mode as
C standard requires.

Trying 8 -> 11:
    8: {r127:DI=ca:DI-0x1;clobber ca:DI;}
      REG_DEAD ca:DI
      REG_UNUSED ca:DI
   11: r128:SI=r127:DI#0
      REG_DEAD r127:DI
Successfully matched this instruction:
(set (reg:SI 128)
    (plus:SI (reg:SI 98 ca)
        (const_int -1 [0xffffffffffffffff])))
allowing combination of insns 8 and 11
original costs 4 + 4 = 8
replacement cost 4
deferring deletion of insn with uid = 8.
modifying insn i3    11: {r128:SI=ca:SI-0x1;clobber ca:SI;}
      REG_UNUSED ca:SI
deferring rescan insn with uid = 11.

The C standard type promotion requirement and 64-bit return value are the
root cause of such problem, I think.
> 
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr95737.c
>>> @@ -0,0 +1,10 @@
>>> +/* PR target/95737 */
>>> +/* { dg-do compile { target lp64 } } */
>>> +/* { dg-options "-O2 -mdejagnu-cpu=power8" } */
>>
>> Why does the testcase force power8? This testcase is not specific to
>> Power8 or later.
> 
> Yes, and we should generate the same code on older machines.
> 
>>> +/* { dg-final { scan-assembler-not {\mextsw\M} } } */
>>> +
>>> +
>>> +unsigned long long negativeLessThan (unsigned long long a, unsigned long long b)
>>> +{
>>> +   return -(a < b);
>>> +}
>>
>> If you're only testing for lp64, the testcase could use "long" instead
>> of "long long".
> 
> The testcase really needs "powerpc64", if that would mean "test if
> -mpowerpc64 is (implicitly) used".  But that is not what it currently
> means (it is something akin to "powerpc64_hw", instead).
> 
> So we test lp64, which is set if and only if -m64 was used.  It is
> reasonable coverage, no one cares much for -m32 -mpowerpc64 .
> 
> 
> Segher
diff mbox series

Patch

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 6ecb0bd6142..1d8b212962f 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -2358,6 +2358,19 @@  (define_insn "subf<mode>3_carry_in_xx"
   "subfe %0,%0,%0"
   [(set_attr "type" "add")])

+(define_insn_and_split "*extenddi_ca_minus_one"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
+	(sign_extend:DI (plus:SI (reg:SI CA_REGNO)
+				 (const_int -1))))]
+  ""
+  "#"
+  ""
+  [(parallel [(set (match_dup 0)
+		   (plus:DI (reg:DI CA_REGNO)
+			    (const_int -1)))
+	      (clobber (reg:DI CA_REGNO))])]
+  ""
+)

 (define_insn "@neg<mode>2"
   [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
diff --git a/gcc/testsuite/gcc.target/powerpc/pr95737.c b/gcc/testsuite/gcc.target/powerpc/pr95737.c
new file mode 100644
index 00000000000..94320f23423
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr95737.c
@@ -0,0 +1,10 @@ 
+/* PR target/95737 */
+/* { dg-do compile { target lp64 } } */
+/* { dg-options "-O2 -mdejagnu-cpu=power8" } */
+/* { dg-final { scan-assembler-not {\mextsw\M} } } */
+
+
+unsigned long long negativeLessThan (unsigned long long a, unsigned long long b)
+{
+   return -(a < b);
+}