diff mbox series

[v2,rs6000] Fix ICE on expand bcd<bcd_add_sub>_<code>_<mode> [PR100736]

Message ID 41da7001-549d-c7ae-fa6b-534a8faf673e@linux.ibm.com
State New
Headers show
Series [v2,rs6000] Fix ICE on expand bcd<bcd_add_sub>_<code>_<mode> [PR100736] | expand

Commit Message

HAO CHEN GUI May 26, 2022, 7:35 a.m. UTC
Hi,
  This patch fixes the ICE reported in PR100736. It removes the condition
check of finite math only flag not setting in "*<code><mode>_cc" pattern.
With or without this flag, we still can use "cror" to check if either
two bits of CC is set or not for "fp_two" codes. We don't need a reverse
comparison (implemented by crnot) here when the finite math flag is set,
as the latency of "cror" and "crnor" are the same.

  Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
Is this okay for trunk? Any recommendations? Thanks a lot.

ChangeLog
2022-05-26 Haochen Gui <guihaoc@linux.ibm.com>

gcc/
	* config/rs6000/rs6000.md (*<code><mode>_cc): Remove condition of
	finite math only flag not setting.

gcc/testsuite/
	* gcc.target/powerpc/pr100736.c: New.


patch.diff

Comments

HAO CHEN GUI May 30, 2022, 1:26 a.m. UTC | #1
Hi,
   Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2022-May/595661.html
Thanks.

On 26/5/2022 下午 3:35, HAO CHEN GUI wrote:
> Hi,
>   This patch fixes the ICE reported in PR100736. It removes the condition
> check of finite math only flag not setting in "*<code><mode>_cc" pattern.
> With or without this flag, we still can use "cror" to check if either
> two bits of CC is set or not for "fp_two" codes. We don't need a reverse
> comparison (implemented by crnot) here when the finite math flag is set,
> as the latency of "cror" and "crnor" are the same.
> 
>   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
> Is this okay for trunk? Any recommendations? Thanks a lot.
> 
> ChangeLog
> 2022-05-26 Haochen Gui <guihaoc@linux.ibm.com>
> 
> gcc/
> 	* config/rs6000/rs6000.md (*<code><mode>_cc): Remove condition of
> 	finite math only flag not setting.
> 
> gcc/testsuite/
> 	* gcc.target/powerpc/pr100736.c: New.
> 
> 
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index fdfbc6566a5..a6f9cbc9b8b 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -12995,9 +12995,9 @@ (define_insn_and_split "*<code><mode>_cc"
>    [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
>  	(fp_two:GPR (match_operand:CCFP 1 "cc_reg_operand" "y")
>  		  (const_int 0)))]
> -  "!flag_finite_math_only"
> +  ""
>    "#"
> -  "&& 1"
> +  ""
>    [(pc)]
>  {
>    rtx cc = rs6000_emit_fp_cror (<CODE>, <MODE>mode, operands[1]);
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr100736.c b/gcc/testsuite/gcc.target/powerpc/pr100736.c
> new file mode 100644
> index 00000000000..32cb6df6cd9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr100736.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_p8vector_ok } */
> +/* { dg-options "-mdejagnu-cpu=power8 -O2 -ffinite-math-only" } */
> +
> +typedef __attribute__ ((altivec (vector__))) unsigned char v;
> +
> +int foo (v a, v b)
> +{
> +  return __builtin_vec_bcdsub_ge (a, b, 0);
> +}
> +
> +/* { dg-final { scan-assembler {\mcror\M} } } */
>
Kewen.Lin May 30, 2022, 10:12 a.m. UTC | #2
Hi Haochen,

on 2022/5/26 15:35, HAO CHEN GUI wrote:
> Hi,
>   This patch fixes the ICE reported in PR100736. It removes the condition
> check of finite math only flag not setting in "*<code><mode>_cc" pattern.
> With or without this flag, we still can use "cror" to check if either
> two bits of CC is set or not for "fp_two" codes. We don't need a reverse
> comparison (implemented by crnot) here when the finite math flag is set,
> as the latency of "cror" and "crnor" are the same.
> 
>   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
> Is this okay for trunk? Any recommendations? Thanks a lot.
> 
> ChangeLog
> 2022-05-26 Haochen Gui <guihaoc@linux.ibm.com>
> 
> gcc/
> 	* config/rs6000/rs6000.md (*<code><mode>_cc): Remove condition of
> 	finite math only flag not setting.
> 
> gcc/testsuite/
> 	* gcc.target/powerpc/pr100736.c: New.
> 
> 
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index fdfbc6566a5..a6f9cbc9b8b 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -12995,9 +12995,9 @@ (define_insn_and_split "*<code><mode>_cc"
>    [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
>  	(fp_two:GPR (match_operand:CCFP 1 "cc_reg_operand" "y")
>  		  (const_int 0)))]
> -  "!flag_finite_math_only"
> +  ""
>    "#"
> -  "&& 1"
> +  ""

Segher added this hunk, not sure if he prefer to keep the condition unchanged
and update the expansion side, looking forward to his comments.  :)

>    [(pc)]
>  {
>    rtx cc = rs6000_emit_fp_cror (<CODE>, <MODE>mode, operands[1]);
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr100736.c b/gcc/testsuite/gcc.target/powerpc/pr100736.c
> new file mode 100644
> index 00000000000..32cb6df6cd9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr100736.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_p8vector_ok } */
> +/* { dg-options "-mdejagnu-cpu=power8 -O2 -ffinite-math-only" } */
> +
> +typedef __attribute__ ((altivec (vector__))) unsigned char v;
> +
> +int foo (v a, v b)
> +{
> +  return __builtin_vec_bcdsub_ge (a, b, 0);
> +}
> +
> +/* { dg-final { scan-assembler {\mcror\M} } } */
> 

The case of PR100736 fails with ICE as reported, maybe we can remove this dg-final check,
since as you noted in the description above either "cror" or "crnor" are acceptable,
this extra check could probably make this case fragile.

BR,
Kewen
Segher Boessenkool May 31, 2022, 11:56 p.m. UTC | #3
Hi!

On Mon, May 30, 2022 at 06:12:26PM +0800, Kewen.Lin wrote:
> on 2022/5/26 15:35, HAO CHEN GUI wrote:
> >   This patch fixes the ICE reported in PR100736. It removes the condition
> > check of finite math only flag not setting in "*<code><mode>_cc" pattern.
> > With or without this flag, we still can use "cror" to check if either
> > two bits of CC is set or not for "fp_two" codes. We don't need a reverse
> > comparison (implemented by crnot) here when the finite math flag is set,
> > as the latency of "cror" and "crnor" are the same.

> > --- a/gcc/config/rs6000/rs6000.md
> > +++ b/gcc/config/rs6000/rs6000.md
> > @@ -12995,9 +12995,9 @@ (define_insn_and_split "*<code><mode>_cc"
> >    [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
> >  	(fp_two:GPR (match_operand:CCFP 1 "cc_reg_operand" "y")
> >  		  (const_int 0)))]
> > -  "!flag_finite_math_only"
> > +  ""
> >    "#"
> > -  "&& 1"
> > +  ""
> 
> Segher added this hunk, not sure if he prefer to keep the condition unchanged
> and update the expansion side, looking forward to his comments.  :)

It's not clear to me how this can ever happen without finite_math_only?
The patch is safe, sure, but it may the real problem is elsewhere.

> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/pr100736.c
> > @@ -0,0 +1,12 @@
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target powerpc_p8vector_ok } */
> > +/* { dg-options "-mdejagnu-cpu=power8 -O2 -ffinite-math-only" } */

The usual flag to use would be -ffast-math :-)

> > +/* { dg-final { scan-assembler {\mcror\M} } } */
> 
> The case of PR100736 fails with ICE as reported, maybe we can remove this dg-final check,
> since as you noted in the description above either "cror" or "crnor" are acceptable,
> this extra check could probably make this case fragile.

Check for \mcrn?or\M then?  But, is crnor something we want here ever?

The reason we do not have cror for finte-math-only is that comparisons
can only (validly :-) ) return LT, GT, or EQ then, and we can branch on
that without twiddling CRF bits first.  Is this not true for BCD
compares, is that what the problem is?  Or, is our builtin expansion
returning something invalid?  Or something else :-)


Segher
Segher Boessenkool June 1, 2022, 10:05 p.m. UTC | #4
Hi!

On Tue, May 31, 2022 at 06:56:00PM -0500, Segher Boessenkool wrote:
> It's not clear to me how this can ever happen without finite_math_only?
> The patch is safe, sure, but it may the real problem is elsewhere.

So, it is incorrect the RTL for our bcd{add,sub} insns uses CCFP at all.

CCFP stands for the result of a 4-way comparison, regular float
comparison: lt gt eq un.  But bcdadd does not have an unordered at all.
Instead, it has the result of a 3-way comparison (lt gt eq), and bit 3
is set if an overflow happened -- but still exactly one of bits 0..2 is
set then!  (If one of the inputs is an invalid number it sets bits 0..3
to 0001 though.)

So it would be much more correct and sensible to use regular integer
comparison results here, so, CC.

Does that fix the problem?


Segher
HAO CHEN GUI June 2, 2022, 5:30 a.m. UTC | #5
Segher,
  Does BCD comparison return false when either operand is invalid coding?
If yes, the result could be 3-way. We can check gt and eq bits for ge.
We still can't use crnot to only check lt bit as there could be invalid
coding.
  Also, do you think finite-math-only excludes invalid coding? Seems GCC
doesn't clear define it.

Thanks.
Gui Haochen


On 2/6/2022 上午 6:05, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, May 31, 2022 at 06:56:00PM -0500, Segher Boessenkool wrote:
>> It's not clear to me how this can ever happen without finite_math_only?
>> The patch is safe, sure, but it may the real problem is elsewhere.
> 
> So, it is incorrect the RTL for our bcd{add,sub} insns uses CCFP at all.
> 
> CCFP stands for the result of a 4-way comparison, regular float
> comparison: lt gt eq un.  But bcdadd does not have an unordered at all.
> Instead, it has the result of a 3-way comparison (lt gt eq), and bit 3
> is set if an overflow happened -- but still exactly one of bits 0..2 is
> set then!  (If one of the inputs is an invalid number it sets bits 0..3
> to 0001 though.)
> 
> So it would be much more correct and sensible to use regular integer
> comparison results here, so, CC.
> 
> Does that fix the problem?
> 
> 
> Segher
Segher Boessenkool June 2, 2022, 9:04 a.m. UTC | #6
Hi!

On Thu, Jun 02, 2022 at 01:30:04PM +0800, HAO CHEN GUI wrote:
> Segher,
>   Does BCD comparison return false when either operand is invalid coding?

It sets all of LT, GT, and EQ to 0 (it normally sets exactly one of them
to 1).  It sets bit 3 (the "SO" bit usually) to 1.

That is what the machine insns do.  What the builtins do is undefined as
far as I know?  If So we can do whatever is most convenient, so, not
handle it specifically at all, just go with what falls out.

> If yes, the result could be 3-way. We can check gt and eq bits for ge.

You can check the LT bit, instead: it is only one branch insn, and also
only one setbc[r] insn (it can be slightly more expensive if you can use
only older insns).

> We still can't use crnot to only check lt bit as there could be invalid
> coding.
>   Also, do you think finite-math-only excludes invalid coding? Seems GCC
> doesn't clear define it.

This is not floating-point code at all, it should not be influenced at
all by finite-math-only!


Segher
HAO CHEN GUI June 6, 2022, 8:14 a.m. UTC | #7
On 2/6/2022 下午 5:04, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Jun 02, 2022 at 01:30:04PM +0800, HAO CHEN GUI wrote:
>> Segher,
>>   Does BCD comparison return false when either operand is invalid coding?
> 
> It sets all of LT, GT, and EQ to 0 (it normally sets exactly one of them
> to 1).  It sets bit 3 (the "SO" bit usually) to 1.
> 
> That is what the machine insns do.  What the builtins do is undefined as
> far as I know?  If So we can do whatever is most convenient, so, not
> handle it specifically at all, just go with what falls out.

We defined the following unordered BCD builtins in rs6000-builtin.def.
They check the bit 3 for overflow.

  const signed int __builtin_bcdadd_ov_v1ti (vsq, vsq, const int<1>);
    BCDADD_OV_V1TI bcdadd_unordered_v1ti {}

  const signed int __builtin_bcdadd_ov_v16qi (vsc, vsc, const int<1>);
    BCDADD_OV_V16QI bcdadd_unordered_v16qi {}

Also Xlc defines three BCD builtins for overflow and invalid coding.
https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-bcd-test-add-subtract-overflow
Shall GCC keep up with Xlc? Please advise.

Thanks
Gui Haochen

> 
>> If yes, the result could be 3-way. We can check gt and eq bits for ge.
> 
> You can check the LT bit, instead: it is only one branch insn, and also
> only one setbc[r] insn (it can be slightly more expensive if you can use
> only older insns).
> 
>> We still can't use crnot to only check lt bit as there could be invalid
>> coding.
>>   Also, do you think finite-math-only excludes invalid coding? Seems GCC
>> doesn't clear define it.
> 
> This is not floating-point code at all, it should not be influenced at
> all by finite-math-only!
> 
> 
> Segher
diff mbox series

Patch

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index fdfbc6566a5..a6f9cbc9b8b 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -12995,9 +12995,9 @@  (define_insn_and_split "*<code><mode>_cc"
   [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
 	(fp_two:GPR (match_operand:CCFP 1 "cc_reg_operand" "y")
 		  (const_int 0)))]
-  "!flag_finite_math_only"
+  ""
   "#"
-  "&& 1"
+  ""
   [(pc)]
 {
   rtx cc = rs6000_emit_fp_cror (<CODE>, <MODE>mode, operands[1]);
diff --git a/gcc/testsuite/gcc.target/powerpc/pr100736.c b/gcc/testsuite/gcc.target/powerpc/pr100736.c
new file mode 100644
index 00000000000..32cb6df6cd9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr100736.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-mdejagnu-cpu=power8 -O2 -ffinite-math-only" } */
+
+typedef __attribute__ ((altivec (vector__))) unsigned char v;
+
+int foo (v a, v b)
+{
+  return __builtin_vec_bcdsub_ge (a, b, 0);
+}
+
+/* { dg-final { scan-assembler {\mcror\M} } } */