diff mbox series

combine: Don't simplify high part of paradoxical-SUBREG-of-MEM on machines that sign-extend loads [PR113010]

Message ID 20240222205918.3871954-1-gkm@rivosinc.com
State New
Headers show
Series combine: Don't simplify high part of paradoxical-SUBREG-of-MEM on machines that sign-extend loads [PR113010] | expand

Commit Message

Greg McGary Feb. 22, 2024, 8:59 p.m. UTC
The sign bit of a sign-extending load cannot be known until runtime,
so don't attempt to simplify it in the combiner.

2024-02-22  Greg McGary  <gkm@rivosinc.com>

        PR rtl-optimization/113010
        * combine.cc (simplify_comparison): Don't simplify high part
	of paradoxical-SUBREG-of-MEM on machines that sign-extend loads

        * gcc.c-torture/execute/pr113010.c: New test.
---
 gcc/combine.cc                                 | 10 ++++++++--
 gcc/testsuite/gcc.c-torture/execute/pr113010.c |  9 +++++++++
 2 files changed, 17 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr113010.c

Comments

Jakub Jelinek Feb. 22, 2024, 9:08 p.m. UTC | #1
On Thu, Feb 22, 2024 at 12:59:18PM -0800, Greg McGary wrote:
> The sign bit of a sign-extending load cannot be known until runtime,
> so don't attempt to simplify it in the combiner.
> 
> 2024-02-22  Greg McGary  <gkm@rivosinc.com>
> 
>         PR rtl-optimization/113010
>         * combine.cc (simplify_comparison): Don't simplify high part
> 	of paradoxical-SUBREG-of-MEM on machines that sign-extend loads
> 
>         * gcc.c-torture/execute/pr113010.c: New test.
> ---
>  gcc/combine.cc                                 | 10 ++++++++--
>  gcc/testsuite/gcc.c-torture/execute/pr113010.c |  9 +++++++++
>  2 files changed, 17 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr113010.c
> 
> diff --git a/gcc/combine.cc b/gcc/combine.cc
> index 812553c091e..736206242e1 100644
> --- a/gcc/combine.cc
> +++ b/gcc/combine.cc
> @@ -12550,9 +12550,15 @@ simplify_comparison (enum rtx_code code, rtx *pop0, rtx *pop1)
>  	    }
>  
>  	  /* If the inner mode is narrower and we are extracting the low part,
> -	     we can treat the SUBREG as if it were a ZERO_EXTEND.  */
> +	     we can treat the SUBREG as if it were a ZERO_EXTEND ...  */
>  	  if (paradoxical_subreg_p (op0))
> -	    ;
> +	    {
> +	      /* ... except we can't treat as ZERO_EXTEND when a machine
> +		 automatically sign-extends loads. */
> +	      if (MEM_P (SUBREG_REG (op0)) && WORD_REGISTER_OPERATIONS
> +		  && load_extend_op (inner_mode) == SIGN_EXTEND)
> +		break;

That doesn't feel sufficient.  Like in the PR112758 patch, I believe
for WORD_REGISTER_OPERATIONS you should treat it as a ZERO_EXTEND only
if MEM_P (SUBREG_REG (op0)) && load_extend_op (inner_mode) == ZERO_EXTEND
or if GET_MODE_PRECISION (inner_mode) is known to be >= BITS_PER_WORD.

	Jakub
Greg McGary Feb. 23, 2024, 7:36 p.m. UTC | #2
On 2/22/24 2:08 PM, Jakub Jelinek wrote:
> On Thu, Feb 22, 2024 at 12:59:18PM -0800, Greg McGary wrote:
>> The sign bit of a sign-extending load cannot be known until runtime,
>> so don't attempt to simplify it in the combiner.
>>
>> 2024-02-22  Greg McGary  <gkm@rivosinc.com>
>>
>>          PR rtl-optimization/113010
>>          * combine.cc (simplify_comparison): Don't simplify high part
>> 	of paradoxical-SUBREG-of-MEM on machines that sign-extend loads
>>
>>          * gcc.c-torture/execute/pr113010.c: New test.
>> ---
>>   gcc/combine.cc                                 | 10 ++++++++--
>>   gcc/testsuite/gcc.c-torture/execute/pr113010.c |  9 +++++++++
>>   2 files changed, 17 insertions(+), 2 deletions(-)
>>   create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr113010.c
>>
>> diff --git a/gcc/combine.cc b/gcc/combine.cc
>> index 812553c091e..736206242e1 100644
>> --- a/gcc/combine.cc
>> +++ b/gcc/combine.cc
>> @@ -12550,9 +12550,15 @@ simplify_comparison (enum rtx_code code, rtx *pop0, rtx *pop1)
>>   	    }
>>   
>>   	  /* If the inner mode is narrower and we are extracting the low part,
>> -	     we can treat the SUBREG as if it were a ZERO_EXTEND.  */
>> +	     we can treat the SUBREG as if it were a ZERO_EXTEND ...  */
>>   	  if (paradoxical_subreg_p (op0))
>> -	    ;
>> +	    {
>> +	      /* ... except we can't treat as ZERO_EXTEND when a machine
>> +		 automatically sign-extends loads. */
>> +	      if (MEM_P (SUBREG_REG (op0)) && WORD_REGISTER_OPERATIONS
>> +		  && load_extend_op (inner_mode) == SIGN_EXTEND)
>> +		break;
> That doesn't feel sufficient.  Like in the PR112758 patch, I believe
> for WORD_REGISTER_OPERATIONS you should treat it as a ZERO_EXTEND only
> if MEM_P (SUBREG_REG (op0)) && load_extend_op (inner_mode) == ZERO_EXTEND
> or if GET_MODE_PRECISION (inner_mode) is known to be >= BITS_PER_WORD.
>
> 	Jakub

The gist of your comment is that my patch was missing the test for width
of inner_mode vs. BITS_PER_WORD. Here's a revision. Looks good to you?

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 812553c091e..4626f2edae9 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -12550,9 +12550,17 @@ simplify_comparison (enum rtx_code code, rtx 
*pop0, rtx *pop1)
             }

           /* If the inner mode is narrower and we are extracting the 
low part,
-            we can treat the SUBREG as if it were a ZERO_EXTEND. */
+            we can treat the SUBREG as if it were a ZERO_EXTEND ...  */
           if (paradoxical_subreg_p (op0))
-           ;
+           {
+             /* ... except we can't do that for loads on machines
+                that don't automatically zero-extend loads. */
+             if (WORD_REGISTER_OPERATIONS
+                 && GET_MODE_PRECISION (inner_mode) < BITS_PER_WORD
+                 && MEM_P (SUBREG_REG (op0))
+                 && load_extend_op (inner_mode) != ZERO_EXTEND)
+               break;
+           }
           else if (subreg_lowpart_p (op0)
                    && GET_MODE_CLASS (mode) == MODE_INT
                    && is_int_mode (GET_MODE (SUBREG_REG (op0)), 
&inner_mode)
diff mbox series

Patch

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 812553c091e..736206242e1 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -12550,9 +12550,15 @@  simplify_comparison (enum rtx_code code, rtx *pop0, rtx *pop1)
 	    }
 
 	  /* If the inner mode is narrower and we are extracting the low part,
-	     we can treat the SUBREG as if it were a ZERO_EXTEND.  */
+	     we can treat the SUBREG as if it were a ZERO_EXTEND ...  */
 	  if (paradoxical_subreg_p (op0))
-	    ;
+	    {
+	      /* ... except we can't treat as ZERO_EXTEND when a machine
+		 automatically sign-extends loads. */
+	      if (MEM_P (SUBREG_REG (op0)) && WORD_REGISTER_OPERATIONS
+		  && load_extend_op (inner_mode) == SIGN_EXTEND)
+		break;
+	    }
 	  else if (subreg_lowpart_p (op0)
 		   && GET_MODE_CLASS (mode) == MODE_INT
 		   && is_int_mode (GET_MODE (SUBREG_REG (op0)), &inner_mode)
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr113010.c b/gcc/testsuite/gcc.c-torture/execute/pr113010.c
new file mode 100644
index 00000000000..a95c613c1df
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr113010.c
@@ -0,0 +1,9 @@ 
+int minus_1 = -1;
+
+int
+main ()
+{
+  if ((0, 0xfffffffful) >= minus_1)
+    __builtin_abort ();
+  return 0;
+}