diff mbox

[AArch64,PR65139] use clobber with match_scratch for aarch64_lshr_sisd_or_int_<mode>3

Message ID 552EED9F.3060901@linaro.org
State New
Headers show

Commit Message

Kugan Vivekanandarajah April 15, 2015, 11 p.m. UTC
On 16/04/15 08:32, Jakub Jelinek wrote:
> On Thu, Apr 16, 2015 at 08:27:24AM +1000, Kugan wrote:
>> +    if (<CODE> == LSHIFTRT)
>> +      {
>> +        emit_insn (gen_aarch64_lshr_sisd_or_int_<mode>3 (operands[0], operands[1], operands[2]));
> 
> That is way too long line, please wrap it.
> 
>> +        DONE;
>> +      }
>>    }
>>  )
>>  
>> @@ -3361,11 +3367,12 @@
>>  )
>>  
>>  ;; Logical right shift using SISD or Integer instruction
>> -(define_insn "*aarch64_lshr_sisd_or_int_<mode>3"
>> -  [(set (match_operand:GPI 0 "register_operand" "=w,&w,r")
>> +(define_insn "aarch64_lshr_sisd_or_int_<mode>3"
>> +  [(set (match_operand:GPI 0 "register_operand" "=w,w,r")
>>          (lshiftrt:GPI
>>            (match_operand:GPI 1 "register_operand" "w,w,r")
>> -          (match_operand:QI 2 "aarch64_reg_or_shift_imm_<mode>" "Us<cmode>,w,rUs<cmode>")))]
>> +          (match_operand:QI 2 "aarch64_reg_or_shift_imm_<mode>" "Us<cmode>,w,rUs<cmode>")))
> 
> Though, this one too...
> 

Fixed in the attached patch.

Thanks,
Kugan

Comments

Richard Earnshaw April 18, 2015, 2:07 p.m. UTC | #1
On 16/04/15 00:00, Kugan wrote:
> 
> 
> On 16/04/15 08:32, Jakub Jelinek wrote:
>> On Thu, Apr 16, 2015 at 08:27:24AM +1000, Kugan wrote:
>>> +    if (<CODE> == LSHIFTRT)
>>> +      {
>>> +        emit_insn (gen_aarch64_lshr_sisd_or_int_<mode>3 (operands[0], operands[1], operands[2]));
>>
>> That is way too long line, please wrap it.
>>
>>> +        DONE;
>>> +      }
>>>    }
>>>  )
>>>  
>>> @@ -3361,11 +3367,12 @@
>>>  )
>>>  
>>>  ;; Logical right shift using SISD or Integer instruction
>>> -(define_insn "*aarch64_lshr_sisd_or_int_<mode>3"
>>> -  [(set (match_operand:GPI 0 "register_operand" "=w,&w,r")
>>> +(define_insn "aarch64_lshr_sisd_or_int_<mode>3"
>>> +  [(set (match_operand:GPI 0 "register_operand" "=w,w,r")
>>>          (lshiftrt:GPI
>>>            (match_operand:GPI 1 "register_operand" "w,w,r")
>>> -          (match_operand:QI 2 "aarch64_reg_or_shift_imm_<mode>" "Us<cmode>,w,rUs<cmode>")))]
>>> +          (match_operand:QI 2 "aarch64_reg_or_shift_imm_<mode>" "Us<cmode>,w,rUs<cmode>")))
>>
>> Though, this one too...
>>
> 
> Fixed in the attached patch.
> 
> Thanks,
> Kugan
> 


Not ok.

You need to ensure that your scratch register cannot overlap op1, since
the scratch is written before op1 is read.

R.
Jakub Jelinek April 18, 2015, 3:13 p.m. UTC | #2
On Sat, Apr 18, 2015 at 03:07:16PM +0100, Richard Earnshaw wrote:
> You need to ensure that your scratch register cannot overlap op1, since
> the scratch is written before op1 is read.

-   (clobber (match_scratch:QI 3 "=X,w,X"))]
+   (clobber (match_scratch:QI 3 "=X,&w,X"))]

incremental diff should ensure that, right?

	Jakub
Richard Earnshaw April 18, 2015, 5:21 p.m. UTC | #3
On 18/04/15 16:13, Jakub Jelinek wrote:
> On Sat, Apr 18, 2015 at 03:07:16PM +0100, Richard Earnshaw wrote:
>> You need to ensure that your scratch register cannot overlap op1, since
>> the scratch is written before op1 is read.
> 
> -   (clobber (match_scratch:QI 3 "=X,w,X"))]
> +   (clobber (match_scratch:QI 3 "=X,&w,X"))]
> 
> incremental diff should ensure that, right?
> 
> 	Jakub
> 


Sorry, where in the patch is that hunk?

I see just:

+   (clobber (match_scratch:QI 3 "=X,w,X"))]

And why would early clobbering the scratch be notably better than the
original?

R.
Maxim Kuvyrkov April 18, 2015, 6:17 p.m. UTC | #4
> On Apr 18, 2015, at 8:21 PM, Richard Earnshaw <Richard.Earnshaw@foss.arm.com> wrote:
> 
> On 18/04/15 16:13, Jakub Jelinek wrote:
>> On Sat, Apr 18, 2015 at 03:07:16PM +0100, Richard Earnshaw wrote:
>>> You need to ensure that your scratch register cannot overlap op1, since
>>> the scratch is written before op1 is read.
>> 
>> -   (clobber (match_scratch:QI 3 "=X,w,X"))]
>> +   (clobber (match_scratch:QI 3 "=X,&w,X"))]
>> 
>> incremental diff should ensure that, right?
>> 
>> 	Jakub
>> 
> 
> 
> Sorry, where in the patch is that hunk?
> 
> I see just:
> 
> +   (clobber (match_scratch:QI 3 "=X,w,X"))]

Jakub's suggestion is an incremental patch on top of Kugan's.

> 
> And why would early clobbering the scratch be notably better than the
> original?
> 

It will still be better.  With this patch we want to allow RA freedom to optimally handle both of the following cases:

1. operand[1] dies after the instruction.  In this case we want operand[0] and operand[1] to be assigned to the same reg, and operand[3] to be assigned to a different register to provide a temporary.  In this case we don't care whether operand[3] is early-clobber or not.  This case is not optimally handled with current insn patterns.

2. operand[1] lives on after the instruction.  In this case we want operand[0] and operand[3] to be assigned to the same reg, and not clobber operand[1].  By marking operand[3] early-clobber we ensure that operand[1] is in a different register from what operand[0] and operand[3] were assigned to.  This case should be handled equally well before and after the patch.

My understanding is that Kugan's patch with Jakub's fix on top satisfy both of these cases.
 
--
Maxim Kuvyrkov
www.linaro.org
Richard Earnshaw April 21, 2015, 1:25 p.m. UTC | #5
On 18/04/15 19:17, Maxim Kuvyrkov wrote:
>> On Apr 18, 2015, at 8:21 PM, Richard Earnshaw <Richard.Earnshaw@foss.arm.com> wrote:
>>
>> On 18/04/15 16:13, Jakub Jelinek wrote:
>>> On Sat, Apr 18, 2015 at 03:07:16PM +0100, Richard Earnshaw wrote:
>>>> You need to ensure that your scratch register cannot overlap op1, since
>>>> the scratch is written before op1 is read.
>>>
>>> -   (clobber (match_scratch:QI 3 "=X,w,X"))]
>>> +   (clobber (match_scratch:QI 3 "=X,&w,X"))]
>>>
>>> incremental diff should ensure that, right?
>>>
>>> 	Jakub
>>>
>>
>>
>> Sorry, where in the patch is that hunk?
>>
>> I see just:
>>
>> +   (clobber (match_scratch:QI 3 "=X,w,X"))]
> 
> Jakub's suggestion is an incremental patch on top of Kugan's.
> 

Ah, sorry, I though he was implying it was already in the patch somewhere.

>>
>> And why would early clobbering the scratch be notably better than the
>> original?
>>
> 
> It will still be better.  With this patch we want to allow RA freedom to optimally handle both of the following cases:
> 
> 1. operand[1] dies after the instruction.  In this case we want operand[0] and operand[1] to be assigned to the same reg, and operand[3] to be assigned to a different register to provide a temporary.  In this case we don't care whether operand[3] is early-clobber or not.  This case is not optimally handled with current insn patterns.
> 
> 2. operand[1] lives on after the instruction.  In this case we want operand[0] and operand[3] to be assigned to the same reg, and not clobber operand[1].  By marking operand[3] early-clobber we ensure that operand[1] is in a different register from what operand[0] and operand[3] were assigned to.  This case should be handled equally well before and after the patch.
> 
> My understanding is that Kugan's patch with Jakub's fix on top satisfy both of these cases.
>  

I still don't think it handles all cases efficiently.  If we really want
the result in a different register from both of the inputs, then now we
need two registers for the results, one for the result and another for
the temporary.  In that case we could have used the result register as
the scratch, but now we can't.

Maybe we can provide two alternatives, one that early-clobbers the
result register but doesn't need a scratch and one that doesn't
early-clobber the result, but does need a scratch.

So something like

(define_insn "aarch64_lshr_sisd_or_int_<mode>3"
  [(set (match_operand:GPI 0 "register_operand" "=w,&w,w,r")
         (lshiftrt:GPI
           (match_operand:GPI 1 "register_operand" "w,w,w,r")
           (match_operand:QI 2 "aarch64_reg_or_shift_imm_<mode>"
                              "Us<cmode>,w,w,rUs<cmode>")))
   (clobber (match_scratch:QI 3 "=X,X,w,X"))]

... but I haven't tested any of that.

I would also note the conversation in
https://gcc.gnu.org/ml/gcc/2015-04/msg00240.html.  That seems to suggest
we should be wary of using scratch sequences since the register
allocator doesn't account for them properly.

R.

> --
> Maxim Kuvyrkov
> www.linaro.org
>
diff mbox

Patch

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 534a862..72a9f05 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -3277,6 +3277,14 @@ 
 	    DONE;
           }
       }
+
+    if (<CODE> == LSHIFTRT)
+      {
+        emit_insn (gen_aarch64_lshr_sisd_or_int_<mode>3 (operands[0],
+                                                         operands[1],
+                                                         operands[2]));
+        DONE;
+      }
   }
 )
 
@@ -3361,11 +3369,13 @@ 
 )
 
 ;; Logical right shift using SISD or Integer instruction
-(define_insn "*aarch64_lshr_sisd_or_int_<mode>3"
-  [(set (match_operand:GPI 0 "register_operand" "=w,&w,r")
+(define_insn "aarch64_lshr_sisd_or_int_<mode>3"
+  [(set (match_operand:GPI 0 "register_operand" "=w,w,r")
         (lshiftrt:GPI
           (match_operand:GPI 1 "register_operand" "w,w,r")
-          (match_operand:QI 2 "aarch64_reg_or_shift_imm_<mode>" "Us<cmode>,w,rUs<cmode>")))]
+          (match_operand:QI 2 "aarch64_reg_or_shift_imm_<mode>"
+                              "Us<cmode>,w,rUs<cmode>")))
+   (clobber (match_scratch:QI 3 "=X,w,X"))]
   ""
   "@
    ushr\t%<rtn>0<vas>, %<rtn>1<vas>, %2
@@ -3379,30 +3389,28 @@ 
   [(set (match_operand:DI 0 "aarch64_simd_register")
         (lshiftrt:DI
            (match_operand:DI 1 "aarch64_simd_register")
-           (match_operand:QI 2 "aarch64_simd_register")))]
+           (match_operand:QI 2 "aarch64_simd_register")))
+   (clobber (match_scratch:QI 3))]
   "TARGET_SIMD && reload_completed"
   [(set (match_dup 3)
         (unspec:QI [(match_dup 2)] UNSPEC_SISD_NEG))
    (set (match_dup 0)
         (unspec:DI [(match_dup 1) (match_dup 3)] UNSPEC_SISD_USHL))]
-  {
-    operands[3] = gen_lowpart (QImode, operands[0]);
-  }
+  ""
 )
 
 (define_split
   [(set (match_operand:SI 0 "aarch64_simd_register")
         (lshiftrt:SI
            (match_operand:SI 1 "aarch64_simd_register")
-           (match_operand:QI 2 "aarch64_simd_register")))]
+           (match_operand:QI 2 "aarch64_simd_register")))
+   (clobber (match_scratch:QI 3))]
   "TARGET_SIMD && reload_completed"
   [(set (match_dup 3)
         (unspec:QI [(match_dup 2)] UNSPEC_SISD_NEG))
    (set (match_dup 0)
         (unspec:SI [(match_dup 1) (match_dup 3)] UNSPEC_USHL_2S))]
-  {
-    operands[3] = gen_lowpart (QImode, operands[0]);
-  }
+  ""
 )
 
 ;; Arithmetic right shift using SISD or Integer instruction