Patchwork Improve andq $0xffffffff, %reg handling (PR target/53110)

login
register
mail settings
Submitter Uros Bizjak
Date April 30, 2012, 12:54 p.m.
Message ID <CAFULd4aXx3Lbweow+5FoFeHA-3ud2r-6ckz7XEzK3jWazvG18A@mail.gmail.com>
Download mbox | patch
Permalink /patch/155859/
State New
Headers show

Comments

Uros Bizjak - April 30, 2012, 12:54 p.m.
On Thu, Apr 26, 2012 at 5:34 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Thu, Apr 26, 2012 at 5:28 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>
>>> > We have a splitter for reg1 = reg2 & 0xffffffff, but only if regnums
>>> > are different.  But movl %edi, %edi is a cheaper variant of
>>> > andq $0xffffffff, %rdi even with the same register and doesn't clobber
>>> > flags, so this patch attempts to expand it as a zero extension early.
>>> >
>>> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>>> >
>>> > 2012-04-25  Jakub Jelinek  <jakub@redhat.com>
>>> >
>>> >        PR target/53110
>>> >        * config/i386/i386.md (and<mode>3): For andq $0xffffffff, reg
>>> >        instead expand it as zero extension.
>>>
>>> Perhaps we can go all the way and also convert ANDs with $0xff and
>>> $0xffff to relevant zero_extend patterns, like in the referred
>>> splitter.
>>
>> I wasn't sure about 0xffff, on which CPUs it would be a win and on which it
>> would not.  0xffffffff is a win always.

It is always a win, especially when loading from memory. In the latter
case, no execution units are used, only load (memory read) unit.

> My recent changes to zero_extend expanders should handle this
> automatically, and will undo generation of zero_extend pattern. Please
> see zero_extend<mode>si2_and expander, and how it handles
> TARGET_ZERO_EXTEND_WITH_AND targets.

Attached patch implements this idea. In addition, it fixes the
splitter to not change output mode of zero_extension from HImode and
QImode from DImode to SImode. Although they generate the same
instruction, I think we should better keep original mode here.

2012-04-30  Uros Bizjak  <ubizjak@gmail.com>

	* config/i386/i386.md (and<mode>3): Expand masking operations with
	0xff, 0xffff or 0xffffffff immediates to corresponding zero_extend RTX.
	(and splitter): Split to DImode zero_extend RTX for DImode operand[0].

Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
{,-m32} and committed to mainline SVN.

Uros.
Jakub Jelinek - April 30, 2012, 1:10 p.m.
On Mon, Apr 30, 2012 at 02:54:05PM +0200, Uros Bizjak wrote:
> > My recent changes to zero_extend expanders should handle this
> > automatically, and will undo generation of zero_extend pattern. Please
> > see zero_extend<mode>si2_and expander, and how it handles
> > TARGET_ZERO_EXTEND_WITH_AND targets.
> 
> Attached patch implements this idea. In addition, it fixes the
> splitter to not change output mode of zero_extension from HImode and
> QImode from DImode to SImode. Although they generate the same
> instruction, I think we should better keep original mode here.

Thanks.  I was trying this morning slightly different patch for the same,
but strangely it failed bootstrap, and didn't get around to analysing
why a mem store had (zero_extend (subreg (reg))) on a RHS.

> +  operands[1] = gen_lowpart (mode, operands[1]);
> +
> +  if (GET_MODE (operands[0]) == DImode)
> +    insn = (mode == SImode)
> +	   ? gen_zero_extendsidi2
> +	   : (mode == HImode)
> +	   ? gen_zero_extendhidi2
> +	   : gen_zero_extendqidi2;
> +  else if (GET_MODE (operands[0]) == SImode)
> +    insn = (mode == HImode)
> +	   ? gen_zero_extendhisi2
> +	   : gen_zero_extendqisi2;
> +  else if (GET_MODE (operands[0]) == HImode)
> +    insn = gen_zero_extendqihi2;
>    else
> -    ix86_expand_binary_operator (AND, <MODE>mode, operands);
> +    gcc_unreachable ();
> +
> +  emit_insn (insn (operands[0], operands[1]));

IMHO you should use <MODE>mode instead of GET_MODE (operands[0])
in all of the above, then the compiler can actually optimize
it at compile time.

	Jakub
Uros Bizjak - April 30, 2012, 1:34 p.m.
On Mon, Apr 30, 2012 at 3:10 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Mon, Apr 30, 2012 at 02:54:05PM +0200, Uros Bizjak wrote:
>> > My recent changes to zero_extend expanders should handle this
>> > automatically, and will undo generation of zero_extend pattern. Please
>> > see zero_extend<mode>si2_and expander, and how it handles
>> > TARGET_ZERO_EXTEND_WITH_AND targets.
>>
>> Attached patch implements this idea. In addition, it fixes the
>> splitter to not change output mode of zero_extension from HImode and
>> QImode from DImode to SImode. Although they generate the same
>> instruction, I think we should better keep original mode here.
>
> Thanks.  I was trying this morning slightly different patch for the same,
> but strangely it failed bootstrap, and didn't get around to analysing
> why a mem store had (zero_extend (subreg (reg))) on a RHS.

Maybe it was due to slightly wrong splitter? I was changing both, the
expander and the splitter in parallel, and didn't see any failure you
mentioned...

>> +  operands[1] = gen_lowpart (mode, operands[1]);
>> +
>> +  if (GET_MODE (operands[0]) == DImode)
>> +    insn = (mode == SImode)
>> +        ? gen_zero_extendsidi2
>> +        : (mode == HImode)
>> +        ? gen_zero_extendhidi2
>> +        : gen_zero_extendqidi2;
>> +  else if (GET_MODE (operands[0]) == SImode)
>> +    insn = (mode == HImode)
>> +        ? gen_zero_extendhisi2
>> +        : gen_zero_extendqisi2;
>> +  else if (GET_MODE (operands[0]) == HImode)
>> +    insn = gen_zero_extendqihi2;
>>    else
>> -    ix86_expand_binary_operator (AND, <MODE>mode, operands);
>> +    gcc_unreachable ();
>> +
>> +  emit_insn (insn (operands[0], operands[1]));
>
> IMHO you should use <MODE>mode instead of GET_MODE (operands[0])
> in all of the above, then the compiler can actually optimize
> it at compile time.

You are right, I will change this in an incremental patch.

Thanks,
Uros.
Uros Bizjak - April 30, 2012, 1:47 p.m.
On Mon, Apr 30, 2012 at 3:34 PM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Mon, Apr 30, 2012 at 3:10 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>> On Mon, Apr 30, 2012 at 02:54:05PM +0200, Uros Bizjak wrote:
>>> > My recent changes to zero_extend expanders should handle this
>>> > automatically, and will undo generation of zero_extend pattern. Please
>>> > see zero_extend<mode>si2_and expander, and how it handles
>>> > TARGET_ZERO_EXTEND_WITH_AND targets.
>>>
>>> Attached patch implements this idea. In addition, it fixes the
>>> splitter to not change output mode of zero_extension from HImode and
>>> QImode from DImode to SImode. Although they generate the same
>>> instruction, I think we should better keep original mode here.
>>
>> Thanks.  I was trying this morning slightly different patch for the same,
>> but strangely it failed bootstrap, and didn't get around to analysing
>> why a mem store had (zero_extend (subreg (reg))) on a RHS.
>
> Maybe it was due to slightly wrong splitter? I was changing both, the
> expander and the splitter in parallel, and didn't see any failure you
> mentioned...

In fact, the problem was already in your original patch. There was a
check for REG_P (operands[1]), but it should check for REG_P
(operands[0]). There is no zero_extended store, only load.

Uros.

Patch

Index: config/i386/i386.md
===================================================================
--- config/i386/i386.md	(revision 186954)
+++ config/i386/i386.md	(working copy)
@@ -7695,14 +7695,45 @@ 
 		  (match_operand:SWIM 2 "<general_szext_operand>")))]
   ""
 {
-  if (<MODE>mode == DImode
-      && GET_CODE (operands[2]) == CONST_INT
-      && INTVAL (operands[2]) == (HOST_WIDE_INT) 0xffffffff
-      && REG_P (operands[1]))
-    emit_insn (gen_zero_extendsidi2 (operands[0],
-				     gen_lowpart (SImode, operands[1])));
+  enum machine_mode mode = GET_MODE (operands[1]);
+  rtx (*insn) (rtx, rtx);
+
+  if (CONST_INT_P (operands[2]) && REG_P (operands[0]))
+    {
+      HOST_WIDE_INT ival = INTVAL (operands[2]);
+
+      if (ival == (HOST_WIDE_INT) 0xffffffff)
+	mode = SImode;
+      else if (ival == 0xffff)
+	mode = HImode;
+      else if (ival == 0xff)
+	mode = QImode;
+      }
+
+  if (mode == GET_MODE (operands[1]))
+    {
+      ix86_expand_binary_operator (AND, <MODE>mode, operands);
+      DONE;
+    }
+
+  operands[1] = gen_lowpart (mode, operands[1]);
+
+  if (GET_MODE (operands[0]) == DImode)
+    insn = (mode == SImode)
+	   ? gen_zero_extendsidi2
+	   : (mode == HImode)
+	   ? gen_zero_extendhidi2
+	   : gen_zero_extendqidi2;
+  else if (GET_MODE (operands[0]) == SImode)
+    insn = (mode == HImode)
+	   ? gen_zero_extendhisi2
+	   : gen_zero_extendqisi2;
+  else if (GET_MODE (operands[0]) == HImode)
+    insn = gen_zero_extendqihi2;
   else
-    ix86_expand_binary_operator (AND, <MODE>mode, operands);
+    gcc_unreachable ();
+
+  emit_insn (insn (operands[0], operands[1]));
   DONE;
 })
 
@@ -7839,32 +7870,38 @@ 
    && true_regnum (operands[0]) != true_regnum (operands[1])"
   [(const_int 0)]
 {
+  HOST_WIDE_INT ival = INTVAL (operands[2]);
   enum machine_mode mode;
+  rtx (*insn) (rtx, rtx);
 
-  if (INTVAL (operands[2]) == (HOST_WIDE_INT) 0xffffffff)
+  if (ival == (HOST_WIDE_INT) 0xffffffff)
     mode = SImode;
-  else if (INTVAL (operands[2]) == 0xffff)
+  else if (ival == 0xffff)
     mode = HImode;
   else
     {
-      gcc_assert (INTVAL (operands[2]) == 0xff);
+      gcc_assert (ival == 0xff);
       mode = QImode;
     }
 
   operands[1] = gen_lowpart (mode, operands[1]);
 
-  if (mode == SImode)
-    emit_insn (gen_zero_extendsidi2 (operands[0], operands[1]));
+  if (GET_MODE (operands[0]) == DImode)
+    insn = (mode == SImode)
+	   ? gen_zero_extendsidi2
+	   : (mode == HImode)
+	   ? gen_zero_extendhidi2
+	   : gen_zero_extendqidi2;
   else
     {
-      rtx (*insn) (rtx, rtx);
-
       /* Zero extend to SImode to avoid partial register stalls.  */
       operands[0] = gen_lowpart (SImode, operands[0]);
 
-      insn = (mode == HImode) ? gen_zero_extendhisi2 : gen_zero_extendqisi2;
-      emit_insn (insn (operands[0], operands[1]));
+      insn = (mode == HImode)
+	     ? gen_zero_extendhisi2
+	     : gen_zero_extendqisi2;
     }
+  emit_insn (insn (operands[0], operands[1]));
   DONE;
 })