diff mbox

combine: Omit redundant AND in change_zero_ext.

Message ID 20161214100147.GA3427@linux.vnet.ibm.com
State New
Headers show

Commit Message

Dominik Vogt Dec. 14, 2016, 10:01 a.m. UTC
This is another micro-optimisation in change_zero_ext.  If an

  (and (lshiftrt ... (N)) (M))

generated by change_zero_ext is equivalent to just

  (lshiftrt ... (N))

(because the AND constant selects the N rightmost bits of the
result), strip off the AND.

_But_ I'm still not completely convinced whether this is a good
idea.  It may become necessary to add md patterns to deal with
just the LSHIFTRT.  On the other hand it saves the need for
another special case in change_zero_ext, and a less obvious, very
specific risbg pattern on s390 

--

Bootstrapped and regression tested on s390x and s390.  (Targets
with risbg-like instructions (Power, others?) may need some
tuning.)

Ciao

Dominik ^_^  ^_^

Comments

Segher Boessenkool Dec. 14, 2016, 7:32 p.m. UTC | #1
On Wed, Dec 14, 2016 at 11:01:47AM +0100, Dominik Vogt wrote:
> This is another micro-optimisation in change_zero_ext.  If an
> 
>   (and (lshiftrt ... (N)) (M))
> 
> generated by change_zero_ext is equivalent to just
> 
>   (lshiftrt ... (N))
> 
> (because the AND constant selects the N rightmost bits of the
> result), strip off the AND.
> 
> _But_ I'm still not completely convinced whether this is a good
> idea.  It may become necessary to add md patterns to deal with
> just the LSHIFTRT.  On the other hand it saves the need for
> another special case in change_zero_ext, and a less obvious, very
> specific risbg pattern on s390 

For PowerPC we should already have all such patterns with a "bare" shift
(they can be created in other ways, too).

> Bootstrapped and regression tested on s390x and s390.  (Targets
> with risbg-like instructions (Power, others?) may need some
> tuning.)

But, it is also possible I missed some.  So please wait until I have
tested it.


> diff --git a/gcc/combine.c b/gcc/combine.c
> index 19851a2..5ebf31c 100644
> --- a/gcc/combine.c
> +++ b/gcc/combine.c
> @@ -11280,8 +11280,13 @@ change_zero_ext (rtx pat)
>        else
>  	continue;
>  
> -      wide_int mask = wi::mask (size, false, GET_MODE_PRECISION (mode));
> -      x = gen_rtx_AND (mode, x, immed_wide_int_const (mask, mode));
> +      if (!(GET_CODE (x) == LSHIFTRT
> +	    && CONST_INT_P (XEXP (x, 1))
> +	    && size + INTVAL (XEXP (x, 1)) == GET_MODE_PRECISION (mode)))
> +	{
> +	  wide_int mask = wi::mask (size, false, GET_MODE_PRECISION (mode));
> +	  x = gen_rtx_AND (mode, x, immed_wide_int_const (mask, mode));
> +	}

One could argue that this should have been an lshiftrt in the first place
then, not a zero_ext*.  Hrm.


Segher
Dominik Vogt Dec. 15, 2016, 8:55 a.m. UTC | #2
On Wed, Dec 14, 2016 at 01:32:48PM -0600, Segher Boessenkool wrote:
> On Wed, Dec 14, 2016 at 11:01:47AM +0100, Dominik Vogt wrote:
> > This is another micro-optimisation in change_zero_ext.  If an
> > 
> >   (and (lshiftrt ... (N)) (M))
> > 
> > generated by change_zero_ext is equivalent to just
> > 
> >   (lshiftrt ... (N))
> > 
> > (because the AND constant selects the N rightmost bits of the
> > result), strip off the AND.
> > 
> > _But_ I'm still not completely convinced whether this is a good
> > idea.  It may become necessary to add md patterns to deal with
> > just the LSHIFTRT.  On the other hand it saves the need for
> > another special case in change_zero_ext, and a less obvious, very
> > specific risbg pattern on s390 
> 
> For PowerPC we should already have all such patterns with a "bare" shift
> (they can be created in other ways, too).
> 
> > Bootstrapped and regression tested on s390x and s390.  (Targets
> > with risbg-like instructions (Power, others?) may need some
> > tuning.)
> 
> But, it is also possible I missed some.  So please wait until I have
> tested it.
> 
> 
> > diff --git a/gcc/combine.c b/gcc/combine.c
> > index 19851a2..5ebf31c 100644
> > --- a/gcc/combine.c
> > +++ b/gcc/combine.c
> > @@ -11280,8 +11280,13 @@ change_zero_ext (rtx pat)
> >        else
> >  	continue;
> >  
> > -      wide_int mask = wi::mask (size, false, GET_MODE_PRECISION (mode));
> > -      x = gen_rtx_AND (mode, x, immed_wide_int_const (mask, mode));
> > +      if (!(GET_CODE (x) == LSHIFTRT
> > +	    && CONST_INT_P (XEXP (x, 1))
> > +	    && size + INTVAL (XEXP (x, 1)) == GET_MODE_PRECISION (mode)))
> > +	{
> > +	  wide_int mask = wi::mask (size, false, GET_MODE_PRECISION (mode));
> > +	  x = gen_rtx_AND (mode, x, immed_wide_int_const (mask, mode));
> > +	}
> 
> One could argue that this should have been an lshiftrt in the first place
> then, not a zero_ext*.  Hrm.

This one

  void g2(ui64 *pl, i32 seed)
  {
    seed = 69607 * seed;
    pl[0] = (seed >> 8) & 0xffffff;
  }

generates

  (zero_extract:DI (reg:SI 75 [ seed ])
    (const_int 24 [0x18])
    (const_int 0 [0]))

on s390x.

Ciao

Dominik ^_^  ^_^
Segher Boessenkool Dec. 15, 2016, 9:16 a.m. UTC | #3
On Thu, Dec 15, 2016 at 09:55:52AM +0100, Dominik Vogt wrote:
> > > Bootstrapped and regression tested on s390x and s390.  (Targets
> > > with risbg-like instructions (Power, others?) may need some
> > > tuning.)
> > 
> > But, it is also possible I missed some.  So please wait until I have
> > tested it.
> > 
> > > diff --git a/gcc/combine.c b/gcc/combine.c
> > > index 19851a2..5ebf31c 100644
> > > --- a/gcc/combine.c
> > > +++ b/gcc/combine.c
> > > @@ -11280,8 +11280,13 @@ change_zero_ext (rtx pat)
> > >        else
> > >  	continue;
> > >  
> > > -      wide_int mask = wi::mask (size, false, GET_MODE_PRECISION (mode));
> > > -      x = gen_rtx_AND (mode, x, immed_wide_int_const (mask, mode));
> > > +      if (!(GET_CODE (x) == LSHIFTRT
> > > +	    && CONST_INT_P (XEXP (x, 1))
> > > +	    && size + INTVAL (XEXP (x, 1)) == GET_MODE_PRECISION (mode)))
> > > +	{
> > > +	  wide_int mask = wi::mask (size, false, GET_MODE_PRECISION (mode));
> > > +	  x = gen_rtx_AND (mode, x, immed_wide_int_const (mask, mode));
> > > +	}
> > 
> > One could argue that this should have been an lshiftrt in the first place
> > then, not a zero_ext*.  Hrm.
> 
> This one
> 
>   void g2(ui64 *pl, i32 seed)
>   {
>     seed = 69607 * seed;
>     pl[0] = (seed >> 8) & 0xffffff;
>   }
> 
> generates
> 
>   (zero_extract:DI (reg:SI 75 [ seed ])
>     (const_int 24 [0x18])
>     (const_int 0 [0]))
> 
> on s390x.

Ah, right, it changes mode as well.  I see.

Tested on powerpc64-linux {-m32,-m64}, no new failures.  The patch is
okay for trunk.  Thanks!


Segher
Andreas Krebbel Dec. 19, 2016, 9:56 a.m. UTC | #4
On Wed, Dec 14, 2016 at 11:01:47AM +0100, Dominik Vogt wrote:
> gcc/ChangeLog-change_zero_ext-2
> 
> 	* combine.c (change_zero_ext): Skip generation of redundant AND.

Applied. Thanks!

-Andreas-
diff mbox

Patch

diff --git a/gcc/combine.c b/gcc/combine.c
index 19851a2..5ebf31c 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -11280,8 +11280,13 @@  change_zero_ext (rtx pat)
       else
 	continue;
 
-      wide_int mask = wi::mask (size, false, GET_MODE_PRECISION (mode));
-      x = gen_rtx_AND (mode, x, immed_wide_int_const (mask, mode));
+      if (!(GET_CODE (x) == LSHIFTRT
+	    && CONST_INT_P (XEXP (x, 1))
+	    && size + INTVAL (XEXP (x, 1)) == GET_MODE_PRECISION (mode)))
+	{
+	  wide_int mask = wi::mask (size, false, GET_MODE_PRECISION (mode));
+	  x = gen_rtx_AND (mode, x, immed_wide_int_const (mask, mode));
+	}
 
       SUBST (**iter, x);
       changed = true;