diff mbox

combine: Replace sign_extend with zero_extend more often.

Message ID 20161214123913.GA348@linux.vnet.ibm.com
State New
Headers show

Commit Message

Dominik Vogt Dec. 14, 2016, 12:39 p.m. UTC
There may be a slight imprecision in expand_compound_operation.
When it encounters a SIGN_EXTEND where it's already known that the
sign bit is zero, it may replace that with a ZERO_EXTEND (and
tries to simplify that further).  However, the pattern is only
replaced if the new set_src_cost() is _lower_ than the old cost.

The patch changes that to "not higher than", assuming that the
ZERO_EXTEND form is generally preferrable unless there is a reason
to believe it's not (i.e. its cost is higher).  The comment atop
this code block seems to support this:

  /* Convert sign extension to zero extension, if we know that the high
     bit is not set, as this is easier to optimize.  It will be converted
     back to cheaper alternative in make_extraction.  */

On s390[x] this gets rid of some SIGN_EXTENDs completely.

(The patched code uses the cheaper of both replacement patterns.)

--

The patch hasn't got a lot of testing yet as I'd like to hear your
opinion on the patch first.

Ciao

Dominik ^_^  ^_^

Comments

Segher Boessenkool Dec. 15, 2016, 10:32 a.m. UTC | #1
On Wed, Dec 14, 2016 at 01:39:13PM +0100, Dominik Vogt wrote:
> There may be a slight imprecision in expand_compound_operation.
> When it encounters a SIGN_EXTEND where it's already known that the
> sign bit is zero, it may replace that with a ZERO_EXTEND (and
> tries to simplify that further).  However, the pattern is only
> replaced if the new set_src_cost() is _lower_ than the old cost.
> 
> The patch changes that to "not higher than", assuming that the
> ZERO_EXTEND form is generally preferrable unless there is a reason
> to believe it's not (i.e. its cost is higher).  The comment atop
> this code block seems to support this:
> 
>   /* Convert sign extension to zero extension, if we know that the high
>      bit is not set, as this is easier to optimize.  It will be converted
>      back to cheaper alternative in make_extraction.  */
> 
> On s390[x] this gets rid of some SIGN_EXTENDs completely.
> 
> (The patched code uses the cheaper of both replacement patterns.)

That looks fine.  But see below.

> The patch hasn't got a lot of testing yet as I'd like to hear your
> opinion on the patch first.

I am testing it on powerpc.  Please also test on x86?

> gcc/ChangeLog-signextend-1
> 
> 	* combine.c (expand_compound_operation): Substitute ZERO_EXTEND for
> 	SIGN_EXTEND if the costs are equal or lower.
> 	Choose the cheapest replacement.

>        /* Make sure this is a profitable operation.  */
>        if (set_src_cost (x, mode, optimize_this_for_speed_p)
> -          > set_src_cost (temp2, mode, optimize_this_for_speed_p))
> -       return temp2;
> -      else if (set_src_cost (x, mode, optimize_this_for_speed_p)
> -               > set_src_cost (temp, mode, optimize_this_for_speed_p))
> -       return temp;
> -      else
> -       return x;
> +	  >= set_src_cost (temp2, mode, optimize_this_for_speed_p))
> +	x = temp2;
> +      if (set_src_cost (x, mode, optimize_this_for_speed_p)
> +	  >= set_src_cost (temp, mode, optimize_this_for_speed_p))
> +	x = temp;
> +      return x;
>      }

So this prefers the zero_extend version over the expand_compound_operation
version, I wonder if that is a good idea.


Segher
Dominik Vogt Dec. 15, 2016, 12:57 p.m. UTC | #2
On Thu, Dec 15, 2016 at 04:32:34AM -0600, Segher Boessenkool wrote:
> On Wed, Dec 14, 2016 at 01:39:13PM +0100, Dominik Vogt wrote:
> > There may be a slight imprecision in expand_compound_operation.
> > When it encounters a SIGN_EXTEND where it's already known that the
> > sign bit is zero, it may replace that with a ZERO_EXTEND (and
> > tries to simplify that further).  However, the pattern is only
> > replaced if the new set_src_cost() is _lower_ than the old cost.
> > 
> > The patch changes that to "not higher than", assuming that the
> > ZERO_EXTEND form is generally preferrable unless there is a reason
> > to believe it's not (i.e. its cost is higher).  The comment atop
> > this code block seems to support this:
> > 
> >   /* Convert sign extension to zero extension, if we know that the high
> >      bit is not set, as this is easier to optimize.  It will be converted
> >      back to cheaper alternative in make_extraction.  */
> > 
> > On s390[x] this gets rid of some SIGN_EXTENDs completely.
> > 
> > (The patched code uses the cheaper of both replacement patterns.)
> 
> That looks fine.  But see below.
> 
> > The patch hasn't got a lot of testing yet as I'd like to hear your
> > opinion on the patch first.
> 
> I am testing it on powerpc.  Please also test on x86?
> 
> > gcc/ChangeLog-signextend-1
> > 
> > 	* combine.c (expand_compound_operation): Substitute ZERO_EXTEND for
> > 	SIGN_EXTEND if the costs are equal or lower.
> > 	Choose the cheapest replacement.
> 
> >        /* Make sure this is a profitable operation.  */
> >        if (set_src_cost (x, mode, optimize_this_for_speed_p)
> > -          > set_src_cost (temp2, mode, optimize_this_for_speed_p))
> > -       return temp2;
> > -      else if (set_src_cost (x, mode, optimize_this_for_speed_p)
> > -               > set_src_cost (temp, mode, optimize_this_for_speed_p))
> > -       return temp;
> > -      else
> > -       return x;
> > +	  >= set_src_cost (temp2, mode, optimize_this_for_speed_p))
> > +	x = temp2;
> > +      if (set_src_cost (x, mode, optimize_this_for_speed_p)
> > +	  >= set_src_cost (temp, mode, optimize_this_for_speed_p))
> > +	x = temp;
> > +      return x;
> >      }
> 
> So this prefers the zero_extend version over the expand_compound_operation
> version, I wonder if that is a good idea.

Maybe this is a little less disruptive:

      int ctemp = set_src_cost (temp, mode, optimize_this_for_speed_p);
      int ctemp2 = set_src_cost (temp2, mode, optimize_this_for_speed_p);

      /* Make sure this is a profitable operation.  */
      if (MIN (ctemp, ctemp2)
	  <= set_src_cost (x, mode, optimize_this_for_speed_p))
	x = (ctemp < ctemp2) ? temp : temp2;
      return x;

Ciao

Dominik ^_^  ^_^
Segher Boessenkool Dec. 15, 2016, 3:52 p.m. UTC | #3
On Thu, Dec 15, 2016 at 01:57:06PM +0100, Dominik Vogt wrote:
> > > The patch hasn't got a lot of testing yet as I'd like to hear your
> > > opinion on the patch first.
> > 
> > I am testing it on powerpc.  Please also test on x86?
> > 
> > > gcc/ChangeLog-signextend-1
> > > 
> > > 	* combine.c (expand_compound_operation): Substitute ZERO_EXTEND for
> > > 	SIGN_EXTEND if the costs are equal or lower.
> > > 	Choose the cheapest replacement.
> > 
> > >        /* Make sure this is a profitable operation.  */
> > >        if (set_src_cost (x, mode, optimize_this_for_speed_p)
> > > -          > set_src_cost (temp2, mode, optimize_this_for_speed_p))
> > > -       return temp2;
> > > -      else if (set_src_cost (x, mode, optimize_this_for_speed_p)
> > > -               > set_src_cost (temp, mode, optimize_this_for_speed_p))
> > > -       return temp;
> > > -      else
> > > -       return x;
> > > +	  >= set_src_cost (temp2, mode, optimize_this_for_speed_p))
> > > +	x = temp2;
> > > +      if (set_src_cost (x, mode, optimize_this_for_speed_p)
> > > +	  >= set_src_cost (temp, mode, optimize_this_for_speed_p))
> > > +	x = temp;
> > > +      return x;
> > >      }
> > 
> > So this prefers the zero_extend version over the expand_compound_operation
> > version, I wonder if that is a good idea.
> 
> Maybe this is a little less disruptive:
> 
>       int ctemp = set_src_cost (temp, mode, optimize_this_for_speed_p);
>       int ctemp2 = set_src_cost (temp2, mode, optimize_this_for_speed_p);
> 
>       /* Make sure this is a profitable operation.  */
>       if (MIN (ctemp, ctemp2)
> 	  <= set_src_cost (x, mode, optimize_this_for_speed_p))
> 	x = (ctemp < ctemp2) ? temp : temp2;
>       return x;

Or just swap the temp and temp2 cases in your original patch.  Which btw
tested fine on powerpc64-linux {-m32,-m64}.


Segher
diff mbox

Patch

diff --git a/gcc/combine.c b/gcc/combine.c
index 1456290..4ecfb4b 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -7090,13 +7090,12 @@  expand_compound_operation (rtx x)
 
       /* Make sure this is a profitable operation.  */
       if (set_src_cost (x, mode, optimize_this_for_speed_p)
-          > set_src_cost (temp2, mode, optimize_this_for_speed_p))
-       return temp2;
-      else if (set_src_cost (x, mode, optimize_this_for_speed_p)
-               > set_src_cost (temp, mode, optimize_this_for_speed_p))
-       return temp;
-      else
-       return x;
+	  >= set_src_cost (temp2, mode, optimize_this_for_speed_p))
+	x = temp2;
+      if (set_src_cost (x, mode, optimize_this_for_speed_p)
+	  >= set_src_cost (temp, mode, optimize_this_for_speed_p))
+	x = temp;
+      return x;
     }
 
   /* We can optimize some special cases of ZERO_EXTEND.  */