Patchwork Fix up rotate expansion

login
register
mail settings
Submitter Jakub Jelinek
Date May 10, 2013, 2:53 p.m.
Message ID <20130510145355.GR1377@tucnak.redhat.com>
Download mbox | patch
Permalink /patch/243005/
State New
Headers show

Comments

Jakub Jelinek - May 10, 2013, 2:53 p.m.
Hi!

Our rotate expansion if rotate optab isn't present for the respective
mode looks unsafe for rotations by variable count if that count could
be 0, because then it invokes right or left shift by bitsize.
While on most targets the hw will probably do the right thing
(it is fine if x << 32 will either yield x or 0, in both cases
that ored together with x >> 0 aka x will still yield x), perhaps gcc
could try to optimize based on the fact that undefined behavior can't
happen, so IMHO it is better to generate a safer sequence.

Ok for trunk?

2013-05-10  Jakub Jelinek  <jakub@redhat.com>

	* expmed.c (expand_shift_1): For rotations by const0_rtx just
	return shifted.  Use (-op1) & (prec - 1) as other_amount
	instead of prec - op1.


	Jakub
Jeff Law - May 10, 2013, 11:18 p.m.
On 05/10/2013 08:53 AM, Jakub Jelinek wrote:
> Hi!
>
> Our rotate expansion if rotate optab isn't present for the respective
> mode looks unsafe for rotations by variable count if that count could
> be 0, because then it invokes right or left shift by bitsize.
> While on most targets the hw will probably do the right thing
> (it is fine if x << 32 will either yield x or 0, in both cases
> that ored together with x >> 0 aka x will still yield x), perhaps gcc
> could try to optimize based on the fact that undefined behavior can't
> happen, so IMHO it is better to generate a safer sequence.
>
> Ok for trunk?
>
> 2013-05-10  Jakub Jelinek  <jakub@redhat.com>
>
> 	* expmed.c (expand_shift_1): For rotations by const0_rtx just
> 	return shifted.  Use (-op1) & (prec - 1) as other_amount
> 	instead of prec - op1.
Found by inspection?  Presumably the rotate was synthesized by GCC from 
some other set of operations.  To be optimizable, we'd have to prove the 
original sequence triggered undefined behaviour.

Seems that we ought to have a testcase, even though it probably means 
scanning the tree dumps to pick up the undefined behaviour.  Approved 
with a testcase.


Out of curiosity, do we recognize the original sequence created by 
expand_shift_1 as a rotate if they appeared in the original source?



jeff
Jakub Jelinek - May 11, 2013, 7:05 a.m.
On Fri, May 10, 2013 at 05:18:34PM -0600, Jeff Law wrote:
> On 05/10/2013 08:53 AM, Jakub Jelinek wrote:
> >Hi!
> >
> >Our rotate expansion if rotate optab isn't present for the respective
> >mode looks unsafe for rotations by variable count if that count could
> >be 0, because then it invokes right or left shift by bitsize.
> >While on most targets the hw will probably do the right thing
> >(it is fine if x << 32 will either yield x or 0, in both cases
> >that ored together with x >> 0 aka x will still yield x), perhaps gcc
> >could try to optimize based on the fact that undefined behavior can't
> >happen, so IMHO it is better to generate a safer sequence.
> >
> >Ok for trunk?
> >
> >2013-05-10  Jakub Jelinek  <jakub@redhat.com>
> >
> >	* expmed.c (expand_shift_1): For rotations by const0_rtx just
> >	return shifted.  Use (-op1) & (prec - 1) as other_amount
> >	instead of prec - op1.

> Found by inspection?

Yes.

>  Presumably the rotate was synthesized by GCC
> from some other set of operations.  To be optimizable, we'd have to
> prove the original sequence triggered undefined behaviour.

See http://gcc.gnu.org/PR57157 and
http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00453.html
Until recently only the bit_rotate: code in fold-const.c was pattern
matching {L,R}ROTATE_EXPR and handled only the
(x << y) {+|^} (x >> (b - y))
patterns which is indeed undefined behavior for y == 0.
But since the above change trunk also handles the
(x << y) {+|^} (x >> ((-y) & (b - 1))
pattern which is valid even for y = 0, thus the above patch adjusts
what we generate.  The info whether rotation count 0 was valid or not is
unfortunately lost, adding two new {L,R}ROTATE0_EXPR tree codes might be
overkill for this.

> Seems that we ought to have a testcase, even though it probably
> means scanning the tree dumps to pick up the undefined behaviour.
> Approved with a testcase.

I have added lots of testcases recently, for rotation by zero perhaps
something similar to rotate-1a.c from above can be added as rotate-2b.c
and rotate-4b.c, and test zero rotation.

	Jakub

Patch

--- gcc/expmed.c.jj	2013-05-07 10:26:46.000000000 +0200
+++ gcc/expmed.c	2013-05-10 11:54:34.659987038 +0200
@@ -2166,7 +2166,8 @@  expand_shift_1 (enum tree_code code, enu
 	    {
 	      /* If we have been unable to open-code this by a rotation,
 		 do it as the IOR of two shifts.  I.e., to rotate A
-		 by N bits, compute (A << N) | ((unsigned) A >> (C - N))
+		 by N bits, compute
+		 (A << N) | ((unsigned) A >> ((-N) & (C - 1)))
 		 where C is the bitsize of A.
 
 		 It is theoretically possible that the target machine might
@@ -2181,14 +2182,22 @@  expand_shift_1 (enum tree_code code, enu
 	      rtx temp1;
 
 	      new_amount = op1;
-	      if (CONST_INT_P (op1))
+	      if (op1 == const0_rtx)
+		return shifted;
+	      else if (CONST_INT_P (op1))
 		other_amount = GEN_INT (GET_MODE_BITSIZE (mode)
 					- INTVAL (op1));
 	      else
-		other_amount
-		  = simplify_gen_binary (MINUS, GET_MODE (op1),
-					 GEN_INT (GET_MODE_PRECISION (mode)),
-					 op1);
+		{
+		  other_amount
+		    = simplify_gen_unary (NEG, GET_MODE (op1),
+					  op1, GET_MODE (op1));
+		  other_amount
+		    = simplify_gen_binary (AND, GET_MODE (op1),
+					   other_amount,
+					   GEN_INT (GET_MODE_PRECISION (mode)
+						    - 1));
+		}
 
 	      shifted = force_reg (mode, shifted);