diff mbox

[C/C++] Don't convert RHS of a shift-expression to int (PR c/63862)

Message ID 20141126173944.GW29446@redhat.com
State New
Headers show

Commit Message

Marek Polacek Nov. 26, 2014, 5:39 p.m. UTC
Both C11 and C++14 standards specify that integral promotions are
performed on both operands of a shift-expression.  This we do just
fine.  But then we convert the right operand to integer_type_node.
Not only is this unnecessary, it can also be harfmul, because for
e.g. 
void
foo (unsigned int x)
{
  if (-1 >> x != -1)
    bar ();
}
with (int) x we lose info that x is nonnegative, which means that
tree_expr_nonnegative_p cannot fold this expr.  Another problem
with the conversion is that we weren't able to detect e.g. shift
by 0x100000000ULL, since after the conversion this is 0.

This patch does away with the conversion to integer type for both
C and C++ FEs.  It exposed an ubsan bug where I was using incorrect
type for a MINUS_EXPR.  With this patch, the XFAIL in shiftopt-1.c
is not needed anymore.

Joseph, is that C FE part ok?
Jason, is that C++ FE part ok?
Jakub, is that ubsan part ok?

Bootstrapped/regtested on ppc64-linux; ubsan.exp tested even on x86_64.

2014-11-26  Marek Polacek  <polacek@redhat.com>

	PR c/63862
c-family/
	* c-ubsan.c (ubsan_instrument_shift): Change the type of a MINUS_EXPR
	to op1_utype.
c/
	* c-typeck.c (build_binary_op) <RSHIFT_EXPR, LSHIFT_EXPR>: Don't
	convert the right operand to integer type.
cp/
	* typeck.c (cp_build_binary_op) <RSHIFT_EXPR, LSHIFT_EXPR>: Don't
	convert the right operand to integer type.
testsuite/
	* gcc.c-torture/execute/shiftopt-1.c: Don't XFAIL anymore.
	* c-c++-common/ubsan/shift-7.c: New test.


	Marek

Comments

Jakub Jelinek Nov. 26, 2014, 5:50 p.m. UTC | #1
On Wed, Nov 26, 2014 at 06:39:44PM +0100, Marek Polacek wrote:
> Both C11 and C++14 standards specify that integral promotions are
> performed on both operands of a shift-expression.  This we do just
> fine.  But then we convert the right operand to integer_type_node.
> Not only is this unnecessary, it can also be harfmul, because for
> e.g. 
> void
> foo (unsigned int x)
> {
>   if (-1 >> x != -1)
>     bar ();
> }
> with (int) x we lose info that x is nonnegative, which means that
> tree_expr_nonnegative_p cannot fold this expr.  Another problem
> with the conversion is that we weren't able to detect e.g. shift
> by 0x100000000ULL, since after the conversion this is 0.

Have you checked what the expander does with it?  Say trying to
shift something by __int128 count or similar might upset it.
Wonder about if we don't make similar assumptions in the middle-end.
It might make a difference (sometimes positive, sometimes negative)
for vectorization too.

> 2014-11-26  Marek Polacek  <polacek@redhat.com>
> 
> 	PR c/63862
> c-family/
> 	* c-ubsan.c (ubsan_instrument_shift): Change the type of a MINUS_EXPR
> 	to op1_utype.

This part is ok regardless of the rest.

	Jakub
Joseph Myers Nov. 26, 2014, 6:18 p.m. UTC | #2
On Wed, 26 Nov 2014, Marek Polacek wrote:

> Joseph, is that C FE part ok?

The C changes are OK once Jakub's middle-end/expander issue is resolved 
(possibly by adding execution tests of shifts by in-range 64-bit and 
128-bit integers, both constant and non-constant, and compilation tests of 
shifts by out-of-range 64-bit and 128-bit integers).
Marek Polacek Nov. 26, 2014, 6:20 p.m. UTC | #3
On Wed, Nov 26, 2014 at 06:50:29PM +0100, Jakub Jelinek wrote:
> On Wed, Nov 26, 2014 at 06:39:44PM +0100, Marek Polacek wrote:
> > Both C11 and C++14 standards specify that integral promotions are
> > performed on both operands of a shift-expression.  This we do just
> > fine.  But then we convert the right operand to integer_type_node.
> > Not only is this unnecessary, it can also be harfmul, because for
> > e.g. 
> > void
> > foo (unsigned int x)
> > {
> >   if (-1 >> x != -1)
> >     bar ();
> > }
> > with (int) x we lose info that x is nonnegative, which means that
> > tree_expr_nonnegative_p cannot fold this expr.  Another problem
> > with the conversion is that we weren't able to detect e.g. shift
> > by 0x100000000ULL, since after the conversion this is 0.
> 
> Have you checked what the expander does with it?  Say trying to
> shift something by __int128 count or similar might upset it.

I tried
int
main ()
{
  __int128 x = 1;
  __int128 y = 200;
  __int128 z;
  asm ("" : "+g" (x), "+g" (y));
  z = x << y;
  asm ("" : "+g" (z));
  return z;
}
and that works (even with ubsan) as expected.  I haven't checked .expand
(I don't think I'd be able to judge the difference anyway), but on the
testcase above the assembly is the same with -O, with -O0 I see:

 	movq	%rbx, -24(%rbp)
 	movq	%rax, -48(%rbp)
 	movq	%rdx, -40(%rbp)
-	movq	-48(%rbp), %rax
-	movl	%eax, %ecx
 	movq	-32(%rbp), %rax
 	movq	-24(%rbp), %rdx
+	movzbl	-48(%rbp), %ecx
 	shldq	%rax, %rdx
 	salq	%cl, %rax
 	testb	$64, %cl

> Wonder about if we don't make similar assumptions in the middle-end.
> It might make a difference (sometimes positive, sometimes negative)
> for vectorization too.
 
I don't really know; I assume the testsuite would have catched it if
something went awry.

> > 2014-11-26  Marek Polacek  <polacek@redhat.com>
> > 
> > 	PR c/63862
> > c-family/
> > 	* c-ubsan.c (ubsan_instrument_shift): Change the type of a MINUS_EXPR
> > 	to op1_utype.
> 
> This part is ok regardless of the rest.

Thanks,

	Marek
Jason Merrill Nov. 26, 2014, 6:21 p.m. UTC | #4
On 11/26/2014 01:18 PM, Joseph Myers wrote:
> On Wed, 26 Nov 2014, Marek Polacek wrote:
>
>> Joseph, is that C FE part ok?
>
> The C changes are OK once Jakub's middle-end/expander issue is resolved
> (possibly by adding execution tests of shifts by in-range 64-bit and
> 128-bit integers, both constant and non-constant, and compilation tests of
> shifts by out-of-range 64-bit and 128-bit integers).

Likewise for C++.

Jason
Jakub Jelinek Nov. 26, 2014, 6:49 p.m. UTC | #5
On Wed, Nov 26, 2014 at 07:20:04PM +0100, Marek Polacek wrote:
> On Wed, Nov 26, 2014 at 06:50:29PM +0100, Jakub Jelinek wrote:
> > On Wed, Nov 26, 2014 at 06:39:44PM +0100, Marek Polacek wrote:
> > > Both C11 and C++14 standards specify that integral promotions are
> > > performed on both operands of a shift-expression.  This we do just
> > > fine.  But then we convert the right operand to integer_type_node.
> > > Not only is this unnecessary, it can also be harfmul, because for
> > > e.g. 
> > > void
> > > foo (unsigned int x)
> > > {
> > >   if (-1 >> x != -1)

I'd say fold-const not simplifying this is a bug even if x is signed int.
I think negative shift counts are undefined even in the middle-end, Richard?
Treating 5 >> -5 as 5 << 5 looks just wrong to me, does any FE rely on that?
I don't think the expander will expand it that way though.

Anyway, if the shift count is unnecessary wide type, then the middle-end
won't be able to optimize it as much as it could if it has been a narrower
type.

So, for ubsan purposes (but, shifts are instrumented in the FEs) it is of
course desirable to see the original type, but perhaps it might be a good
idea to cast the shift count to integer_type_node say during gimplification
(perhaps in c_gimplify_expr?).

	Jakub
Richard Biener Nov. 26, 2014, 6:59 p.m. UTC | #6
On November 26, 2014 7:49:55 PM CET, Jakub Jelinek <jakub@redhat.com> wrote:
>On Wed, Nov 26, 2014 at 07:20:04PM +0100, Marek Polacek wrote:
>> On Wed, Nov 26, 2014 at 06:50:29PM +0100, Jakub Jelinek wrote:
>> > On Wed, Nov 26, 2014 at 06:39:44PM +0100, Marek Polacek wrote:
>> > > Both C11 and C++14 standards specify that integral promotions are
>> > > performed on both operands of a shift-expression.  This we do
>just
>> > > fine.  But then we convert the right operand to
>integer_type_node.
>> > > Not only is this unnecessary, it can also be harfmul, because for
>> > > e.g. 
>> > > void
>> > > foo (unsigned int x)
>> > > {
>> > >   if (-1 >> x != -1)
>
>I'd say fold-const not simplifying this is a bug even if x is signed
>int.
>I think negative shift counts are undefined even in the middle-end,
>Richard?

Well, in the past it has, at least for constant folding, pollibly only in into_const_binop.

>Treating 5 >> -5 as 5 << 5 looks just wrong to me, does any FE rely on
>that?

Not sure.

>I don't think the expander will expand it that way though.

Not sure :)

>Anyway, if the shift count is unnecessary wide type, then the
>middle-end
>won't be able to optimize it as much as it could if it has been a
>narrower
>type.
>
>So, for ubsan purposes (but, shifts are instrumented in the FEs) it is
>of
>course desirable to see the original type, but perhaps it might be a
>good
>idea to cast the shift count to integer_type_node say during
>gimplification
>(perhaps in c_gimplify_expr?).

These are all separate issues but I'd rather get rid of conversions than adding some.

So I think the original patch is OK.

Richard.

>	Jakub
Jason Merrill Nov. 26, 2014, 7:25 p.m. UTC | #7
How about converting the rhs to unsigned int if it is already unsigned?

Jason
Jakub Jelinek Nov. 26, 2014, 7:33 p.m. UTC | #8
On Wed, Nov 26, 2014 at 02:25:54PM -0500, Jason Merrill wrote:
> How about converting the rhs to unsigned int if it is already unsigned?

That is fine.  I'm just worried about the casts to wider types.
So perhaps just promote and cast to int if the promoted type is
signed or unsigned if it is unsigned?

E.g. if you have
long long foo (long long x, long long y, long long z)
{
  return x << (y * z);
}
and long long multiplication requires say a libcall, while
int * int can be expanded inline, using long long for shift
count penalizes the code unnecessarily.  Ditto for
long long foo (void) __attribute__((const));
int
bar (int x)
{
  return x << (foo () & ~0xffffffffLL);
}
and similar.  I know, we are running here back into the sadly :(
still missing type demotion and promotion passes, but changing this
now when those passes are missing can only hurt code.

	Jakub
Richard Biener Nov. 26, 2014, 9:15 p.m. UTC | #9
On November 26, 2014 8:25:54 PM CET, Jason Merrill <jason@redhat.com> wrote:
>How about converting the rhs to unsigned int if it is already unsigned?

Does the standard say so?  I think not.
IMHO it's not the front ends business to do this. And I don't see how it helps either.

Richard.

>Jason
Richard Biener Nov. 26, 2014, 9:20 p.m. UTC | #10
On November 26, 2014 8:33:52 PM CET, Jakub Jelinek <jakub@redhat.com> wrote:
>On Wed, Nov 26, 2014 at 02:25:54PM -0500, Jason Merrill wrote:
>> How about converting the rhs to unsigned int if it is already
>unsigned?
>
>That is fine.  I'm just worried about the casts to wider types.
>So perhaps just promote and cast to int if the promoted type is
>signed or unsigned if it is unsigned?
>
>E.g. if you have
>long long foo (long long x, long long y, long long z)
>{
>  return x << (y * z);
>}
>and long long multiplication requires say a libcall, while
>int * int can be expanded inline, using long long for shift
>count penalizes the code unnecessarily.  Ditto for
>long long foo (void) __attribute__((const));
>int
>bar (int x)
>{
>  return x << (foo () & ~0xffffffffLL);
>}
>and similar.  I know, we are running here back into the sadly :(
>still missing type demotion and promotion passes, but changing this
>now when those passes are missing can only hurt code.

Well, if you want to aggressively prune unused bits then you could "back-propagate" the shift count value-range.

And note that all the frontend shorten-optimizations should move to the middle-end.

That said, instead of casting to int we could cast to unsigned char...

Richard.

>
>	Jakub
Jakub Jelinek Nov. 26, 2014, 9:24 p.m. UTC | #11
On Wed, Nov 26, 2014 at 10:20:39PM +0100, Richard Biener wrote:
> Well, if you want to aggressively prune unused bits then you could "back-propagate" the shift count value-range.
> 
> And note that all the frontend shorten-optimizations should move to the middle-end.
> 
> That said, instead of casting to int we could cast to unsigned char...

As long as we don't have > 256 bitsize integers, yes, for the purposes of
GIMPLE.  Though, as with type demotion, then it might be worthwhile to try
to promote it to wider type if the shift instructions operate on wider shift
count modes than QImode.

	Jakub
Richard Biener Nov. 27, 2014, 9:30 a.m. UTC | #12
On Wed, 26 Nov 2014, Jakub Jelinek wrote:

> On Wed, Nov 26, 2014 at 10:20:39PM +0100, Richard Biener wrote:
> > Well, if you want to aggressively prune unused bits then you could "back-propagate" the shift count value-range.
> > 
> > And note that all the frontend shorten-optimizations should move to the middle-end.
> > 
> > That said, instead of casting to int we could cast to unsigned char...
> 
> As long as we don't have > 256 bitsize integers, yes, for the purposes of
> GIMPLE.  Though, as with type demotion, then it might be worthwhile to try
> to promote it to wider type if the shift instructions operate on wider shift
> count modes than QImode.

True.  I suppose the promotion pass could promote/demote the shift count
to a mode appropriate for the target.  But that's again orthogonal to
optimizing the computation of the shift count based on its value-range
which does not invoke undefined behavior (thus, shortening of those
computations).  We don't have a pass yet that allows this kind of
"back"-propagation though of course a simple shortening pass wouldn't
be hard to implement (or similar to what the FEs do now we can add
some patterns to match.pd catching those opportunities).

Richard.
Jakub Jelinek Nov. 27, 2014, 9:45 a.m. UTC | #13
On Thu, Nov 27, 2014 at 10:30:16AM +0100, Richard Biener wrote:
> On Wed, 26 Nov 2014, Jakub Jelinek wrote:
> 
> > On Wed, Nov 26, 2014 at 10:20:39PM +0100, Richard Biener wrote:
> > > Well, if you want to aggressively prune unused bits then you could "back-propagate" the shift count value-range.
> > > 
> > > And note that all the frontend shorten-optimizations should move to the middle-end.
> > > 
> > > That said, instead of casting to int we could cast to unsigned char...
> > 
> > As long as we don't have > 256 bitsize integers, yes, for the purposes of
> > GIMPLE.  Though, as with type demotion, then it might be worthwhile to try
> > to promote it to wider type if the shift instructions operate on wider shift
> > count modes than QImode.
> 
> True.  I suppose the promotion pass could promote/demote the shift count
> to a mode appropriate for the target.  But that's again orthogonal to
> optimizing the computation of the shift count based on its value-range
> which does not invoke undefined behavior (thus, shortening of those
> computations).  We don't have a pass yet that allows this kind of
> "back"-propagation though of course a simple shortening pass wouldn't
> be hard to implement (or similar to what the FEs do now we can add
> some patterns to match.pd catching those opportunities).

My point is that because we don't have infrastructure for that in GCC 5,
I think it would be best to keep the status quo in GIMPLE for this, and only
when we have infrastructure to do something better, change it.
As Marek wants to have the original (promoted) type in the FEs (which is of
course desirable, not only for ubsan, but also for diagnostics),
transforming that into the old way in c_gimplify_expr would keep the
middle-end seeing what it was used to.

	Jakub
Richard Biener Nov. 27, 2014, 9:56 a.m. UTC | #14
On Thu, 27 Nov 2014, Jakub Jelinek wrote:

> On Thu, Nov 27, 2014 at 10:30:16AM +0100, Richard Biener wrote:
> > On Wed, 26 Nov 2014, Jakub Jelinek wrote:
> > 
> > > On Wed, Nov 26, 2014 at 10:20:39PM +0100, Richard Biener wrote:
> > > > Well, if you want to aggressively prune unused bits then you could "back-propagate" the shift count value-range.
> > > > 
> > > > And note that all the frontend shorten-optimizations should move to the middle-end.
> > > > 
> > > > That said, instead of casting to int we could cast to unsigned char...
> > > 
> > > As long as we don't have > 256 bitsize integers, yes, for the purposes of
> > > GIMPLE.  Though, as with type demotion, then it might be worthwhile to try
> > > to promote it to wider type if the shift instructions operate on wider shift
> > > count modes than QImode.
> > 
> > True.  I suppose the promotion pass could promote/demote the shift count
> > to a mode appropriate for the target.  But that's again orthogonal to
> > optimizing the computation of the shift count based on its value-range
> > which does not invoke undefined behavior (thus, shortening of those
> > computations).  We don't have a pass yet that allows this kind of
> > "back"-propagation though of course a simple shortening pass wouldn't
> > be hard to implement (or similar to what the FEs do now we can add
> > some patterns to match.pd catching those opportunities).
> 
> My point is that because we don't have infrastructure for that in GCC 5,
> I think it would be best to keep the status quo in GIMPLE for this, and only
> when we have infrastructure to do something better, change it.
> As Marek wants to have the original (promoted) type in the FEs (which is of
> course desirable, not only for ubsan, but also for diagnostics),
> transforming that into the old way in c_gimplify_expr would keep the
> middle-end seeing what it was used to.

But then rather than using integer_type_node I'd convert to
unsigned_type_node (if that doesn't work you know we have some nasty
dependence on negative shift counts "working")

Richard.
diff mbox

Patch

diff --git gcc/c-family/c-ubsan.c gcc/c-family/c-ubsan.c
index 90b03f2..96afc67 100644
--- gcc/c-family/c-ubsan.c
+++ gcc/c-family/c-ubsan.c
@@ -151,7 +151,7 @@  ubsan_instrument_shift (location_t loc, enum tree_code code,
       && !TYPE_UNSIGNED (type0)
       && flag_isoc99)
     {
-      tree x = fold_build2 (MINUS_EXPR, unsigned_type_node, uprecm1,
+      tree x = fold_build2 (MINUS_EXPR, op1_utype, uprecm1,
 			    fold_convert (op1_utype, op1));
       tt = fold_convert_loc (loc, unsigned_type_for (type0), op0);
       tt = fold_build2 (RSHIFT_EXPR, TREE_TYPE (tt), tt, x);
diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
index 67efb46..bf0f306 100644
--- gcc/c/c-typeck.c
+++ gcc/c/c-typeck.c
@@ -10513,11 +10513,6 @@  build_binary_op (location_t location, enum tree_code code,
 
 	  /* Use the type of the value to be shifted.  */
 	  result_type = type0;
-	  /* Convert the non vector shift-count to an integer, regardless
-	     of size of value being shifted.  */
-	  if (TREE_CODE (TREE_TYPE (op1)) != VECTOR_TYPE
-	      && TYPE_MAIN_VARIANT (TREE_TYPE (op1)) != integer_type_node)
-	    op1 = convert (integer_type_node, op1);
 	  /* Avoid converting op1 to result_type later.  */
 	  converted = 1;
 	}
@@ -10563,11 +10558,6 @@  build_binary_op (location_t location, enum tree_code code,
 
 	  /* Use the type of the value to be shifted.  */
 	  result_type = type0;
-	  /* Convert the non vector shift-count to an integer, regardless
-	     of size of value being shifted.  */
-	  if (TREE_CODE (TREE_TYPE (op1)) != VECTOR_TYPE
-	      && TYPE_MAIN_VARIANT (TREE_TYPE (op1)) != integer_type_node)
-	    op1 = convert (integer_type_node, op1);
 	  /* Avoid converting op1 to result_type later.  */
 	  converted = 1;
 	}
diff --git gcc/cp/typeck.c gcc/cp/typeck.c
index 8b66acc..6ca346b 100644
--- gcc/cp/typeck.c
+++ gcc/cp/typeck.c
@@ -4295,10 +4295,6 @@  cp_build_binary_op (location_t location,
 			     "right shift count >= width of type");
 		}
 	    }
-	  /* Convert the shift-count to an integer, regardless of
-	     size of value being shifted.  */
-	  if (TYPE_MAIN_VARIANT (TREE_TYPE (op1)) != integer_type_node)
-	    op1 = cp_convert (integer_type_node, op1, complain);
 	  /* Avoid converting op1 to result_type later.  */
 	  converted = 1;
 	}
@@ -4344,10 +4340,6 @@  cp_build_binary_op (location_t location,
 			     "left shift count >= width of type");
 		}
 	    }
-	  /* Convert the shift-count to an integer, regardless of
-	     size of value being shifted.  */
-	  if (TYPE_MAIN_VARIANT (TREE_TYPE (op1)) != integer_type_node)
-	    op1 = cp_convert (integer_type_node, op1, complain);
 	  /* Avoid converting op1 to result_type later.  */
 	  converted = 1;
 	}
diff --git gcc/testsuite/c-c++-common/ubsan/shift-7.c gcc/testsuite/c-c++-common/ubsan/shift-7.c
index e69de29..1e33273 100644
--- gcc/testsuite/c-c++-common/ubsan/shift-7.c
+++ gcc/testsuite/c-c++-common/ubsan/shift-7.c
@@ -0,0 +1,27 @@ 
+/* PR c/63862 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=undefined" } */
+
+unsigned long long int __attribute__ ((noinline, noclone))
+foo (unsigned long long int i, unsigned long long int j)
+{
+  asm ("");
+  return i >> j;
+}
+
+unsigned long long int __attribute__ ((noinline, noclone))
+bar (unsigned long long int i, unsigned long long int j)
+{
+  asm ("");
+  return i << j;
+}
+
+int
+main ()
+{
+  foo (1ULL, 0x100000000ULL);
+  bar (1ULL, 0x100000000ULL);
+}
+
+/* { dg-output "shift exponent 4294967296 is too large for \[^\n\r]*-bit type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*shift exponent 4294967296 is too large for \[^\n\r]*-bit type 'long long unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
diff --git gcc/testsuite/gcc.c-torture/execute/shiftopt-1.c gcc/testsuite/gcc.c-torture/execute/shiftopt-1.c
index 3ff714d..8c855b8 100644
--- gcc/testsuite/gcc.c-torture/execute/shiftopt-1.c
+++ gcc/testsuite/gcc.c-torture/execute/shiftopt-1.c
@@ -22,16 +22,11 @@  utest (unsigned int x)
   if (0 >> x != 0)
     link_error ();
 
-  /* XFAIL: the C frontend converts the shift amount to 'int'
-     thus we get -1 >> (int)x which means the shift amount may
-     be negative.  See PR63862.  */
-#if 0
   if (-1 >> x != -1)
     link_error ();
 
   if (~0 >> x != ~0)
     link_error ();
-#endif
 }
 
 void