diff mbox

Fix INTEGER_CST handling for > 64 bits wide bitfields (PR tree-optimization/68835)

Message ID 20151216220751.GQ18720@tucnak.redhat.com
State New
Headers show

Commit Message

Jakub Jelinek Dec. 16, 2015, 10:07 p.m. UTC
Hi!

As can be seen on the testcases below, on > 64 bit precision bitfields
we either ICE or miscompile.

get_int_cst_ext_nunits already has code that for unsigned precision
in multiplies of HOST_BITS_PER_WIDE_INT it forces TREE_INT_CST_EXT_NUNITS
to be bigger than TREE_INT_CST_NUNITS, the former holds the actual
value (as negative) and is followed by 0 or more -1 values and a final 0
value.  But for some reason this isn't done for > HOST_BITS_PER_WIDE_INT
precisions that aren't multiples of HOST_BITS_PER_WIDE_INT, while we want to
say even in those cases that the value is actually not negative, but very
large.

The following patch attempts to do that, by handling those precisions
the same, TREE_INT_CST_NUNITS again hold the negative value, followed by
0 or more -1 values and finally one which is the -1 zero extended to the
precision % HOST_BITS_PER_WIDE_INT (so for the former special case
of precision % HOST_BITS_PER_WIDE_INT == 0 still 0 as before).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

BTW, the tree-pretty-print.c printing of such INTEGER_CSTs still looks
wrong, we use for the unsigned wi::neg_p values
print_hex (const wide_int_ref &, char *) which prints digits rounded up
to BLOCKS_NEEDED (wi.get_precision ()).  I think it would be better
to print in that case just the non-padded number of digits (and for digits
not divisible by 4 start with 0x1, 0x3 or 0x7), but not sure if additional
parameter should be added for this to print_hex, or just tree-pretty-print
should call sprintf directly in that case.  Preferences?

2015-12-16  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/68835
	* tree.c (get_int_cst_ext_nunits): Return
	cst.get_precision () / HOST_BITS_PER_WIDE_INT + 1
	for all unsigned wi::neg_p (cst) constants.
	(build_new_int_cst): If cst.get_precision is not a multiple
	of HOST_BITS_PER_WIDE_INT, zero extend -1 to the precision
	% HOST_BITS_PER_WIDE_INT.

	* gcc.dg/pr68835-1.c: New test.
	* gcc.dg/pr68835-2.c: New test.


	Jakub

Comments

Richard Biener Dec. 17, 2015, 11:33 a.m. UTC | #1
On Wed, 16 Dec 2015, Jakub Jelinek wrote:

> Hi!
> 
> As can be seen on the testcases below, on > 64 bit precision bitfields
> we either ICE or miscompile.
> 
> get_int_cst_ext_nunits already has code that for unsigned precision
> in multiplies of HOST_BITS_PER_WIDE_INT it forces TREE_INT_CST_EXT_NUNITS
> to be bigger than TREE_INT_CST_NUNITS, the former holds the actual
> value (as negative) and is followed by 0 or more -1 values and a final 0
> value.  But for some reason this isn't done for > HOST_BITS_PER_WIDE_INT
> precisions that aren't multiples of HOST_BITS_PER_WIDE_INT, while we want to
> say even in those cases that the value is actually not negative, but very
> large.
> 
> The following patch attempts to do that, by handling those precisions
> the same, TREE_INT_CST_NUNITS again hold the negative value, followed by
> 0 or more -1 values and finally one which is the -1 zero extended to the
> precision % HOST_BITS_PER_WIDE_INT (so for the former special case
> of precision % HOST_BITS_PER_WIDE_INT == 0 still 0 as before).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> BTW, the tree-pretty-print.c printing of such INTEGER_CSTs still looks
> wrong, we use for the unsigned wi::neg_p values
> print_hex (const wide_int_ref &, char *) which prints digits rounded up
> to BLOCKS_NEEDED (wi.get_precision ()).  I think it would be better
> to print in that case just the non-padded number of digits (and for digits
> not divisible by 4 start with 0x1, 0x3 or 0x7), but not sure if additional
> parameter should be added for this to print_hex, or just tree-pretty-print
> should call sprintf directly in that case.  Preferences?
> 
> 2015-12-16  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR tree-optimization/68835
> 	* tree.c (get_int_cst_ext_nunits): Return
> 	cst.get_precision () / HOST_BITS_PER_WIDE_INT + 1
> 	for all unsigned wi::neg_p (cst) constants.
> 	(build_new_int_cst): If cst.get_precision is not a multiple
> 	of HOST_BITS_PER_WIDE_INT, zero extend -1 to the precision
> 	% HOST_BITS_PER_WIDE_INT.
> 
> 	* gcc.dg/pr68835-1.c: New test.
> 	* gcc.dg/pr68835-2.c: New test.
> 
> --- gcc/tree.c.jj	2015-12-16 09:02:11.000000000 +0100
> +++ gcc/tree.c	2015-12-16 17:50:25.000000000 +0100
> @@ -1245,11 +1245,9 @@ static unsigned int
>  get_int_cst_ext_nunits (tree type, const wide_int &cst)
>  {
>    gcc_checking_assert (cst.get_precision () == TYPE_PRECISION (type));
> -  /* We need an extra zero HWI if CST is an unsigned integer with its
> -     upper bit set, and if CST occupies a whole number of HWIs.  */
> -  if (TYPE_UNSIGNED (type)
> -      && wi::neg_p (cst)
> -      && (cst.get_precision () % HOST_BITS_PER_WIDE_INT) == 0)
> +  /* We need extra HWIs if CST is an unsigned integer with its
> +     upper bit set.  */
> +  if (TYPE_UNSIGNED (type) && wi::neg_p (cst))
>      return cst.get_precision () / HOST_BITS_PER_WIDE_INT + 1;
>    return cst.get_len ();
>  }

I don't see why this is necessary - if there are enough bits to
always have a zero MSB for unsigned types then we don't need an
extra element.

> @@ -1266,7 +1264,8 @@ build_new_int_cst (tree type, const wide
>    if (len < ext_len)
>      {
>        --ext_len;
> -      TREE_INT_CST_ELT (nt, ext_len) = 0;
> +      TREE_INT_CST_ELT (nt, ext_len)
> +	= zext_hwi (-1, cst.get_precision () % HOST_BITS_PER_WIDE_INT);

isn't this enough?

Thanks,
Richard.

>        for (unsigned int i = len; i < ext_len; ++i)
>  	TREE_INT_CST_ELT (nt, i) = -1;
>      }
> --- gcc/testsuite/gcc.dg/pr68835-1.c.jj	2015-12-16 18:14:08.960943653 +0100
> +++ gcc/testsuite/gcc.dg/pr68835-1.c	2015-12-16 18:15:56.803447877 +0100
> @@ -0,0 +1,12 @@
> +/* PR tree-optimization/68835 */
> +/* { dg-do compile { target int128 } } */
> +/* { dg-options "-O2" } */
> +
> +unsigned __int128
> +foo (unsigned long a, unsigned long b)
> +{
> +  unsigned __int128 x = (unsigned __int128) a * b;
> +  struct { unsigned __int128 a : 96; } w;
> +  w.a = x;
> +  return w.a;
> +}
> --- gcc/testsuite/gcc.dg/pr68835-2.c.jj	2015-12-16 18:41:32.162177493 +0100
> +++ gcc/testsuite/gcc.dg/pr68835-2.c	2015-12-16 18:43:07.829853729 +0100
> @@ -0,0 +1,23 @@
> +/* PR tree-optimization/68835 */
> +/* { dg-do run { target int128 } } */
> +/* { dg-options "-O2" } */
> +
> +__attribute__((noinline, noclone)) unsigned __int128
> +foo (void)
> +{
> +  unsigned __int128 x = (unsigned __int128) 0xffffffffffffffffULL;
> +  struct { unsigned __int128 a : 65; } w;
> +  w.a = x;
> +  w.a += x;
> +  return w.a;
> +}
> +
> +int
> +main ()
> +{
> +  unsigned __int128 x = foo ();
> +  if ((unsigned long long) x != 0xfffffffffffffffeULL
> +      || (unsigned long long) (x >> 64) != 1)
> +    __builtin_abort ();
> +  return 0;
> +}
> 
> 	Jakub
> 
>
Jakub Jelinek Dec. 17, 2015, 11:55 a.m. UTC | #2
On Thu, Dec 17, 2015 at 12:33:22PM +0100, Richard Biener wrote:
> > @@ -1245,11 +1245,9 @@ static unsigned int
> >  get_int_cst_ext_nunits (tree type, const wide_int &cst)
> >  {
> >    gcc_checking_assert (cst.get_precision () == TYPE_PRECISION (type));
> > -  /* We need an extra zero HWI if CST is an unsigned integer with its
> > -     upper bit set, and if CST occupies a whole number of HWIs.  */
> > -  if (TYPE_UNSIGNED (type)
> > -      && wi::neg_p (cst)
> > -      && (cst.get_precision () % HOST_BITS_PER_WIDE_INT) == 0)
> > +  /* We need extra HWIs if CST is an unsigned integer with its
> > +     upper bit set.  */
> > +  if (TYPE_UNSIGNED (type) && wi::neg_p (cst))
> >      return cst.get_precision () / HOST_BITS_PER_WIDE_INT + 1;
> >    return cst.get_len ();
> >  }
> 
> I don't see why this is necessary - if there are enough bits to
> always have a zero MSB for unsigned types then we don't need an
> extra element.

But there aren't enough bits in that case.
Let's consider how is -2 casted to unsigned type with precision 64, 65, 127,
128 and 129 is encoded.
cst.get_len () is in each case 1, the first element is -2.
So, what get_int_cst_ext_nunits before my change does is that for
precision 64 and 128 return 2 and 3, for all other precisions (talking just
about < 129) returns 1.
So, before my patch we have:
precision	TREE_INT_CST_NUNITS	TREE_INT_CST_EXT_NUNITS		TREE_INT_CST_ELT
64		1			2				-2, 0
65		1			1				-2
127		1			1				-2
128		1			3				-2, -1, 0
129		1			1				-2
and with the patch we have:
64		1			2				-2, 0
65		1			2				-2, 1
127		1			2				-2, -1ULL / 2
128		1			3				-2, -1, 0
129		1			3				-2, -1, 1

Thus, before the patch there was nothing that in the ext units would tell
the users the value is actually large unsigned one rather than negative.
I really don't see what is so special on precisions divisible by
HOST_BITS_PER_WIDE_INT.  The > TREE_INT_CST_NUNITS elements aren't
guaranteed to be all 0s anyway, they are all -1s and for the precisions
divisible by HOST_BITS_PER_WIDE_INT the last one is 0.  That is what is
special on those, with my patch for the other precisions the last one is
simply always some power of two - 1 (other than -1).

Now, if the value is not in between maximum and maximum - 0x8000000000000000ULL
TREE_INT_CST_NUNITS will not be 1, but say 2 or more, and perhaps again
the len is still smaller than precision / HOST_BITS_PER_WIDE_INT + 1, but
if it is again negative, we want to have some way to add the ext elements
(a series of -1, followed by 0 or other power of two - 1 other than -1).

> 
> > @@ -1266,7 +1264,8 @@ build_new_int_cst (tree type, const wide
> >    if (len < ext_len)
> >      {
> >        --ext_len;
> > -      TREE_INT_CST_ELT (nt, ext_len) = 0;
> > +      TREE_INT_CST_ELT (nt, ext_len)
> > +	= zext_hwi (-1, cst.get_precision () % HOST_BITS_PER_WIDE_INT);
> 
> isn't this enough?

No.  This is the len < ext_len case, without the first hunk
if len < ext_len then cst.get_precision () % HOST_BITS_PER_WIDE_INT is
guaranteed to be 0, and thus zext_hwi will return 0.

	Jakub
Richard Biener Dec. 17, 2015, 12:02 p.m. UTC | #3
On Thu, 17 Dec 2015, Jakub Jelinek wrote:

> On Thu, Dec 17, 2015 at 12:33:22PM +0100, Richard Biener wrote:
> > > @@ -1245,11 +1245,9 @@ static unsigned int
> > >  get_int_cst_ext_nunits (tree type, const wide_int &cst)
> > >  {
> > >    gcc_checking_assert (cst.get_precision () == TYPE_PRECISION (type));
> > > -  /* We need an extra zero HWI if CST is an unsigned integer with its
> > > -     upper bit set, and if CST occupies a whole number of HWIs.  */
> > > -  if (TYPE_UNSIGNED (type)
> > > -      && wi::neg_p (cst)
> > > -      && (cst.get_precision () % HOST_BITS_PER_WIDE_INT) == 0)
> > > +  /* We need extra HWIs if CST is an unsigned integer with its
> > > +     upper bit set.  */
> > > +  if (TYPE_UNSIGNED (type) && wi::neg_p (cst))
> > >      return cst.get_precision () / HOST_BITS_PER_WIDE_INT + 1;
> > >    return cst.get_len ();
> > >  }
> > 
> > I don't see why this is necessary - if there are enough bits to
> > always have a zero MSB for unsigned types then we don't need an
> > extra element.
> 
> But there aren't enough bits in that case.
> Let's consider how is -2 casted to unsigned type with precision 64, 65, 127,
> 128 and 129 is encoded.
> cst.get_len () is in each case 1, the first element is -2.
> So, what get_int_cst_ext_nunits before my change does is that for
> precision 64 and 128 return 2 and 3, for all other precisions (talking just
> about < 129) returns 1.

Doh, yes.

> So, before my patch we have:
> precision	TREE_INT_CST_NUNITS	TREE_INT_CST_EXT_NUNITS		TREE_INT_CST_ELT
> 64		1			2				-2, 0
> 65		1			1				-2
> 127		1			1				-2
> 128		1			3				-2, -1, 0
> 129		1			1				-2
> and with the patch we have:
> 64		1			2				-2, 0
> 65		1			2				-2, 1
> 127		1			2				-2, -1ULL / 2
> 128		1			3				-2, -1, 0
> 129		1			3				-2, -1, 1
> 
> Thus, before the patch there was nothing that in the ext units would tell
> the users the value is actually large unsigned one rather than negative.
> I really don't see what is so special on precisions divisible by
> HOST_BITS_PER_WIDE_INT.  The > TREE_INT_CST_NUNITS elements aren't
> guaranteed to be all 0s anyway, they are all -1s and for the precisions
> divisible by HOST_BITS_PER_WIDE_INT the last one is 0.  That is what is
> special on those, with my patch for the other precisions the last one is
> simply always some power of two - 1 (other than -1).
> 
> Now, if the value is not in between maximum and maximum - 0x8000000000000000ULL
> TREE_INT_CST_NUNITS will not be 1, but say 2 or more, and perhaps again
> the len is still smaller than precision / HOST_BITS_PER_WIDE_INT + 1, but
> if it is again negative, we want to have some way to add the ext elements
> (a series of -1, followed by 0 or other power of two - 1 other than -1).
> 
> > 
> > > @@ -1266,7 +1264,8 @@ build_new_int_cst (tree type, const wide
> > >    if (len < ext_len)
> > >      {
> > >        --ext_len;
> > > -      TREE_INT_CST_ELT (nt, ext_len) = 0;
> > > +      TREE_INT_CST_ELT (nt, ext_len)
> > > +	= zext_hwi (-1, cst.get_precision () % HOST_BITS_PER_WIDE_INT);
> > 
> > isn't this enough?
> 
> No.  This is the len < ext_len case, without the first hunk
> if len < ext_len then cst.get_precision () % HOST_BITS_PER_WIDE_INT is
> guaranteed to be 0, and thus zext_hwi will return 0.

Patch is ok.

Thanks,
Richard.
diff mbox

Patch

--- gcc/tree.c.jj	2015-12-16 09:02:11.000000000 +0100
+++ gcc/tree.c	2015-12-16 17:50:25.000000000 +0100
@@ -1245,11 +1245,9 @@  static unsigned int
 get_int_cst_ext_nunits (tree type, const wide_int &cst)
 {
   gcc_checking_assert (cst.get_precision () == TYPE_PRECISION (type));
-  /* We need an extra zero HWI if CST is an unsigned integer with its
-     upper bit set, and if CST occupies a whole number of HWIs.  */
-  if (TYPE_UNSIGNED (type)
-      && wi::neg_p (cst)
-      && (cst.get_precision () % HOST_BITS_PER_WIDE_INT) == 0)
+  /* We need extra HWIs if CST is an unsigned integer with its
+     upper bit set.  */
+  if (TYPE_UNSIGNED (type) && wi::neg_p (cst))
     return cst.get_precision () / HOST_BITS_PER_WIDE_INT + 1;
   return cst.get_len ();
 }
@@ -1266,7 +1264,8 @@  build_new_int_cst (tree type, const wide
   if (len < ext_len)
     {
       --ext_len;
-      TREE_INT_CST_ELT (nt, ext_len) = 0;
+      TREE_INT_CST_ELT (nt, ext_len)
+	= zext_hwi (-1, cst.get_precision () % HOST_BITS_PER_WIDE_INT);
       for (unsigned int i = len; i < ext_len; ++i)
 	TREE_INT_CST_ELT (nt, i) = -1;
     }
--- gcc/testsuite/gcc.dg/pr68835-1.c.jj	2015-12-16 18:14:08.960943653 +0100
+++ gcc/testsuite/gcc.dg/pr68835-1.c	2015-12-16 18:15:56.803447877 +0100
@@ -0,0 +1,12 @@ 
+/* PR tree-optimization/68835 */
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-O2" } */
+
+unsigned __int128
+foo (unsigned long a, unsigned long b)
+{
+  unsigned __int128 x = (unsigned __int128) a * b;
+  struct { unsigned __int128 a : 96; } w;
+  w.a = x;
+  return w.a;
+}
--- gcc/testsuite/gcc.dg/pr68835-2.c.jj	2015-12-16 18:41:32.162177493 +0100
+++ gcc/testsuite/gcc.dg/pr68835-2.c	2015-12-16 18:43:07.829853729 +0100
@@ -0,0 +1,23 @@ 
+/* PR tree-optimization/68835 */
+/* { dg-do run { target int128 } } */
+/* { dg-options "-O2" } */
+
+__attribute__((noinline, noclone)) unsigned __int128
+foo (void)
+{
+  unsigned __int128 x = (unsigned __int128) 0xffffffffffffffffULL;
+  struct { unsigned __int128 a : 65; } w;
+  w.a = x;
+  w.a += x;
+  return w.a;
+}
+
+int
+main ()
+{
+  unsigned __int128 x = foo ();
+  if ((unsigned long long) x != 0xfffffffffffffffeULL
+      || (unsigned long long) (x >> 64) != 1)
+    __builtin_abort ();
+  return 0;
+}