diff mbox

Fix ICE on CONST_WIDE_INT in simplify_immed_subreg (PR rtl-optimization/78526)

Message ID 20161125173502.GM3541@tucnak.redhat.com
State New
Headers show

Commit Message

Jakub Jelinek Nov. 25, 2016, 5:35 p.m. UTC
Hi!

If we try to simplify a paradoxical subreg of a CONST_WIDE_INT,
we can ICE, because wi::extract_uhwi doesn't like accessing
bytes beyond the precision.
  for (i = 0; i < elem_bitsize; i += value_bit)
    *vp++ = wi::extract_uhwi (val, i, value_bit);
  for (; i < elem_bitsize; i += value_bit)
    *vp++ = extend;
looks wrong, because the second loop is useless.  But, we actually
do want the second loop, to handle the paradoxical bytes.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-11-25  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/78526
	* simplify-rtx.c (simplify_immed_subreg): Don't use wi::extract_uhwi
	beyond val's precision.

	* gcc.dg/pr78526.c: New test.


	Jakub

Comments

Richard Sandiford Nov. 25, 2016, 6:50 p.m. UTC | #1
Jakub Jelinek <jakub@redhat.com> writes:
> Hi!
>
> If we try to simplify a paradoxical subreg of a CONST_WIDE_INT,
> we can ICE, because wi::extract_uhwi doesn't like accessing
> bytes beyond the precision.
>   for (i = 0; i < elem_bitsize; i += value_bit)
>     *vp++ = wi::extract_uhwi (val, i, value_bit);
>   for (; i < elem_bitsize; i += value_bit)
>     *vp++ = extend;
> looks wrong, because the second loop is useless.  But, we actually
> do want the second loop, to handle the paradoxical bytes.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2016-11-25  Jakub Jelinek  <jakub@redhat.com>
>
> 	PR rtl-optimization/78526
> 	* simplify-rtx.c (simplify_immed_subreg): Don't use wi::extract_uhwi
> 	beyond val's precision.
>
> 	* gcc.dg/pr78526.c: New test.

OK, thanks.

I think this is still wrong for modes that are wide enough to use
CONST_WIDE_INT but aren't a multiple of a HWI in size.  In that
case we'll still try to access the integer in a wider precision for the
final iteration of the first loop.  I'm not sure we have any targets
like that though, and either way, I agree this is a strict improvement.

Perhaps we should have extract_uhwi that works for arbitrary bitpos
and width and zero-extends beyond the precision, and an extract_shwi
that does the same but sign-extending beyond the precision?  Then code
like this could use a single loop and could safely access any wide-int
representation (regardless of whether it's stored in sign-extended form).
The extract_*hwi routines would become a bit more expensive though.

Does that sound like an improvement?  If so I'll file a PR.
I might have time before GCC 7 if it qualifies.

Thanks,
Richard
Jakub Jelinek Nov. 25, 2016, 6:54 p.m. UTC | #2
On Fri, Nov 25, 2016 at 06:50:17PM +0000, Richard Sandiford wrote:
> I think this is still wrong for modes that are wide enough to use
> CONST_WIDE_INT but aren't a multiple of a HWI in size.  In that

The loop iterates by value_bit, which is 8 bits.  Thus, indeed, it will
misbehave if there is a CONST_WIDE_INT with a precision not multiple of 8.

> Does that sound like an improvement?  If so I'll file a PR.
> I might have time before GCC 7 if it qualifies.

Yeah, it might be an improvement.

	Jakub
diff mbox

Patch

--- gcc/simplify-rtx.c.jj	2016-11-18 20:04:28.000000000 +0100
+++ gcc/simplify-rtx.c	2016-11-25 15:07:13.872322389 +0100
@@ -5740,8 +5740,9 @@  simplify_immed_subreg (machine_mode oute
 	  {
 	    rtx_mode_t val = rtx_mode_t (el, innermode);
 	    unsigned char extend = wi::sign_mask (val);
+	    int prec = wi::get_precision (val);
 
-	    for (i = 0; i < elem_bitsize; i += value_bit)
+	    for (i = 0; i < prec && i < elem_bitsize; i += value_bit)
 	      *vp++ = wi::extract_uhwi (val, i, value_bit);
 	    for (; i < elem_bitsize; i += value_bit)
 	      *vp++ = extend;
--- gcc/testsuite/gcc.dg/pr78526.c.jj	2016-11-25 15:15:53.266799450 +0100
+++ gcc/testsuite/gcc.dg/pr78526.c	2016-11-25 15:16:12.004564146 +0100
@@ -0,0 +1,21 @@ 
+/* PR rtl-optimization/78526 */
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-O -fno-tree-ccp -fno-tree-sra -g -w" } */
+/* { dg-additional-options "-mavx512bw" { target i?86-*-* x86_64-*-* } } */
+
+typedef unsigned U __attribute__ ((vector_size (64)));
+typedef unsigned __int128 V __attribute__ ((vector_size (64)));
+
+static inline V
+bar (U u, U x, V v)
+{
+  v = (V)(U) { 0, ~0 };
+  v[x[0]] <<= u[-63];
+  return v;
+}
+
+V
+foo (U u)
+{
+  return bar (u, (U) {}, (V) {});
+}