Message ID | 54F6D239.2010902@arm.com |
---|---|
State | New |
Headers | show |
On Wed, 4 Mar 2015, Kyrill Tkachov wrote: > Hi all, > > This patch fixes PR rtl-optimization 65235. > As mentioned in bugzilla: > Combine tries to combine: > (insn 72 71 73 2 (set (reg:V2DI 117 [ D.18177 ]) > (vec_concat:V2DI (reg:DI 176 [ D.18179 ]) > (reg:DI 114 [ D.18168 ]))) > (expr_list:REG_DEAD (reg:DI 176 [ D.18179 ]) > (expr_list:REG_DEAD (reg:DI 114 [ D.18168 ]) > > and > > (insn 104 102 105 2 (set (reg:DI 193 [ D.18168 ]) > (vec_select:DI (reg:V2DI 117 [ D.18177 ]) > (parallel [ > (const_int 0 [0]) > ]))) > (expr_list:REG_DEAD (reg:V2DI 117 [ D.18177 ]) > (nil))) > > but ends up generating: > (set (reg:DI 193 [ D.18168 ]) > (reg:DI 114 [ D.18168 ])) > > > that is, it gets the wrong element of the vec_concat. > > The problem is that in simplify-rtx it tries to get the size of the > first element, compare the requested offset against it and pick the > second element if the offset is greater than that size. If the first > element is a const_int, the size is 0, so it will always pick the second > part. I am a bit surprised that the first element is a const_int and not reg 176, maybe that's why it doesn't reproduce on other platforms? Not that it really matters. > The patch fixes that by calculating the size of the first element by > taking the size of the outer mode and subtracting the size of the second > element. I wonder if we should admit that vector sizes are always a power of 2 and use half the size of the outer mode? > I've added an assert to make sure that the second element is not also a > const_int, as a vec_concat of const_ints doesn't make sense as far as I > can see.
On 04/03/15 10:41, Marc Glisse wrote: > On Wed, 4 Mar 2015, Kyrill Tkachov wrote: > >> Hi all, >> >> This patch fixes PR rtl-optimization 65235. >> As mentioned in bugzilla: >> Combine tries to combine: >> (insn 72 71 73 2 (set (reg:V2DI 117 [ D.18177 ]) >> (vec_concat:V2DI (reg:DI 176 [ D.18179 ]) >> (reg:DI 114 [ D.18168 ]))) >> (expr_list:REG_DEAD (reg:DI 176 [ D.18179 ]) >> (expr_list:REG_DEAD (reg:DI 114 [ D.18168 ]) >> >> and >> >> (insn 104 102 105 2 (set (reg:DI 193 [ D.18168 ]) >> (vec_select:DI (reg:V2DI 117 [ D.18177 ]) >> (parallel [ >> (const_int 0 [0]) >> ]))) >> (expr_list:REG_DEAD (reg:V2DI 117 [ D.18177 ]) >> (nil))) >> >> but ends up generating: >> (set (reg:DI 193 [ D.18168 ]) >> (reg:DI 114 [ D.18168 ])) >> >> >> that is, it gets the wrong element of the vec_concat. >> >> The problem is that in simplify-rtx it tries to get the size of the >> first element, compare the requested offset against it and pick the >> second element if the offset is greater than that size. If the first >> element is a const_int, the size is 0, so it will always pick the second >> part. > I am a bit surprised that the first element is a const_int and not reg > 176, maybe that's why it doesn't reproduce on other platforms? Not that it > really matters. Yeah, I guess combine tries various whacky combinations when it's trying to munge things together. > >> The patch fixes that by calculating the size of the first element by >> taking the size of the outer mode and subtracting the size of the second >> element. > I wonder if we should admit that vector sizes are always a power of 2 and > use half the size of the outer mode? Perhaps. I tried to be as defensive as possible since I'm not sure what the restrictions are on vec_select+vec_concat combinations. Kyrill > >> I've added an assert to make sure that the second element is not also a >> const_int, as a vec_concat of const_ints doesn't make sense as far as I >> can see.
> The patch fixes that by calculating the size of the first element by > taking the size of the outer mode and subtracting the size of the second > element. > > I've added an assert to make sure that the second element is not also a > const_int, as a vec_concat of const_ints doesn't make sense as far as I can > see. I'm not sure about the assert, can't we just punt in this case? > Bootstrapped and tested on aarch64-none-linux-gnu, > arm-none-linux-gnueabihf, x86_64-linux-gnu. > This bug appears on trunk, 4.9 and 4.8, so it's not a regression on the > release branches but it is a wrong-code bug. I think that the fix would be acceptable for GCC 5 without the assert.
commit 816f7592617681c48e5303aaa9e3b0ee8a858457 Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com> Date: Thu Feb 26 16:40:52 2015 +0000 [simplify-rtx] Calculate vector size correctly when simplifying (vec_select (vec_concat (const_int) (...)) [...]) diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c index a003b41..3a6159f 100644 --- a/gcc/simplify-rtx.c +++ b/gcc/simplify-rtx.c @@ -3555,7 +3555,19 @@ simplify_binary_operation_1 (enum rtx_code code, machine_mode mode, while (GET_MODE (vec) != mode && GET_CODE (vec) == VEC_CONCAT) { - HOST_WIDE_INT vec_size = GET_MODE_SIZE (GET_MODE (XEXP (vec, 0))); + HOST_WIDE_INT vec_size; + + if (CONST_INT_P (XEXP (vec, 0))) + { + /* vec_concat of two const_ints doesn't make sense with + respect to modes. */ + gcc_assert (!CONST_INT_P (XEXP (vec, 1))); + vec_size = GET_MODE_SIZE (GET_MODE (trueop0)) + - GET_MODE_SIZE (GET_MODE (XEXP (vec, 1))); + } + else + vec_size = GET_MODE_SIZE (GET_MODE (XEXP (vec, 0))); + if (offset < vec_size) vec = XEXP (vec, 0); else diff --git a/gcc/testsuite/gcc.target/aarch64/pr65235_1.c b/gcc/testsuite/gcc.target/aarch64/pr65235_1.c new file mode 100644 index 0000000..ca12cd5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pr65235_1.c @@ -0,0 +1,30 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +#include "arm_neon.h" + +int +main (int argc, char** argv) +{ + int64x1_t val1; + int64x1_t val2; + int64x1_t val3; + uint64x1_t val13; + uint64x2_t val14; + uint64_t got; + uint64_t exp; + val1 = vcreate_s64(UINT64_C(0xffffffff80008000)); + val2 = vcreate_s64(UINT64_C(0x0000f38d00000000)); + val3 = vcreate_s64(UINT64_C(0xffff7fff0000809b)); + /* Expect: "val13" = 8000000000001553. */ + val13 = vcreate_u64 (UINT64_C(0x8000000000001553)); + /* Expect: "val14" = 0010 0000 0000 0002 0000 0000 0000 0000. */ + val14 = vcombine_u64(vcgt_s64(vqrshl_s64(val1, val2), + vshr_n_s64(val3, 18)), + vshr_n_u64(val13, 11)); + /* Should be 0000000000000000. */ + got = vgetq_lane_u64(val14, 0); + exp = 0; + if(exp != got) + __builtin_abort (); +}