Message ID | 5357DA4F.7010906@arm.com |
---|---|
State | New |
Headers | show |
On 23/04/14 16:20, Alan Lawrence wrote: > This patch is a small tidy of a more-complicated expression that just flips a > single bit and can thus be a simple XOR. > > No regressions on aarch64-none-elf or aarch64_be-none-elf. (I've verified code > is indeed exercised by dg-torture.exp vshuf-v*.c). > > Also ok after applying TBL and testsuite patches in > http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01309.html and > http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00579.html. > > gcc/ChangeLog: > 2014-04-23 Alan Lawrence <alan.lawrence@arm.com> > > * config/aarch64/aarch64.c (aarch64_expand_vec_perm_1): tidy bit-flip expression. > s/tidy/Tidy/ It's not obvious from your description (or from the code, for that matter) that for this to be valid nelt must be a power of 2. I suggest that, above the loop, you put gcc_assert (nelt == (nelt & -nelt)); OK with those changes. R. > > xor_tidy.patch > > > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index a3147ee..b879754 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -8124,7 +8124,7 @@ aarch64_expand_vec_perm_const_1 (struct expand_vec_perm_d *d) > rtx x; > > for (i = 0; i < nelt; ++i) > - d->perm[i] = (d->perm[i] + nelt) & (2 * nelt - 1); > + d->perm[i] ^= nelt; /* Keep the same index, but in the other vector. */ > > x = d->op0; > d->op0 = d->op1; >
On 04/29/2014 05:42 AM, Richard Earnshaw wrote: > On 23/04/14 16:20, Alan Lawrence wrote: >> This patch is a small tidy of a more-complicated expression that just flips a >> single bit and can thus be a simple XOR. >> >> No regressions on aarch64-none-elf or aarch64_be-none-elf. (I've verified code >> is indeed exercised by dg-torture.exp vshuf-v*.c). >> >> Also ok after applying TBL and testsuite patches in >> http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01309.html and >> http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00579.html. >> >> gcc/ChangeLog: >> 2014-04-23 Alan Lawrence <alan.lawrence@arm.com> >> >> * config/aarch64/aarch64.c (aarch64_expand_vec_perm_1): tidy bit-flip expression. >> > > s/tidy/Tidy/ > > It's not obvious from your description (or from the code, for that > matter) that for this to be valid nelt must be a power of 2. > > I suggest that, above the loop, you put > > gcc_assert (nelt == (nelt & -nelt)); > > OK with those changes. I think it's sort of obvious from context that we're working with a vector. And it also seems obvious that we won't have a vector without a power-of-two number of elements. r~
Whilst I agree with Richard H that it is obvious, my feeling is that the assertion does no harm, so have committed rev 210005 with Richard E's changes. --Alan Richard Henderson wrote: > On 04/29/2014 05:42 AM, Richard Earnshaw wrote: >> On 23/04/14 16:20, Alan Lawrence wrote: >>> This patch is a small tidy of a more-complicated expression that just flips a >>> single bit and can thus be a simple XOR. >>> >>> No regressions on aarch64-none-elf or aarch64_be-none-elf. (I've verified code >>> is indeed exercised by dg-torture.exp vshuf-v*.c). >>> >>> Also ok after applying TBL and testsuite patches in >>> http://gcc.gnu.org/ml/gcc-patches/2014-04/msg01309.html and >>> http://gcc.gnu.org/ml/gcc-patches/2014-04/msg00579.html. >>> >>> gcc/ChangeLog: >>> 2014-04-23 Alan Lawrence <alan.lawrence@arm.com> >>> >>> * config/aarch64/aarch64.c (aarch64_expand_vec_perm_1): tidy bit-flip expression. >>> >> s/tidy/Tidy/ >> >> It's not obvious from your description (or from the code, for that >> matter) that for this to be valid nelt must be a power of 2. >> >> I suggest that, above the loop, you put >> >> gcc_assert (nelt == (nelt & -nelt)); >> >> OK with those changes. > > I think it's sort of obvious from context that we're working with a vector. > And it also seems obvious that we won't have a vector without a power-of-two > number of elements. > > > r~ > >
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index a3147ee..b879754 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -8124,7 +8124,7 @@ aarch64_expand_vec_perm_const_1 (struct expand_vec_perm_d *d) rtx x; for (i = 0; i < nelt; ++i) - d->perm[i] = (d->perm[i] + nelt) & (2 * nelt - 1); + d->perm[i] ^= nelt; /* Keep the same index, but in the other vector. */ x = d->op0; d->op0 = d->op1;