Message ID | YkgWe+Tq0SxZUvUN@tucnak |
---|---|
State | New |
Headers | show |
Series | i386: Fix up ix86_expand_vector_init_general [PR105123] | expand |
On Sat, Apr 2, 2022 at 11:25 AM Jakub Jelinek <jakub@redhat.com> wrote: > > Hi! > > The following testcase is miscompiled on ia32. > The problem is that at -O0 we end up with: > vector(4) short unsigned int _1; > short unsigned int u.0_3; > ... > _1 = {u.0_3, u.0_3, u.0_3, u.0_3}; > statement (dead) which is wrongly expanded. > elt is (subreg:HI (reg:SI 83 [ u.0_3 ]) 0), tmp_mode SImode, > so after convert_mode we start with word (reg:SI 83 [ u.0_3 ]). > The intent is to manually broadcast that value to 2 SImode parts, > but because we pass word as target to expand_simple_binop, it will > overwrite (reg:SI 83 [ u.0_3 ]) and we end up with 0: > 10: {r83:SI=r83:SI<<0x10;clobber flags:CC;} > 11: {r83:SI=r83:SI|r83:SI;clobber flags:CC;} > 12: {r83:SI=r83:SI<<0x10;clobber flags:CC;} > 13: {r83:SI=r83:SI|r83:SI;clobber flags:CC;} > 14: clobber r110:V4HI > 15: r110:V4HI#0=r83:SI > 16: r110:V4HI#4=r83:SI > as the two ors do nothing and two shifts each by 16 left shift it all > away. > The following patch fixes that by using NULL_RTX target, so we expand it as > 10: {r110:SI=r83:SI<<0x10;clobber flags:CC;} > 11: {r111:SI=r110:SI|r83:SI;clobber flags:CC;} > 12: {r112:SI=r83:SI<<0x10;clobber flags:CC;} > 13: {r113:SI=r112:SI|r83:SI;clobber flags:CC;} > 14: clobber r114:V4HI > 15: r114:V4HI#0=r111:SI > 16: r114:V4HI#4=r113:SI > instead. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > Another possibility would be to pass NULL_RTX only when word == elt > and word otherwise, where word would necessarily be a pseudo from the first > shift after passing NULL_RTX there once or pass NULL_RTX for the shift and > word for ior. > > 2022-04-02 Jakub Jelinek <jakub@redhat.com> > > PR target/105123 > * config/i386/i386-expand.cc (ix86_expand_vector_init_general): Avoid > using word as target for expand_simple_binop when doing ASHIFT and > IOR. > > * gcc.target/i386/pr105123.c: New test. OK. Thanks, Uros. > > --- gcc/config/i386/i386-expand.cc.jj 2022-03-19 13:52:53.000000000 +0100 > +++ gcc/config/i386/i386-expand.cc 2022-04-01 16:51:27.253154191 +0200 > @@ -15830,9 +15830,9 @@ quarter: > else > { > word = expand_simple_binop (tmp_mode, ASHIFT, word, shift, > - word, 1, OPTAB_LIB_WIDEN); > + NULL_RTX, 1, OPTAB_LIB_WIDEN); > word = expand_simple_binop (tmp_mode, IOR, word, elt, > - word, 1, OPTAB_LIB_WIDEN); > + NULL_RTX, 1, OPTAB_LIB_WIDEN); > } > } > > --- gcc/testsuite/gcc.target/i386/pr105123.c.jj 2022-04-01 16:56:44.549625810 +0200 > +++ gcc/testsuite/gcc.target/i386/pr105123.c 2022-04-01 16:56:33.569782511 +0200 > @@ -0,0 +1,22 @@ > +/* PR target/105123 */ > +/* { dg-do run { target sse2_runtime } } */ > +/* { dg-options "-msse2" } */ > +/* { dg-additional-options "-mtune=i686" { target ia32 } } */ > + > +typedef unsigned short __attribute__((__vector_size__ (4 * sizeof (unsigned short)))) V; > + > +V > +foo (unsigned short u, V v) > +{ > + return __builtin_shuffle (u * v, v); > +} > + > +int > +main () > +{ > + V x = foo (1, (V) { 0, 1, 2, 3 }); > + for (unsigned i = 0; i < 4; i++) > + if (x[i] != i) > + __builtin_abort (); > + return 0; > +} > > Jakub >
--- gcc/config/i386/i386-expand.cc.jj 2022-03-19 13:52:53.000000000 +0100 +++ gcc/config/i386/i386-expand.cc 2022-04-01 16:51:27.253154191 +0200 @@ -15830,9 +15830,9 @@ quarter: else { word = expand_simple_binop (tmp_mode, ASHIFT, word, shift, - word, 1, OPTAB_LIB_WIDEN); + NULL_RTX, 1, OPTAB_LIB_WIDEN); word = expand_simple_binop (tmp_mode, IOR, word, elt, - word, 1, OPTAB_LIB_WIDEN); + NULL_RTX, 1, OPTAB_LIB_WIDEN); } } --- gcc/testsuite/gcc.target/i386/pr105123.c.jj 2022-04-01 16:56:44.549625810 +0200 +++ gcc/testsuite/gcc.target/i386/pr105123.c 2022-04-01 16:56:33.569782511 +0200 @@ -0,0 +1,22 @@ +/* PR target/105123 */ +/* { dg-do run { target sse2_runtime } } */ +/* { dg-options "-msse2" } */ +/* { dg-additional-options "-mtune=i686" { target ia32 } } */ + +typedef unsigned short __attribute__((__vector_size__ (4 * sizeof (unsigned short)))) V; + +V +foo (unsigned short u, V v) +{ + return __builtin_shuffle (u * v, v); +} + +int +main () +{ + V x = foo (1, (V) { 0, 1, 2, 3 }); + for (unsigned i = 0; i < 4; i++) + if (x[i] != i) + __builtin_abort (); + return 0; +}