Message ID | 007f01daebd7$7b2f14b0$718d3e10$@nextmovesoftware.com |
---|---|
State | New |
Headers | show |
Series | [x86] PR target/116275: Handle STV of *extenddi2_doubleword_highpart | expand |
On Sun, Aug 11, 2024 at 12:16 PM Roger Sayle <roger@nextmovesoftware.com> wrote: > > > This patch resolves PR target/116275, a recent ICE-on-valid regression on > -m32 caused by my recent change to enable STV of DImode arithmeric right > shift on non-AVX512VL targets. The oversight is that the i386 backend > contains an *extenddi2_doubleword_highpart instruction (whose pattern > is an arithmetic right shift of a left shift) that optimizes the case where > sign-extension need only update the highpart word of a DImode value when > generating 32-bit code (!TARGET_64BIT). STV accepts this pattern as a > candidate, as there are patterns to handle this form of extension on SSE > using AVX512VL instructions (and previously ASHIFTRT was only allowed on > AVX512VL). Now that ASHIFTRT is a candidate on non-AVX512vL targets, we > either need to check that the first operand is a register, or as done > below provide the define_insn_and_split that provides a non-AVX512VL > implementation of *extendv2di_highpart_stv. > > The new testcase only ICEed with -m32, so this test could be limited to > target ia32, but there's no harm also running this test on -m64 to > provide a little extra test coverage. > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-m32} > with no new failures. Ok for mainline? > > > 2024-08-11 Roger Sayle <roger@nextmovesoftware.com> > > gcc/ChangeLog > PR target/116275 > * config/i386/i386.md (*extendv2di2_highpart_stv_noavx512vl): New > define_insn_and_split to handle the STV conversion of the DImode > pattern *extenddi2_doubleword_highpart. > > gcc/testsuite/ChangeLog > PR target/116275 > * g++.target/i386/pr116275.C: New test case. + [(set (match_dup 0) + (ashift:V2DI (match_dup 1) (match_dup 2))) + (set (match_dup 0) + (ashiftrt:V2DI (match_dup 0) (match_dup 2)))]) SInce this pattern is split before reload, you can perhaps introduce a new V2DI temporary register and use it to output from the first RTX. This will ease the job of RA a tiny bit. OK with or without the above suggestion. Thanks, Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index db7789c..1a6188f 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -17393,6 +17393,24 @@ (ashift:V2DI (match_dup 1) (match_dup 2))) (set (match_dup 0) (ashiftrt:V2DI (match_dup 0) (match_dup 2)))]) + +;; Without AVX512VL, split this instruction before reload. +(define_insn_and_split "*extendv2di2_highpart_stv_noavx512vl" + [(set (match_operand:V2DI 0 "register_operand" "=v") + (ashiftrt:V2DI + (ashift:V2DI (match_operand:V2DI 1 "nonimmediate_operand" "vm") + (match_operand:QI 2 "const_int_operand")) + (match_operand:QI 3 "const_int_operand")))] + "!TARGET_AVX512VL + && INTVAL (operands[2]) == INTVAL (operands[3]) + && UINTVAL (operands[2]) < 32 + && ix86_pre_reload_split ()" + "#" + "&& 1" + [(set (match_dup 0) + (ashift:V2DI (match_dup 1) (match_dup 2))) + (set (match_dup 0) + (ashiftrt:V2DI (match_dup 0) (match_dup 2)))]) ;; Rotate instructions diff --git a/gcc/testsuite/g++.target/i386/pr116275.C b/gcc/testsuite/g++.target/i386/pr116275.C new file mode 100644 index 0000000..69c5b5a --- /dev/null +++ b/gcc/testsuite/g++.target/i386/pr116275.C @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mavx -std=c++11" } */ + +struct SymbolDesc push_back(SymbolDesc); +struct SymbolDesc { + long long ELFLocalSymIdx; +}; +struct Expected { + long long &operator*(); +}; +void SymbolizableObjectFileaddSymbol() { + Expected SymbolAddressOrErr; + long long SymbolAddress = *SymbolAddressOrErr << 8 >> 8; + push_back({SymbolAddress}); +}