Message ID | mpteevkhmws.fsf@arm.com |
---|---|
State | New |
Headers | show |
Series | simplify-rtx: Extend (truncate (*extract ...)) fold [PR87763] | expand |
On Mon, 2020-01-27 at 16:41 +0000, Richard Sandiford wrote: > In the gcc.target/aarch64/lsl_asr_sbfiz.c part of this PR, we have: > > Failed to match this instruction: > (set (reg:SI 95) > (ashift:SI (subreg:SI (sign_extract:DI (subreg:DI (reg:SI 97) 0) > (const_int 3 [0x3]) > (const_int 0 [0])) 0) > (const_int 19 [0x13]))) > > If we perform the natural simplification to: > > (set (reg:SI 95) > (ashift:SI (sign_extract:SI (reg:SI 97) > (const_int 3 [0x3]) > (const_int 0 [0])) 0) > (const_int 19 [0x13]))) > > then the pattern matches. And it turns out that we do have a > simplification like that already, but it would only kick in for > extractions from a reg, not a subreg. E.g.: Yea. I ran into similar problems with the extract/extend bits in combine. And we know it's a fairly general problem that we don't handle SUBREGs anywhere near as consistently as REGs. > > (set (reg:SI 95) > (ashift:SI (subreg:SI (sign_extract:DI (reg:DI X) > (const_int 3 [0x3]) > (const_int 0 [0])) 0) > (const_int 19 [0x13]))) > > would simplify to: > > (set (reg:SI 95) > (ashift:SI (sign_extract:SI (subreg:SI (reg:DI X) 0) > (const_int 3 [0x3]) > (const_int 0 [0])) 0) > (const_int 19 [0x13]))) > > IMO the subreg case is even more obviously a simplification > than the bare reg case, since the net effect is to remove > either one or two subregs, rather than simply change the > position of a subreg/truncation. > > However, doing that regressed gcc.dg/tree-ssa/pr64910-2.c > for -m32 on x86_64-linux-gnu, because we could then simplify > a :HI zero_extract to a :QI one. The associated *testqi_ext_3 > pattern did already seem to want to handle QImode extractions: > > "ix86_match_ccmode (insn, CCNOmode) > && ((TARGET_64BIT && GET_MODE (operands[2]) == DImode) > || GET_MODE (operands[2]) == SImode > || GET_MODE (operands[2]) == HImode > || GET_MODE (operands[2]) == QImode) > > but I'm not sure how often the QI case would trigger in practice, > since the zero_extract mode was restricted to HI and above. I checked > the other x86 patterns and couldn't see any other instances of this. > > Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu, > OK to install? > > Richard > > > 2020-01-27 Richard Sandiford <richard.sandiford@arm.com> > > gcc/ > PR rtl-optimization/87763 > * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract > simplification to handle subregs as well as bare regs. > * config/i386/i386.md (*testqi_ext_3): Match QI extracts too. Do you need to check for and reject paradoxicals here? If not, OK as- is. If you need to check, then that's pre-approved as well. jeff >
Jeff Law <law@redhat.com> writes: > On Mon, 2020-01-27 at 16:41 +0000, Richard Sandiford wrote: >> In the gcc.target/aarch64/lsl_asr_sbfiz.c part of this PR, we have: >> >> Failed to match this instruction: >> (set (reg:SI 95) >> (ashift:SI (subreg:SI (sign_extract:DI (subreg:DI (reg:SI 97) 0) >> (const_int 3 [0x3]) >> (const_int 0 [0])) 0) >> (const_int 19 [0x13]))) >> >> If we perform the natural simplification to: >> >> (set (reg:SI 95) >> (ashift:SI (sign_extract:SI (reg:SI 97) >> (const_int 3 [0x3]) >> (const_int 0 [0])) 0) >> (const_int 19 [0x13]))) >> >> then the pattern matches. And it turns out that we do have a >> simplification like that already, but it would only kick in for >> extractions from a reg, not a subreg. E.g.: > Yea. I ran into similar problems with the extract/extend bits in > combine. And we know it's a fairly general problem that we don't > handle SUBREGs anywhere near as consistently as REGs. > > >> >> (set (reg:SI 95) >> (ashift:SI (subreg:SI (sign_extract:DI (reg:DI X) >> (const_int 3 [0x3]) >> (const_int 0 [0])) 0) >> (const_int 19 [0x13]))) >> >> would simplify to: >> >> (set (reg:SI 95) >> (ashift:SI (sign_extract:SI (subreg:SI (reg:DI X) 0) >> (const_int 3 [0x3]) >> (const_int 0 [0])) 0) >> (const_int 19 [0x13]))) >> >> IMO the subreg case is even more obviously a simplification >> than the bare reg case, since the net effect is to remove >> either one or two subregs, rather than simply change the >> position of a subreg/truncation. >> >> However, doing that regressed gcc.dg/tree-ssa/pr64910-2.c >> for -m32 on x86_64-linux-gnu, because we could then simplify >> a :HI zero_extract to a :QI one. The associated *testqi_ext_3 >> pattern did already seem to want to handle QImode extractions: >> >> "ix86_match_ccmode (insn, CCNOmode) >> && ((TARGET_64BIT && GET_MODE (operands[2]) == DImode) >> || GET_MODE (operands[2]) == SImode >> || GET_MODE (operands[2]) == HImode >> || GET_MODE (operands[2]) == QImode) >> >> but I'm not sure how often the QI case would trigger in practice, >> since the zero_extract mode was restricted to HI and above. I checked >> the other x86 patterns and couldn't see any other instances of this. >> >> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu, >> OK to install? >> >> Richard >> >> >> 2020-01-27 Richard Sandiford <richard.sandiford@arm.com> >> >> gcc/ >> PR rtl-optimization/87763 >> * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract >> simplification to handle subregs as well as bare regs. >> * config/i386/i386.md (*testqi_ext_3): Match QI extracts too. > Do you need to check for and reject paradoxicals here? If not, OK as- > is. If you need to check, then that's pre-approved as well. Thanks, pushed. On the paradoxical thing: the outer subreg is always non-paradoxical for simplify_truncation, so we don't need to check that specifically. For the inner subreg we need to handle the paradoxical case for the PR. I wondered at one point about punting for non-paradoxical inner subregs, but there didn't seem to be a good reason to keep an expression like (truncate (op (truncate...))) over (op (truncate ...)). If it turns out there is a good reason though, changing SUBREG_P to paradoxical_subreg_p would be fine in terms of this PR. Richard
On Jan 27 2020, Richard Sandiford wrote: > * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract > simplification to handle subregs as well as bare regs. That breaks gcc.target/m68k/pr39726.c @@ -67,39 +67,39 @@ Disassembly of section .text: 00000088 <a0wbs>: 88: 302f 0006 movew %sp@(6),%d0 8c: 906f 000a subw %sp@(10),%d0 - 90: 4a00 tstb %d0 - 92: 6a08 bpls 9c <a0wbs+0x14> - 94: 13fc 0001 0000 moveb #1,0 <a0bs> - 9a: 0000 - 98: R_68K_32 v - 9c: 4e75 rts + 90: 0240 0080 andiw #128,%d0 + 94: 6708 beqs 9e <a0wbs+0x16> + 96: 13fc 0001 0000 moveb #1,0 <a0bs> + 9c: 0000 + 9a: R_68K_32 v + 9e: 4e75 rts -0000009e <a1wbs>: - 9e: 302f 0006 movew %sp@(6),%d0 - a2: d06f 000a addw %sp@(10),%d0 - a6: 4a00 tstb %d0 - a8: 6a08 bpls b2 <a1wbs+0x14> - aa: 13fc 0001 0000 moveb #1,0 <a0bs> - b0: 0000 - ae: R_68K_32 v - b2: 4e75 rts +000000a0 <a1wbs>: + a0: 302f 0006 movew %sp@(6),%d0 + a4: d06f 000a addw %sp@(10),%d0 + a8: 0240 0080 andiw #128,%d0 + ac: 6708 beqs b6 <a1wbs+0x16> + ae: 13fc 0001 0000 moveb #1,0 <a0bs> + b4: 0000 + b2: R_68K_32 v + b6: 4e75 rts -000000b4 <a0w>: - b4: 302f 0006 movew %sp@(6),%d0 - b8: 906f 000a subw %sp@(10),%d0 - bc: 0240 8421 andiw #-31711,%d0 - c0: 6708 beqs ca <a0w+0x16> - c2: 13fc 0001 0000 moveb #1,0 <a0bs> - c8: 0000 - c6: R_68K_32 v - ca: 4e75 rts +000000b8 <a0w>: + b8: 302f 0006 movew %sp@(6),%d0 + bc: 906f 000a subw %sp@(10),%d0 + c0: 0240 8421 andiw #-31711,%d0 + c4: 6708 beqs ce <a0w+0x16> + c6: 13fc 0001 0000 moveb #1,0 <a0bs> + cc: 0000 + ca: R_68K_32 v + ce: 4e75 rts -000000cc <a1w>: - cc: 302f 0006 movew %sp@(6),%d0 - d0: d06f 000a addw %sp@(10),%d0 - d4: 0240 8421 andiw #-31711,%d0 - d8: 6708 beqs e2 <a1w+0x16> - da: 13fc 0001 0000 moveb #1,0 <a0bs> - e0: 0000 - de: R_68K_32 v - e2: 4e75 rts +000000d0 <a1w>: + d0: 302f 0006 movew %sp@(6),%d0 + d4: d06f 000a addw %sp@(10),%d0 + d8: 0240 8421 andiw #-31711,%d0 + dc: 6708 beqs e6 <a1w+0x16> + de: 13fc 0001 0000 moveb #1,0 <a0bs> + e4: 0000 + e2: R_68K_32 v + e6: 4e75 rts Andreas.
Andreas Schwab <schwab@suse.de> writes: > On Jan 27 2020, Richard Sandiford wrote: > >> * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract >> simplification to handle subregs as well as bare regs. > > That breaks gcc.target/m68k/pr39726.c Gah. Jeff pointed out off-list that it also broke gcc.target/sh/pr64345-1.c on sh3-linux-gnu. It didn't look like either of them would be fixable in an acceptably non-invasive and unhacky way, so I've reverted the patch "for now". Thanks, Richard
On Wed, 2020-01-29 at 19:18 +0000, Richard Sandiford wrote: > Andreas Schwab <schwab@suse.de> writes: > > On Jan 27 2020, Richard Sandiford wrote: > > > > > * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract > > > simplification to handle subregs as well as bare regs. > > > > That breaks gcc.target/m68k/pr39726.c > > Gah. Jeff pointed out off-list that it also broke > gcc.target/sh/pr64345-1.c on sh3-linux-gnu. It didn't look like either > of them would be fixable in an acceptably non-invasive and unhacky way, > so I've reverted the patch "for now". I would have considered letting those two targets regress those tests to move forward on 87763. aarch64 is (IMHO) more important than the sh and m68k combined ;-) It also seems to me that your patch generates better RTL and that we could claim that a port that regresses on code quality needs its port maintainer to step in and fix the port. WRT the m68k issue I'd think it could be fixed by twiddling cbranchsi4_btst_reg_insn_1 to accept a mode iterator on the zero_extract and making some minor adjustments in its output code. Something like the attached. I haven't tested it in any real way and haven't really thought about whether or not it does the right thing for a MEM operand. I'd be surprised if the SH fix wasn't similar, but I know almost nothing about the SH implementation. Jeff
On Thu, Jan 30, 2020 at 10:23:35AM -0700, Jeff Law wrote: > diff --git a/gcc/config/m68k/m68k.md b/gcc/config/m68k/m68k.md > index 8e35357ea23..78c4cbe4753 100644 > --- a/gcc/config/m68k/m68k.md > +++ b/gcc/config/m68k/m68k.md > @@ -644,12 +644,12 @@ > return m68k_output_branch_integer (code); > }) > > -(define_insn "cbranchsi4_btst_reg_insn_1" > +(define_insn "cbranch<mode>4_btst_reg_insn_1" > [(set (pc) > (if_then_else (match_operator 0 "equality_comparison_operator" > - [(zero_extract:SI (match_operand:SI 1 "nonimmediate_operand" "do,dQ") > - (const_int 1) > - (match_operand:SI 2 "const_int_operand" "n,n")) > + [(zero_extract:I (match_operand:I 1 "nonimmediate_operand" "do,dQ") > + (const_int 1) > + (match_operand:I 2 "const_int_operand" "n,n")) > (const_int 0)]) > (label_ref (match_operand 3 "")) > (pc)))] > @@ -665,8 +665,9 @@ > } > else > { > - operands[2] = GEN_INT (31 - INTVAL (operands[2])); > - code = m68k_output_btst (operands[2], operands[1], code, 31); > + operands[2] = GEN_INT (GET_MODE_BITSIZE (GET_MODE (operands[1])) > + - INTVAL (operands[2]) - 1); > + code = m68k_output_btst (operands[2], operands[1], code, GET_MODE_BITSIZE (GET_MODE (operands[1])) - 1); s/GET_MODE (operands[1])/<MODE>mode/g ? Also, the last line is too long. Jakub
On Thu, 2020-01-30 at 18:27 +0100, Jakub Jelinek wrote: > On Thu, Jan 30, 2020 at 10:23:35AM -0700, Jeff Law wrote: > > diff --git a/gcc/config/m68k/m68k.md b/gcc/config/m68k/m68k.md > > index 8e35357ea23..78c4cbe4753 100644 > > --- a/gcc/config/m68k/m68k.md > > +++ b/gcc/config/m68k/m68k.md > > @@ -644,12 +644,12 @@ > > return m68k_output_branch_integer (code); > > }) > > > > -(define_insn "cbranchsi4_btst_reg_insn_1" > > +(define_insn "cbranch<mode>4_btst_reg_insn_1" > > [(set (pc) > > (if_then_else (match_operator 0 "equality_comparison_operator" > > - [(zero_extract:SI (match_operand:SI 1 "nonimmediate_operand" "do,dQ") > > - (const_int 1) > > - (match_operand:SI 2 "const_int_operand" "n,n")) > > + [(zero_extract:I (match_operand:I 1 "nonimmediate_operand" "do,dQ") > > + (const_int 1) > > + (match_operand:I 2 "const_int_operand" "n,n")) > > (const_int 0)]) > > (label_ref (match_operand 3 "")) > > (pc)))] > > @@ -665,8 +665,9 @@ > > } > > else > > { > > - operands[2] = GEN_INT (31 - INTVAL (operands[2])); > > - code = m68k_output_btst (operands[2], operands[1], code, 31); > > + operands[2] = GEN_INT (GET_MODE_BITSIZE (GET_MODE (operands[1])) > > + - INTVAL (operands[2]) - 1); > > + code = m68k_output_btst (operands[2], operands[1], code, GET_MODE_BITSIZE (GET_MODE (operands[1])) - 1); > > s/GET_MODE (operands[1])/<MODE>mode/g ? > Also, the last line is too long. It's not meant for inclusion as-is, but to show how we might fix this and allow us to move forward on 87763. Jeff
Jeff Law <law@redhat.com> writes: > On Wed, 2020-01-29 at 19:18 +0000, Richard Sandiford wrote: >> Andreas Schwab <schwab@suse.de> writes: >> > On Jan 27 2020, Richard Sandiford wrote: >> > >> > > * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract >> > > simplification to handle subregs as well as bare regs. >> > >> > That breaks gcc.target/m68k/pr39726.c >> >> Gah. Jeff pointed out off-list that it also broke >> gcc.target/sh/pr64345-1.c on sh3-linux-gnu. It didn't look like either >> of them would be fixable in an acceptably non-invasive and unhacky way, >> so I've reverted the patch "for now". > I would have considered letting those two targets regress those tests > to move forward on 87763. aarch64 is (IMHO) more important than the sh > and m68k combined ;-) It also seems to me that your patch generates > better RTL and that we could claim that a port that regresses on code > quality needs its port maintainer to step in and fix the port. > > WRT the m68k issue I'd think it could be fixed by twiddling > cbranchsi4_btst_reg_insn_1 to accept a mode iterator on the > zero_extract and making some minor adjustments in its output code. > Something like the attached. I haven't tested it in any real way and > haven't really thought about whether or not it does the right thing for > a MEM operand. It just feels like this is breaking the contract with extv and extzv, where all *_extracts are supposed to be in the mode that those patterns accept. My i386 patch was doing that too TBH, I was just being optimistic given that QImode was already handled by that pattern. :-) So IMO my patch has too many knock-on effects for it to be suitable for stage 4. While we have make_extraction, we probably have to leave this kind of expression untouched. For AArch64 I was planning on adding a new pattern to match the (subreg:SI (zero_extract:DI ...)) -- as one of the comments in the PR suggested -- but with the subreg matched via subreg_lowpart_operator, to make it endian-safe. This is similar in spirit to the new i686 popcount patterns. Thanks, Richard > > I'd be surprised if the SH fix wasn't similar, but I know almost > nothing about the SH implementation. > > Jeff
On Thu, 2020-01-30 at 17:54 +0000, Richard Sandiford wrote: > Jeff Law <law@redhat.com> writes: > > On Wed, 2020-01-29 at 19:18 +0000, Richard Sandiford wrote: > > > Andreas Schwab <schwab@suse.de> writes: > > > > On Jan 27 2020, Richard Sandiford wrote: > > > > > > > > > * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract > > > > > simplification to handle subregs as well as bare regs. > > > > > > > > That breaks gcc.target/m68k/pr39726.c > > > > > > Gah. Jeff pointed out off-list that it also broke > > > gcc.target/sh/pr64345-1.c on sh3-linux-gnu. It didn't look like either > > > of them would be fixable in an acceptably non-invasive and unhacky way, > > > so I've reverted the patch "for now". > > I would have considered letting those two targets regress those tests > > to move forward on 87763. aarch64 is (IMHO) more important than the sh > > and m68k combined ;-) It also seems to me that your patch generates > > better RTL and that we could claim that a port that regresses on code > > quality needs its port maintainer to step in and fix the port. > > > > WRT the m68k issue I'd think it could be fixed by twiddling > > cbranchsi4_btst_reg_insn_1 to accept a mode iterator on the > > zero_extract and making some minor adjustments in its output code. > > Something like the attached. I haven't tested it in any real way and > > haven't really thought about whether or not it does the right thing for > > a MEM operand. > > It just feels like this is breaking the contract with extv and extzv, > where all *_extracts are supposed to be in the mode that those patterns > accept. My i386 patch was doing that too TBH, I was just being > optimistic given that QImode was already handled by that pattern. :-) > > So IMO my patch has too many knock-on effects for it to be suitable for > stage 4. While we have make_extraction, we probably have to leave this > kind of expression untouched. > > For AArch64 I was planning on adding a new pattern to match the > (subreg:SI (zero_extract:DI ...)) -- as one of the comments in the > PR suggested -- but with the subreg matched via subreg_lowpart_operator, > to make it endian-safe. This is similar in spirit to the new i686 > popcount patterns. Fair enough on the simplify-rtx change :-) One might argue that we should be loosening the requirements, but memory operands are particularly troublesome. I think HP and I had a light discussion on that a year or so ago. WRT adding patterns to aarch64. Yea, I went that route last year, but never was able to go from "hey, this seems to work pretty well" to "it's OK for the trunk". Jeff
Hi Richard, You patch below increases code-size on aarch64-linux-gnu with -Os on SPEC2k6 400.perlbench and 453.povray -- by 1% and 2% respectively. 400.perlbench,perlbench_base.default, 101,939261,951221 453.povray,povray_base.default, 102,707807,721399 Would you please check whether these can be avoided? [Let me know if you need help reproducing this problem.] Thank you, -- Maxim Kuvyrkov https://www.linaro.org > On Jan 27, 2020, at 7:41 PM, Richard Sandiford <richard.sandiford@arm.com> wrote: > > In the gcc.target/aarch64/lsl_asr_sbfiz.c part of this PR, we have: > > Failed to match this instruction: > (set (reg:SI 95) > (ashift:SI (subreg:SI (sign_extract:DI (subreg:DI (reg:SI 97) 0) > (const_int 3 [0x3]) > (const_int 0 [0])) 0) > (const_int 19 [0x13]))) > > If we perform the natural simplification to: > > (set (reg:SI 95) > (ashift:SI (sign_extract:SI (reg:SI 97) > (const_int 3 [0x3]) > (const_int 0 [0])) 0) > (const_int 19 [0x13]))) > > then the pattern matches. And it turns out that we do have a > simplification like that already, but it would only kick in for > extractions from a reg, not a subreg. E.g.: > > (set (reg:SI 95) > (ashift:SI (subreg:SI (sign_extract:DI (reg:DI X) > (const_int 3 [0x3]) > (const_int 0 [0])) 0) > (const_int 19 [0x13]))) > > would simplify to: > > (set (reg:SI 95) > (ashift:SI (sign_extract:SI (subreg:SI (reg:DI X) 0) > (const_int 3 [0x3]) > (const_int 0 [0])) 0) > (const_int 19 [0x13]))) > > IMO the subreg case is even more obviously a simplification > than the bare reg case, since the net effect is to remove > either one or two subregs, rather than simply change the > position of a subreg/truncation. > > However, doing that regressed gcc.dg/tree-ssa/pr64910-2.c > for -m32 on x86_64-linux-gnu, because we could then simplify > a :HI zero_extract to a :QI one. The associated *testqi_ext_3 > pattern did already seem to want to handle QImode extractions: > > "ix86_match_ccmode (insn, CCNOmode) > && ((TARGET_64BIT && GET_MODE (operands[2]) == DImode) > || GET_MODE (operands[2]) == SImode > || GET_MODE (operands[2]) == HImode > || GET_MODE (operands[2]) == QImode) > > but I'm not sure how often the QI case would trigger in practice, > since the zero_extract mode was restricted to HI and above. I checked > the other x86 patterns and couldn't see any other instances of this. > > Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu, > OK to install? > > Richard > > > 2020-01-27 Richard Sandiford <richard.sandiford@arm.com> > > gcc/ > PR rtl-optimization/87763 > * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract > simplification to handle subregs as well as bare regs. > * config/i386/i386.md (*testqi_ext_3): Match QI extracts too. > --- > gcc/config/i386/i386.md | 2 +- > gcc/simplify-rtx.c | 4 +++- > 2 files changed, 4 insertions(+), 2 deletions(-) > > diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md > index 6e9c9bd2fb6..a125ab350bb 100644 > --- a/gcc/config/i386/i386.md > +++ b/gcc/config/i386/i386.md > @@ -8927,7 +8927,7 @@ (define_insn "*testqi_ext_2" > (define_insn_and_split "*testqi_ext_3" > [(set (match_operand 0 "flags_reg_operand") > (match_operator 1 "compare_operator" > - [(zero_extract:SWI248 > + [(zero_extract:SWI > (match_operand 2 "nonimmediate_operand" "rm") > (match_operand 3 "const_int_operand" "n") > (match_operand 4 "const_int_operand" "n")) > diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c > index eff1d07a253..db4f9339c15 100644 > --- a/gcc/simplify-rtx.c > +++ b/gcc/simplify-rtx.c > @@ -736,7 +736,9 @@ simplify_truncation (machine_mode mode, rtx op, > (*_extract:M1 (truncate:M1 (reg:M2)) (len) (pos')) if possible without > changing len. */ > if ((GET_CODE (op) == ZERO_EXTRACT || GET_CODE (op) == SIGN_EXTRACT) > - && REG_P (XEXP (op, 0)) > + && (REG_P (XEXP (op, 0)) > + || (SUBREG_P (XEXP (op, 0)) > + && REG_P (SUBREG_REG (XEXP (op, 0))))) > && GET_MODE (XEXP (op, 0)) == GET_MODE (op) > && CONST_INT_P (XEXP (op, 1)) > && CONST_INT_P (XEXP (op, 2))) > -- > 2.17.1 >
Maxim Kuvyrkov <maxim.kuvyrkov@gmail.com> writes: > Hi Richard, > > You patch below increases code-size on aarch64-linux-gnu with -Os on SPEC2k6 400.perlbench and 453.povray -- by 1% and 2% respectively. > > 400.perlbench,perlbench_base.default, 101,939261,951221 > 453.povray,povray_base.default, 102,707807,721399 > > Would you please check whether these can be avoided? [Let me know if you need help reproducing this problem.] I reverted the patch on Wednesday, see: https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01962.html Thanks, Richard > > Thank you, > > -- > Maxim Kuvyrkov > https://www.linaro.org > >> On Jan 27, 2020, at 7:41 PM, Richard Sandiford <richard.sandiford@arm.com> wrote: >> >> In the gcc.target/aarch64/lsl_asr_sbfiz.c part of this PR, we have: >> >> Failed to match this instruction: >> (set (reg:SI 95) >> (ashift:SI (subreg:SI (sign_extract:DI (subreg:DI (reg:SI 97) 0) >> (const_int 3 [0x3]) >> (const_int 0 [0])) 0) >> (const_int 19 [0x13]))) >> >> If we perform the natural simplification to: >> >> (set (reg:SI 95) >> (ashift:SI (sign_extract:SI (reg:SI 97) >> (const_int 3 [0x3]) >> (const_int 0 [0])) 0) >> (const_int 19 [0x13]))) >> >> then the pattern matches. And it turns out that we do have a >> simplification like that already, but it would only kick in for >> extractions from a reg, not a subreg. E.g.: >> >> (set (reg:SI 95) >> (ashift:SI (subreg:SI (sign_extract:DI (reg:DI X) >> (const_int 3 [0x3]) >> (const_int 0 [0])) 0) >> (const_int 19 [0x13]))) >> >> would simplify to: >> >> (set (reg:SI 95) >> (ashift:SI (sign_extract:SI (subreg:SI (reg:DI X) 0) >> (const_int 3 [0x3]) >> (const_int 0 [0])) 0) >> (const_int 19 [0x13]))) >> >> IMO the subreg case is even more obviously a simplification >> than the bare reg case, since the net effect is to remove >> either one or two subregs, rather than simply change the >> position of a subreg/truncation. >> >> However, doing that regressed gcc.dg/tree-ssa/pr64910-2.c >> for -m32 on x86_64-linux-gnu, because we could then simplify >> a :HI zero_extract to a :QI one. The associated *testqi_ext_3 >> pattern did already seem to want to handle QImode extractions: >> >> "ix86_match_ccmode (insn, CCNOmode) >> && ((TARGET_64BIT && GET_MODE (operands[2]) == DImode) >> || GET_MODE (operands[2]) == SImode >> || GET_MODE (operands[2]) == HImode >> || GET_MODE (operands[2]) == QImode) >> >> but I'm not sure how often the QI case would trigger in practice, >> since the zero_extract mode was restricted to HI and above. I checked >> the other x86 patterns and couldn't see any other instances of this. >> >> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu, >> OK to install? >> >> Richard >> >> >> 2020-01-27 Richard Sandiford <richard.sandiford@arm.com> >> >> gcc/ >> PR rtl-optimization/87763 >> * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract >> simplification to handle subregs as well as bare regs. >> * config/i386/i386.md (*testqi_ext_3): Match QI extracts too. >> --- >> gcc/config/i386/i386.md | 2 +- >> gcc/simplify-rtx.c | 4 +++- >> 2 files changed, 4 insertions(+), 2 deletions(-) >> >> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md >> index 6e9c9bd2fb6..a125ab350bb 100644 >> --- a/gcc/config/i386/i386.md >> +++ b/gcc/config/i386/i386.md >> @@ -8927,7 +8927,7 @@ (define_insn "*testqi_ext_2" >> (define_insn_and_split "*testqi_ext_3" >> [(set (match_operand 0 "flags_reg_operand") >> (match_operator 1 "compare_operator" >> - [(zero_extract:SWI248 >> + [(zero_extract:SWI >> (match_operand 2 "nonimmediate_operand" "rm") >> (match_operand 3 "const_int_operand" "n") >> (match_operand 4 "const_int_operand" "n")) >> diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c >> index eff1d07a253..db4f9339c15 100644 >> --- a/gcc/simplify-rtx.c >> +++ b/gcc/simplify-rtx.c >> @@ -736,7 +736,9 @@ simplify_truncation (machine_mode mode, rtx op, >> (*_extract:M1 (truncate:M1 (reg:M2)) (len) (pos')) if possible without >> changing len. */ >> if ((GET_CODE (op) == ZERO_EXTRACT || GET_CODE (op) == SIGN_EXTRACT) >> - && REG_P (XEXP (op, 0)) >> + && (REG_P (XEXP (op, 0)) >> + || (SUBREG_P (XEXP (op, 0)) >> + && REG_P (SUBREG_REG (XEXP (op, 0))))) >> && GET_MODE (XEXP (op, 0)) == GET_MODE (op) >> && CONST_INT_P (XEXP (op, 1)) >> && CONST_INT_P (XEXP (op, 2))) >> -- >> 2.17.1 >>
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 6e9c9bd2fb6..a125ab350bb 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -8927,7 +8927,7 @@ (define_insn "*testqi_ext_2" (define_insn_and_split "*testqi_ext_3" [(set (match_operand 0 "flags_reg_operand") (match_operator 1 "compare_operator" - [(zero_extract:SWI248 + [(zero_extract:SWI (match_operand 2 "nonimmediate_operand" "rm") (match_operand 3 "const_int_operand" "n") (match_operand 4 "const_int_operand" "n")) diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c index eff1d07a253..db4f9339c15 100644 --- a/gcc/simplify-rtx.c +++ b/gcc/simplify-rtx.c @@ -736,7 +736,9 @@ simplify_truncation (machine_mode mode, rtx op, (*_extract:M1 (truncate:M1 (reg:M2)) (len) (pos')) if possible without changing len. */ if ((GET_CODE (op) == ZERO_EXTRACT || GET_CODE (op) == SIGN_EXTRACT) - && REG_P (XEXP (op, 0)) + && (REG_P (XEXP (op, 0)) + || (SUBREG_P (XEXP (op, 0)) + && REG_P (SUBREG_REG (XEXP (op, 0))))) && GET_MODE (XEXP (op, 0)) == GET_MODE (op) && CONST_INT_P (XEXP (op, 1)) && CONST_INT_P (XEXP (op, 2)))