Message ID | 20220524214703.4022737-3-philipp.tomsich@vrull.eu |
---|---|
State | New |
Headers | show |
Series | RISC-V: Improve sequences with shifted zero-extended operands | expand |
LGTM, you can commit that without [3/3] if you like :) On Wed, May 25, 2022 at 5:47 AM Philipp Tomsich <philipp.tomsich@vrull.eu> wrote: > > When encountering a prescaled (biased) value as a candidate for > sh[123]add.uw, the combine pass will present this as shifted by the > aggregate amount (prescale + shift-amount) with an appropriately > adjusted mask constant that has fewer than 32 bits set. > > E.g., here's the failing expression seen in combine for a prescale of > 1 and a shift of 2 (note how 0x3fffffff8 >> 3 is 0x7fffffff). > Trying 7, 8 -> 10: > 7: r78:SI=r81:DI#0<<0x1 > REG_DEAD r81:DI > 8: r79:DI=zero_extend(r78:SI) > REG_DEAD r78:SI > 10: r80:DI=r79:DI<<0x2+r82:DI > REG_DEAD r79:DI > REG_DEAD r82:DI > Failed to match this instruction: > (set (reg:DI 80 [ cD.1491 ]) > (plus:DI (and:DI (ashift:DI (reg:DI 81) > (const_int 3 [0x3])) > (const_int 17179869176 [0x3fffffff8])) > (reg:DI 82))) > > To address this, we introduce a splitter handling these cases. > > gcc/ChangeLog: > > * config/riscv/bitmanip.md: Add split to handle opportunities > for slli + sh[123]add.uw > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/zba-shadd.c: New test. > > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu> > Co-developed-by: Manolis Tsamis <manolis.tsamis@vrull.eu> > > --- > > gcc/config/riscv/bitmanip.md | 44 ++++++++++++++++++++++ > gcc/testsuite/gcc.target/riscv/zba-shadd.c | 13 +++++++ > 2 files changed, 57 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/riscv/zba-shadd.c > > diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md > index 0ab9ffe3c0b..6c1ccc6f8c5 100644 > --- a/gcc/config/riscv/bitmanip.md > +++ b/gcc/config/riscv/bitmanip.md > @@ -79,6 +79,50 @@ (define_insn "*shNadduw" > [(set_attr "type" "bitmanip") > (set_attr "mode" "DI")]) > > +;; During combine, we may encounter an attempt to combine > +;; slli rtmp, rs, #imm > +;; zext.w rtmp, rtmp > +;; sh[123]add rd, rtmp, rs2 > +;; which will lead to the immediate not satisfying the above constraints. > +;; By splitting the compound expression, we can simplify to a slli and a > +;; sh[123]add.uw. > +(define_split > + [(set (match_operand:DI 0 "register_operand") > + (plus:DI (and:DI (ashift:DI (match_operand:DI 1 "register_operand") > + (match_operand:QI 2 "immediate_operand")) > + (match_operand:DI 3 "consecutive_bits_operand")) > + (match_operand:DI 4 "register_operand"))) > + (clobber (match_operand:DI 5 "register_operand"))] > + "TARGET_64BIT && TARGET_ZBA" > + [(set (match_dup 5) (ashift:DI (match_dup 1) (match_dup 6))) > + (set (match_dup 0) (plus:DI (and:DI (ashift:DI (match_dup 5) > + (match_dup 7)) > + (match_dup 8)) > + (match_dup 4)))] > +{ > + unsigned HOST_WIDE_INT mask = UINTVAL (operands[3]); > + /* scale: shift within the sh[123]add.uw */ > + int scale = 32 - clz_hwi (mask); > + /* bias: pre-scale amount (i.e. the prior shift amount) */ > + int bias = ctz_hwi (mask) - scale; > + > + /* If the bias + scale don't add up to operand[2], reject. */ > + if ((scale + bias) != UINTVAL (operands[2])) > + FAIL; > + > + /* If the shift-amount is out-of-range for sh[123]add.uw, reject. */ > + if ((scale < 1) || (scale > 3)) > + FAIL; > + > + /* If there's no bias, the '*shNadduw' pattern should have matched. */ > + if (bias == 0) > + FAIL; > + > + operands[6] = GEN_INT (bias); > + operands[7] = GEN_INT (scale); > + operands[8] = GEN_INT (0xffffffffULL << scale); > +}) > + > (define_insn "*add.uw" > [(set (match_operand:DI 0 "register_operand" "=r") > (plus:DI (zero_extend:DI > diff --git a/gcc/testsuite/gcc.target/riscv/zba-shadd.c b/gcc/testsuite/gcc.target/riscv/zba-shadd.c > new file mode 100644 > index 00000000000..33da2530f3f > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/zba-shadd.c > @@ -0,0 +1,13 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -march=rv64gc_zba -mabi=lp64" } */ > + > +unsigned long foo(unsigned int a, unsigned long b) > +{ > + a = a << 1; > + unsigned long c = (unsigned long) a; > + unsigned long d = b + (c<<2); > + return d; > +} > + > +/* { dg-final { scan-assembler "sh2add.uw" } } */ > +/* { dg-final { scan-assembler-not "zext" } } */ > \ No newline at end of file > -- > 2.34.1 >
Thanks, applied to master! For [3/3], I'll submit a new standalone patch with the requested changes. On Tue, 7 Jun 2022 at 12:25, Kito Cheng <kito.cheng@gmail.com> wrote: > > LGTM, you can commit that without [3/3] if you like :) > > On Wed, May 25, 2022 at 5:47 AM Philipp Tomsich > <philipp.tomsich@vrull.eu> wrote: > > > > When encountering a prescaled (biased) value as a candidate for > > sh[123]add.uw, the combine pass will present this as shifted by the > > aggregate amount (prescale + shift-amount) with an appropriately > > adjusted mask constant that has fewer than 32 bits set. > > > > E.g., here's the failing expression seen in combine for a prescale of > > 1 and a shift of 2 (note how 0x3fffffff8 >> 3 is 0x7fffffff). > > Trying 7, 8 -> 10: > > 7: r78:SI=r81:DI#0<<0x1 > > REG_DEAD r81:DI > > 8: r79:DI=zero_extend(r78:SI) > > REG_DEAD r78:SI > > 10: r80:DI=r79:DI<<0x2+r82:DI > > REG_DEAD r79:DI > > REG_DEAD r82:DI > > Failed to match this instruction: > > (set (reg:DI 80 [ cD.1491 ]) > > (plus:DI (and:DI (ashift:DI (reg:DI 81) > > (const_int 3 [0x3])) > > (const_int 17179869176 [0x3fffffff8])) > > (reg:DI 82))) > > > > To address this, we introduce a splitter handling these cases. > > > > gcc/ChangeLog: > > > > * config/riscv/bitmanip.md: Add split to handle opportunities > > for slli + sh[123]add.uw > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/riscv/zba-shadd.c: New test. > > > > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu> > > Co-developed-by: Manolis Tsamis <manolis.tsamis@vrull.eu> > > > > --- > > > > gcc/config/riscv/bitmanip.md | 44 ++++++++++++++++++++++ > > gcc/testsuite/gcc.target/riscv/zba-shadd.c | 13 +++++++ > > 2 files changed, 57 insertions(+) > > create mode 100644 gcc/testsuite/gcc.target/riscv/zba-shadd.c > > > > diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md > > index 0ab9ffe3c0b..6c1ccc6f8c5 100644 > > --- a/gcc/config/riscv/bitmanip.md > > +++ b/gcc/config/riscv/bitmanip.md > > @@ -79,6 +79,50 @@ (define_insn "*shNadduw" > > [(set_attr "type" "bitmanip") > > (set_attr "mode" "DI")]) > > > > +;; During combine, we may encounter an attempt to combine > > +;; slli rtmp, rs, #imm > > +;; zext.w rtmp, rtmp > > +;; sh[123]add rd, rtmp, rs2 > > +;; which will lead to the immediate not satisfying the above constraints. > > +;; By splitting the compound expression, we can simplify to a slli and a > > +;; sh[123]add.uw. > > +(define_split > > + [(set (match_operand:DI 0 "register_operand") > > + (plus:DI (and:DI (ashift:DI (match_operand:DI 1 "register_operand") > > + (match_operand:QI 2 "immediate_operand")) > > + (match_operand:DI 3 "consecutive_bits_operand")) > > + (match_operand:DI 4 "register_operand"))) > > + (clobber (match_operand:DI 5 "register_operand"))] > > + "TARGET_64BIT && TARGET_ZBA" > > + [(set (match_dup 5) (ashift:DI (match_dup 1) (match_dup 6))) > > + (set (match_dup 0) (plus:DI (and:DI (ashift:DI (match_dup 5) > > + (match_dup 7)) > > + (match_dup 8)) > > + (match_dup 4)))] > > +{ > > + unsigned HOST_WIDE_INT mask = UINTVAL (operands[3]); > > + /* scale: shift within the sh[123]add.uw */ > > + int scale = 32 - clz_hwi (mask); > > + /* bias: pre-scale amount (i.e. the prior shift amount) */ > > + int bias = ctz_hwi (mask) - scale; > > + > > + /* If the bias + scale don't add up to operand[2], reject. */ > > + if ((scale + bias) != UINTVAL (operands[2])) > > + FAIL; > > + > > + /* If the shift-amount is out-of-range for sh[123]add.uw, reject. */ > > + if ((scale < 1) || (scale > 3)) > > + FAIL; > > + > > + /* If there's no bias, the '*shNadduw' pattern should have matched. */ > > + if (bias == 0) > > + FAIL; > > + > > + operands[6] = GEN_INT (bias); > > + operands[7] = GEN_INT (scale); > > + operands[8] = GEN_INT (0xffffffffULL << scale); > > +}) > > + > > (define_insn "*add.uw" > > [(set (match_operand:DI 0 "register_operand" "=r") > > (plus:DI (zero_extend:DI > > diff --git a/gcc/testsuite/gcc.target/riscv/zba-shadd.c b/gcc/testsuite/gcc.target/riscv/zba-shadd.c > > new file mode 100644 > > index 00000000000..33da2530f3f > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/riscv/zba-shadd.c > > @@ -0,0 +1,13 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O2 -march=rv64gc_zba -mabi=lp64" } */ > > + > > +unsigned long foo(unsigned int a, unsigned long b) > > +{ > > + a = a << 1; > > + unsigned long c = (unsigned long) a; > > + unsigned long d = b + (c<<2); > > + return d; > > +} > > + > > +/* { dg-final { scan-assembler "sh2add.uw" } } */ > > +/* { dg-final { scan-assembler-not "zext" } } */ > > \ No newline at end of file > > -- > > 2.34.1 > >
../../gcc/config/riscv/bitmanip.md: In function 'rtx_insn* gen_split_44(rtx_ins\ n*, rtx_def**)': ../../gcc/config/riscv/bitmanip.md:110:28: error: comparison of integer express\ ions of different signedness: 'int' and 'long unsigned int' [-Werror=sign-compa\ re] 110 | if ((scale + bias) != UINTVAL (operands[2]))
Hi Andreas: Fixed via https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d6b423882a05d7b4f40ae1e9d942c9c4c13761b7, thanks! On Fri, Jun 17, 2022 at 4:34 PM Andreas Schwab <schwab@linux-m68k.org> wrote: > > ../../gcc/config/riscv/bitmanip.md: In function 'rtx_insn* gen_split_44(rtx_ins\ > n*, rtx_def**)': > ../../gcc/config/riscv/bitmanip.md:110:28: error: comparison of integer express\ > ions of different signedness: 'int' and 'long unsigned int' [-Werror=sign-compa\ > re] > 110 | if ((scale + bias) != UINTVAL (operands[2])) > > -- > Andreas Schwab, schwab@linux-m68k.org > GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 > "And now for something completely different."
Kito, thanks: you were a few minutes ahead of my fix there. On Fri, 17 Jun 2022 at 16:00, Kito Cheng <kito.cheng@gmail.com> wrote: > Hi Andreas: > > Fixed via > https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d6b423882a05d7b4f40ae1e9d942c9c4c13761b7 > , > thanks! > > On Fri, Jun 17, 2022 at 4:34 PM Andreas Schwab <schwab@linux-m68k.org> > wrote: > > > > ../../gcc/config/riscv/bitmanip.md: In function 'rtx_insn* > gen_split_44(rtx_ins\ > > n*, rtx_def**)': > > ../../gcc/config/riscv/bitmanip.md:110:28: error: comparison of integer > express\ > > ions of different signedness: 'int' and 'long unsigned int' > [-Werror=sign-compa\ > > re] > > 110 | if ((scale + bias) != UINTVAL (operands[2])) > > > > -- > > Andreas Schwab, schwab@linux-m68k.org > > GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 > > "And now for something completely different." >
diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md index 0ab9ffe3c0b..6c1ccc6f8c5 100644 --- a/gcc/config/riscv/bitmanip.md +++ b/gcc/config/riscv/bitmanip.md @@ -79,6 +79,50 @@ (define_insn "*shNadduw" [(set_attr "type" "bitmanip") (set_attr "mode" "DI")]) +;; During combine, we may encounter an attempt to combine +;; slli rtmp, rs, #imm +;; zext.w rtmp, rtmp +;; sh[123]add rd, rtmp, rs2 +;; which will lead to the immediate not satisfying the above constraints. +;; By splitting the compound expression, we can simplify to a slli and a +;; sh[123]add.uw. +(define_split + [(set (match_operand:DI 0 "register_operand") + (plus:DI (and:DI (ashift:DI (match_operand:DI 1 "register_operand") + (match_operand:QI 2 "immediate_operand")) + (match_operand:DI 3 "consecutive_bits_operand")) + (match_operand:DI 4 "register_operand"))) + (clobber (match_operand:DI 5 "register_operand"))] + "TARGET_64BIT && TARGET_ZBA" + [(set (match_dup 5) (ashift:DI (match_dup 1) (match_dup 6))) + (set (match_dup 0) (plus:DI (and:DI (ashift:DI (match_dup 5) + (match_dup 7)) + (match_dup 8)) + (match_dup 4)))] +{ + unsigned HOST_WIDE_INT mask = UINTVAL (operands[3]); + /* scale: shift within the sh[123]add.uw */ + int scale = 32 - clz_hwi (mask); + /* bias: pre-scale amount (i.e. the prior shift amount) */ + int bias = ctz_hwi (mask) - scale; + + /* If the bias + scale don't add up to operand[2], reject. */ + if ((scale + bias) != UINTVAL (operands[2])) + FAIL; + + /* If the shift-amount is out-of-range for sh[123]add.uw, reject. */ + if ((scale < 1) || (scale > 3)) + FAIL; + + /* If there's no bias, the '*shNadduw' pattern should have matched. */ + if (bias == 0) + FAIL; + + operands[6] = GEN_INT (bias); + operands[7] = GEN_INT (scale); + operands[8] = GEN_INT (0xffffffffULL << scale); +}) + (define_insn "*add.uw" [(set (match_operand:DI 0 "register_operand" "=r") (plus:DI (zero_extend:DI diff --git a/gcc/testsuite/gcc.target/riscv/zba-shadd.c b/gcc/testsuite/gcc.target/riscv/zba-shadd.c new file mode 100644 index 00000000000..33da2530f3f --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zba-shadd.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=rv64gc_zba -mabi=lp64" } */ + +unsigned long foo(unsigned int a, unsigned long b) +{ + a = a << 1; + unsigned long c = (unsigned long) a; + unsigned long d = b + (c<<2); + return d; +} + +/* { dg-final { scan-assembler "sh2add.uw" } } */ +/* { dg-final { scan-assembler-not "zext" } } */ \ No newline at end of file
When encountering a prescaled (biased) value as a candidate for sh[123]add.uw, the combine pass will present this as shifted by the aggregate amount (prescale + shift-amount) with an appropriately adjusted mask constant that has fewer than 32 bits set. E.g., here's the failing expression seen in combine for a prescale of 1 and a shift of 2 (note how 0x3fffffff8 >> 3 is 0x7fffffff). Trying 7, 8 -> 10: 7: r78:SI=r81:DI#0<<0x1 REG_DEAD r81:DI 8: r79:DI=zero_extend(r78:SI) REG_DEAD r78:SI 10: r80:DI=r79:DI<<0x2+r82:DI REG_DEAD r79:DI REG_DEAD r82:DI Failed to match this instruction: (set (reg:DI 80 [ cD.1491 ]) (plus:DI (and:DI (ashift:DI (reg:DI 81) (const_int 3 [0x3])) (const_int 17179869176 [0x3fffffff8])) (reg:DI 82))) To address this, we introduce a splitter handling these cases. gcc/ChangeLog: * config/riscv/bitmanip.md: Add split to handle opportunities for slli + sh[123]add.uw gcc/testsuite/ChangeLog: * gcc.target/riscv/zba-shadd.c: New test. Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu> Co-developed-by: Manolis Tsamis <manolis.tsamis@vrull.eu> --- gcc/config/riscv/bitmanip.md | 44 ++++++++++++++++++++++ gcc/testsuite/gcc.target/riscv/zba-shadd.c | 13 +++++++ 2 files changed, 57 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/zba-shadd.c