Message ID | 65719e53-d438-45db-b0cb-2829ba6ac7e2@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | [PATCH-1,rs6000] Add a new type of CC mode - CCBCD for bcd insns [PR100736, PR114732] | expand |
Hi, Gently ping the series of patches. [PATCH-1, rs6000]Add a new type of CC mode - CCBCD for bcd insns [PR100736, PR114732] https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650217.html [PATCH-2, rs6000] Add a new type of CC mode - CCLTEQ https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650218.html [PATCH-3, rs6000] Set CC mode of vector string isolate insns to CCEQ https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650219.html [PATCH-4, rs6000] Optimize single cc bit reverse implementation https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650220.html [PATCH-5, rs6000] Replace explicit CC bit reverse with common format https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650766.html [PATCH-6, rs6000] Split setcc to two insns after reload https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650856.html Thanks Gui Haochen 在 2024/4/30 15:18, HAO CHEN GUI 写道: > Hi, > It's the first patch of a series of patches optimizing CC modes on > rs6000. > > bcd insns set all four bits of a CR field. But it has different single > bit reverse behavior than CCFP's. The forth bit of bcd cr fields is used > to indict overflow or invalid number. It's not a bit for unordered test. > So the "le" test should be reversed to "gt" not "ungt". The "ge" test > should be reversed to "lt" not "unlt". That's the root cause of PR100736 > and PR114732. > > This patch fixes the issue by adding a new type of CC mode - CCBCD for > all bcd insns. Here a new setcc_rev pattern is added for ccbcd. It will > be merged to a uniform pattern which is for all CC modes in sequential > patch. > > The rtl code "unordered" is still used for testing overflow or > invalid number. IMHO, the "unordered" on a CC mode can be considered as > testing the forth bit of a CR field setting or not. The "eq" on a CC mode > can be considered as testing the third bit setting or not. Thus we avoid > creating lots of unspecs for the CR bit testing. > > Bootstrapped and tested on powerpc64-linux BE and LE with no > regressions. Is it OK for the trunk? > > Thanks > Gui Haochen > > > ChangeLog > rs6000: Add a new type of CC mode - CCBCD for bcd insns > > gcc/ > PR target/100736 > PR target/114732 > * config/rs6000/altivec.md (bcd<bcd_add_sub>_<mode>): Replace CCFP > with CCBCD. > (*bcd<bcd_add_sub>_test_<mode>): Likewise. > (*bcd<bcd_add_sub>_test2_<mode>): Likewise. > (bcd<bcd_add_sub>_<code>_<mode>): Likewise. > (*bcdinvalid_<mode>): Likewise. > (bcdinvalid_<mode>): Likewise. > (bcdshift_v16qi): Likewise. > (bcdmul10_v16qi): Likewise. > (bcddiv10_v16qi): Likewise. > (peephole for bcd_add/sub): Likewise. > * config/rs6000/predicates.md (branch_comparison_operator): Add CCBCD > and its supported comparison codes. > * config/rs6000/rs6000-modes.def (CC_MODE): Add CCBCD. > * config/rs6000/rs6000.cc (validate_condition_mode): Add CCBCD > assertion. > * config/rs6000/rs6000.md (CC_any): Add CCBCD. > (ccbcd_rev): New code iterator. > (*<code><mode>_cc): New insn and split pattern for CCBCD reverse > compare. > > gcc/testsuite/ > PR target/100736 > PR target/114732 > * gcc.target/powerpc/pr100736.c: New. > * gcc.target/powerpc/pr114732.c: New. > > patch.diff > diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md > index bb20441c096..9fa8cf89f61 100644 > --- a/gcc/config/rs6000/altivec.md > +++ b/gcc/config/rs6000/altivec.md > @@ -4443,7 +4443,7 @@ (define_insn "bcd<bcd_add_sub>_<mode>" > (match_operand:VBCD 2 "register_operand" "v") > (match_operand:QI 3 "const_0_to_1_operand" "n")] > UNSPEC_BCD_ADD_SUB)) > - (clobber (reg:CCFP CR6_REGNO))] > + (clobber (reg:CCBCD CR6_REGNO))] > "TARGET_P8_VECTOR" > "bcd<bcd_add_sub>. %0,%1,%2,%3" > [(set_attr "type" "vecsimple")]) > @@ -4454,8 +4454,8 @@ (define_insn "bcd<bcd_add_sub>_<mode>" > ;; probably should be one that can go in the VMX (Altivec) registers, so we > ;; can't use DDmode or DFmode. > (define_insn "*bcd<bcd_add_sub>_test_<mode>" > - [(set (reg:CCFP CR6_REGNO) > - (compare:CCFP > + [(set (reg:CCBCD CR6_REGNO) > + (compare:CCBCD > (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v") > (match_operand:VBCD 2 "register_operand" "v") > (match_operand:QI 3 "const_0_to_1_operand" "i")] > @@ -4472,8 +4472,8 @@ (define_insn "*bcd<bcd_add_sub>_test2_<mode>" > (match_operand:VBCD 2 "register_operand" "v") > (match_operand:QI 3 "const_0_to_1_operand" "i")] > UNSPEC_BCD_ADD_SUB)) > - (set (reg:CCFP CR6_REGNO) > - (compare:CCFP > + (set (reg:CCBCD CR6_REGNO) > + (compare:CCBCD > (unspec:V2DF [(match_dup 1) > (match_dup 2) > (match_dup 3)] > @@ -4566,8 +4566,8 @@ (define_insn "vclrrb" > [(set_attr "type" "vecsimple")]) > > (define_expand "bcd<bcd_add_sub>_<code>_<mode>" > - [(parallel [(set (reg:CCFP CR6_REGNO) > - (compare:CCFP > + [(parallel [(set (reg:CCBCD CR6_REGNO) > + (compare:CCBCD > (unspec:V2DF [(match_operand:VBCD 1 "register_operand") > (match_operand:VBCD 2 "register_operand") > (match_operand:QI 3 "const_0_to_1_operand")] > @@ -4575,7 +4575,7 @@ (define_expand "bcd<bcd_add_sub>_<code>_<mode>" > (match_dup 4))) > (clobber (match_scratch:VBCD 5))]) > (set (match_operand:SI 0 "register_operand") > - (BCD_TEST:SI (reg:CCFP CR6_REGNO) > + (BCD_TEST:SI (reg:CCBCD CR6_REGNO) > (const_int 0)))] > "TARGET_P8_VECTOR" > { > @@ -4583,8 +4583,8 @@ (define_expand "bcd<bcd_add_sub>_<code>_<mode>" > }) > > (define_insn "*bcdinvalid_<mode>" > - [(set (reg:CCFP CR6_REGNO) > - (compare:CCFP > + [(set (reg:CCBCD CR6_REGNO) > + (compare:CCBCD > (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v")] > UNSPEC_BCDSUB) > (match_operand:V2DF 2 "zero_constant" "j"))) > @@ -4594,14 +4594,14 @@ (define_insn "*bcdinvalid_<mode>" > [(set_attr "type" "vecsimple")]) > > (define_expand "bcdinvalid_<mode>" > - [(parallel [(set (reg:CCFP CR6_REGNO) > - (compare:CCFP > + [(parallel [(set (reg:CCBCD CR6_REGNO) > + (compare:CCBCD > (unspec:V2DF [(match_operand:VBCD 1 "register_operand")] > UNSPEC_BCDSUB) > (match_dup 2))) > (clobber (match_scratch:VBCD 3))]) > (set (match_operand:SI 0 "register_operand") > - (unordered:SI (reg:CCFP CR6_REGNO) > + (unordered:SI (reg:CCBCD CR6_REGNO) > (const_int 0)))] > "TARGET_P8_VECTOR" > { > @@ -4614,7 +4614,7 @@ (define_insn "bcdshift_v16qi" > (match_operand:V16QI 2 "register_operand" "v") > (match_operand:QI 3 "const_0_to_1_operand" "n")] > UNSPEC_BCDSHIFT)) > - (clobber (reg:CCFP CR6_REGNO))] > + (clobber (reg:CCBCD CR6_REGNO))] > "TARGET_P8_VECTOR" > "bcds. %0,%1,%2,%3" > [(set_attr "type" "vecsimple")]) > @@ -4623,7 +4623,7 @@ (define_expand "bcdmul10_v16qi" > [(set (match_operand:V16QI 0 "register_operand") > (unspec:V16QI [(match_operand:V16QI 1 "register_operand")] > UNSPEC_BCDSHIFT)) > - (clobber (reg:CCFP CR6_REGNO))] > + (clobber (reg:CCBCD CR6_REGNO))] > "TARGET_P9_VECTOR" > { > rtx one = gen_reg_rtx (V16QImode); > @@ -4638,7 +4638,7 @@ (define_expand "bcddiv10_v16qi" > [(set (match_operand:V16QI 0 "register_operand") > (unspec:V16QI [(match_operand:V16QI 1 "register_operand")] > UNSPEC_BCDSHIFT)) > - (clobber (reg:CCFP CR6_REGNO))] > + (clobber (reg:CCBCD CR6_REGNO))] > "TARGET_P9_VECTOR" > { > rtx one = gen_reg_rtx (V16QImode); > @@ -4662,9 +4662,9 @@ (define_peephole2 > (match_operand:V1TI 2 "register_operand") > (match_operand:QI 3 "const_0_to_1_operand")] > UNSPEC_BCD_ADD_SUB)) > - (clobber (reg:CCFP CR6_REGNO))]) > - (parallel [(set (reg:CCFP CR6_REGNO) > - (compare:CCFP > + (clobber (reg:CCBCD CR6_REGNO))]) > + (parallel [(set (reg:CCBCD CR6_REGNO) > + (compare:CCBCD > (unspec:V2DF [(match_dup 1) > (match_dup 2) > (match_dup 3)] > @@ -4677,8 +4677,8 @@ (define_peephole2 > (match_dup 2) > (match_dup 3)] > UNSPEC_BCD_ADD_SUB)) > - (set (reg:CCFP CR6_REGNO) > - (compare:CCFP > + (set (reg:CCBCD CR6_REGNO) > + (compare:CCBCD > (unspec:V2DF [(match_dup 1) > (match_dup 2) > (match_dup 3)] > diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md > index d23ce9a77a3..18198add744 100644 > --- a/gcc/config/rs6000/predicates.md > +++ b/gcc/config/rs6000/predicates.md > @@ -1350,7 +1350,9 @@ (define_predicate "branch_comparison_operator" > (if_then_else (match_test "flag_finite_math_only") > (match_code "lt,le,gt,ge,eq,ne,unordered,ordered") > (match_code "lt,gt,eq,unordered,unge,unle,ne,ordered")) > - (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne")) > + (if_then_else (match_test "GET_MODE (XEXP (op, 0)) == CCBCDmode") > + (match_code "lt,le,gt,ge,eq,ne,unordered,ordered") > + (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne"))) > (match_test "validate_condition_mode (GET_CODE (op), > GET_MODE (XEXP (op, 0))), > 1"))) > diff --git a/gcc/config/rs6000/rs6000-modes.def b/gcc/config/rs6000/rs6000-modes.def > index 094b246c834..3e2e6dfb4ff 100644 > --- a/gcc/config/rs6000/rs6000-modes.def > +++ b/gcc/config/rs6000/rs6000-modes.def > @@ -61,6 +61,7 @@ FRACTIONAL_FLOAT_MODE (TF, FLOAT_PRECISION_TFmode, 16, ieee_quad_format); > > CC_MODE (CCUNS); > CC_MODE (CCFP); > +CC_MODE (CCBCD); /* Used for bcd insns */ > CC_MODE (CCEQ); > > /* Vector modes. */ > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index 6ba9df4f02e..4068cd8b929 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -11597,9 +11597,11 @@ validate_condition_mode (enum rtx_code code, machine_mode mode) > gcc_assert ((code != GTU && code != LTU && code != GEU && code != LEU) > || mode == CCUNSmode); > > + gcc_assert (mode == CCFPmode || mode == CCBCDmode > + || (code != ORDERED && code != UNORDERED)); > + > gcc_assert (mode == CCFPmode > - || (code != ORDERED && code != UNORDERED > - && code != UNEQ && code != LTGT > + || (code != UNEQ && code != LTGT > && code != UNGT && code != UNLT > && code != UNGE && code != UNLE)); > > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index bc8bc6ab060..9b5fcdc8db0 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -8115,7 +8115,7 @@ (define_expand "movcc" > "" > "") > > -(define_mode_iterator CC_any [CC CCUNS CCEQ CCFP]) > +(define_mode_iterator CC_any [CC CCUNS CCEQ CCFP CCBCD]) > > (define_insn "*movcc_<mode>" > [(set (match_operand:CC_any 0 "nonimmediate_operand" > @@ -13245,6 +13245,7 @@ (define_insn_and_split "*nesi3_ext<mode>" > > (define_code_iterator fp_rev [ordered ne unle unge]) > (define_code_iterator fp_two [ltgt le ge unlt ungt uneq]) > +(define_code_iterator ccbcd_rev [ordered ne le ge]) > > (define_insn_and_split "*<code><mode>_cc" > [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") > @@ -13264,6 +13265,24 @@ (define_insn_and_split "*<code><mode>_cc" > } > [(set_attr "length" "12")]) > > +(define_insn_and_split "*<code><mode>_cc" > + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") > + (ccbcd_rev:GPR (match_operand:CCBCD 1 "cc_reg_operand" "y") > + (const_int 0)))] > + "" > + "#" > + "&& 1" > + [(pc)] > +{ > + rtx_code revcode = reverse_condition (<CODE>); > + rtx eq = gen_rtx_fmt_ee (revcode, <MODE>mode, operands[1], const0_rtx); > + rtx tmp = gen_reg_rtx (<MODE>mode); > + emit_move_insn (tmp, eq); > + emit_insn (gen_xor<mode>3 (operands[0], tmp, const1_rtx)); > + DONE; > +} > + [(set_attr "length" "12")]) > + > (define_insn_and_split "*<code><mode>_cc" > [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") > (fp_two:GPR (match_operand:CCFP 1 "cc_reg_operand" "y") > diff --git a/gcc/testsuite/gcc.target/powerpc/pr100736.c b/gcc/testsuite/gcc.target/powerpc/pr100736.c > new file mode 100644 > index 00000000000..85e3ae56d55 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr100736.c > @@ -0,0 +1,12 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mdejagnu-cpu=power8 -O2 -ffinite-math-only" } */ > +/* { dg-require-effective-target powerpc_vsx_ok } */ > + > +/* Verify there is no ICE with finite-math-only */ > + > +int foo (vector unsigned char a, vector unsigned char b) > +{ > + return __builtin_vec_bcdsub_ge (a, b, 0); > +} > + > + > diff --git a/gcc/testsuite/gcc.target/powerpc/pr114732.c b/gcc/testsuite/gcc.target/powerpc/pr114732.c > new file mode 100644 > index 00000000000..d0b878780c6 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr114732.c > @@ -0,0 +1,12 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mdejagnu-cpu=power8 -O2" } */ > +/* { dg-require-effective-target powerpc_vsx_ok } */ > + > +/* Verify only one cr bit need to be tested. */ > + > +int foo (vector unsigned char a, vector unsigned char b) > +{ > + return __builtin_vec_bcdsub_ge (a, b, 0) != 1; > +} > + > +/* { dg-final { scan-assembler-not "cror" } } */
Hi, Gently ping them. Thanks Gui Haochen 在 2024/5/13 9:56, HAO CHEN GUI 写道: > Hi, > Gently ping the series of patches. > [PATCH-1, rs6000]Add a new type of CC mode - CCBCD for bcd insns [PR100736, PR114732] > https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650217.html > [PATCH-2, rs6000] Add a new type of CC mode - CCLTEQ > https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650218.html > [PATCH-3, rs6000] Set CC mode of vector string isolate insns to CCEQ > https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650219.html > [PATCH-4, rs6000] Optimize single cc bit reverse implementation > https://gcc.gnu.org/pipermail/gcc-patches/2024-April/650220.html > [PATCH-5, rs6000] Replace explicit CC bit reverse with common format > https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650766.html > [PATCH-6, rs6000] Split setcc to two insns after reload > https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650856.html > > Thanks > Gui Haochen > > 在 2024/4/30 15:18, HAO CHEN GUI 写道: >> Hi, >> It's the first patch of a series of patches optimizing CC modes on >> rs6000. >> >> bcd insns set all four bits of a CR field. But it has different single >> bit reverse behavior than CCFP's. The forth bit of bcd cr fields is used >> to indict overflow or invalid number. It's not a bit for unordered test. >> So the "le" test should be reversed to "gt" not "ungt". The "ge" test >> should be reversed to "lt" not "unlt". That's the root cause of PR100736 >> and PR114732. >> >> This patch fixes the issue by adding a new type of CC mode - CCBCD for >> all bcd insns. Here a new setcc_rev pattern is added for ccbcd. It will >> be merged to a uniform pattern which is for all CC modes in sequential >> patch. >> >> The rtl code "unordered" is still used for testing overflow or >> invalid number. IMHO, the "unordered" on a CC mode can be considered as >> testing the forth bit of a CR field setting or not. The "eq" on a CC mode >> can be considered as testing the third bit setting or not. Thus we avoid >> creating lots of unspecs for the CR bit testing. >> >> Bootstrapped and tested on powerpc64-linux BE and LE with no >> regressions. Is it OK for the trunk? >> >> Thanks >> Gui Haochen >> >> >> ChangeLog >> rs6000: Add a new type of CC mode - CCBCD for bcd insns >> >> gcc/ >> PR target/100736 >> PR target/114732 >> * config/rs6000/altivec.md (bcd<bcd_add_sub>_<mode>): Replace CCFP >> with CCBCD. >> (*bcd<bcd_add_sub>_test_<mode>): Likewise. >> (*bcd<bcd_add_sub>_test2_<mode>): Likewise. >> (bcd<bcd_add_sub>_<code>_<mode>): Likewise. >> (*bcdinvalid_<mode>): Likewise. >> (bcdinvalid_<mode>): Likewise. >> (bcdshift_v16qi): Likewise. >> (bcdmul10_v16qi): Likewise. >> (bcddiv10_v16qi): Likewise. >> (peephole for bcd_add/sub): Likewise. >> * config/rs6000/predicates.md (branch_comparison_operator): Add CCBCD >> and its supported comparison codes. >> * config/rs6000/rs6000-modes.def (CC_MODE): Add CCBCD. >> * config/rs6000/rs6000.cc (validate_condition_mode): Add CCBCD >> assertion. >> * config/rs6000/rs6000.md (CC_any): Add CCBCD. >> (ccbcd_rev): New code iterator. >> (*<code><mode>_cc): New insn and split pattern for CCBCD reverse >> compare. >> >> gcc/testsuite/ >> PR target/100736 >> PR target/114732 >> * gcc.target/powerpc/pr100736.c: New. >> * gcc.target/powerpc/pr114732.c: New. >> >> patch.diff >> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md >> index bb20441c096..9fa8cf89f61 100644 >> --- a/gcc/config/rs6000/altivec.md >> +++ b/gcc/config/rs6000/altivec.md >> @@ -4443,7 +4443,7 @@ (define_insn "bcd<bcd_add_sub>_<mode>" >> (match_operand:VBCD 2 "register_operand" "v") >> (match_operand:QI 3 "const_0_to_1_operand" "n")] >> UNSPEC_BCD_ADD_SUB)) >> - (clobber (reg:CCFP CR6_REGNO))] >> + (clobber (reg:CCBCD CR6_REGNO))] >> "TARGET_P8_VECTOR" >> "bcd<bcd_add_sub>. %0,%1,%2,%3" >> [(set_attr "type" "vecsimple")]) >> @@ -4454,8 +4454,8 @@ (define_insn "bcd<bcd_add_sub>_<mode>" >> ;; probably should be one that can go in the VMX (Altivec) registers, so we >> ;; can't use DDmode or DFmode. >> (define_insn "*bcd<bcd_add_sub>_test_<mode>" >> - [(set (reg:CCFP CR6_REGNO) >> - (compare:CCFP >> + [(set (reg:CCBCD CR6_REGNO) >> + (compare:CCBCD >> (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v") >> (match_operand:VBCD 2 "register_operand" "v") >> (match_operand:QI 3 "const_0_to_1_operand" "i")] >> @@ -4472,8 +4472,8 @@ (define_insn "*bcd<bcd_add_sub>_test2_<mode>" >> (match_operand:VBCD 2 "register_operand" "v") >> (match_operand:QI 3 "const_0_to_1_operand" "i")] >> UNSPEC_BCD_ADD_SUB)) >> - (set (reg:CCFP CR6_REGNO) >> - (compare:CCFP >> + (set (reg:CCBCD CR6_REGNO) >> + (compare:CCBCD >> (unspec:V2DF [(match_dup 1) >> (match_dup 2) >> (match_dup 3)] >> @@ -4566,8 +4566,8 @@ (define_insn "vclrrb" >> [(set_attr "type" "vecsimple")]) >> >> (define_expand "bcd<bcd_add_sub>_<code>_<mode>" >> - [(parallel [(set (reg:CCFP CR6_REGNO) >> - (compare:CCFP >> + [(parallel [(set (reg:CCBCD CR6_REGNO) >> + (compare:CCBCD >> (unspec:V2DF [(match_operand:VBCD 1 "register_operand") >> (match_operand:VBCD 2 "register_operand") >> (match_operand:QI 3 "const_0_to_1_operand")] >> @@ -4575,7 +4575,7 @@ (define_expand "bcd<bcd_add_sub>_<code>_<mode>" >> (match_dup 4))) >> (clobber (match_scratch:VBCD 5))]) >> (set (match_operand:SI 0 "register_operand") >> - (BCD_TEST:SI (reg:CCFP CR6_REGNO) >> + (BCD_TEST:SI (reg:CCBCD CR6_REGNO) >> (const_int 0)))] >> "TARGET_P8_VECTOR" >> { >> @@ -4583,8 +4583,8 @@ (define_expand "bcd<bcd_add_sub>_<code>_<mode>" >> }) >> >> (define_insn "*bcdinvalid_<mode>" >> - [(set (reg:CCFP CR6_REGNO) >> - (compare:CCFP >> + [(set (reg:CCBCD CR6_REGNO) >> + (compare:CCBCD >> (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v")] >> UNSPEC_BCDSUB) >> (match_operand:V2DF 2 "zero_constant" "j"))) >> @@ -4594,14 +4594,14 @@ (define_insn "*bcdinvalid_<mode>" >> [(set_attr "type" "vecsimple")]) >> >> (define_expand "bcdinvalid_<mode>" >> - [(parallel [(set (reg:CCFP CR6_REGNO) >> - (compare:CCFP >> + [(parallel [(set (reg:CCBCD CR6_REGNO) >> + (compare:CCBCD >> (unspec:V2DF [(match_operand:VBCD 1 "register_operand")] >> UNSPEC_BCDSUB) >> (match_dup 2))) >> (clobber (match_scratch:VBCD 3))]) >> (set (match_operand:SI 0 "register_operand") >> - (unordered:SI (reg:CCFP CR6_REGNO) >> + (unordered:SI (reg:CCBCD CR6_REGNO) >> (const_int 0)))] >> "TARGET_P8_VECTOR" >> { >> @@ -4614,7 +4614,7 @@ (define_insn "bcdshift_v16qi" >> (match_operand:V16QI 2 "register_operand" "v") >> (match_operand:QI 3 "const_0_to_1_operand" "n")] >> UNSPEC_BCDSHIFT)) >> - (clobber (reg:CCFP CR6_REGNO))] >> + (clobber (reg:CCBCD CR6_REGNO))] >> "TARGET_P8_VECTOR" >> "bcds. %0,%1,%2,%3" >> [(set_attr "type" "vecsimple")]) >> @@ -4623,7 +4623,7 @@ (define_expand "bcdmul10_v16qi" >> [(set (match_operand:V16QI 0 "register_operand") >> (unspec:V16QI [(match_operand:V16QI 1 "register_operand")] >> UNSPEC_BCDSHIFT)) >> - (clobber (reg:CCFP CR6_REGNO))] >> + (clobber (reg:CCBCD CR6_REGNO))] >> "TARGET_P9_VECTOR" >> { >> rtx one = gen_reg_rtx (V16QImode); >> @@ -4638,7 +4638,7 @@ (define_expand "bcddiv10_v16qi" >> [(set (match_operand:V16QI 0 "register_operand") >> (unspec:V16QI [(match_operand:V16QI 1 "register_operand")] >> UNSPEC_BCDSHIFT)) >> - (clobber (reg:CCFP CR6_REGNO))] >> + (clobber (reg:CCBCD CR6_REGNO))] >> "TARGET_P9_VECTOR" >> { >> rtx one = gen_reg_rtx (V16QImode); >> @@ -4662,9 +4662,9 @@ (define_peephole2 >> (match_operand:V1TI 2 "register_operand") >> (match_operand:QI 3 "const_0_to_1_operand")] >> UNSPEC_BCD_ADD_SUB)) >> - (clobber (reg:CCFP CR6_REGNO))]) >> - (parallel [(set (reg:CCFP CR6_REGNO) >> - (compare:CCFP >> + (clobber (reg:CCBCD CR6_REGNO))]) >> + (parallel [(set (reg:CCBCD CR6_REGNO) >> + (compare:CCBCD >> (unspec:V2DF [(match_dup 1) >> (match_dup 2) >> (match_dup 3)] >> @@ -4677,8 +4677,8 @@ (define_peephole2 >> (match_dup 2) >> (match_dup 3)] >> UNSPEC_BCD_ADD_SUB)) >> - (set (reg:CCFP CR6_REGNO) >> - (compare:CCFP >> + (set (reg:CCBCD CR6_REGNO) >> + (compare:CCBCD >> (unspec:V2DF [(match_dup 1) >> (match_dup 2) >> (match_dup 3)] >> diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md >> index d23ce9a77a3..18198add744 100644 >> --- a/gcc/config/rs6000/predicates.md >> +++ b/gcc/config/rs6000/predicates.md >> @@ -1350,7 +1350,9 @@ (define_predicate "branch_comparison_operator" >> (if_then_else (match_test "flag_finite_math_only") >> (match_code "lt,le,gt,ge,eq,ne,unordered,ordered") >> (match_code "lt,gt,eq,unordered,unge,unle,ne,ordered")) >> - (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne")) >> + (if_then_else (match_test "GET_MODE (XEXP (op, 0)) == CCBCDmode") >> + (match_code "lt,le,gt,ge,eq,ne,unordered,ordered") >> + (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne"))) >> (match_test "validate_condition_mode (GET_CODE (op), >> GET_MODE (XEXP (op, 0))), >> 1"))) >> diff --git a/gcc/config/rs6000/rs6000-modes.def b/gcc/config/rs6000/rs6000-modes.def >> index 094b246c834..3e2e6dfb4ff 100644 >> --- a/gcc/config/rs6000/rs6000-modes.def >> +++ b/gcc/config/rs6000/rs6000-modes.def >> @@ -61,6 +61,7 @@ FRACTIONAL_FLOAT_MODE (TF, FLOAT_PRECISION_TFmode, 16, ieee_quad_format); >> >> CC_MODE (CCUNS); >> CC_MODE (CCFP); >> +CC_MODE (CCBCD); /* Used for bcd insns */ >> CC_MODE (CCEQ); >> >> /* Vector modes. */ >> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc >> index 6ba9df4f02e..4068cd8b929 100644 >> --- a/gcc/config/rs6000/rs6000.cc >> +++ b/gcc/config/rs6000/rs6000.cc >> @@ -11597,9 +11597,11 @@ validate_condition_mode (enum rtx_code code, machine_mode mode) >> gcc_assert ((code != GTU && code != LTU && code != GEU && code != LEU) >> || mode == CCUNSmode); >> >> + gcc_assert (mode == CCFPmode || mode == CCBCDmode >> + || (code != ORDERED && code != UNORDERED)); >> + >> gcc_assert (mode == CCFPmode >> - || (code != ORDERED && code != UNORDERED >> - && code != UNEQ && code != LTGT >> + || (code != UNEQ && code != LTGT >> && code != UNGT && code != UNLT >> && code != UNGE && code != UNLE)); >> >> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md >> index bc8bc6ab060..9b5fcdc8db0 100644 >> --- a/gcc/config/rs6000/rs6000.md >> +++ b/gcc/config/rs6000/rs6000.md >> @@ -8115,7 +8115,7 @@ (define_expand "movcc" >> "" >> "") >> >> -(define_mode_iterator CC_any [CC CCUNS CCEQ CCFP]) >> +(define_mode_iterator CC_any [CC CCUNS CCEQ CCFP CCBCD]) >> >> (define_insn "*movcc_<mode>" >> [(set (match_operand:CC_any 0 "nonimmediate_operand" >> @@ -13245,6 +13245,7 @@ (define_insn_and_split "*nesi3_ext<mode>" >> >> (define_code_iterator fp_rev [ordered ne unle unge]) >> (define_code_iterator fp_two [ltgt le ge unlt ungt uneq]) >> +(define_code_iterator ccbcd_rev [ordered ne le ge]) >> >> (define_insn_and_split "*<code><mode>_cc" >> [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") >> @@ -13264,6 +13265,24 @@ (define_insn_and_split "*<code><mode>_cc" >> } >> [(set_attr "length" "12")]) >> >> +(define_insn_and_split "*<code><mode>_cc" >> + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") >> + (ccbcd_rev:GPR (match_operand:CCBCD 1 "cc_reg_operand" "y") >> + (const_int 0)))] >> + "" >> + "#" >> + "&& 1" >> + [(pc)] >> +{ >> + rtx_code revcode = reverse_condition (<CODE>); >> + rtx eq = gen_rtx_fmt_ee (revcode, <MODE>mode, operands[1], const0_rtx); >> + rtx tmp = gen_reg_rtx (<MODE>mode); >> + emit_move_insn (tmp, eq); >> + emit_insn (gen_xor<mode>3 (operands[0], tmp, const1_rtx)); >> + DONE; >> +} >> + [(set_attr "length" "12")]) >> + >> (define_insn_and_split "*<code><mode>_cc" >> [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") >> (fp_two:GPR (match_operand:CCFP 1 "cc_reg_operand" "y") >> diff --git a/gcc/testsuite/gcc.target/powerpc/pr100736.c b/gcc/testsuite/gcc.target/powerpc/pr100736.c >> new file mode 100644 >> index 00000000000..85e3ae56d55 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/pr100736.c >> @@ -0,0 +1,12 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-mdejagnu-cpu=power8 -O2 -ffinite-math-only" } */ >> +/* { dg-require-effective-target powerpc_vsx_ok } */ >> + >> +/* Verify there is no ICE with finite-math-only */ >> + >> +int foo (vector unsigned char a, vector unsigned char b) >> +{ >> + return __builtin_vec_bcdsub_ge (a, b, 0); >> +} >> + >> + >> diff --git a/gcc/testsuite/gcc.target/powerpc/pr114732.c b/gcc/testsuite/gcc.target/powerpc/pr114732.c >> new file mode 100644 >> index 00000000000..d0b878780c6 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/pr114732.c >> @@ -0,0 +1,12 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-mdejagnu-cpu=power8 -O2" } */ >> +/* { dg-require-effective-target powerpc_vsx_ok } */ >> + >> +/* Verify only one cr bit need to be tested. */ >> + >> +int foo (vector unsigned char a, vector unsigned char b) >> +{ >> + return __builtin_vec_bcdsub_ge (a, b, 0) != 1; >> +} >> + >> +/* { dg-final { scan-assembler-not "cror" } } */
Hi, on 2024/4/30 15:18, HAO CHEN GUI wrote: > Hi, > It's the first patch of a series of patches optimizing CC modes on > rs6000. > > bcd insns set all four bits of a CR field. But it has different single > bit reverse behavior than CCFP's. The forth bit of bcd cr fields is used > to indict overflow or invalid number. It's not a bit for unordered test. > So the "le" test should be reversed to "gt" not "ungt". The "ge" test > should be reversed to "lt" not "unlt". That's the root cause of PR100736 > and PR114732. > > This patch fixes the issue by adding a new type of CC mode - CCBCD for > all bcd insns. Here a new setcc_rev pattern is added for ccbcd. It will > be merged to a uniform pattern which is for all CC modes in sequential > patch. Thanks for doing this, adding one more CCmode for BCD specific looks reasonable and make code more clear. > > The rtl code "unordered" is still used for testing overflow or > invalid number. IMHO, the "unordered" on a CC mode can be considered as > testing the forth bit of a CR field setting or not. The "eq" on a CC mode > can be considered as testing the third bit setting or not. Thus we avoid > creating lots of unspecs for the CR bit testing. I can understand re-using "unordered" and "eq" will save some efforts than doing with unspecs, but they are actually RTL codes instead of bits on the specific hardware CR, a downside is that people who isn't aware of this design point can have some misunderstanding when reading/checking the code or dumping, from this perspective unspecs (with reasonable name) can be more meaningful. Normally adopting RTL code is better since they have the chance to be considered (optimized) in generic pass/code, but it isn't the case here as we just use the code itself but not be with the same semantic (meaning). Looking forward to others' opinions on this, if we want to adopt "unordered" and "eq" like what this patch does, I think we should at least emphasize such points in rs6000-modes.def. > > Bootstrapped and tested on powerpc64-linux BE and LE with no > regressions. Is it OK for the trunk? Some minor comments are inlined, Segher did a lot of work on CC, I'm looking forward to his review on this patch series. :) > > Thanks > Gui Haochen > > > ChangeLog > rs6000: Add a new type of CC mode - CCBCD for bcd insns > > gcc/ > PR target/100736 > PR target/114732 > * config/rs6000/altivec.md (bcd<bcd_add_sub>_<mode>): Replace CCFP > with CCBCD. > (*bcd<bcd_add_sub>_test_<mode>): Likewise. > (*bcd<bcd_add_sub>_test2_<mode>): Likewise. > (bcd<bcd_add_sub>_<code>_<mode>): Likewise. > (*bcdinvalid_<mode>): Likewise. > (bcdinvalid_<mode>): Likewise. > (bcdshift_v16qi): Likewise. > (bcdmul10_v16qi): Likewise. > (bcddiv10_v16qi): Likewise. > (peephole for bcd_add/sub): Likewise. > * config/rs6000/predicates.md (branch_comparison_operator): Add CCBCD > and its supported comparison codes. > * config/rs6000/rs6000-modes.def (CC_MODE): Add CCBCD. > * config/rs6000/rs6000.cc (validate_condition_mode): Add CCBCD > assertion. > * config/rs6000/rs6000.md (CC_any): Add CCBCD. > (ccbcd_rev): New code iterator. > (*<code><mode>_cc): New insn and split pattern for CCBCD reverse > compare. > > gcc/testsuite/ > PR target/100736 > PR target/114732 > * gcc.target/powerpc/pr100736.c: New. > * gcc.target/powerpc/pr114732.c: New. > > patch.diff > diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md > index bb20441c096..9fa8cf89f61 100644 > --- a/gcc/config/rs6000/altivec.md > +++ b/gcc/config/rs6000/altivec.md > @@ -4443,7 +4443,7 @@ (define_insn "bcd<bcd_add_sub>_<mode>" > (match_operand:VBCD 2 "register_operand" "v") > (match_operand:QI 3 "const_0_to_1_operand" "n")] > UNSPEC_BCD_ADD_SUB)) > - (clobber (reg:CCFP CR6_REGNO))] > + (clobber (reg:CCBCD CR6_REGNO))] > "TARGET_P8_VECTOR" > "bcd<bcd_add_sub>. %0,%1,%2,%3" > [(set_attr "type" "vecsimple")]) > @@ -4454,8 +4454,8 @@ (define_insn "bcd<bcd_add_sub>_<mode>" > ;; probably should be one that can go in the VMX (Altivec) registers, so we > ;; can't use DDmode or DFmode. Here is a paragraph of comments above: ;; Use a floating point type (V2DFmode) for the compare to set CR6 so that we ;; can use the unordered test for BCD nans and add/subtracts that overflow. An ;; UNORDERED test on an integer type (like V1TImode) is not defined. The type ;; probably should be one that can go in the VMX (Altivec) registers, so we ;; can't use DDmode or DFmode. Is it still hold? It's not obvious where is the code checking unordered test should be on fp type (modes), if it still takes effect, "unspec" would help to get rid of this restriction. Otherwise, this comment should be updated and we can drop this workaround with V2DF here. > (define_insn "*bcd<bcd_add_sub>_test_<mode>" > - [(set (reg:CCFP CR6_REGNO) > - (compare:CCFP > + [(set (reg:CCBCD CR6_REGNO) > + (compare:CCBCD > (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v") > (match_operand:VBCD 2 "register_operand" "v") > (match_operand:QI 3 "const_0_to_1_operand" "i")] > @@ -4472,8 +4472,8 @@ (define_insn "*bcd<bcd_add_sub>_test2_<mode>" > (match_operand:VBCD 2 "register_operand" "v") > (match_operand:QI 3 "const_0_to_1_operand" "i")] > UNSPEC_BCD_ADD_SUB)) > - (set (reg:CCFP CR6_REGNO) > - (compare:CCFP > + (set (reg:CCBCD CR6_REGNO) > + (compare:CCBCD > (unspec:V2DF [(match_dup 1) > (match_dup 2) > (match_dup 3)] > @@ -4566,8 +4566,8 @@ (define_insn "vclrrb" > [(set_attr "type" "vecsimple")]) > > (define_expand "bcd<bcd_add_sub>_<code>_<mode>" > - [(parallel [(set (reg:CCFP CR6_REGNO) > - (compare:CCFP > + [(parallel [(set (reg:CCBCD CR6_REGNO) > + (compare:CCBCD > (unspec:V2DF [(match_operand:VBCD 1 "register_operand") > (match_operand:VBCD 2 "register_operand") > (match_operand:QI 3 "const_0_to_1_operand")] > @@ -4575,7 +4575,7 @@ (define_expand "bcd<bcd_add_sub>_<code>_<mode>" > (match_dup 4))) > (clobber (match_scratch:VBCD 5))]) > (set (match_operand:SI 0 "register_operand") > - (BCD_TEST:SI (reg:CCFP CR6_REGNO) > + (BCD_TEST:SI (reg:CCBCD CR6_REGNO) > (const_int 0)))] > "TARGET_P8_VECTOR" > { > @@ -4583,8 +4583,8 @@ (define_expand "bcd<bcd_add_sub>_<code>_<mode>" > }) > > (define_insn "*bcdinvalid_<mode>" > - [(set (reg:CCFP CR6_REGNO) > - (compare:CCFP > + [(set (reg:CCBCD CR6_REGNO) > + (compare:CCBCD > (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v")] > UNSPEC_BCDSUB) > (match_operand:V2DF 2 "zero_constant" "j"))) > @@ -4594,14 +4594,14 @@ (define_insn "*bcdinvalid_<mode>" > [(set_attr "type" "vecsimple")]) > > (define_expand "bcdinvalid_<mode>" > - [(parallel [(set (reg:CCFP CR6_REGNO) > - (compare:CCFP > + [(parallel [(set (reg:CCBCD CR6_REGNO) > + (compare:CCBCD > (unspec:V2DF [(match_operand:VBCD 1 "register_operand")] > UNSPEC_BCDSUB) > (match_dup 2))) > (clobber (match_scratch:VBCD 3))]) > (set (match_operand:SI 0 "register_operand") > - (unordered:SI (reg:CCFP CR6_REGNO) > + (unordered:SI (reg:CCBCD CR6_REGNO) > (const_int 0)))] > "TARGET_P8_VECTOR" > { > @@ -4614,7 +4614,7 @@ (define_insn "bcdshift_v16qi" > (match_operand:V16QI 2 "register_operand" "v") > (match_operand:QI 3 "const_0_to_1_operand" "n")] > UNSPEC_BCDSHIFT)) > - (clobber (reg:CCFP CR6_REGNO))] > + (clobber (reg:CCBCD CR6_REGNO))] > "TARGET_P8_VECTOR" > "bcds. %0,%1,%2,%3" > [(set_attr "type" "vecsimple")]) > @@ -4623,7 +4623,7 @@ (define_expand "bcdmul10_v16qi" > [(set (match_operand:V16QI 0 "register_operand") > (unspec:V16QI [(match_operand:V16QI 1 "register_operand")] > UNSPEC_BCDSHIFT)) > - (clobber (reg:CCFP CR6_REGNO))] > + (clobber (reg:CCBCD CR6_REGNO))] > "TARGET_P9_VECTOR" > { > rtx one = gen_reg_rtx (V16QImode); > @@ -4638,7 +4638,7 @@ (define_expand "bcddiv10_v16qi" > [(set (match_operand:V16QI 0 "register_operand") > (unspec:V16QI [(match_operand:V16QI 1 "register_operand")] > UNSPEC_BCDSHIFT)) > - (clobber (reg:CCFP CR6_REGNO))] > + (clobber (reg:CCBCD CR6_REGNO))] > "TARGET_P9_VECTOR" > { > rtx one = gen_reg_rtx (V16QImode); > @@ -4662,9 +4662,9 @@ (define_peephole2 > (match_operand:V1TI 2 "register_operand") > (match_operand:QI 3 "const_0_to_1_operand")] > UNSPEC_BCD_ADD_SUB)) > - (clobber (reg:CCFP CR6_REGNO))]) > - (parallel [(set (reg:CCFP CR6_REGNO) > - (compare:CCFP > + (clobber (reg:CCBCD CR6_REGNO))]) > + (parallel [(set (reg:CCBCD CR6_REGNO) > + (compare:CCBCD > (unspec:V2DF [(match_dup 1) > (match_dup 2) > (match_dup 3)] > @@ -4677,8 +4677,8 @@ (define_peephole2 > (match_dup 2) > (match_dup 3)] > UNSPEC_BCD_ADD_SUB)) > - (set (reg:CCFP CR6_REGNO) > - (compare:CCFP > + (set (reg:CCBCD CR6_REGNO) > + (compare:CCBCD > (unspec:V2DF [(match_dup 1) > (match_dup 2) > (match_dup 3)] > diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md > index d23ce9a77a3..18198add744 100644 > --- a/gcc/config/rs6000/predicates.md > +++ b/gcc/config/rs6000/predicates.md > @@ -1350,7 +1350,9 @@ (define_predicate "branch_comparison_operator" > (if_then_else (match_test "flag_finite_math_only") > (match_code "lt,le,gt,ge,eq,ne,unordered,ordered") > (match_code "lt,gt,eq,unordered,unge,unle,ne,ordered")) > - (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne")) > + (if_then_else (match_test "GET_MODE (XEXP (op, 0)) == CCBCDmode") > + (match_code "lt,le,gt,ge,eq,ne,unordered,ordered") > + (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne"))) > (match_test "validate_condition_mode (GET_CODE (op), > GET_MODE (XEXP (op, 0))), > 1"))) > diff --git a/gcc/config/rs6000/rs6000-modes.def b/gcc/config/rs6000/rs6000-modes.def > index 094b246c834..3e2e6dfb4ff 100644 > --- a/gcc/config/rs6000/rs6000-modes.def > +++ b/gcc/config/rs6000/rs6000-modes.def > @@ -61,6 +61,7 @@ FRACTIONAL_FLOAT_MODE (TF, FLOAT_PRECISION_TFmode, 16, ieee_quad_format); > > CC_MODE (CCUNS); > CC_MODE (CCFP); > +CC_MODE (CCBCD); /* Used for bcd insns */ > CC_MODE (CCEQ); > > /* Vector modes. */ > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index 6ba9df4f02e..4068cd8b929 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -11597,9 +11597,11 @@ validate_condition_mode (enum rtx_code code, machine_mode mode) > gcc_assert ((code != GTU && code != LTU && code != GEU && code != LEU) > || mode == CCUNSmode); > > + gcc_assert (mode == CCFPmode || mode == CCBCDmode > + || (code != ORDERED && code != UNORDERED)); > + An example for the above concern, if people read this code without knowing the design point above, one question would come up like how can CCBCDmode have {UN,}ORDERED available, shouldn't it only have overflow/invalid. > gcc_assert (mode == CCFPmode > - || (code != ORDERED && code != UNORDERED > - && code != UNEQ && code != LTGT > + || (code != UNEQ && code != LTGT > && code != UNGT && code != UNLT > && code != UNGE && code != UNLE)); > > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index bc8bc6ab060..9b5fcdc8db0 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -8115,7 +8115,7 @@ (define_expand "movcc" > "" > "") > > -(define_mode_iterator CC_any [CC CCUNS CCEQ CCFP]) > +(define_mode_iterator CC_any [CC CCUNS CCEQ CCFP CCBCD]) > > (define_insn "*movcc_<mode>" > [(set (match_operand:CC_any 0 "nonimmediate_operand" > @@ -13245,6 +13245,7 @@ (define_insn_and_split "*nesi3_ext<mode>" > > (define_code_iterator fp_rev [ordered ne unle unge]) > (define_code_iterator fp_two [ltgt le ge unlt ungt uneq]) > +(define_code_iterator ccbcd_rev [ordered ne le ge]) Nit: As "fp_rev" uses "fp", s/ccbcd/bcd/ > > (define_insn_and_split "*<code><mode>_cc" > [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") > @@ -13264,6 +13265,24 @@ (define_insn_and_split "*<code><mode>_cc" > } > [(set_attr "length" "12")]) > > +(define_insn_and_split "*<code><mode>_cc" > + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") > + (ccbcd_rev:GPR (match_operand:CCBCD 1 "cc_reg_operand" "y") > + (const_int 0)))] > + "" > + "#" > + "&& 1" > + [(pc)] > +{ > + rtx_code revcode = reverse_condition (<CODE>); > + rtx eq = gen_rtx_fmt_ee (revcode, <MODE>mode, operands[1], const0_rtx); > + rtx tmp = gen_reg_rtx (<MODE>mode); > + emit_move_insn (tmp, eq); > + emit_insn (gen_xor<mode>3 (operands[0], tmp, const1_rtx)); > + DONE; > +} > + [(set_attr "length" "12")]) > + > (define_insn_and_split "*<code><mode>_cc" > [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") > (fp_two:GPR (match_operand:CCFP 1 "cc_reg_operand" "y") > diff --git a/gcc/testsuite/gcc.target/powerpc/pr100736.c b/gcc/testsuite/gcc.target/powerpc/pr100736.c > new file mode 100644 > index 00000000000..85e3ae56d55 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr100736.c > @@ -0,0 +1,12 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mdejagnu-cpu=power8 -O2 -ffinite-math-only" } */ > +/* { dg-require-effective-target powerpc_vsx_ok } */ Nit: s/powerpc_vsx_ok/powerpc_vsx/ > + > +/* Verify there is no ICE with finite-math-only */ > + > +int foo (vector unsigned char a, vector unsigned char b) > +{ > + return __builtin_vec_bcdsub_ge (a, b, 0); > +} > + > + > diff --git a/gcc/testsuite/gcc.target/powerpc/pr114732.c b/gcc/testsuite/gcc.target/powerpc/pr114732.c > new file mode 100644 > index 00000000000..d0b878780c6 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr114732.c > @@ -0,0 +1,12 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mdejagnu-cpu=power8 -O2" } */ > +/* { dg-require-effective-target powerpc_vsx_ok } */ Ditto. BR, Kewen > + > +/* Verify only one cr bit need to be tested. */ > + > +int foo (vector unsigned char a, vector unsigned char b) > +{ > + return __builtin_vec_bcdsub_ge (a, b, 0) != 1; > +} > + > +/* { dg-final { scan-assembler-not "cror" } } */
Hi Kewen, 在 2024/5/29 13:26, Kewen.Lin 写道: > I can understand re-using "unordered" and "eq" will save some efforts than > doing with unspecs, but they are actually RTL codes instead of bits on the > specific hardware CR, a downside is that people who isn't aware of this > design point can have some misunderstanding when reading/checking the code > or dumping, from this perspective unspecs (with reasonable name) can be > more meaningful. Normally adopting RTL code is better since they have the > chance to be considered (optimized) in generic pass/code, but it isn't the > case here as we just use the code itself but not be with the same semantic > (meaning). Looking forward to others' opinions on this, if we want to adopt > "unordered" and "eq" like what this patch does, I think we should at least > emphasize such points in rs6000-modes.def. Thanks so much for your comments. IMHO, the core is if we can re-define "unordered" or "eq" for certain CC mode on a specific target. If we can't or it's unsafe, we have to use the unspecs. In this case, I just want to define the code "unordered" on CCBCD as testing if the bit 3 is set on this CR field. Actually rs6000 already use "lt" code to test if bit 0 is set for vector compare instructions. The following expand is an example. (define_expand "vector_ae_<mode>_p" [(parallel [(set (reg:CC CR6_REGNO) (unspec:CC [(ne:CC (match_operand:VI 1 "vlogical_operand") (match_operand:VI 2 "vlogical_operand"))] UNSPEC_PREDICATE)) (set (match_dup 3) (ne:VI (match_dup 1) (match_dup 2)))]) (set (match_operand:SI 0 "register_operand" "=r") (lt:SI (reg:CC CR6_REGNO) (const_int 0))) (set (match_dup 0) (xor:SI (match_dup 0) (const_int 1)))] I think the "lt" on CC just doesn't mean it compares if CC value is less than an integer. It just tests the "lt" bit (bit 0) is set or not on this CC. Looking forward to your and Segher's further invaluable comments. Thanks Gui Haochen
Hi Haochen, on 2024/5/30 11:14, HAO CHEN GUI wrote: > Hi Kewen, > > 在 2024/5/29 13:26, Kewen.Lin 写道: >> I can understand re-using "unordered" and "eq" will save some efforts than >> doing with unspecs, but they are actually RTL codes instead of bits on the >> specific hardware CR, a downside is that people who isn't aware of this >> design point can have some misunderstanding when reading/checking the code >> or dumping, from this perspective unspecs (with reasonable name) can be >> more meaningful. Normally adopting RTL code is better since they have the >> chance to be considered (optimized) in generic pass/code, but it isn't the >> case here as we just use the code itself but not be with the same semantic >> (meaning). Looking forward to others' opinions on this, if we want to adopt >> "unordered" and "eq" like what this patch does, I think we should at least >> emphasize such points in rs6000-modes.def. > > Thanks so much for your comments. IMHO, the core is if we can re-define > "unordered" or "eq" for certain CC mode on a specific target. If we can't or > it's unsafe, we have to use the unspecs. In this case, I just want to define > the code "unordered" on CCBCD as testing if the bit 3 is set on this CR field. But my point is that "unordered" has its semantic, it looks a bit tricky to adopt it on the result from BCD comparison when which only has "invalid" and "overflow" other than normal ones, though I can understand that this patch wants to use it to test bit 3 since for float comparison bit 3 is for "unordered". However, IMHO it would be more clear to have one unspec to test bit 3 when bit 3 doesn't actually mean unordered result. > Actually rs6000 already use "lt" code to test if bit 0 is set for vector > compare instructions. The following expand is an example. Yeah, but it doesn't mean it's the most sensible way to do this, IMHO it suffers from the similar issue and can be improved as well. > > (define_expand "vector_ae_<mode>_p" > [(parallel > [(set (reg:CC CR6_REGNO) > (unspec:CC [(ne:CC (match_operand:VI 1 "vlogical_operand") > (match_operand:VI 2 "vlogical_operand"))] > UNSPEC_PREDICATE)) > (set (match_dup 3) > (ne:VI (match_dup 1) > (match_dup 2)))]) > (set (match_operand:SI 0 "register_operand" "=r") > (lt:SI (reg:CC CR6_REGNO) > (const_int 0))) > (set (match_dup 0) > (xor:SI (match_dup 0) > (const_int 1)))] > > I think the "lt" on CC just doesn't mean it compares if CC value is less than an > integer. It just tests the "lt" bit (bit 0) is set or not on this CC. But bit 0 doesn't mean for "lt" comparison result in this context any more. BR, Kewen > > Looking forward to your and Segher's further invaluable comments. > > Thanks > Gui Haochen
diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md index bb20441c096..9fa8cf89f61 100644 --- a/gcc/config/rs6000/altivec.md +++ b/gcc/config/rs6000/altivec.md @@ -4443,7 +4443,7 @@ (define_insn "bcd<bcd_add_sub>_<mode>" (match_operand:VBCD 2 "register_operand" "v") (match_operand:QI 3 "const_0_to_1_operand" "n")] UNSPEC_BCD_ADD_SUB)) - (clobber (reg:CCFP CR6_REGNO))] + (clobber (reg:CCBCD CR6_REGNO))] "TARGET_P8_VECTOR" "bcd<bcd_add_sub>. %0,%1,%2,%3" [(set_attr "type" "vecsimple")]) @@ -4454,8 +4454,8 @@ (define_insn "bcd<bcd_add_sub>_<mode>" ;; probably should be one that can go in the VMX (Altivec) registers, so we ;; can't use DDmode or DFmode. (define_insn "*bcd<bcd_add_sub>_test_<mode>" - [(set (reg:CCFP CR6_REGNO) - (compare:CCFP + [(set (reg:CCBCD CR6_REGNO) + (compare:CCBCD (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v") (match_operand:VBCD 2 "register_operand" "v") (match_operand:QI 3 "const_0_to_1_operand" "i")] @@ -4472,8 +4472,8 @@ (define_insn "*bcd<bcd_add_sub>_test2_<mode>" (match_operand:VBCD 2 "register_operand" "v") (match_operand:QI 3 "const_0_to_1_operand" "i")] UNSPEC_BCD_ADD_SUB)) - (set (reg:CCFP CR6_REGNO) - (compare:CCFP + (set (reg:CCBCD CR6_REGNO) + (compare:CCBCD (unspec:V2DF [(match_dup 1) (match_dup 2) (match_dup 3)] @@ -4566,8 +4566,8 @@ (define_insn "vclrrb" [(set_attr "type" "vecsimple")]) (define_expand "bcd<bcd_add_sub>_<code>_<mode>" - [(parallel [(set (reg:CCFP CR6_REGNO) - (compare:CCFP + [(parallel [(set (reg:CCBCD CR6_REGNO) + (compare:CCBCD (unspec:V2DF [(match_operand:VBCD 1 "register_operand") (match_operand:VBCD 2 "register_operand") (match_operand:QI 3 "const_0_to_1_operand")] @@ -4575,7 +4575,7 @@ (define_expand "bcd<bcd_add_sub>_<code>_<mode>" (match_dup 4))) (clobber (match_scratch:VBCD 5))]) (set (match_operand:SI 0 "register_operand") - (BCD_TEST:SI (reg:CCFP CR6_REGNO) + (BCD_TEST:SI (reg:CCBCD CR6_REGNO) (const_int 0)))] "TARGET_P8_VECTOR" { @@ -4583,8 +4583,8 @@ (define_expand "bcd<bcd_add_sub>_<code>_<mode>" }) (define_insn "*bcdinvalid_<mode>" - [(set (reg:CCFP CR6_REGNO) - (compare:CCFP + [(set (reg:CCBCD CR6_REGNO) + (compare:CCBCD (unspec:V2DF [(match_operand:VBCD 1 "register_operand" "v")] UNSPEC_BCDSUB) (match_operand:V2DF 2 "zero_constant" "j"))) @@ -4594,14 +4594,14 @@ (define_insn "*bcdinvalid_<mode>" [(set_attr "type" "vecsimple")]) (define_expand "bcdinvalid_<mode>" - [(parallel [(set (reg:CCFP CR6_REGNO) - (compare:CCFP + [(parallel [(set (reg:CCBCD CR6_REGNO) + (compare:CCBCD (unspec:V2DF [(match_operand:VBCD 1 "register_operand")] UNSPEC_BCDSUB) (match_dup 2))) (clobber (match_scratch:VBCD 3))]) (set (match_operand:SI 0 "register_operand") - (unordered:SI (reg:CCFP CR6_REGNO) + (unordered:SI (reg:CCBCD CR6_REGNO) (const_int 0)))] "TARGET_P8_VECTOR" { @@ -4614,7 +4614,7 @@ (define_insn "bcdshift_v16qi" (match_operand:V16QI 2 "register_operand" "v") (match_operand:QI 3 "const_0_to_1_operand" "n")] UNSPEC_BCDSHIFT)) - (clobber (reg:CCFP CR6_REGNO))] + (clobber (reg:CCBCD CR6_REGNO))] "TARGET_P8_VECTOR" "bcds. %0,%1,%2,%3" [(set_attr "type" "vecsimple")]) @@ -4623,7 +4623,7 @@ (define_expand "bcdmul10_v16qi" [(set (match_operand:V16QI 0 "register_operand") (unspec:V16QI [(match_operand:V16QI 1 "register_operand")] UNSPEC_BCDSHIFT)) - (clobber (reg:CCFP CR6_REGNO))] + (clobber (reg:CCBCD CR6_REGNO))] "TARGET_P9_VECTOR" { rtx one = gen_reg_rtx (V16QImode); @@ -4638,7 +4638,7 @@ (define_expand "bcddiv10_v16qi" [(set (match_operand:V16QI 0 "register_operand") (unspec:V16QI [(match_operand:V16QI 1 "register_operand")] UNSPEC_BCDSHIFT)) - (clobber (reg:CCFP CR6_REGNO))] + (clobber (reg:CCBCD CR6_REGNO))] "TARGET_P9_VECTOR" { rtx one = gen_reg_rtx (V16QImode); @@ -4662,9 +4662,9 @@ (define_peephole2 (match_operand:V1TI 2 "register_operand") (match_operand:QI 3 "const_0_to_1_operand")] UNSPEC_BCD_ADD_SUB)) - (clobber (reg:CCFP CR6_REGNO))]) - (parallel [(set (reg:CCFP CR6_REGNO) - (compare:CCFP + (clobber (reg:CCBCD CR6_REGNO))]) + (parallel [(set (reg:CCBCD CR6_REGNO) + (compare:CCBCD (unspec:V2DF [(match_dup 1) (match_dup 2) (match_dup 3)] @@ -4677,8 +4677,8 @@ (define_peephole2 (match_dup 2) (match_dup 3)] UNSPEC_BCD_ADD_SUB)) - (set (reg:CCFP CR6_REGNO) - (compare:CCFP + (set (reg:CCBCD CR6_REGNO) + (compare:CCBCD (unspec:V2DF [(match_dup 1) (match_dup 2) (match_dup 3)] diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md index d23ce9a77a3..18198add744 100644 --- a/gcc/config/rs6000/predicates.md +++ b/gcc/config/rs6000/predicates.md @@ -1350,7 +1350,9 @@ (define_predicate "branch_comparison_operator" (if_then_else (match_test "flag_finite_math_only") (match_code "lt,le,gt,ge,eq,ne,unordered,ordered") (match_code "lt,gt,eq,unordered,unge,unle,ne,ordered")) - (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne")) + (if_then_else (match_test "GET_MODE (XEXP (op, 0)) == CCBCDmode") + (match_code "lt,le,gt,ge,eq,ne,unordered,ordered") + (match_code "lt,ltu,le,leu,gt,gtu,ge,geu,eq,ne"))) (match_test "validate_condition_mode (GET_CODE (op), GET_MODE (XEXP (op, 0))), 1"))) diff --git a/gcc/config/rs6000/rs6000-modes.def b/gcc/config/rs6000/rs6000-modes.def index 094b246c834..3e2e6dfb4ff 100644 --- a/gcc/config/rs6000/rs6000-modes.def +++ b/gcc/config/rs6000/rs6000-modes.def @@ -61,6 +61,7 @@ FRACTIONAL_FLOAT_MODE (TF, FLOAT_PRECISION_TFmode, 16, ieee_quad_format); CC_MODE (CCUNS); CC_MODE (CCFP); +CC_MODE (CCBCD); /* Used for bcd insns */ CC_MODE (CCEQ); /* Vector modes. */ diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 6ba9df4f02e..4068cd8b929 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -11597,9 +11597,11 @@ validate_condition_mode (enum rtx_code code, machine_mode mode) gcc_assert ((code != GTU && code != LTU && code != GEU && code != LEU) || mode == CCUNSmode); + gcc_assert (mode == CCFPmode || mode == CCBCDmode + || (code != ORDERED && code != UNORDERED)); + gcc_assert (mode == CCFPmode - || (code != ORDERED && code != UNORDERED - && code != UNEQ && code != LTGT + || (code != UNEQ && code != LTGT && code != UNGT && code != UNLT && code != UNGE && code != UNLE)); diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index bc8bc6ab060..9b5fcdc8db0 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -8115,7 +8115,7 @@ (define_expand "movcc" "" "") -(define_mode_iterator CC_any [CC CCUNS CCEQ CCFP]) +(define_mode_iterator CC_any [CC CCUNS CCEQ CCFP CCBCD]) (define_insn "*movcc_<mode>" [(set (match_operand:CC_any 0 "nonimmediate_operand" @@ -13245,6 +13245,7 @@ (define_insn_and_split "*nesi3_ext<mode>" (define_code_iterator fp_rev [ordered ne unle unge]) (define_code_iterator fp_two [ltgt le ge unlt ungt uneq]) +(define_code_iterator ccbcd_rev [ordered ne le ge]) (define_insn_and_split "*<code><mode>_cc" [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") @@ -13264,6 +13265,24 @@ (define_insn_and_split "*<code><mode>_cc" } [(set_attr "length" "12")]) +(define_insn_and_split "*<code><mode>_cc" + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") + (ccbcd_rev:GPR (match_operand:CCBCD 1 "cc_reg_operand" "y") + (const_int 0)))] + "" + "#" + "&& 1" + [(pc)] +{ + rtx_code revcode = reverse_condition (<CODE>); + rtx eq = gen_rtx_fmt_ee (revcode, <MODE>mode, operands[1], const0_rtx); + rtx tmp = gen_reg_rtx (<MODE>mode); + emit_move_insn (tmp, eq); + emit_insn (gen_xor<mode>3 (operands[0], tmp, const1_rtx)); + DONE; +} + [(set_attr "length" "12")]) + (define_insn_and_split "*<code><mode>_cc" [(set (match_operand:GPR 0 "gpc_reg_operand" "=r") (fp_two:GPR (match_operand:CCFP 1 "cc_reg_operand" "y") diff --git a/gcc/testsuite/gcc.target/powerpc/pr100736.c b/gcc/testsuite/gcc.target/powerpc/pr100736.c new file mode 100644 index 00000000000..85e3ae56d55 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr100736.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-mdejagnu-cpu=power8 -O2 -ffinite-math-only" } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ + +/* Verify there is no ICE with finite-math-only */ + +int foo (vector unsigned char a, vector unsigned char b) +{ + return __builtin_vec_bcdsub_ge (a, b, 0); +} + + diff --git a/gcc/testsuite/gcc.target/powerpc/pr114732.c b/gcc/testsuite/gcc.target/powerpc/pr114732.c new file mode 100644 index 00000000000..d0b878780c6 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr114732.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-mdejagnu-cpu=power8 -O2" } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ + +/* Verify only one cr bit need to be tested. */ + +int foo (vector unsigned char a, vector unsigned char b) +{ + return __builtin_vec_bcdsub_ge (a, b, 0) != 1; +} + +/* { dg-final { scan-assembler-not "cror" } } */