mbox series

[v2,00/35] tcg: Introduce TCG_COND_TST{EQ,NE}

Message ID 20231028194522.245170-1-richard.henderson@linaro.org
Headers show
Series tcg: Introduce TCG_COND_TST{EQ,NE} | expand

Message

Richard Henderson Oct. 28, 2023, 7:44 p.m. UTC
Expose a pair of comparison operators that map to the "test"
comparison that is available on many architectures.

Changes for v2:
  * Add TCGCond to tcg_target_const_match.
    This fixes a long-standing issue with ppc and s390x backends,
    in that CMPI for signed comparisons has signed immediate and
    CMPLI for unsigned comparisons has unsigned immediate.
    But now allows different immediates for the TST comparisons.
  * tcg/i386: Generate TEST x,x for power-of-two in {7,15,31,63}.
  * tcg/i386: Generate BT n,x for other power-of-two.
  * tcg/ppc: tcg_target_const_match improvements
  * tcg/s390x: tcg_target_const_match improvements
  * target/m68k: Use TST{EQ,NE} for gen_fcc_cond.

I wanted more testing, so I went looking for low-hanging fruit.
I didn't see any within target/arm, probably due to how we represent NZCV,
and I didn't want to overlap anything that Paolo might be doing with
target/i386, so I landed on trget/m68k.  Tested via our float_convd.


r~


Richard Henderson (35):
  tcg: Introduce TCG_COND_TST{EQ,NE}
  tcg/optimize: Split out arg_is_const_val
  tcg/optimize: Split out do_constant_folding_cond1
  tcg/optimize: Do swap_commutative2 in do_constant_folding_cond2
  tcg/optimize: Split out arg_new_constant
  tcg/optimize: Handle TCG_COND_TST{EQ,NE}
  tcg: Add TCGConst argument to tcg_target_const_match
  tcg/aarch64: Support TCG_COND_TST{EQ,NE}
  tcg/aarch64: Generate TBZ, TBNZ
  tcg/aarch64: Generate CBNZ for TSTNE of UINT32_MAX
  tcg/arm: Support TCG_COND_TST{EQ,NE}
  tcg/i386: Pass x86 condition codes to tcg_out_cmov
  tcg/i386: Move tcg_cond_to_jcc[] into tcg_out_cmp
  tcg/i386: Support TCG_COND_TST{EQ,NE}
  tcg/i386: Improve TSTNE/TESTEQ vs powers of two
  tcg/loongarch64: Support TCG_COND_TST{EQ,NE}
  tcg/mips: Support TCG_COND_TST{EQ,NE}
  tcg/riscv: Support TCG_COND_TST{EQ,NE}
  tcg/sparc64: Implement tcg_out_extrl_i64_i32
  tcg/sparc64: Hoist read of tcg_cond_to_rcond
  tcg/sparc64: Pass TCGCond to tcg_out_cmp
  tcg/sparc64: Support TCG_COND_TST{EQ,NE}
  tcg/ppc: Sink tcg_to_bc usage into tcg_out_bc
  tcg/ppc: Use cr0 in tcg_to_bc and tcg_to_isel
  tcg/ppc: Tidy up tcg_target_const_match
  tcg/ppc: Add TCG_CT_CONST_CMP
  tcg/ppc: Support TCG_COND_TST{EQ,NE}
  tcg/s390x: Split constraint A into J+U
  tcg/s390x: Add TCG_CT_CONST_CMP
  tcg/s390x: Support TCG_COND_TST{EQ,NE}
  tcg/tci: Support TCG_COND_TST{EQ,NE}
  target/alpha: Use TCG_COND_TST{EQ,NE} for BLB{C,S}
  target/alpha: Use TCG_COND_TST{EQ,NE} for CMOVLB{C,S}
  target/alpha: Use TCG_COND_TSTNE for gen_fold_mzero
  target/m68k: Use TCG_COND_TST{EQ,NE} in gen_fcc_cond

 docs/devel/tcg-ops.rst           |   2 +
 include/tcg/tcg-cond.h           |  49 +++-
 tcg/aarch64/tcg-target-con-set.h |   5 +-
 tcg/aarch64/tcg-target-con-str.h |   1 +
 tcg/i386/tcg-target-con-set.h    |   6 +-
 tcg/i386/tcg-target-con-str.h    |   1 +
 tcg/ppc/tcg-target-con-set.h     |   5 +-
 tcg/ppc/tcg-target-con-str.h     |   1 +
 tcg/s390x/tcg-target-con-set.h   |   8 +-
 tcg/s390x/tcg-target-con-str.h   |   3 +-
 target/alpha/translate.c         |  94 +++----
 target/m68k/translate.c          |  74 +++--
 tcg/optimize.c                   | 464 +++++++++++++++++++++++--------
 tcg/tcg.c                        |  38 ++-
 tcg/tci.c                        |  14 +
 tcg/aarch64/tcg-target.c.inc     | 165 ++++++++---
 tcg/arm/tcg-target.c.inc         |  62 +++--
 tcg/i386/tcg-target.c.inc        | 178 ++++++++----
 tcg/loongarch64/tcg-target.c.inc |  59 ++--
 tcg/mips/tcg-target.c.inc        |  44 ++-
 tcg/ppc/tcg-target.c.inc         | 277 +++++++++++++-----
 tcg/riscv/tcg-target.c.inc       |  23 +-
 tcg/s390x/tcg-target.c.inc       | 246 ++++++++++------
 tcg/sparc64/tcg-target.c.inc     |  70 +++--
 tcg/tci/tcg-target.c.inc         |   3 +-
 25 files changed, 1342 insertions(+), 550 deletions(-)

Comments

Richard Henderson Nov. 2, 2023, 10:17 p.m. UTC | #1
Ping.  Running out of days before soft-freeze.  :-)

r~

On 10/28/23 12:44, Richard Henderson wrote:
> Expose a pair of comparison operators that map to the "test"
> comparison that is available on many architectures.
> 
> Changes for v2:
>    * Add TCGCond to tcg_target_const_match.
>      This fixes a long-standing issue with ppc and s390x backends,
>      in that CMPI for signed comparisons has signed immediate and
>      CMPLI for unsigned comparisons has unsigned immediate.
>      But now allows different immediates for the TST comparisons.
>    * tcg/i386: Generate TEST x,x for power-of-two in {7,15,31,63}.
>    * tcg/i386: Generate BT n,x for other power-of-two.
>    * tcg/ppc: tcg_target_const_match improvements
>    * tcg/s390x: tcg_target_const_match improvements
>    * target/m68k: Use TST{EQ,NE} for gen_fcc_cond.
> 
> I wanted more testing, so I went looking for low-hanging fruit.
> I didn't see any within target/arm, probably due to how we represent NZCV,
> and I didn't want to overlap anything that Paolo might be doing with
> target/i386, so I landed on trget/m68k.  Tested via our float_convd.
> 
> 
> r~
> 
> 
> Richard Henderson (35):
>    tcg: Introduce TCG_COND_TST{EQ,NE}
>    tcg/optimize: Split out arg_is_const_val
>    tcg/optimize: Split out do_constant_folding_cond1
>    tcg/optimize: Do swap_commutative2 in do_constant_folding_cond2
>    tcg/optimize: Split out arg_new_constant
>    tcg/optimize: Handle TCG_COND_TST{EQ,NE}
>    tcg: Add TCGConst argument to tcg_target_const_match
>    tcg/aarch64: Support TCG_COND_TST{EQ,NE}
>    tcg/aarch64: Generate TBZ, TBNZ
>    tcg/aarch64: Generate CBNZ for TSTNE of UINT32_MAX
>    tcg/arm: Support TCG_COND_TST{EQ,NE}
>    tcg/i386: Pass x86 condition codes to tcg_out_cmov
>    tcg/i386: Move tcg_cond_to_jcc[] into tcg_out_cmp
>    tcg/i386: Support TCG_COND_TST{EQ,NE}
>    tcg/i386: Improve TSTNE/TESTEQ vs powers of two
>    tcg/loongarch64: Support TCG_COND_TST{EQ,NE}
>    tcg/mips: Support TCG_COND_TST{EQ,NE}
>    tcg/riscv: Support TCG_COND_TST{EQ,NE}
>    tcg/sparc64: Implement tcg_out_extrl_i64_i32
>    tcg/sparc64: Hoist read of tcg_cond_to_rcond
>    tcg/sparc64: Pass TCGCond to tcg_out_cmp
>    tcg/sparc64: Support TCG_COND_TST{EQ,NE}
>    tcg/ppc: Sink tcg_to_bc usage into tcg_out_bc
>    tcg/ppc: Use cr0 in tcg_to_bc and tcg_to_isel
>    tcg/ppc: Tidy up tcg_target_const_match
>    tcg/ppc: Add TCG_CT_CONST_CMP
>    tcg/ppc: Support TCG_COND_TST{EQ,NE}
>    tcg/s390x: Split constraint A into J+U
>    tcg/s390x: Add TCG_CT_CONST_CMP
>    tcg/s390x: Support TCG_COND_TST{EQ,NE}
>    tcg/tci: Support TCG_COND_TST{EQ,NE}
>    target/alpha: Use TCG_COND_TST{EQ,NE} for BLB{C,S}
>    target/alpha: Use TCG_COND_TST{EQ,NE} for CMOVLB{C,S}
>    target/alpha: Use TCG_COND_TSTNE for gen_fold_mzero
>    target/m68k: Use TCG_COND_TST{EQ,NE} in gen_fcc_cond
> 
>   docs/devel/tcg-ops.rst           |   2 +
>   include/tcg/tcg-cond.h           |  49 +++-
>   tcg/aarch64/tcg-target-con-set.h |   5 +-
>   tcg/aarch64/tcg-target-con-str.h |   1 +
>   tcg/i386/tcg-target-con-set.h    |   6 +-
>   tcg/i386/tcg-target-con-str.h    |   1 +
>   tcg/ppc/tcg-target-con-set.h     |   5 +-
>   tcg/ppc/tcg-target-con-str.h     |   1 +
>   tcg/s390x/tcg-target-con-set.h   |   8 +-
>   tcg/s390x/tcg-target-con-str.h   |   3 +-
>   target/alpha/translate.c         |  94 +++----
>   target/m68k/translate.c          |  74 +++--
>   tcg/optimize.c                   | 464 +++++++++++++++++++++++--------
>   tcg/tcg.c                        |  38 ++-
>   tcg/tci.c                        |  14 +
>   tcg/aarch64/tcg-target.c.inc     | 165 ++++++++---
>   tcg/arm/tcg-target.c.inc         |  62 +++--
>   tcg/i386/tcg-target.c.inc        | 178 ++++++++----
>   tcg/loongarch64/tcg-target.c.inc |  59 ++--
>   tcg/mips/tcg-target.c.inc        |  44 ++-
>   tcg/ppc/tcg-target.c.inc         | 277 +++++++++++++-----
>   tcg/riscv/tcg-target.c.inc       |  23 +-
>   tcg/s390x/tcg-target.c.inc       | 246 ++++++++++------
>   tcg/sparc64/tcg-target.c.inc     |  70 +++--
>   tcg/tci/tcg-target.c.inc         |   3 +-
>   25 files changed, 1342 insertions(+), 550 deletions(-)
>
Paolo Bonzini Jan. 6, 2024, 5:43 p.m. UTC | #2
On Sat, Oct 28, 2023 at 9:45 PM Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> Expose a pair of comparison operators that map to the "test"
> comparison that is available on many architectures.
>
> Changes for v2:
>   * Add TCGCond to tcg_target_const_match.
>     This fixes a long-standing issue with ppc and s390x backends,
>     in that CMPI for signed comparisons has signed immediate and
>     CMPLI for unsigned comparisons has unsigned immediate.
>     But now allows different immediates for the TST comparisons.
>   * tcg/i386: Generate TEST x,x for power-of-two in {7,15,31,63}.
>   * tcg/i386: Generate BT n,x for other power-of-two.
>   * tcg/ppc: tcg_target_const_match improvements
>   * tcg/s390x: tcg_target_const_match improvements
>   * target/m68k: Use TST{EQ,NE} for gen_fcc_cond.

I updated the MIPS backend (untested though) and pushed the result to
branch i386 of https://gitlab.com/bonzini/qemu/.

However I was thinking: a lot of RISC targets simply do AND/ANDI
followed by the sequence used for TCG_COND_NE.  Would it make sense to
have a TCG_TARGET_SUPPORTS_TST bit and, if absent, lower TSTEQ/TSTNE
to AND+EQ/NE directly in the optimizer?  And for brcond2/setcond2,
always using AND/AND/OR may work just as well as any backend-specific
trick, and will give more freedom to the register allocator.

This also would allow to reorder the target patches before the TCG
patches, thus: 1) providing more exposure to the lowering code 2)
making the "minimum viable series" smaller. (2) is not a huge deal
since you have already written the backend code, but (1) is
interesting.

Paolo
Richard Henderson Jan. 8, 2024, 9:45 p.m. UTC | #3
On 1/7/24 04:43, Paolo Bonzini wrote:
> On Sat, Oct 28, 2023 at 9:45 PM Richard Henderson
> <richard.henderson@linaro.org> wrote:
>>
>> Expose a pair of comparison operators that map to the "test"
>> comparison that is available on many architectures.
>>
>> Changes for v2:
>>    * Add TCGCond to tcg_target_const_match.
>>      This fixes a long-standing issue with ppc and s390x backends,
>>      in that CMPI for signed comparisons has signed immediate and
>>      CMPLI for unsigned comparisons has unsigned immediate.
>>      But now allows different immediates for the TST comparisons.
>>    * tcg/i386: Generate TEST x,x for power-of-two in {7,15,31,63}.
>>    * tcg/i386: Generate BT n,x for other power-of-two.
>>    * tcg/ppc: tcg_target_const_match improvements
>>    * tcg/s390x: tcg_target_const_match improvements
>>    * target/m68k: Use TST{EQ,NE} for gen_fcc_cond.
> 
> I updated the MIPS backend (untested though) and pushed the result to
> branch i386 of https://gitlab.com/bonzini/qemu/.

Thanks.

> 
> However I was thinking: a lot of RISC targets simply do AND/ANDI
> followed by the sequence used for TCG_COND_NE.  Would it make sense to
> have a TCG_TARGET_SUPPORTS_TST bit and, if absent, lower TSTEQ/TSTNE
> to AND+EQ/NE directly in the optimizer?

Probably best, yes.

> And for brcond2/setcond2,
> always using AND/AND/OR may work just as well as any backend-specific
> trick, and will give more freedom to the register allocator.

   test   a,b
   testeq c,e

for Arm32.  So I'll leave it to the backends.


r~
Paolo Bonzini Jan. 8, 2024, 10:55 p.m. UTC | #4
Il lun 8 gen 2024, 22:45 Richard Henderson <richard.henderson@linaro.org>
ha scritto:

> > I was thinking: a lot of RISC targets simply do AND/ANDI
> > followed by the sequence used for TCG_COND_NE.  Would it make sense to
> > have a TCG_TARGET_SUPPORTS_TST bit and, if absent, lower TSTEQ/TSTNE
> > to AND+EQ/NE directly in the optimizer?
>
> Probably best, yes.
>

Ok, I will give it a shot.

> And for brcond2/setcond2,
> > always using AND/AND/OR may work just as well as any backend-specific
> > trick, and will give more freedom to the register allocator.
>
>    test   a,b
>    testeq c,e
>
> for Arm32.  So I'll leave it to the backends.
>

Nice. :)

Paolo


>
> r~
>
>
Richard Henderson Jan. 9, 2024, 8:36 a.m. UTC | #5
On 1/9/24 09:55, Paolo Bonzini wrote:
> 
> 
> Il lun 8 gen 2024, 22:45 Richard Henderson <richard.henderson@linaro.org 
> <mailto:richard.henderson@linaro.org>> ha scritto:
> 
>      > I was thinking: a lot of RISC targets simply do AND/ANDI
>      > followed by the sequence used for TCG_COND_NE.  Would it make sense to
>      > have a TCG_TARGET_SUPPORTS_TST bit and, if absent, lower TSTEQ/TSTNE
>      > to AND+EQ/NE directly in the optimizer?
> 
>     Probably best, yes.
> 
> 
> Ok, I will give it a shot.

I'm in the process now...


r~