mbox series

[00/11] aarch64: Rework ldp/stp patterns, add new ldp/stp pass

Message ID ZVZaCBDJ8GG9YoR+@arm.com
Headers show
Series aarch64: Rework ldp/stp patterns, add new ldp/stp pass | expand

Message

Alex Coplan Nov. 16, 2023, 6:06 p.m. UTC
Hi,

This patch series reworks the load/store pair representation in the aarch64
backend and adds a new load/store pair fusion pass.

Patch 1/11 is just a rebased version of the patch from the previous version of
the series to add support to RTL-SSA for inserting new insns.

Patch 2/11 adds some RTL-SSA helpers for removing accesses.  Patches 3-5 fix up
the testsuite in light of the codegen changes.  Patches 6-7 make small tweaks to
operand printing in the aarch64 backend.  Patch 8/11 generalizes and reworks the
existing load/store pair writeback patterns (in preparation for use by the pass).
Patch 9/11 reworks the non-writeback pair patterns, both to fix a correctness
issue and increase their generality (while reducing the number of patterns).
Patch 10/11 is a revised version of the load/store pair fusion pass including
writeback support (among other changes).

Finally, patch 11/11 adjusts the mem{cpy,set} expansion to avoid creating
ldp/stp at expand time, instead we rely on the new pass to do it.

Many thanks to Richard Sandiford for his help in patiently answering my
many questions during the development of the series.

Bootstrapped/regtested as a series on aarch64-linux-gnu.

Thanks,
Alex

Alex Coplan (11):
  rtl-ssa: Support for inserting new insns
  rtl-ssa: Add some helpers for removing accesses
  aarch64, testsuite: Fix up auto-init-padding tests
  aarch64, testsuite: Allow ldp/stp on SVE regs with -msve-vector-bits=128
  aarch64, testsuite: Fix up pr103147-10 tests
  aarch64: Fix up aarch64_print_operand xzr/wzr case
  aarch64: Fix up printing of ldp/stp with -msve-vector-bits=128
  aarch64: Generalize writeback ldp/stp patterns
  aarch64: Rewrite non-writeback ldp/stp patterns
  aarch64: Add new load/store pair fusion pass.
  aarch64: Use individual loads/stores for mem{cpy,set} expansion

 gcc/config.gcc                                |    4 +-
 gcc/config/aarch64/aarch64-ldp-fusion.cc      | 2727 +++++++++++++++++
 gcc/config/aarch64/aarch64-ldpstp.md          |   66 +-
 gcc/config/aarch64/aarch64-modes.def          |    6 +-
 gcc/config/aarch64/aarch64-passes.def         |    2 +
 gcc/config/aarch64/aarch64-protos.h           |    7 +-
 gcc/config/aarch64/aarch64-simd.md            |   60 -
 gcc/config/aarch64/aarch64.cc                 |  338 +-
 gcc/config/aarch64/aarch64.md                 |  472 +--
 gcc/config/aarch64/aarch64.opt                |   23 +
 gcc/config/aarch64/iterators.md               |    3 +
 gcc/config/aarch64/predicates.md              |   48 +-
 gcc/config/aarch64/t-aarch64                  |    7 +
 gcc/rtl-ssa/access-utils.h                    |   42 +
 gcc/rtl-ssa/accesses.cc                       |   10 +
 gcc/rtl-ssa/accesses.h                        |    4 +
 gcc/rtl-ssa/changes.cc                        |   74 +-
 gcc/rtl-ssa/changes.h                         |    2 +
 gcc/rtl-ssa/functions.h                       |   14 +
 gcc/rtl-ssa/insns.cc                          |    5 +
 gcc/rtl-ssa/insns.h                           |    7 +-
 gcc/rtl-ssa/internals.inl                     |    1 +
 gcc/rtl-ssa/member-fns.inl                    |   12 +
 gcc/rtl-ssa/movement.h                        |    8 +-
 .../g++.target/aarch64/pr103147-10.C          |    5 +
 .../gcc.target/aarch64/auto-init-padding-1.c  |    8 +-
 .../gcc.target/aarch64/auto-init-padding-2.c  |    2 +-
 .../gcc.target/aarch64/auto-init-padding-3.c  |    7 +-
 .../gcc.target/aarch64/auto-init-padding-4.c  |    4 +-
 .../gcc.target/aarch64/auto-init-padding-9.c  |    7 +-
 .../gcc.target/aarch64/pr103147-10.c          |    5 +
 .../aarch64/sve/pcs/stack_clash_1_128.c       |   32 +
 .../gcc.target/aarch64/sve/pcs/struct_3_128.c |   29 +
 33 files changed, 3573 insertions(+), 468 deletions(-)
 create mode 100644 gcc/config/aarch64/aarch64-ldp-fusion.cc