mbox series

[00/10,RISC-V] Atomics improvements [PR100265/PR100266]

Message ID 20210426124552.3316789-1-cmuellner@gcc.gnu.org
Headers show
Series Atomics improvements [PR100265/PR100266] | expand


Christoph Müllner April 26, 2021, 12:45 p.m. UTC
This series provides a cleanup of the current atomics implementation
of RISC-V:

* PR100265: Use proper fences for atomic load/store
* PR100266: Provide programmatic implementation of CAS

As both are very related, I merged the patches into one series
(to avoid merge issues if one overtake the other).

The first patch could be squashed into the following patches,
but I found it easier to understand the changes with it in place.

The series has been tested as follows:
* Building and testing a multilib RV32/64 toolchain
  (bootstrapped with riscv-gnu-toolchain repo)
* Manual review of generated sequences for GCC's atomic builtins API

The second part of the series (the re-implementation of CAS) benefits
from a REE improvement (see PR100264):
If this patch is not in place, then an additional s.ext instruction
is emitted after the SC.W (in case of RV64 and CAS for uint32_t).

Christoph Muellner (10):
  RISC-V: Simplify memory model code [PR 100265]
  RISC-V: Emit proper memory ordering suffixes for AMOs [PR 100265]
  RISC-V: Eliminate %F specifier from riscv_print_operand() [PR 100265]
  RISC-V: Don't use amoswap for atomic stores [PR 100265]
  RISC-V: Emit fences according to chosen memory model [PR 100265]
  RISC-V: Implement atomic_{load,store} [PR 100265]
  RISC-V: Model INSNs for LR and SC [PR 100266]
  RISC-V: Add s.ext-consuming INSNs for LR and SC [PR 100266]
  RISC-V: Generate helpers for cbranch4 [PR 100266]
  RISC-V: Provide programmatic implementation of CAS [PR 100266]

 gcc/config/riscv/riscv-protos.h |   1 +
 gcc/config/riscv/riscv.c        | 134 +++++++++++++---------
 gcc/config/riscv/riscv.md       |   2 +-
 gcc/config/riscv/sync.md        | 190 ++++++++++++++++++++++----------
 4 files changed, 215 insertions(+), 112 deletions(-)


Jim Wilson April 28, 2021, 10:40 p.m. UTC | #1
On Mon, Apr 26, 2021 at 5:45 AM Christoph Muellner <cmuellner@gcc.gnu.org>

> This series provides a cleanup of the current atomics implementation
> of RISC-V:

This looks OK to me, other than the issue with address instructions emitted
inside the lr/sc loop.  That could be fixed with a follow up patch though.

amoswap is probably faster than a fence followed by a store, but they
aren't exactly the same operation, since I think the amoswap only provides
consistency for the target address whereas the fence would provide
consistency for all addresses.  It does look odd that we were using amoswap
here.  The inconsistency with atomic_load might be a problem.  I'm OK with
dropping this.

I'm not sure if we are implementing the __sync_* builtins properly.  Most
of them are defined as full barriers.  With our weak memory model RVWMO,
that implies that we need a fence along with an amo instruction.  However,
this is broken without your patch, and not made any worse by your patch, so
it isn't an issue with your patch.  Just a FYI.  See for instance the
discussion in PR 65697 and the patch to fix this for aarch64.