diff mbox series

[v1] RISC-V: Add support for inlining subword atomic operations

Message ID 20220208004841.3126082-1-patrick@rivosinc.com
State New
Headers show
Series [v1] RISC-V: Add support for inlining subword atomic operations | expand

Commit Message

Patrick O'Neill Feb. 8, 2022, 12:48 a.m. UTC
RISC-V has no support for subword atomic operations; code currently
generates libatomic library calls.

This patch changes the default behavior to inline subword atomic calls 
(using the same logic as the existing library call).
Behavior can be specified using the -minline-atomics and
-mno-inline-atomics command line flags.

gcc/libgcc/config/riscv/atomic.c has the same logic implemented in asm.
This will need to stay for backwards compatibility and the
-mno-inline-atomics flag.

2022-02-07 Patrick O'Neill <patrick@rivosinc.com>

	PR target/104338
	* riscv.opt: Add command-line flag.
	* sync.md (atomic_fetch_<atomic_optab><mode>): logic for 
	expanding subword atomic operations.
	* sync.md (subword_atomic_fetch_strong_<atomic_optab>): LR/SC
	block for performing atomic operation
	* atomic.c: Add reference to duplicate logic.
	* inline-atomics-1.c: New test.
	* inline-atomics-2.c: Likewise.
	* inline-atomics-3.c: Likewise.
	* inline-atomics-4.c: Likewise.
	* inline-atomics-5.c: Likewise.
	* inline-atomics-6.c: Likewise.
	* inline-atomics-7.c: Likewise.
	* inline-atomics-8.c: Likewise.
	* inline-atomics-9.c: Likewise.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
---
There may be further concerns about the memory consistency of these 
operations, but this patch focuses on simply moving the logic inline.
Those concerns can be addressed in a future patch.
---
 gcc/config/riscv/riscv.opt                    |   4 +
 gcc/config/riscv/sync.md                      |  96 +++
 .../gcc.target/riscv/inline-atomics-1.c       |  11 +
 .../gcc.target/riscv/inline-atomics-2.c       |  12 +
 .../gcc.target/riscv/inline-atomics-3.c       | 569 ++++++++++++++++++
 .../gcc.target/riscv/inline-atomics-4.c       | 566 +++++++++++++++++
 .../gcc.target/riscv/inline-atomics-5.c       |  13 +
 .../gcc.target/riscv/inline-atomics-6.c       |  12 +
 .../gcc.target/riscv/inline-atomics-7.c       |  12 +
 .../gcc.target/riscv/inline-atomics-8.c       |  17 +
 .../gcc.target/riscv/inline-atomics-9.c       |  17 +
 libgcc/config/riscv/atomic.c                  |   2 +
 12 files changed, 1331 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-9.c

Comments

Palmer Dabbelt Feb. 23, 2022, 9:49 p.m. UTC | #1
On Mon, 07 Feb 2022 16:48:41 PST (-0800), patrick@rivosinc.com wrote:
> RISC-V has no support for subword atomic operations; code currently
> generates libatomic library calls.
>
> This patch changes the default behavior to inline subword atomic calls
> (using the same logic as the existing library call).
> Behavior can be specified using the -minline-atomics and
> -mno-inline-atomics command line flags.
>
> gcc/libgcc/config/riscv/atomic.c has the same logic implemented in asm.
> This will need to stay for backwards compatibility and the
> -mno-inline-atomics flag.
>
> 2022-02-07 Patrick O'Neill <patrick@rivosinc.com>
>
> 	PR target/104338
> 	* riscv.opt: Add command-line flag.
> 	* sync.md (atomic_fetch_<atomic_optab><mode>): logic for
> 	expanding subword atomic operations.
> 	* sync.md (subword_atomic_fetch_strong_<atomic_optab>): LR/SC
> 	block for performing atomic operation
> 	* atomic.c: Add reference to duplicate logic.
> 	* inline-atomics-1.c: New test.
> 	* inline-atomics-2.c: Likewise.
> 	* inline-atomics-3.c: Likewise.
> 	* inline-atomics-4.c: Likewise.
> 	* inline-atomics-5.c: Likewise.
> 	* inline-atomics-6.c: Likewise.
> 	* inline-atomics-7.c: Likewise.
> 	* inline-atomics-8.c: Likewise.
> 	* inline-atomics-9.c: Likewise.
>
> Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
> ---
> There may be further concerns about the memory consistency of these
> operations, but this patch focuses on simply moving the logic inline.
> Those concerns can be addressed in a future patch.
> ---
>  gcc/config/riscv/riscv.opt                    |   4 +
>  gcc/config/riscv/sync.md                      |  96 +++
>  .../gcc.target/riscv/inline-atomics-1.c       |  11 +
>  .../gcc.target/riscv/inline-atomics-2.c       |  12 +
>  .../gcc.target/riscv/inline-atomics-3.c       | 569 ++++++++++++++++++
>  .../gcc.target/riscv/inline-atomics-4.c       | 566 +++++++++++++++++
>  .../gcc.target/riscv/inline-atomics-5.c       |  13 +
>  .../gcc.target/riscv/inline-atomics-6.c       |  12 +
>  .../gcc.target/riscv/inline-atomics-7.c       |  12 +
>  .../gcc.target/riscv/inline-atomics-8.c       |  17 +
>  .../gcc.target/riscv/inline-atomics-9.c       |  17 +
>  libgcc/config/riscv/atomic.c                  |   2 +
>  12 files changed, 1331 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-3.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-4.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-5.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-6.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-7.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-8.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-9.c
>
> diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
> index e294e223151..fb702317233 100644
> --- a/gcc/config/riscv/riscv.opt
> +++ b/gcc/config/riscv/riscv.opt
> @@ -211,3 +211,7 @@ Enum(isa_spec_class) String(20191213) Value(ISA_SPEC_CLASS_20191213)
>  misa-spec=
>  Target RejectNegative Joined Enum(isa_spec_class) Var(riscv_isa_spec) Init(TARGET_DEFAULT_ISA_SPEC)
>  Set the version of RISC-V ISA spec.
> +
> +minline-atomics
> +Target Bool Var(ALWAYS_INLINE_SUBWORD_ATOMIC) Init(-1)
> +Always inline subword atomic operations.

We usually have lower-case names for variables, but I think you can get 
away with a target flag here (which makes things slightly easier).  The 
-1 initializer is also a bit odd, but that'd go away with a target flag.

At a bare minimum this needs a dov/invoke.texi blurb, but IMO this 
should really be called out as a news entry as well -- we're already 
finding some ABI-related fallout in libstdc++, so we should make this as 
visible as possible to users.  I think it's OK to default to enabling 
the inline atomics, as we're not directly breaking the ABI from GCC.

> diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
> index 747a799e237..e19b4157d3c 100644
> --- a/gcc/config/riscv/sync.md
> +++ b/gcc/config/riscv/sync.md
> @@ -92,6 +92,102 @@
>    "%F3amo<insn>.<amo>%A3 %0,%z2,%1"
>    [(set (attr "length") (const_int 8))])
>
> +(define_expand "atomic_fetch_<atomic_optab><mode>"
> +  [(set (match_operand:SHORT 0 "register_operand" "=&r")	      ;; old value at mem
> +	(match_operand:SHORT 1 "memory_operand" "+A"))		      ;; mem location
> +   (set (match_dup 1)
> +	(unspec_volatile:SHORT
> +	  [(any_atomic:SHORT (match_dup 1)
> +		     (match_operand:SHORT 2 "reg_or_0_operand" "rJ")) ;; value for op
> +	   (match_operand:SI 3 "const_int_operand")]		      ;; model
> +	 UNSPEC_SYNC_OLD_OP))]
> +  "TARGET_ATOMIC && ALWAYS_INLINE_SUBWORD_ATOMIC"
> +{
> +  /* We have no QImode/HImode atomics, so form a mask, then use
> +     subword_atomic_fetch_strong_<mode> to implement a LR/SC version of the
> +     operation. */
> +
> +  /* Logic duplicated in gcc/libgcc/config/riscv/atomic.c for use when inlining
> +     is disabled */
> +
> +  rtx old = gen_reg_rtx (SImode);
> +  rtx mem = operands[1];
> +  rtx value = operands[2];
> +  rtx mask = gen_reg_rtx (SImode);
> +  rtx notmask = gen_reg_rtx (SImode);
> +
> +  rtx addr = force_reg (Pmode, XEXP (mem, 0));
> +
> +  rtx aligned_addr = gen_reg_rtx (Pmode);
> +  emit_move_insn (aligned_addr,  gen_rtx_AND (Pmode, addr,
> +					      gen_int_mode (-4, Pmode)));
> +
> +  rtx aligned_mem = change_address (mem, SImode, aligned_addr);
> +
> +  rtx shift = gen_reg_rtx (SImode);
> +  emit_move_insn (shift, gen_rtx_AND (SImode, gen_lowpart (SImode, addr),
> +				      gen_int_mode (3, SImode)));
> +  emit_move_insn (shift, gen_rtx_ASHIFT (SImode, shift,
> +					 gen_int_mode(3, SImode)));
> +
> +  rtx value_reg = gen_reg_rtx (SImode);
> +  emit_move_insn (value_reg, simplify_gen_subreg (SImode, value, <MODE>mode, 0));
> +
> +  rtx shifted_value = gen_reg_rtx (SImode);
> +  emit_move_insn(shifted_value, gen_rtx_ASHIFT(SImode, value_reg,
> +					       gen_lowpart (QImode, shift)));
> +
> +  int unshifted_mask;
> +  if (<MODE>mode == QImode)
> +    unshifted_mask = 0xFF;
> +  else
> +    unshifted_mask = 0xFFFF;
> +
> +  rtx mask_reg = gen_reg_rtx (SImode);
> +  emit_move_insn (mask_reg, gen_int_mode(unshifted_mask, SImode));
> +
> +  emit_move_insn (mask, gen_rtx_ASHIFT(SImode, mask_reg,
> +				       gen_lowpart (QImode, shift)));
> +
> +  emit_move_insn (notmask, gen_rtx_NOT(SImode, mask));
> +
> +  emit_insn (gen_subword_atomic_fetch_strong_<atomic_optab> (old, aligned_mem,
> +							     shifted_value,
> +							     mask, notmask));
> +
> +  emit_move_insn (old, gen_rtx_ASHIFTRT(SImode, old,
> +					gen_lowpart(QImode, shift)));
> +
> +  emit_move_insn (operands[0], gen_lowpart(<MODE>mode, old));
> +
> +  DONE;
> +})
> +
> +(define_insn "subword_atomic_fetch_strong_<atomic_optab>"
> +  [(set (match_operand:SI 0 "register_operand" "=&r")		   ;; old value at mem
> +	(match_operand:SI 1 "memory_operand" "+A"))		   ;; mem location
> +   (set (match_dup 1)
> +	(unspec_volatile:SI
> +	  [(any_atomic:SI (match_dup 1)
> +		     (match_operand:SI 2 "register_operand" "rI")) ;; value for op
> +	   (match_operand:SI 3 "register_operand" "rI")]	   ;; mask
> +	 UNSPEC_SYNC_OLD_OP))

IIUC nothing's looking at UNSPEC_SYNC_OLD_OP, so it's not technically a 
bug, but this isn't really computing the same thing the other patterns 
do (it's a shifted version of the value, not the actual value).  That's 
likely to trip someone up at some point, so I'd just make a new unspec 
for it.

> +    (match_operand:SI 4 "register_operand" "rI")		   ;; not_mask
> +    (clobber (match_scratch:SI 5 "=&r"))			   ;; tmp_1
> +    (clobber (match_scratch:SI 6 "=&r"))]			   ;; tmp_2
> +  "TARGET_ATOMIC && ALWAYS_INLINE_SUBWORD_ATOMIC"
> +  {
> +    return
> +    "1:\;"
> +    "lr.w.aq\t%0, %1\;"
> +    "<insn>\t%5, %0, %2\;"
> +    "and\t%5, %5, %3\;"
> +    "and\t%6, %0, %4\;"
> +    "or\t%6, %6, %5\;"
> +    "sc.w.rl\t%5, %6, %1\;"
> +    "bnez\t%5, 1b\;";}
> +  )
> +
>  (define_insn "atomic_exchange<mode>"
>    [(set (match_operand:GPR 0 "register_operand" "=&r")
>  	(unspec_volatile:GPR
> diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-1.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-1.c
> new file mode 100644
> index 00000000000..110fdabd313
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mno-inline-atomics" } */
> +/* { dg-final { scan-assembler "\tcall\t__sync_fetch_and_add_1" } } */
> +
> +char bar;
> +
> +int
> +main ()
> +{
> +  __sync_fetch_and_add(&bar, 1);
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-2.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-2.c
> new file mode 100644
> index 00000000000..8d5c31d8b79
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-2.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* Verify that subword atomics do not generate calls.  */
> +/* { dg-options "-minline-atomics" } */
> +/* { dg-final { scan-assembler-not "\tcall\t__sync_fetch_and_add_1" } } */
> +
> +char bar;
> +
> +int
> +main ()
> +{
> +  __sync_fetch_and_add(&bar, 1);
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-3.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-3.c
> new file mode 100644
> index 00000000000..19b382d45b0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-3.c
> @@ -0,0 +1,569 @@
> +/* Check all char alignments.  */
> +/* Duplicate logic as libatomic/testsuite/libatomic.c/atomic-op-1.c */
> +/* Test __atomic routines for existence and proper execution on 1 byte
> +   values with each valid memory model.  */
> +/* { dg-do run } */
> +/* { dg-options "-latomic -minline-atomics -Wno-address-of-packed-member" } */
> +
> +/* Test the execution of the __atomic_*OP builtin routines for a char.  */
> +
> +extern void abort(void);
> +
> +char count, res;
> +const char init = ~0;
> +
> +struct A
> +{
> +   char a;
> +   char b;
> +   char c;
> +   char d;
> +} __attribute__ ((packed)) A;
> +
> +/* The fetch_op routines return the original value before the operation.  */
> +
> +void
> +test_fetch_add (char* v)
> +{
> +  *v = 0;
> +  count = 1;
> +
> +  if (__atomic_fetch_add (v, count, __ATOMIC_RELAXED) != 0)
> +    abort ();
> +
> +  if (__atomic_fetch_add (v, 1, __ATOMIC_CONSUME) != 1)
> +    abort ();
> +
> +  if (__atomic_fetch_add (v, count, __ATOMIC_ACQUIRE) != 2)
> +    abort ();
> +
> +  if (__atomic_fetch_add (v, 1, __ATOMIC_RELEASE) != 3)
> +    abort ();
> +
> +  if (__atomic_fetch_add (v, count, __ATOMIC_ACQ_REL) != 4)
> +    abort ();
> +
> +  if (__atomic_fetch_add (v, 1, __ATOMIC_SEQ_CST) != 5)
> +    abort ();
> +}
> +
> +
> +void
> +test_fetch_sub (char* v)
> +{
> +  *v = res = 20;
> +  count = 0;
> +
> +  if (__atomic_fetch_sub (v, count + 1, __ATOMIC_RELAXED) !=  res--)
> +    abort ();
> +
> +  if (__atomic_fetch_sub (v, 1, __ATOMIC_CONSUME) !=  res--)
> +    abort ();
> +
> +  if (__atomic_fetch_sub (v, count + 1, __ATOMIC_ACQUIRE) !=  res--)
> +    abort ();
> +
> +  if (__atomic_fetch_sub (v, 1, __ATOMIC_RELEASE) !=  res--)
> +    abort ();
> +
> +  if (__atomic_fetch_sub (v, count + 1, __ATOMIC_ACQ_REL) !=  res--)
> +    abort ();
> +
> +  if (__atomic_fetch_sub (v, 1, __ATOMIC_SEQ_CST) !=  res--)
> +    abort ();
> +}
> +
> +void
> +test_fetch_and (char* v)
> +{
> +  *v = init;
> +
> +  if (__atomic_fetch_and (v, 0, __ATOMIC_RELAXED) !=  init)
> +    abort ();
> +
> +  if (__atomic_fetch_and (v, init, __ATOMIC_CONSUME) !=  0)
> +    abort ();
> +
> +  if (__atomic_fetch_and (v, 0, __ATOMIC_ACQUIRE) !=  0)
> +    abort ();
> +
> +  *v = ~*v;
> +  if (__atomic_fetch_and (v, init, __ATOMIC_RELEASE) !=  init)
> +    abort ();
> +
> +  if (__atomic_fetch_and (v, 0, __ATOMIC_ACQ_REL) !=  init)
> +    abort ();
> +
> +  if (__atomic_fetch_and (v, 0, __ATOMIC_SEQ_CST) !=  0)
> +    abort ();
> +}
> +
> +void
> +test_fetch_nand (char* v)
> +{
> +  *v = init;
> +
> +  if (__atomic_fetch_nand (v, 0, __ATOMIC_RELAXED) !=  init)
> +    abort ();
> +
> +  if (__atomic_fetch_nand (v, init, __ATOMIC_CONSUME) !=  init)
> +    abort ();
> +
> +  if (__atomic_fetch_nand (v, 0, __ATOMIC_ACQUIRE) !=  0 )
> +    abort ();
> +
> +  if (__atomic_fetch_nand (v, init, __ATOMIC_RELEASE) !=  init)
> +    abort ();
> +
> +  if (__atomic_fetch_nand (v, init, __ATOMIC_ACQ_REL) !=  0)
> +    abort ();
> +
> +  if (__atomic_fetch_nand (v, 0, __ATOMIC_SEQ_CST) !=  init)
> +    abort ();
> +}
> +
> +void
> +test_fetch_xor (char* v)
> +{
> +  *v = init;
> +  count = 0;
> +
> +  if (__atomic_fetch_xor (v, count, __ATOMIC_RELAXED) !=  init)
> +    abort ();
> +
> +  if (__atomic_fetch_xor (v, ~count, __ATOMIC_CONSUME) !=  init)
> +    abort ();
> +
> +  if (__atomic_fetch_xor (v, 0, __ATOMIC_ACQUIRE) !=  0)
> +    abort ();
> +
> +  if (__atomic_fetch_xor (v, ~count, __ATOMIC_RELEASE) !=  0)
> +    abort ();
> +
> +  if (__atomic_fetch_xor (v, 0, __ATOMIC_ACQ_REL) !=  init)
> +    abort ();
> +
> +  if (__atomic_fetch_xor (v, ~count, __ATOMIC_SEQ_CST) !=  init)
> +    abort ();
> +}
> +
> +void
> +test_fetch_or (char* v)
> +{
> +  *v = 0;
> +  count = 1;
> +
> +  if (__atomic_fetch_or (v, count, __ATOMIC_RELAXED) !=  0)
> +    abort ();
> +
> +  count *= 2;
> +  if (__atomic_fetch_or (v, 2, __ATOMIC_CONSUME) !=  1)
> +    abort ();
> +
> +  count *= 2;
> +  if (__atomic_fetch_or (v, count, __ATOMIC_ACQUIRE) !=  3)
> +    abort ();
> +
> +  count *= 2;
> +  if (__atomic_fetch_or (v, 8, __ATOMIC_RELEASE) !=  7)
> +    abort ();
> +
> +  count *= 2;
> +  if (__atomic_fetch_or (v, count, __ATOMIC_ACQ_REL) !=  15)
> +    abort ();
> +
> +  count *= 2;
> +  if (__atomic_fetch_or (v, count, __ATOMIC_SEQ_CST) !=  31)
> +    abort ();
> +}
> +
> +/* The OP_fetch routines return the new value after the operation.  */
> +
> +void
> +test_add_fetch (char* v)
> +{
> +  *v = 0;
> +  count = 1;
> +
> +  if (__atomic_add_fetch (v, count, __ATOMIC_RELAXED) != 1)
> +    abort ();
> +
> +  if (__atomic_add_fetch (v, 1, __ATOMIC_CONSUME) != 2)
> +    abort ();
> +
> +  if (__atomic_add_fetch (v, count, __ATOMIC_ACQUIRE) != 3)
> +    abort ();
> +
> +  if (__atomic_add_fetch (v, 1, __ATOMIC_RELEASE) != 4)
> +    abort ();
> +
> +  if (__atomic_add_fetch (v, count, __ATOMIC_ACQ_REL) != 5)
> +    abort ();
> +
> +  if (__atomic_add_fetch (v, count, __ATOMIC_SEQ_CST) != 6)
> +    abort ();
> +}
> +
> +
> +void
> +test_sub_fetch (char* v)
> +{
> +  *v = res = 20;
> +  count = 0;
> +
> +  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_RELAXED) !=  --res)
> +    abort ();
> +
> +  if (__atomic_sub_fetch (v, 1, __ATOMIC_CONSUME) !=  --res)
> +    abort ();
> +
> +  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_ACQUIRE) !=  --res)
> +    abort ();
> +
> +  if (__atomic_sub_fetch (v, 1, __ATOMIC_RELEASE) !=  --res)
> +    abort ();
> +
> +  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_ACQ_REL) !=  --res)
> +    abort ();
> +
> +  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_SEQ_CST) !=  --res)
> +    abort ();
> +}
> +
> +void
> +test_and_fetch (char* v)
> +{
> +  *v = init;
> +
> +  if (__atomic_and_fetch (v, 0, __ATOMIC_RELAXED) !=  0)
> +    abort ();
> +
> +  *v = init;
> +  if (__atomic_and_fetch (v, init, __ATOMIC_CONSUME) !=  init)
> +    abort ();
> +
> +  if (__atomic_and_fetch (v, 0, __ATOMIC_ACQUIRE) !=  0)
> +    abort ();
> +
> +  *v = ~*v;
> +  if (__atomic_and_fetch (v, init, __ATOMIC_RELEASE) !=  init)
> +    abort ();
> +
> +  if (__atomic_and_fetch (v, 0, __ATOMIC_ACQ_REL) !=  0)
> +    abort ();
> +
> +  *v = ~*v;
> +  if (__atomic_and_fetch (v, 0, __ATOMIC_SEQ_CST) !=  0)
> +    abort ();
> +}
> +
> +void
> +test_nand_fetch (char* v)
> +{
> +  *v = init;
> +
> +  if (__atomic_nand_fetch (v, 0, __ATOMIC_RELAXED) !=  init)
> +    abort ();
> +
> +  if (__atomic_nand_fetch (v, init, __ATOMIC_CONSUME) !=  0)
> +    abort ();
> +
> +  if (__atomic_nand_fetch (v, 0, __ATOMIC_ACQUIRE) !=  init)
> +    abort ();
> +
> +  if (__atomic_nand_fetch (v, init, __ATOMIC_RELEASE) !=  0)
> +    abort ();
> +
> +  if (__atomic_nand_fetch (v, init, __ATOMIC_ACQ_REL) !=  init)
> +    abort ();
> +
> +  if (__atomic_nand_fetch (v, 0, __ATOMIC_SEQ_CST) !=  init)
> +    abort ();
> +}
> +
> +
> +
> +void
> +test_xor_fetch (char* v)
> +{
> +  *v = init;
> +  count = 0;
> +
> +  if (__atomic_xor_fetch (v, count, __ATOMIC_RELAXED) !=  init)
> +    abort ();
> +
> +  if (__atomic_xor_fetch (v, ~count, __ATOMIC_CONSUME) !=  0)
> +    abort ();
> +
> +  if (__atomic_xor_fetch (v, 0, __ATOMIC_ACQUIRE) !=  0)
> +    abort ();
> +
> +  if (__atomic_xor_fetch (v, ~count, __ATOMIC_RELEASE) !=  init)
> +    abort ();
> +
> +  if (__atomic_xor_fetch (v, 0, __ATOMIC_ACQ_REL) !=  init)
> +    abort ();
> +
> +  if (__atomic_xor_fetch (v, ~count, __ATOMIC_SEQ_CST) !=  0)
> +    abort ();
> +}
> +
> +void
> +test_or_fetch (char* v)
> +{
> +  *v = 0;
> +  count = 1;
> +
> +  if (__atomic_or_fetch (v, count, __ATOMIC_RELAXED) !=  1)
> +    abort ();
> +
> +  count *= 2;
> +  if (__atomic_or_fetch (v, 2, __ATOMIC_CONSUME) !=  3)
> +    abort ();
> +
> +  count *= 2;
> +  if (__atomic_or_fetch (v, count, __ATOMIC_ACQUIRE) !=  7)
> +    abort ();
> +
> +  count *= 2;
> +  if (__atomic_or_fetch (v, 8, __ATOMIC_RELEASE) !=  15)
> +    abort ();
> +
> +  count *= 2;
> +  if (__atomic_or_fetch (v, count, __ATOMIC_ACQ_REL) !=  31)
> +    abort ();
> +
> +  count *= 2;
> +  if (__atomic_or_fetch (v, count, __ATOMIC_SEQ_CST) !=  63)
> +    abort ();
> +}
> +
> +
> +/* Test the OP routines with a result which isn't used. Use both variations
> +   within each function.  */
> +
> +void
> +test_add (char* v)
> +{
> +  *v = 0;
> +  count = 1;
> +
> +  __atomic_add_fetch (v, count, __ATOMIC_RELAXED);
> +  if (*v != 1)
> +    abort ();
> +
> +  __atomic_fetch_add (v, count, __ATOMIC_CONSUME);
> +  if (*v != 2)
> +    abort ();
> +
> +  __atomic_add_fetch (v, 1 , __ATOMIC_ACQUIRE);
> +  if (*v != 3)
> +    abort ();
> +
> +  __atomic_fetch_add (v, 1, __ATOMIC_RELEASE);
> +  if (*v != 4)
> +    abort ();
> +
> +  __atomic_add_fetch (v, count, __ATOMIC_ACQ_REL);
> +  if (*v != 5)
> +    abort ();
> +
> +  __atomic_fetch_add (v, count, __ATOMIC_SEQ_CST);
> +  if (*v != 6)
> +    abort ();
> +}
> +
> +
> +void
> +test_sub (char* v)
> +{
> +  *v = res = 20;
> +  count = 0;
> +
> +  __atomic_sub_fetch (v, count + 1, __ATOMIC_RELAXED);
> +  if (*v != --res)
> +    abort ();
> +
> +  __atomic_fetch_sub (v, count + 1, __ATOMIC_CONSUME);
> +  if (*v != --res)
> +    abort ();
> +
> +  __atomic_sub_fetch (v, 1, __ATOMIC_ACQUIRE);
> +  if (*v != --res)
> +    abort ();
> +
> +  __atomic_fetch_sub (v, 1, __ATOMIC_RELEASE);
> +  if (*v != --res)
> +    abort ();
> +
> +  __atomic_sub_fetch (v, count + 1, __ATOMIC_ACQ_REL);
> +  if (*v != --res)
> +    abort ();
> +
> +  __atomic_fetch_sub (v, count + 1, __ATOMIC_SEQ_CST);
> +  if (*v != --res)
> +    abort ();
> +}
> +
> +void
> +test_and (char* v)
> +{
> +  *v = init;
> +
> +  __atomic_and_fetch (v, 0, __ATOMIC_RELAXED);
> +  if (*v != 0)
> +    abort ();
> +
> +  *v = init;
> +  __atomic_fetch_and (v, init, __ATOMIC_CONSUME);
> +  if (*v != init)
> +    abort ();
> +
> +  __atomic_and_fetch (v, 0, __ATOMIC_ACQUIRE);
> +  if (*v != 0)
> +    abort ();
> +
> +  *v = ~*v;
> +  __atomic_fetch_and (v, init, __ATOMIC_RELEASE);
> +  if (*v != init)
> +    abort ();
> +
> +  __atomic_and_fetch (v, 0, __ATOMIC_ACQ_REL);
> +  if (*v != 0)
> +    abort ();
> +
> +  *v = ~*v;
> +  __atomic_fetch_and (v, 0, __ATOMIC_SEQ_CST);
> +  if (*v != 0)
> +    abort ();
> +}
> +
> +void
> +test_nand (char* v)
> +{
> +  *v = init;
> +
> +  __atomic_fetch_nand (v, 0, __ATOMIC_RELAXED);
> +  if (*v != init)
> +    abort ();
> +
> +  __atomic_fetch_nand (v, init, __ATOMIC_CONSUME);
> +  if (*v != 0)
> +    abort ();
> +
> +  __atomic_nand_fetch (v, 0, __ATOMIC_ACQUIRE);
> +  if (*v != init)
> +    abort ();
> +
> +  __atomic_nand_fetch (v, init, __ATOMIC_RELEASE);
> +  if (*v != 0)
> +    abort ();
> +
> +  __atomic_fetch_nand (v, init, __ATOMIC_ACQ_REL);
> +  if (*v != init)
> +    abort ();
> +
> +  __atomic_nand_fetch (v, 0, __ATOMIC_SEQ_CST);
> +  if (*v != init)
> +    abort ();
> +}
> +
> +
> +
> +void
> +test_xor (char* v)
> +{
> +  *v = init;
> +  count = 0;
> +
> +  __atomic_xor_fetch (v, count, __ATOMIC_RELAXED);
> +  if (*v != init)
> +    abort ();
> +
> +  __atomic_fetch_xor (v, ~count, __ATOMIC_CONSUME);
> +  if (*v != 0)
> +    abort ();
> +
> +  __atomic_xor_fetch (v, 0, __ATOMIC_ACQUIRE);
> +  if (*v != 0)
> +    abort ();
> +
> +  __atomic_fetch_xor (v, ~count, __ATOMIC_RELEASE);
> +  if (*v != init)
> +    abort ();
> +
> +  __atomic_fetch_xor (v, 0, __ATOMIC_ACQ_REL);
> +  if (*v != init)
> +    abort ();
> +
> +  __atomic_xor_fetch (v, ~count, __ATOMIC_SEQ_CST);
> +  if (*v != 0)
> +    abort ();
> +}
> +
> +void
> +test_or (char* v)
> +{
> +  *v = 0;
> +  count = 1;
> +
> +  __atomic_or_fetch (v, count, __ATOMIC_RELAXED);
> +  if (*v != 1)
> +    abort ();
> +
> +  count *= 2;
> +  __atomic_fetch_or (v, count, __ATOMIC_CONSUME);
> +  if (*v != 3)
> +    abort ();
> +
> +  count *= 2;
> +  __atomic_or_fetch (v, 4, __ATOMIC_ACQUIRE);
> +  if (*v != 7)
> +    abort ();
> +
> +  count *= 2;
> +  __atomic_fetch_or (v, 8, __ATOMIC_RELEASE);
> +  if (*v != 15)
> +    abort ();
> +
> +  count *= 2;
> +  __atomic_or_fetch (v, count, __ATOMIC_ACQ_REL);
> +  if (*v != 31)
> +    abort ();
> +
> +  count *= 2;
> +  __atomic_fetch_or (v, count, __ATOMIC_SEQ_CST);
> +  if (*v != 63)
> +    abort ();
> +}
> +
> +int
> +main ()
> +{
> +  char* V[] = {&A.a, &A.b, &A.c, &A.d};
> +
> +  for (int i = 0; i < 4; i++) {
> +    test_fetch_add (V[i]);
> +    test_fetch_sub (V[i]);
> +    test_fetch_and (V[i]);
> +    test_fetch_nand (V[i]);
> +    test_fetch_xor (V[i]);
> +    test_fetch_or (V[i]);
> +
> +    test_add_fetch (V[i]);
> +    test_sub_fetch (V[i]);
> +    test_and_fetch (V[i]);
> +    test_nand_fetch (V[i]);
> +    test_xor_fetch (V[i]);
> +    test_or_fetch (V[i]);
> +
> +    test_add (V[i]);
> +    test_sub (V[i]);
> +    test_and (V[i]);
> +    test_nand (V[i]);
> +    test_xor (V[i]);
> +    test_or (V[i]);
> +  }
> +
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-4.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-4.c
> new file mode 100644
> index 00000000000..619cf1f86ca
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-4.c
> @@ -0,0 +1,566 @@
> +/* Check all short alignments.  */
> +/* Duplicate logic as libatomic/testsuite/libatomic.c/atomic-op-2.c */
> +/* Test __atomic routines for existence and proper execution on 2 byte
> +   values with each valid memory model.  */
> +/* { dg-do run } */
> +/* { dg-options "-latomic -minline-atomics -Wno-address-of-packed-member" } */
> +
> +/* Test the execution of the __atomic_*OP builtin routines for a short.  */
> +
> +extern void abort(void);
> +
> +short count, res;
> +const short init = ~0;
> +
> +struct A
> +{
> +   short a;
> +   short b;
> +} __attribute__ ((packed)) A;
> +
> +/* The fetch_op routines return the original value before the operation.  */
> +
> +void
> +test_fetch_add (short* v)
> +{
> +  *v = 0;
> +  count = 1;
> +
> +  if (__atomic_fetch_add (v, count, __ATOMIC_RELAXED) != 0)
> +    abort ();
> +
> +  if (__atomic_fetch_add (v, 1, __ATOMIC_CONSUME) != 1)
> +    abort ();
> +
> +  if (__atomic_fetch_add (v, count, __ATOMIC_ACQUIRE) != 2)
> +    abort ();
> +
> +  if (__atomic_fetch_add (v, 1, __ATOMIC_RELEASE) != 3)
> +    abort ();
> +
> +  if (__atomic_fetch_add (v, count, __ATOMIC_ACQ_REL) != 4)
> +    abort ();
> +
> +  if (__atomic_fetch_add (v, 1, __ATOMIC_SEQ_CST) != 5)
> +    abort ();
> +}
> +
> +
> +void
> +test_fetch_sub (short* v)
> +{
> +  *v = res = 20;
> +  count = 0;
> +
> +  if (__atomic_fetch_sub (v, count + 1, __ATOMIC_RELAXED) !=  res--)
> +    abort ();
> +
> +  if (__atomic_fetch_sub (v, 1, __ATOMIC_CONSUME) !=  res--)
> +    abort ();
> +
> +  if (__atomic_fetch_sub (v, count + 1, __ATOMIC_ACQUIRE) !=  res--)
> +    abort ();
> +
> +  if (__atomic_fetch_sub (v, 1, __ATOMIC_RELEASE) !=  res--)
> +    abort ();
> +
> +  if (__atomic_fetch_sub (v, count + 1, __ATOMIC_ACQ_REL) !=  res--)
> +    abort ();
> +
> +  if (__atomic_fetch_sub (v, 1, __ATOMIC_SEQ_CST) !=  res--)
> +    abort ();
> +}
> +
> +void
> +test_fetch_and (short* v)
> +{
> +  *v = init;
> +
> +  if (__atomic_fetch_and (v, 0, __ATOMIC_RELAXED) !=  init)
> +    abort ();
> +
> +  if (__atomic_fetch_and (v, init, __ATOMIC_CONSUME) !=  0)
> +    abort ();
> +
> +  if (__atomic_fetch_and (v, 0, __ATOMIC_ACQUIRE) !=  0)
> +    abort ();
> +
> +  *v = ~*v;
> +  if (__atomic_fetch_and (v, init, __ATOMIC_RELEASE) !=  init)
> +    abort ();
> +
> +  if (__atomic_fetch_and (v, 0, __ATOMIC_ACQ_REL) !=  init)
> +    abort ();
> +
> +  if (__atomic_fetch_and (v, 0, __ATOMIC_SEQ_CST) !=  0)
> +    abort ();
> +}
> +
> +void
> +test_fetch_nand (short* v)
> +{
> +  *v = init;
> +
> +  if (__atomic_fetch_nand (v, 0, __ATOMIC_RELAXED) !=  init)
> +    abort ();
> +
> +  if (__atomic_fetch_nand (v, init, __ATOMIC_CONSUME) !=  init)
> +    abort ();
> +
> +  if (__atomic_fetch_nand (v, 0, __ATOMIC_ACQUIRE) !=  0 )
> +    abort ();
> +
> +  if (__atomic_fetch_nand (v, init, __ATOMIC_RELEASE) !=  init)
> +    abort ();
> +
> +  if (__atomic_fetch_nand (v, init, __ATOMIC_ACQ_REL) !=  0)
> +    abort ();
> +
> +  if (__atomic_fetch_nand (v, 0, __ATOMIC_SEQ_CST) !=  init)
> +    abort ();
> +}
> +
> +void
> +test_fetch_xor (short* v)
> +{
> +  *v = init;
> +  count = 0;
> +
> +  if (__atomic_fetch_xor (v, count, __ATOMIC_RELAXED) !=  init)
> +    abort ();
> +
> +  if (__atomic_fetch_xor (v, ~count, __ATOMIC_CONSUME) !=  init)
> +    abort ();
> +
> +  if (__atomic_fetch_xor (v, 0, __ATOMIC_ACQUIRE) !=  0)
> +    abort ();
> +
> +  if (__atomic_fetch_xor (v, ~count, __ATOMIC_RELEASE) !=  0)
> +    abort ();
> +
> +  if (__atomic_fetch_xor (v, 0, __ATOMIC_ACQ_REL) !=  init)
> +    abort ();
> +
> +  if (__atomic_fetch_xor (v, ~count, __ATOMIC_SEQ_CST) !=  init)
> +    abort ();
> +}
> +
> +void
> +test_fetch_or (short* v)
> +{
> +  *v = 0;
> +  count = 1;
> +
> +  if (__atomic_fetch_or (v, count, __ATOMIC_RELAXED) !=  0)
> +    abort ();
> +
> +  count *= 2;
> +  if (__atomic_fetch_or (v, 2, __ATOMIC_CONSUME) !=  1)
> +    abort ();
> +
> +  count *= 2;
> +  if (__atomic_fetch_or (v, count, __ATOMIC_ACQUIRE) !=  3)
> +    abort ();
> +
> +  count *= 2;
> +  if (__atomic_fetch_or (v, 8, __ATOMIC_RELEASE) !=  7)
> +    abort ();
> +
> +  count *= 2;
> +  if (__atomic_fetch_or (v, count, __ATOMIC_ACQ_REL) !=  15)
> +    abort ();
> +
> +  count *= 2;
> +  if (__atomic_fetch_or (v, count, __ATOMIC_SEQ_CST) !=  31)
> +    abort ();
> +}
> +
> +/* The OP_fetch routines return the new value after the operation.  */
> +
> +void
> +test_add_fetch (short* v)
> +{
> +  *v = 0;
> +  count = 1;
> +
> +  if (__atomic_add_fetch (v, count, __ATOMIC_RELAXED) != 1)
> +    abort ();
> +
> +  if (__atomic_add_fetch (v, 1, __ATOMIC_CONSUME) != 2)
> +    abort ();
> +
> +  if (__atomic_add_fetch (v, count, __ATOMIC_ACQUIRE) != 3)
> +    abort ();
> +
> +  if (__atomic_add_fetch (v, 1, __ATOMIC_RELEASE) != 4)
> +    abort ();
> +
> +  if (__atomic_add_fetch (v, count, __ATOMIC_ACQ_REL) != 5)
> +    abort ();
> +
> +  if (__atomic_add_fetch (v, count, __ATOMIC_SEQ_CST) != 6)
> +    abort ();
> +}
> +
> +
> +void
> +test_sub_fetch (short* v)
> +{
> +  *v = res = 20;
> +  count = 0;
> +
> +  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_RELAXED) !=  --res)
> +    abort ();
> +
> +  if (__atomic_sub_fetch (v, 1, __ATOMIC_CONSUME) !=  --res)
> +    abort ();
> +
> +  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_ACQUIRE) !=  --res)
> +    abort ();
> +
> +  if (__atomic_sub_fetch (v, 1, __ATOMIC_RELEASE) !=  --res)
> +    abort ();
> +
> +  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_ACQ_REL) !=  --res)
> +    abort ();
> +
> +  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_SEQ_CST) !=  --res)
> +    abort ();
> +}
> +
> +void
> +test_and_fetch (short* v)
> +{
> +  *v = init;
> +
> +  if (__atomic_and_fetch (v, 0, __ATOMIC_RELAXED) !=  0)
> +    abort ();
> +
> +  *v = init;
> +  if (__atomic_and_fetch (v, init, __ATOMIC_CONSUME) !=  init)
> +    abort ();
> +
> +  if (__atomic_and_fetch (v, 0, __ATOMIC_ACQUIRE) !=  0)
> +    abort ();
> +
> +  *v = ~*v;
> +  if (__atomic_and_fetch (v, init, __ATOMIC_RELEASE) !=  init)
> +    abort ();
> +
> +  if (__atomic_and_fetch (v, 0, __ATOMIC_ACQ_REL) !=  0)
> +    abort ();
> +
> +  *v = ~*v;
> +  if (__atomic_and_fetch (v, 0, __ATOMIC_SEQ_CST) !=  0)
> +    abort ();
> +}
> +
> +void
> +test_nand_fetch (short* v)
> +{
> +  *v = init;
> +
> +  if (__atomic_nand_fetch (v, 0, __ATOMIC_RELAXED) !=  init)
> +    abort ();
> +
> +  if (__atomic_nand_fetch (v, init, __ATOMIC_CONSUME) !=  0)
> +    abort ();
> +
> +  if (__atomic_nand_fetch (v, 0, __ATOMIC_ACQUIRE) !=  init)
> +    abort ();
> +
> +  if (__atomic_nand_fetch (v, init, __ATOMIC_RELEASE) !=  0)
> +    abort ();
> +
> +  if (__atomic_nand_fetch (v, init, __ATOMIC_ACQ_REL) !=  init)
> +    abort ();
> +
> +  if (__atomic_nand_fetch (v, 0, __ATOMIC_SEQ_CST) !=  init)
> +    abort ();
> +}
> +
> +
> +
> +void
> +test_xor_fetch (short* v)
> +{
> +  *v = init;
> +  count = 0;
> +
> +  if (__atomic_xor_fetch (v, count, __ATOMIC_RELAXED) !=  init)
> +    abort ();
> +
> +  if (__atomic_xor_fetch (v, ~count, __ATOMIC_CONSUME) !=  0)
> +    abort ();
> +
> +  if (__atomic_xor_fetch (v, 0, __ATOMIC_ACQUIRE) !=  0)
> +    abort ();
> +
> +  if (__atomic_xor_fetch (v, ~count, __ATOMIC_RELEASE) !=  init)
> +    abort ();
> +
> +  if (__atomic_xor_fetch (v, 0, __ATOMIC_ACQ_REL) !=  init)
> +    abort ();
> +
> +  if (__atomic_xor_fetch (v, ~count, __ATOMIC_SEQ_CST) !=  0)
> +    abort ();
> +}
> +
> +void
> +test_or_fetch (short* v)
> +{
> +  *v = 0;
> +  count = 1;
> +
> +  if (__atomic_or_fetch (v, count, __ATOMIC_RELAXED) !=  1)
> +    abort ();
> +
> +  count *= 2;
> +  if (__atomic_or_fetch (v, 2, __ATOMIC_CONSUME) !=  3)
> +    abort ();
> +
> +  count *= 2;
> +  if (__atomic_or_fetch (v, count, __ATOMIC_ACQUIRE) !=  7)
> +    abort ();
> +
> +  count *= 2;
> +  if (__atomic_or_fetch (v, 8, __ATOMIC_RELEASE) !=  15)
> +    abort ();
> +
> +  count *= 2;
> +  if (__atomic_or_fetch (v, count, __ATOMIC_ACQ_REL) !=  31)
> +    abort ();
> +
> +  count *= 2;
> +  if (__atomic_or_fetch (v, count, __ATOMIC_SEQ_CST) !=  63)
> +    abort ();
> +}
> +
> +
> +/* Test the OP routines with a result which isn't used. Use both variations
> +   within each function.  */
> +
> +void
> +test_add (short* v)
> +{
> +  *v = 0;
> +  count = 1;
> +
> +  __atomic_add_fetch (v, count, __ATOMIC_RELAXED);
> +  if (*v != 1)
> +    abort ();
> +
> +  __atomic_fetch_add (v, count, __ATOMIC_CONSUME);
> +  if (*v != 2)
> +    abort ();
> +
> +  __atomic_add_fetch (v, 1 , __ATOMIC_ACQUIRE);
> +  if (*v != 3)
> +    abort ();
> +
> +  __atomic_fetch_add (v, 1, __ATOMIC_RELEASE);
> +  if (*v != 4)
> +    abort ();
> +
> +  __atomic_add_fetch (v, count, __ATOMIC_ACQ_REL);
> +  if (*v != 5)
> +    abort ();
> +
> +  __atomic_fetch_add (v, count, __ATOMIC_SEQ_CST);
> +  if (*v != 6)
> +    abort ();
> +}
> +
> +
> +void
> +test_sub (short* v)
> +{
> +  *v = res = 20;
> +  count = 0;
> +
> +  __atomic_sub_fetch (v, count + 1, __ATOMIC_RELAXED);
> +  if (*v != --res)
> +    abort ();
> +
> +  __atomic_fetch_sub (v, count + 1, __ATOMIC_CONSUME);
> +  if (*v != --res)
> +    abort ();
> +
> +  __atomic_sub_fetch (v, 1, __ATOMIC_ACQUIRE);
> +  if (*v != --res)
> +    abort ();
> +
> +  __atomic_fetch_sub (v, 1, __ATOMIC_RELEASE);
> +  if (*v != --res)
> +    abort ();
> +
> +  __atomic_sub_fetch (v, count + 1, __ATOMIC_ACQ_REL);
> +  if (*v != --res)
> +    abort ();
> +
> +  __atomic_fetch_sub (v, count + 1, __ATOMIC_SEQ_CST);
> +  if (*v != --res)
> +    abort ();
> +}
> +
> +void
> +test_and (short* v)
> +{
> +  *v = init;
> +
> +  __atomic_and_fetch (v, 0, __ATOMIC_RELAXED);
> +  if (*v != 0)
> +    abort ();
> +
> +  *v = init;
> +  __atomic_fetch_and (v, init, __ATOMIC_CONSUME);
> +  if (*v != init)
> +    abort ();
> +
> +  __atomic_and_fetch (v, 0, __ATOMIC_ACQUIRE);
> +  if (*v != 0)
> +    abort ();
> +
> +  *v = ~*v;
> +  __atomic_fetch_and (v, init, __ATOMIC_RELEASE);
> +  if (*v != init)
> +    abort ();
> +
> +  __atomic_and_fetch (v, 0, __ATOMIC_ACQ_REL);
> +  if (*v != 0)
> +    abort ();
> +
> +  *v = ~*v;
> +  __atomic_fetch_and (v, 0, __ATOMIC_SEQ_CST);
> +  if (*v != 0)
> +    abort ();
> +}
> +
> +void
> +test_nand (short* v)
> +{
> +  *v = init;
> +
> +  __atomic_fetch_nand (v, 0, __ATOMIC_RELAXED);
> +  if (*v != init)
> +    abort ();
> +
> +  __atomic_fetch_nand (v, init, __ATOMIC_CONSUME);
> +  if (*v != 0)
> +    abort ();
> +
> +  __atomic_nand_fetch (v, 0, __ATOMIC_ACQUIRE);
> +  if (*v != init)
> +    abort ();
> +
> +  __atomic_nand_fetch (v, init, __ATOMIC_RELEASE);
> +  if (*v != 0)
> +    abort ();
> +
> +  __atomic_fetch_nand (v, init, __ATOMIC_ACQ_REL);
> +  if (*v != init)
> +    abort ();
> +
> +  __atomic_nand_fetch (v, 0, __ATOMIC_SEQ_CST);
> +  if (*v != init)
> +    abort ();
> +}
> +
> +
> +
> +void
> +test_xor (short* v)
> +{
> +  *v = init;
> +  count = 0;
> +
> +  __atomic_xor_fetch (v, count, __ATOMIC_RELAXED);
> +  if (*v != init)
> +    abort ();
> +
> +  __atomic_fetch_xor (v, ~count, __ATOMIC_CONSUME);
> +  if (*v != 0)
> +    abort ();
> +
> +  __atomic_xor_fetch (v, 0, __ATOMIC_ACQUIRE);
> +  if (*v != 0)
> +    abort ();
> +
> +  __atomic_fetch_xor (v, ~count, __ATOMIC_RELEASE);
> +  if (*v != init)
> +    abort ();
> +
> +  __atomic_fetch_xor (v, 0, __ATOMIC_ACQ_REL);
> +  if (*v != init)
> +    abort ();
> +
> +  __atomic_xor_fetch (v, ~count, __ATOMIC_SEQ_CST);
> +  if (*v != 0)
> +    abort ();
> +}
> +
> +void
> +test_or (short* v)
> +{
> +  *v = 0;
> +  count = 1;
> +
> +  __atomic_or_fetch (v, count, __ATOMIC_RELAXED);
> +  if (*v != 1)
> +    abort ();
> +
> +  count *= 2;
> +  __atomic_fetch_or (v, count, __ATOMIC_CONSUME);
> +  if (*v != 3)
> +    abort ();
> +
> +  count *= 2;
> +  __atomic_or_fetch (v, 4, __ATOMIC_ACQUIRE);
> +  if (*v != 7)
> +    abort ();
> +
> +  count *= 2;
> +  __atomic_fetch_or (v, 8, __ATOMIC_RELEASE);
> +  if (*v != 15)
> +    abort ();
> +
> +  count *= 2;
> +  __atomic_or_fetch (v, count, __ATOMIC_ACQ_REL);
> +  if (*v != 31)
> +    abort ();
> +
> +  count *= 2;
> +  __atomic_fetch_or (v, count, __ATOMIC_SEQ_CST);
> +  if (*v != 63)
> +    abort ();
> +}
> +
> +int
> +main () {
> +  short* V[] = {&A.a, &A.b};
> +
> +  for (int i = 0; i < 2; i++) {
> +    test_fetch_add (V[i]);
> +    test_fetch_sub (V[i]);
> +    test_fetch_and (V[i]);
> +    test_fetch_nand (V[i]);
> +    test_fetch_xor (V[i]);
> +    test_fetch_or (V[i]);
> +
> +    test_add_fetch (V[i]);
> +    test_sub_fetch (V[i]);
> +    test_and_fetch (V[i]);
> +    test_nand_fetch (V[i]);
> +    test_xor_fetch (V[i]);
> +    test_or_fetch (V[i]);
> +
> +    test_add (V[i]);
> +    test_sub (V[i]);
> +    test_and (V[i]);
> +    test_nand (V[i]);
> +    test_xor (V[i]);
> +    test_or (V[i]);
> +  }
> +
> +  return 0;
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-5.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-5.c
> new file mode 100644
> index 00000000000..c2751235dbf
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-5.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* Verify that constant propogation is functioning.  */
> +/* The -4 should be propogated into an ANDI statement. */
> +/* { dg-options "-minline-atomics" } */
> +/* { dg-final { scan-assembler-not "\tli\t[at]\d,-4" } } */
> +
> +char bar;
> +
> +int
> +main ()
> +{
> +  __sync_fetch_and_add(&bar, 1);
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-6.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-6.c
> new file mode 100644
> index 00000000000..18249bae7d1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-6.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* Verify that masks are generated to the correct size.  */
> +/* { dg-options "-O3 -minline-atomics" } */
> +/* Check for mask */
> +/* { dg-final { scan-assembler "\tli\t[at]\d,255" } } */
> +
> +int
> +main ()
> +{
> +  char bar __attribute__((aligned (32)));
> +  __sync_fetch_and_add(&bar, 0);
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-7.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-7.c
> new file mode 100644
> index 00000000000..81bbf4badce
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-7.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* Verify that masks are generated to the correct size.  */
> +/* { dg-options "-O3 -minline-atomics" } */
> +/* Check for mask */
> +/* { dg-final { scan-assembler "\tli\t[at]\d,65535" } } */
> +
> +int
> +main ()
> +{
> +  short bar __attribute__((aligned (32)));
> +  __sync_fetch_and_add(&bar, 0);
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-8.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-8.c
> new file mode 100644
> index 00000000000..d27562ed981
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-8.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* Verify that masks are aligned properly.  */
> +/* { dg-options "-O3 -minline-atomics" } */
> +/* Check for mask */
> +/* { dg-final { scan-assembler "\tli\t[at]\d,16711680" } } */
> +
> +int
> +main ()
> +{
> +  struct A {
> +    char a;
> +    char b;
> +    char c;
> +    char d;
> +  } __attribute__ ((packed)) __attribute__((aligned (32))) A;
> +  __sync_fetch_and_add(&A.c, 0);
> +}
> diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-9.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-9.c
> new file mode 100644
> index 00000000000..382849702ca
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-9.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* Verify that masks are aligned properly.  */
> +/* { dg-options "-O3 -minline-atomics" } */
> +/* Check for mask */
> +/* { dg-final { scan-assembler "\tli\t[at]\d,-16777216" } } */
> +
> +int
> +main ()
> +{
> +  struct A {
> +    char a;
> +    char b;
> +    char c;
> +    char d;
> +  } __attribute__ ((packed)) __attribute__((aligned (32))) A;
> +  __sync_fetch_and_add(&A.d, 0);
> +}
> diff --git a/libgcc/config/riscv/atomic.c b/libgcc/config/riscv/atomic.c
> index 904d8c59cf0..9583027b757 100644
> --- a/libgcc/config/riscv/atomic.c
> +++ b/libgcc/config/riscv/atomic.c
> @@ -30,6 +30,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
>  #define INVERT		"not %[tmp1], %[tmp1]\n\t"
>  #define DONT_INVERT	""
>
> +/* Logic duplicated in gcc/gcc/config/riscv/sync.md for use when inlining is enabled */
> +
>  #define GENERATE_FETCH_AND_OP(type, size, opname, insn, invert, cop)	\
>    type __sync_fetch_and_ ## opname ## _ ## size (type *p, type v)	\
>    {									\
diff mbox series

Patch

diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index e294e223151..fb702317233 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -211,3 +211,7 @@  Enum(isa_spec_class) String(20191213) Value(ISA_SPEC_CLASS_20191213)
 misa-spec=
 Target RejectNegative Joined Enum(isa_spec_class) Var(riscv_isa_spec) Init(TARGET_DEFAULT_ISA_SPEC)
 Set the version of RISC-V ISA spec.
+
+minline-atomics
+Target Bool Var(ALWAYS_INLINE_SUBWORD_ATOMIC) Init(-1)
+Always inline subword atomic operations.
diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
index 747a799e237..e19b4157d3c 100644
--- a/gcc/config/riscv/sync.md
+++ b/gcc/config/riscv/sync.md
@@ -92,6 +92,102 @@ 
   "%F3amo<insn>.<amo>%A3 %0,%z2,%1"
   [(set (attr "length") (const_int 8))])
 
+(define_expand "atomic_fetch_<atomic_optab><mode>"
+  [(set (match_operand:SHORT 0 "register_operand" "=&r")	      ;; old value at mem
+	(match_operand:SHORT 1 "memory_operand" "+A"))		      ;; mem location
+   (set (match_dup 1)
+	(unspec_volatile:SHORT
+	  [(any_atomic:SHORT (match_dup 1)
+		     (match_operand:SHORT 2 "reg_or_0_operand" "rJ")) ;; value for op
+	   (match_operand:SI 3 "const_int_operand")]		      ;; model
+	 UNSPEC_SYNC_OLD_OP))]
+  "TARGET_ATOMIC && ALWAYS_INLINE_SUBWORD_ATOMIC"
+{
+  /* We have no QImode/HImode atomics, so form a mask, then use
+     subword_atomic_fetch_strong_<mode> to implement a LR/SC version of the
+     operation. */
+
+  /* Logic duplicated in gcc/libgcc/config/riscv/atomic.c for use when inlining
+     is disabled */
+
+  rtx old = gen_reg_rtx (SImode);
+  rtx mem = operands[1];
+  rtx value = operands[2];
+  rtx mask = gen_reg_rtx (SImode);
+  rtx notmask = gen_reg_rtx (SImode);
+
+  rtx addr = force_reg (Pmode, XEXP (mem, 0));
+
+  rtx aligned_addr = gen_reg_rtx (Pmode);
+  emit_move_insn (aligned_addr,  gen_rtx_AND (Pmode, addr,
+					      gen_int_mode (-4, Pmode)));
+
+  rtx aligned_mem = change_address (mem, SImode, aligned_addr);
+
+  rtx shift = gen_reg_rtx (SImode);
+  emit_move_insn (shift, gen_rtx_AND (SImode, gen_lowpart (SImode, addr),
+				      gen_int_mode (3, SImode)));
+  emit_move_insn (shift, gen_rtx_ASHIFT (SImode, shift,
+					 gen_int_mode(3, SImode)));
+
+  rtx value_reg = gen_reg_rtx (SImode);
+  emit_move_insn (value_reg, simplify_gen_subreg (SImode, value, <MODE>mode, 0));
+
+  rtx shifted_value = gen_reg_rtx (SImode);
+  emit_move_insn(shifted_value, gen_rtx_ASHIFT(SImode, value_reg,
+					       gen_lowpart (QImode, shift)));
+
+  int unshifted_mask;
+  if (<MODE>mode == QImode)
+    unshifted_mask = 0xFF;
+  else
+    unshifted_mask = 0xFFFF;
+
+  rtx mask_reg = gen_reg_rtx (SImode);
+  emit_move_insn (mask_reg, gen_int_mode(unshifted_mask, SImode));
+
+  emit_move_insn (mask, gen_rtx_ASHIFT(SImode, mask_reg,
+				       gen_lowpart (QImode, shift)));
+
+  emit_move_insn (notmask, gen_rtx_NOT(SImode, mask));
+
+  emit_insn (gen_subword_atomic_fetch_strong_<atomic_optab> (old, aligned_mem,
+							     shifted_value,
+							     mask, notmask));
+
+  emit_move_insn (old, gen_rtx_ASHIFTRT(SImode, old,
+					gen_lowpart(QImode, shift)));
+
+  emit_move_insn (operands[0], gen_lowpart(<MODE>mode, old));
+
+  DONE;
+})
+
+(define_insn "subword_atomic_fetch_strong_<atomic_optab>"
+  [(set (match_operand:SI 0 "register_operand" "=&r")		   ;; old value at mem
+	(match_operand:SI 1 "memory_operand" "+A"))		   ;; mem location
+   (set (match_dup 1)
+	(unspec_volatile:SI
+	  [(any_atomic:SI (match_dup 1)
+		     (match_operand:SI 2 "register_operand" "rI")) ;; value for op
+	   (match_operand:SI 3 "register_operand" "rI")]	   ;; mask
+	 UNSPEC_SYNC_OLD_OP))
+    (match_operand:SI 4 "register_operand" "rI")		   ;; not_mask
+    (clobber (match_scratch:SI 5 "=&r"))			   ;; tmp_1
+    (clobber (match_scratch:SI 6 "=&r"))]			   ;; tmp_2
+  "TARGET_ATOMIC && ALWAYS_INLINE_SUBWORD_ATOMIC"
+  {
+    return
+    "1:\;"
+    "lr.w.aq\t%0, %1\;"
+    "<insn>\t%5, %0, %2\;"
+    "and\t%5, %5, %3\;"
+    "and\t%6, %0, %4\;"
+    "or\t%6, %6, %5\;"
+    "sc.w.rl\t%5, %6, %1\;"
+    "bnez\t%5, 1b\;";}
+  )
+
 (define_insn "atomic_exchange<mode>"
   [(set (match_operand:GPR 0 "register_operand" "=&r")
 	(unspec_volatile:GPR
diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-1.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-1.c
new file mode 100644
index 00000000000..110fdabd313
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-1.c
@@ -0,0 +1,11 @@ 
+/* { dg-do compile } */
+/* { dg-options "-mno-inline-atomics" } */
+/* { dg-final { scan-assembler "\tcall\t__sync_fetch_and_add_1" } } */
+
+char bar;
+
+int
+main ()
+{
+  __sync_fetch_and_add(&bar, 1);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-2.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-2.c
new file mode 100644
index 00000000000..8d5c31d8b79
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-2.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* Verify that subword atomics do not generate calls.  */
+/* { dg-options "-minline-atomics" } */
+/* { dg-final { scan-assembler-not "\tcall\t__sync_fetch_and_add_1" } } */
+
+char bar;
+
+int
+main ()
+{
+  __sync_fetch_and_add(&bar, 1);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-3.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-3.c
new file mode 100644
index 00000000000..19b382d45b0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-3.c
@@ -0,0 +1,569 @@ 
+/* Check all char alignments.  */
+/* Duplicate logic as libatomic/testsuite/libatomic.c/atomic-op-1.c */
+/* Test __atomic routines for existence and proper execution on 1 byte
+   values with each valid memory model.  */
+/* { dg-do run } */
+/* { dg-options "-latomic -minline-atomics -Wno-address-of-packed-member" } */
+
+/* Test the execution of the __atomic_*OP builtin routines for a char.  */
+
+extern void abort(void);
+
+char count, res;
+const char init = ~0;
+
+struct A
+{
+   char a;
+   char b;
+   char c;
+   char d;
+} __attribute__ ((packed)) A;
+
+/* The fetch_op routines return the original value before the operation.  */
+
+void
+test_fetch_add (char* v)
+{
+  *v = 0;
+  count = 1;
+
+  if (__atomic_fetch_add (v, count, __ATOMIC_RELAXED) != 0)
+    abort ();
+
+  if (__atomic_fetch_add (v, 1, __ATOMIC_CONSUME) != 1)
+    abort ();
+
+  if (__atomic_fetch_add (v, count, __ATOMIC_ACQUIRE) != 2)
+    abort ();
+
+  if (__atomic_fetch_add (v, 1, __ATOMIC_RELEASE) != 3)
+    abort ();
+
+  if (__atomic_fetch_add (v, count, __ATOMIC_ACQ_REL) != 4)
+    abort ();
+
+  if (__atomic_fetch_add (v, 1, __ATOMIC_SEQ_CST) != 5)
+    abort ();
+}
+
+
+void
+test_fetch_sub (char* v)
+{
+  *v = res = 20;
+  count = 0;
+
+  if (__atomic_fetch_sub (v, count + 1, __ATOMIC_RELAXED) !=  res--)
+    abort ();
+
+  if (__atomic_fetch_sub (v, 1, __ATOMIC_CONSUME) !=  res--)
+    abort ();
+
+  if (__atomic_fetch_sub (v, count + 1, __ATOMIC_ACQUIRE) !=  res--)
+    abort ();
+
+  if (__atomic_fetch_sub (v, 1, __ATOMIC_RELEASE) !=  res--)
+    abort ();
+
+  if (__atomic_fetch_sub (v, count + 1, __ATOMIC_ACQ_REL) !=  res--)
+    abort ();
+
+  if (__atomic_fetch_sub (v, 1, __ATOMIC_SEQ_CST) !=  res--)
+    abort ();
+}
+
+void
+test_fetch_and (char* v)
+{
+  *v = init;
+
+  if (__atomic_fetch_and (v, 0, __ATOMIC_RELAXED) !=  init)
+    abort ();
+
+  if (__atomic_fetch_and (v, init, __ATOMIC_CONSUME) !=  0)
+    abort ();
+
+  if (__atomic_fetch_and (v, 0, __ATOMIC_ACQUIRE) !=  0)
+    abort ();
+
+  *v = ~*v;
+  if (__atomic_fetch_and (v, init, __ATOMIC_RELEASE) !=  init)
+    abort ();
+
+  if (__atomic_fetch_and (v, 0, __ATOMIC_ACQ_REL) !=  init)
+    abort ();
+
+  if (__atomic_fetch_and (v, 0, __ATOMIC_SEQ_CST) !=  0)
+    abort ();
+}
+
+void
+test_fetch_nand (char* v)
+{
+  *v = init;
+
+  if (__atomic_fetch_nand (v, 0, __ATOMIC_RELAXED) !=  init)
+    abort ();
+
+  if (__atomic_fetch_nand (v, init, __ATOMIC_CONSUME) !=  init)
+    abort ();
+
+  if (__atomic_fetch_nand (v, 0, __ATOMIC_ACQUIRE) !=  0 )
+    abort ();
+
+  if (__atomic_fetch_nand (v, init, __ATOMIC_RELEASE) !=  init)
+    abort ();
+
+  if (__atomic_fetch_nand (v, init, __ATOMIC_ACQ_REL) !=  0)
+    abort ();
+
+  if (__atomic_fetch_nand (v, 0, __ATOMIC_SEQ_CST) !=  init)
+    abort ();
+}
+
+void
+test_fetch_xor (char* v)
+{
+  *v = init;
+  count = 0;
+
+  if (__atomic_fetch_xor (v, count, __ATOMIC_RELAXED) !=  init)
+    abort ();
+
+  if (__atomic_fetch_xor (v, ~count, __ATOMIC_CONSUME) !=  init)
+    abort ();
+
+  if (__atomic_fetch_xor (v, 0, __ATOMIC_ACQUIRE) !=  0)
+    abort ();
+
+  if (__atomic_fetch_xor (v, ~count, __ATOMIC_RELEASE) !=  0)
+    abort ();
+
+  if (__atomic_fetch_xor (v, 0, __ATOMIC_ACQ_REL) !=  init)
+    abort ();
+
+  if (__atomic_fetch_xor (v, ~count, __ATOMIC_SEQ_CST) !=  init)
+    abort ();
+}
+
+void
+test_fetch_or (char* v)
+{
+  *v = 0;
+  count = 1;
+
+  if (__atomic_fetch_or (v, count, __ATOMIC_RELAXED) !=  0)
+    abort ();
+
+  count *= 2;
+  if (__atomic_fetch_or (v, 2, __ATOMIC_CONSUME) !=  1)
+    abort ();
+
+  count *= 2;
+  if (__atomic_fetch_or (v, count, __ATOMIC_ACQUIRE) !=  3)
+    abort ();
+
+  count *= 2;
+  if (__atomic_fetch_or (v, 8, __ATOMIC_RELEASE) !=  7)
+    abort ();
+
+  count *= 2;
+  if (__atomic_fetch_or (v, count, __ATOMIC_ACQ_REL) !=  15)
+    abort ();
+
+  count *= 2;
+  if (__atomic_fetch_or (v, count, __ATOMIC_SEQ_CST) !=  31)
+    abort ();
+}
+
+/* The OP_fetch routines return the new value after the operation.  */
+
+void
+test_add_fetch (char* v)
+{
+  *v = 0;
+  count = 1;
+
+  if (__atomic_add_fetch (v, count, __ATOMIC_RELAXED) != 1)
+    abort ();
+
+  if (__atomic_add_fetch (v, 1, __ATOMIC_CONSUME) != 2)
+    abort ();
+
+  if (__atomic_add_fetch (v, count, __ATOMIC_ACQUIRE) != 3)
+    abort ();
+
+  if (__atomic_add_fetch (v, 1, __ATOMIC_RELEASE) != 4)
+    abort ();
+
+  if (__atomic_add_fetch (v, count, __ATOMIC_ACQ_REL) != 5)
+    abort ();
+
+  if (__atomic_add_fetch (v, count, __ATOMIC_SEQ_CST) != 6)
+    abort ();
+}
+
+
+void
+test_sub_fetch (char* v)
+{
+  *v = res = 20;
+  count = 0;
+
+  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_RELAXED) !=  --res)
+    abort ();
+
+  if (__atomic_sub_fetch (v, 1, __ATOMIC_CONSUME) !=  --res)
+    abort ();
+
+  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_ACQUIRE) !=  --res)
+    abort ();
+
+  if (__atomic_sub_fetch (v, 1, __ATOMIC_RELEASE) !=  --res)
+    abort ();
+
+  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_ACQ_REL) !=  --res)
+    abort ();
+
+  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_SEQ_CST) !=  --res)
+    abort ();
+}
+
+void
+test_and_fetch (char* v)
+{
+  *v = init;
+
+  if (__atomic_and_fetch (v, 0, __ATOMIC_RELAXED) !=  0)
+    abort ();
+
+  *v = init;
+  if (__atomic_and_fetch (v, init, __ATOMIC_CONSUME) !=  init)
+    abort ();
+
+  if (__atomic_and_fetch (v, 0, __ATOMIC_ACQUIRE) !=  0)
+    abort ();
+
+  *v = ~*v;
+  if (__atomic_and_fetch (v, init, __ATOMIC_RELEASE) !=  init)
+    abort ();
+
+  if (__atomic_and_fetch (v, 0, __ATOMIC_ACQ_REL) !=  0)
+    abort ();
+
+  *v = ~*v;
+  if (__atomic_and_fetch (v, 0, __ATOMIC_SEQ_CST) !=  0)
+    abort ();
+}
+
+void
+test_nand_fetch (char* v)
+{
+  *v = init;
+
+  if (__atomic_nand_fetch (v, 0, __ATOMIC_RELAXED) !=  init)
+    abort ();
+
+  if (__atomic_nand_fetch (v, init, __ATOMIC_CONSUME) !=  0)
+    abort ();
+
+  if (__atomic_nand_fetch (v, 0, __ATOMIC_ACQUIRE) !=  init)
+    abort ();
+
+  if (__atomic_nand_fetch (v, init, __ATOMIC_RELEASE) !=  0)
+    abort ();
+
+  if (__atomic_nand_fetch (v, init, __ATOMIC_ACQ_REL) !=  init)
+    abort ();
+
+  if (__atomic_nand_fetch (v, 0, __ATOMIC_SEQ_CST) !=  init)
+    abort ();
+}
+
+
+
+void
+test_xor_fetch (char* v)
+{
+  *v = init;
+  count = 0;
+
+  if (__atomic_xor_fetch (v, count, __ATOMIC_RELAXED) !=  init)
+    abort ();
+
+  if (__atomic_xor_fetch (v, ~count, __ATOMIC_CONSUME) !=  0)
+    abort ();
+
+  if (__atomic_xor_fetch (v, 0, __ATOMIC_ACQUIRE) !=  0)
+    abort ();
+
+  if (__atomic_xor_fetch (v, ~count, __ATOMIC_RELEASE) !=  init)
+    abort ();
+
+  if (__atomic_xor_fetch (v, 0, __ATOMIC_ACQ_REL) !=  init)
+    abort ();
+
+  if (__atomic_xor_fetch (v, ~count, __ATOMIC_SEQ_CST) !=  0)
+    abort ();
+}
+
+void
+test_or_fetch (char* v)
+{
+  *v = 0;
+  count = 1;
+
+  if (__atomic_or_fetch (v, count, __ATOMIC_RELAXED) !=  1)
+    abort ();
+
+  count *= 2;
+  if (__atomic_or_fetch (v, 2, __ATOMIC_CONSUME) !=  3)
+    abort ();
+
+  count *= 2;
+  if (__atomic_or_fetch (v, count, __ATOMIC_ACQUIRE) !=  7)
+    abort ();
+
+  count *= 2;
+  if (__atomic_or_fetch (v, 8, __ATOMIC_RELEASE) !=  15)
+    abort ();
+
+  count *= 2;
+  if (__atomic_or_fetch (v, count, __ATOMIC_ACQ_REL) !=  31)
+    abort ();
+
+  count *= 2;
+  if (__atomic_or_fetch (v, count, __ATOMIC_SEQ_CST) !=  63)
+    abort ();
+}
+
+
+/* Test the OP routines with a result which isn't used. Use both variations
+   within each function.  */
+
+void
+test_add (char* v)
+{
+  *v = 0;
+  count = 1;
+
+  __atomic_add_fetch (v, count, __ATOMIC_RELAXED);
+  if (*v != 1)
+    abort ();
+
+  __atomic_fetch_add (v, count, __ATOMIC_CONSUME);
+  if (*v != 2)
+    abort ();
+
+  __atomic_add_fetch (v, 1 , __ATOMIC_ACQUIRE);
+  if (*v != 3)
+    abort ();
+
+  __atomic_fetch_add (v, 1, __ATOMIC_RELEASE);
+  if (*v != 4)
+    abort ();
+
+  __atomic_add_fetch (v, count, __ATOMIC_ACQ_REL);
+  if (*v != 5)
+    abort ();
+
+  __atomic_fetch_add (v, count, __ATOMIC_SEQ_CST);
+  if (*v != 6)
+    abort ();
+}
+
+
+void
+test_sub (char* v)
+{
+  *v = res = 20;
+  count = 0;
+
+  __atomic_sub_fetch (v, count + 1, __ATOMIC_RELAXED);
+  if (*v != --res)
+    abort ();
+
+  __atomic_fetch_sub (v, count + 1, __ATOMIC_CONSUME);
+  if (*v != --res)
+    abort ();
+
+  __atomic_sub_fetch (v, 1, __ATOMIC_ACQUIRE);
+  if (*v != --res)
+    abort ();
+
+  __atomic_fetch_sub (v, 1, __ATOMIC_RELEASE);
+  if (*v != --res)
+    abort ();
+
+  __atomic_sub_fetch (v, count + 1, __ATOMIC_ACQ_REL);
+  if (*v != --res)
+    abort ();
+
+  __atomic_fetch_sub (v, count + 1, __ATOMIC_SEQ_CST);
+  if (*v != --res)
+    abort ();
+}
+
+void
+test_and (char* v)
+{
+  *v = init;
+
+  __atomic_and_fetch (v, 0, __ATOMIC_RELAXED);
+  if (*v != 0)
+    abort ();
+
+  *v = init;
+  __atomic_fetch_and (v, init, __ATOMIC_CONSUME);
+  if (*v != init)
+    abort ();
+
+  __atomic_and_fetch (v, 0, __ATOMIC_ACQUIRE);
+  if (*v != 0)
+    abort ();
+
+  *v = ~*v;
+  __atomic_fetch_and (v, init, __ATOMIC_RELEASE);
+  if (*v != init)
+    abort ();
+
+  __atomic_and_fetch (v, 0, __ATOMIC_ACQ_REL);
+  if (*v != 0)
+    abort ();
+
+  *v = ~*v;
+  __atomic_fetch_and (v, 0, __ATOMIC_SEQ_CST);
+  if (*v != 0)
+    abort ();
+}
+
+void
+test_nand (char* v)
+{
+  *v = init;
+
+  __atomic_fetch_nand (v, 0, __ATOMIC_RELAXED);
+  if (*v != init)
+    abort ();
+
+  __atomic_fetch_nand (v, init, __ATOMIC_CONSUME);
+  if (*v != 0)
+    abort ();
+
+  __atomic_nand_fetch (v, 0, __ATOMIC_ACQUIRE);
+  if (*v != init)
+    abort ();
+
+  __atomic_nand_fetch (v, init, __ATOMIC_RELEASE);
+  if (*v != 0)
+    abort ();
+
+  __atomic_fetch_nand (v, init, __ATOMIC_ACQ_REL);
+  if (*v != init)
+    abort ();
+
+  __atomic_nand_fetch (v, 0, __ATOMIC_SEQ_CST);
+  if (*v != init)
+    abort ();
+}
+
+
+
+void
+test_xor (char* v)
+{
+  *v = init;
+  count = 0;
+
+  __atomic_xor_fetch (v, count, __ATOMIC_RELAXED);
+  if (*v != init)
+    abort ();
+
+  __atomic_fetch_xor (v, ~count, __ATOMIC_CONSUME);
+  if (*v != 0)
+    abort ();
+
+  __atomic_xor_fetch (v, 0, __ATOMIC_ACQUIRE);
+  if (*v != 0)
+    abort ();
+
+  __atomic_fetch_xor (v, ~count, __ATOMIC_RELEASE);
+  if (*v != init)
+    abort ();
+
+  __atomic_fetch_xor (v, 0, __ATOMIC_ACQ_REL);
+  if (*v != init)
+    abort ();
+
+  __atomic_xor_fetch (v, ~count, __ATOMIC_SEQ_CST);
+  if (*v != 0)
+    abort ();
+}
+
+void
+test_or (char* v)
+{
+  *v = 0;
+  count = 1;
+
+  __atomic_or_fetch (v, count, __ATOMIC_RELAXED);
+  if (*v != 1)
+    abort ();
+
+  count *= 2;
+  __atomic_fetch_or (v, count, __ATOMIC_CONSUME);
+  if (*v != 3)
+    abort ();
+
+  count *= 2;
+  __atomic_or_fetch (v, 4, __ATOMIC_ACQUIRE);
+  if (*v != 7)
+    abort ();
+
+  count *= 2;
+  __atomic_fetch_or (v, 8, __ATOMIC_RELEASE);
+  if (*v != 15)
+    abort ();
+
+  count *= 2;
+  __atomic_or_fetch (v, count, __ATOMIC_ACQ_REL);
+  if (*v != 31)
+    abort ();
+
+  count *= 2;
+  __atomic_fetch_or (v, count, __ATOMIC_SEQ_CST);
+  if (*v != 63)
+    abort ();
+}
+
+int
+main ()
+{
+  char* V[] = {&A.a, &A.b, &A.c, &A.d};
+
+  for (int i = 0; i < 4; i++) {
+    test_fetch_add (V[i]);
+    test_fetch_sub (V[i]);
+    test_fetch_and (V[i]);
+    test_fetch_nand (V[i]);
+    test_fetch_xor (V[i]);
+    test_fetch_or (V[i]);
+
+    test_add_fetch (V[i]);
+    test_sub_fetch (V[i]);
+    test_and_fetch (V[i]);
+    test_nand_fetch (V[i]);
+    test_xor_fetch (V[i]);
+    test_or_fetch (V[i]);
+
+    test_add (V[i]);
+    test_sub (V[i]);
+    test_and (V[i]);
+    test_nand (V[i]);
+    test_xor (V[i]);
+    test_or (V[i]);
+  }
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-4.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-4.c
new file mode 100644
index 00000000000..619cf1f86ca
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-4.c
@@ -0,0 +1,566 @@ 
+/* Check all short alignments.  */
+/* Duplicate logic as libatomic/testsuite/libatomic.c/atomic-op-2.c */
+/* Test __atomic routines for existence and proper execution on 2 byte
+   values with each valid memory model.  */
+/* { dg-do run } */
+/* { dg-options "-latomic -minline-atomics -Wno-address-of-packed-member" } */
+
+/* Test the execution of the __atomic_*OP builtin routines for a short.  */
+
+extern void abort(void);
+
+short count, res;
+const short init = ~0;
+
+struct A
+{
+   short a;
+   short b;
+} __attribute__ ((packed)) A;
+
+/* The fetch_op routines return the original value before the operation.  */
+
+void
+test_fetch_add (short* v)
+{
+  *v = 0;
+  count = 1;
+
+  if (__atomic_fetch_add (v, count, __ATOMIC_RELAXED) != 0)
+    abort ();
+
+  if (__atomic_fetch_add (v, 1, __ATOMIC_CONSUME) != 1)
+    abort ();
+
+  if (__atomic_fetch_add (v, count, __ATOMIC_ACQUIRE) != 2)
+    abort ();
+
+  if (__atomic_fetch_add (v, 1, __ATOMIC_RELEASE) != 3)
+    abort ();
+
+  if (__atomic_fetch_add (v, count, __ATOMIC_ACQ_REL) != 4)
+    abort ();
+
+  if (__atomic_fetch_add (v, 1, __ATOMIC_SEQ_CST) != 5)
+    abort ();
+}
+
+
+void
+test_fetch_sub (short* v)
+{
+  *v = res = 20;
+  count = 0;
+
+  if (__atomic_fetch_sub (v, count + 1, __ATOMIC_RELAXED) !=  res--)
+    abort ();
+
+  if (__atomic_fetch_sub (v, 1, __ATOMIC_CONSUME) !=  res--)
+    abort ();
+
+  if (__atomic_fetch_sub (v, count + 1, __ATOMIC_ACQUIRE) !=  res--)
+    abort ();
+
+  if (__atomic_fetch_sub (v, 1, __ATOMIC_RELEASE) !=  res--)
+    abort ();
+
+  if (__atomic_fetch_sub (v, count + 1, __ATOMIC_ACQ_REL) !=  res--)
+    abort ();
+
+  if (__atomic_fetch_sub (v, 1, __ATOMIC_SEQ_CST) !=  res--)
+    abort ();
+}
+
+void
+test_fetch_and (short* v)
+{
+  *v = init;
+
+  if (__atomic_fetch_and (v, 0, __ATOMIC_RELAXED) !=  init)
+    abort ();
+
+  if (__atomic_fetch_and (v, init, __ATOMIC_CONSUME) !=  0)
+    abort ();
+
+  if (__atomic_fetch_and (v, 0, __ATOMIC_ACQUIRE) !=  0)
+    abort ();
+
+  *v = ~*v;
+  if (__atomic_fetch_and (v, init, __ATOMIC_RELEASE) !=  init)
+    abort ();
+
+  if (__atomic_fetch_and (v, 0, __ATOMIC_ACQ_REL) !=  init)
+    abort ();
+
+  if (__atomic_fetch_and (v, 0, __ATOMIC_SEQ_CST) !=  0)
+    abort ();
+}
+
+void
+test_fetch_nand (short* v)
+{
+  *v = init;
+
+  if (__atomic_fetch_nand (v, 0, __ATOMIC_RELAXED) !=  init)
+    abort ();
+
+  if (__atomic_fetch_nand (v, init, __ATOMIC_CONSUME) !=  init)
+    abort ();
+
+  if (__atomic_fetch_nand (v, 0, __ATOMIC_ACQUIRE) !=  0 )
+    abort ();
+
+  if (__atomic_fetch_nand (v, init, __ATOMIC_RELEASE) !=  init)
+    abort ();
+
+  if (__atomic_fetch_nand (v, init, __ATOMIC_ACQ_REL) !=  0)
+    abort ();
+
+  if (__atomic_fetch_nand (v, 0, __ATOMIC_SEQ_CST) !=  init)
+    abort ();
+}
+
+void
+test_fetch_xor (short* v)
+{
+  *v = init;
+  count = 0;
+
+  if (__atomic_fetch_xor (v, count, __ATOMIC_RELAXED) !=  init)
+    abort ();
+
+  if (__atomic_fetch_xor (v, ~count, __ATOMIC_CONSUME) !=  init)
+    abort ();
+
+  if (__atomic_fetch_xor (v, 0, __ATOMIC_ACQUIRE) !=  0)
+    abort ();
+
+  if (__atomic_fetch_xor (v, ~count, __ATOMIC_RELEASE) !=  0)
+    abort ();
+
+  if (__atomic_fetch_xor (v, 0, __ATOMIC_ACQ_REL) !=  init)
+    abort ();
+
+  if (__atomic_fetch_xor (v, ~count, __ATOMIC_SEQ_CST) !=  init)
+    abort ();
+}
+
+void
+test_fetch_or (short* v)
+{
+  *v = 0;
+  count = 1;
+
+  if (__atomic_fetch_or (v, count, __ATOMIC_RELAXED) !=  0)
+    abort ();
+
+  count *= 2;
+  if (__atomic_fetch_or (v, 2, __ATOMIC_CONSUME) !=  1)
+    abort ();
+
+  count *= 2;
+  if (__atomic_fetch_or (v, count, __ATOMIC_ACQUIRE) !=  3)
+    abort ();
+
+  count *= 2;
+  if (__atomic_fetch_or (v, 8, __ATOMIC_RELEASE) !=  7)
+    abort ();
+
+  count *= 2;
+  if (__atomic_fetch_or (v, count, __ATOMIC_ACQ_REL) !=  15)
+    abort ();
+
+  count *= 2;
+  if (__atomic_fetch_or (v, count, __ATOMIC_SEQ_CST) !=  31)
+    abort ();
+}
+
+/* The OP_fetch routines return the new value after the operation.  */
+
+void
+test_add_fetch (short* v)
+{
+  *v = 0;
+  count = 1;
+
+  if (__atomic_add_fetch (v, count, __ATOMIC_RELAXED) != 1)
+    abort ();
+
+  if (__atomic_add_fetch (v, 1, __ATOMIC_CONSUME) != 2)
+    abort ();
+
+  if (__atomic_add_fetch (v, count, __ATOMIC_ACQUIRE) != 3)
+    abort ();
+
+  if (__atomic_add_fetch (v, 1, __ATOMIC_RELEASE) != 4)
+    abort ();
+
+  if (__atomic_add_fetch (v, count, __ATOMIC_ACQ_REL) != 5)
+    abort ();
+
+  if (__atomic_add_fetch (v, count, __ATOMIC_SEQ_CST) != 6)
+    abort ();
+}
+
+
+void
+test_sub_fetch (short* v)
+{
+  *v = res = 20;
+  count = 0;
+
+  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_RELAXED) !=  --res)
+    abort ();
+
+  if (__atomic_sub_fetch (v, 1, __ATOMIC_CONSUME) !=  --res)
+    abort ();
+
+  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_ACQUIRE) !=  --res)
+    abort ();
+
+  if (__atomic_sub_fetch (v, 1, __ATOMIC_RELEASE) !=  --res)
+    abort ();
+
+  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_ACQ_REL) !=  --res)
+    abort ();
+
+  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_SEQ_CST) !=  --res)
+    abort ();
+}
+
+void
+test_and_fetch (short* v)
+{
+  *v = init;
+
+  if (__atomic_and_fetch (v, 0, __ATOMIC_RELAXED) !=  0)
+    abort ();
+
+  *v = init;
+  if (__atomic_and_fetch (v, init, __ATOMIC_CONSUME) !=  init)
+    abort ();
+
+  if (__atomic_and_fetch (v, 0, __ATOMIC_ACQUIRE) !=  0)
+    abort ();
+
+  *v = ~*v;
+  if (__atomic_and_fetch (v, init, __ATOMIC_RELEASE) !=  init)
+    abort ();
+
+  if (__atomic_and_fetch (v, 0, __ATOMIC_ACQ_REL) !=  0)
+    abort ();
+
+  *v = ~*v;
+  if (__atomic_and_fetch (v, 0, __ATOMIC_SEQ_CST) !=  0)
+    abort ();
+}
+
+void
+test_nand_fetch (short* v)
+{
+  *v = init;
+
+  if (__atomic_nand_fetch (v, 0, __ATOMIC_RELAXED) !=  init)
+    abort ();
+
+  if (__atomic_nand_fetch (v, init, __ATOMIC_CONSUME) !=  0)
+    abort ();
+
+  if (__atomic_nand_fetch (v, 0, __ATOMIC_ACQUIRE) !=  init)
+    abort ();
+
+  if (__atomic_nand_fetch (v, init, __ATOMIC_RELEASE) !=  0)
+    abort ();
+
+  if (__atomic_nand_fetch (v, init, __ATOMIC_ACQ_REL) !=  init)
+    abort ();
+
+  if (__atomic_nand_fetch (v, 0, __ATOMIC_SEQ_CST) !=  init)
+    abort ();
+}
+
+
+
+void
+test_xor_fetch (short* v)
+{
+  *v = init;
+  count = 0;
+
+  if (__atomic_xor_fetch (v, count, __ATOMIC_RELAXED) !=  init)
+    abort ();
+
+  if (__atomic_xor_fetch (v, ~count, __ATOMIC_CONSUME) !=  0)
+    abort ();
+
+  if (__atomic_xor_fetch (v, 0, __ATOMIC_ACQUIRE) !=  0)
+    abort ();
+
+  if (__atomic_xor_fetch (v, ~count, __ATOMIC_RELEASE) !=  init)
+    abort ();
+
+  if (__atomic_xor_fetch (v, 0, __ATOMIC_ACQ_REL) !=  init)
+    abort ();
+
+  if (__atomic_xor_fetch (v, ~count, __ATOMIC_SEQ_CST) !=  0)
+    abort ();
+}
+
+void
+test_or_fetch (short* v)
+{
+  *v = 0;
+  count = 1;
+
+  if (__atomic_or_fetch (v, count, __ATOMIC_RELAXED) !=  1)
+    abort ();
+
+  count *= 2;
+  if (__atomic_or_fetch (v, 2, __ATOMIC_CONSUME) !=  3)
+    abort ();
+
+  count *= 2;
+  if (__atomic_or_fetch (v, count, __ATOMIC_ACQUIRE) !=  7)
+    abort ();
+
+  count *= 2;
+  if (__atomic_or_fetch (v, 8, __ATOMIC_RELEASE) !=  15)
+    abort ();
+
+  count *= 2;
+  if (__atomic_or_fetch (v, count, __ATOMIC_ACQ_REL) !=  31)
+    abort ();
+
+  count *= 2;
+  if (__atomic_or_fetch (v, count, __ATOMIC_SEQ_CST) !=  63)
+    abort ();
+}
+
+
+/* Test the OP routines with a result which isn't used. Use both variations
+   within each function.  */
+
+void
+test_add (short* v)
+{
+  *v = 0;
+  count = 1;
+
+  __atomic_add_fetch (v, count, __ATOMIC_RELAXED);
+  if (*v != 1)
+    abort ();
+
+  __atomic_fetch_add (v, count, __ATOMIC_CONSUME);
+  if (*v != 2)
+    abort ();
+
+  __atomic_add_fetch (v, 1 , __ATOMIC_ACQUIRE);
+  if (*v != 3)
+    abort ();
+
+  __atomic_fetch_add (v, 1, __ATOMIC_RELEASE);
+  if (*v != 4)
+    abort ();
+
+  __atomic_add_fetch (v, count, __ATOMIC_ACQ_REL);
+  if (*v != 5)
+    abort ();
+
+  __atomic_fetch_add (v, count, __ATOMIC_SEQ_CST);
+  if (*v != 6)
+    abort ();
+}
+
+
+void
+test_sub (short* v)
+{
+  *v = res = 20;
+  count = 0;
+
+  __atomic_sub_fetch (v, count + 1, __ATOMIC_RELAXED);
+  if (*v != --res)
+    abort ();
+
+  __atomic_fetch_sub (v, count + 1, __ATOMIC_CONSUME);
+  if (*v != --res)
+    abort ();
+
+  __atomic_sub_fetch (v, 1, __ATOMIC_ACQUIRE);
+  if (*v != --res)
+    abort ();
+
+  __atomic_fetch_sub (v, 1, __ATOMIC_RELEASE);
+  if (*v != --res)
+    abort ();
+
+  __atomic_sub_fetch (v, count + 1, __ATOMIC_ACQ_REL);
+  if (*v != --res)
+    abort ();
+
+  __atomic_fetch_sub (v, count + 1, __ATOMIC_SEQ_CST);
+  if (*v != --res)
+    abort ();
+}
+
+void
+test_and (short* v)
+{
+  *v = init;
+
+  __atomic_and_fetch (v, 0, __ATOMIC_RELAXED);
+  if (*v != 0)
+    abort ();
+
+  *v = init;
+  __atomic_fetch_and (v, init, __ATOMIC_CONSUME);
+  if (*v != init)
+    abort ();
+
+  __atomic_and_fetch (v, 0, __ATOMIC_ACQUIRE);
+  if (*v != 0)
+    abort ();
+
+  *v = ~*v;
+  __atomic_fetch_and (v, init, __ATOMIC_RELEASE);
+  if (*v != init)
+    abort ();
+
+  __atomic_and_fetch (v, 0, __ATOMIC_ACQ_REL);
+  if (*v != 0)
+    abort ();
+
+  *v = ~*v;
+  __atomic_fetch_and (v, 0, __ATOMIC_SEQ_CST);
+  if (*v != 0)
+    abort ();
+}
+
+void
+test_nand (short* v)
+{
+  *v = init;
+
+  __atomic_fetch_nand (v, 0, __ATOMIC_RELAXED);
+  if (*v != init)
+    abort ();
+
+  __atomic_fetch_nand (v, init, __ATOMIC_CONSUME);
+  if (*v != 0)
+    abort ();
+
+  __atomic_nand_fetch (v, 0, __ATOMIC_ACQUIRE);
+  if (*v != init)
+    abort ();
+
+  __atomic_nand_fetch (v, init, __ATOMIC_RELEASE);
+  if (*v != 0)
+    abort ();
+
+  __atomic_fetch_nand (v, init, __ATOMIC_ACQ_REL);
+  if (*v != init)
+    abort ();
+
+  __atomic_nand_fetch (v, 0, __ATOMIC_SEQ_CST);
+  if (*v != init)
+    abort ();
+}
+
+
+
+void
+test_xor (short* v)
+{
+  *v = init;
+  count = 0;
+
+  __atomic_xor_fetch (v, count, __ATOMIC_RELAXED);
+  if (*v != init)
+    abort ();
+
+  __atomic_fetch_xor (v, ~count, __ATOMIC_CONSUME);
+  if (*v != 0)
+    abort ();
+
+  __atomic_xor_fetch (v, 0, __ATOMIC_ACQUIRE);
+  if (*v != 0)
+    abort ();
+
+  __atomic_fetch_xor (v, ~count, __ATOMIC_RELEASE);
+  if (*v != init)
+    abort ();
+
+  __atomic_fetch_xor (v, 0, __ATOMIC_ACQ_REL);
+  if (*v != init)
+    abort ();
+
+  __atomic_xor_fetch (v, ~count, __ATOMIC_SEQ_CST);
+  if (*v != 0)
+    abort ();
+}
+
+void
+test_or (short* v)
+{
+  *v = 0;
+  count = 1;
+
+  __atomic_or_fetch (v, count, __ATOMIC_RELAXED);
+  if (*v != 1)
+    abort ();
+
+  count *= 2;
+  __atomic_fetch_or (v, count, __ATOMIC_CONSUME);
+  if (*v != 3)
+    abort ();
+
+  count *= 2;
+  __atomic_or_fetch (v, 4, __ATOMIC_ACQUIRE);
+  if (*v != 7)
+    abort ();
+
+  count *= 2;
+  __atomic_fetch_or (v, 8, __ATOMIC_RELEASE);
+  if (*v != 15)
+    abort ();
+
+  count *= 2;
+  __atomic_or_fetch (v, count, __ATOMIC_ACQ_REL);
+  if (*v != 31)
+    abort ();
+
+  count *= 2;
+  __atomic_fetch_or (v, count, __ATOMIC_SEQ_CST);
+  if (*v != 63)
+    abort ();
+}
+
+int
+main () {
+  short* V[] = {&A.a, &A.b};
+
+  for (int i = 0; i < 2; i++) {
+    test_fetch_add (V[i]);
+    test_fetch_sub (V[i]);
+    test_fetch_and (V[i]);
+    test_fetch_nand (V[i]);
+    test_fetch_xor (V[i]);
+    test_fetch_or (V[i]);
+
+    test_add_fetch (V[i]);
+    test_sub_fetch (V[i]);
+    test_and_fetch (V[i]);
+    test_nand_fetch (V[i]);
+    test_xor_fetch (V[i]);
+    test_or_fetch (V[i]);
+
+    test_add (V[i]);
+    test_sub (V[i]);
+    test_and (V[i]);
+    test_nand (V[i]);
+    test_xor (V[i]);
+    test_or (V[i]);
+  }
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-5.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-5.c
new file mode 100644
index 00000000000..c2751235dbf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-5.c
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* Verify that constant propogation is functioning.  */
+/* The -4 should be propogated into an ANDI statement. */
+/* { dg-options "-minline-atomics" } */
+/* { dg-final { scan-assembler-not "\tli\t[at]\d,-4" } } */
+
+char bar;
+
+int
+main ()
+{
+  __sync_fetch_and_add(&bar, 1);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-6.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-6.c
new file mode 100644
index 00000000000..18249bae7d1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-6.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* Verify that masks are generated to the correct size.  */
+/* { dg-options "-O3 -minline-atomics" } */
+/* Check for mask */
+/* { dg-final { scan-assembler "\tli\t[at]\d,255" } } */
+
+int
+main ()
+{
+  char bar __attribute__((aligned (32)));
+  __sync_fetch_and_add(&bar, 0);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-7.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-7.c
new file mode 100644
index 00000000000..81bbf4badce
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-7.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* Verify that masks are generated to the correct size.  */
+/* { dg-options "-O3 -minline-atomics" } */
+/* Check for mask */
+/* { dg-final { scan-assembler "\tli\t[at]\d,65535" } } */
+
+int
+main ()
+{
+  short bar __attribute__((aligned (32)));
+  __sync_fetch_and_add(&bar, 0);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-8.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-8.c
new file mode 100644
index 00000000000..d27562ed981
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-8.c
@@ -0,0 +1,17 @@ 
+/* { dg-do compile } */
+/* Verify that masks are aligned properly.  */
+/* { dg-options "-O3 -minline-atomics" } */
+/* Check for mask */
+/* { dg-final { scan-assembler "\tli\t[at]\d,16711680" } } */
+
+int
+main ()
+{
+  struct A {
+    char a;
+    char b;
+    char c;
+    char d;
+  } __attribute__ ((packed)) __attribute__((aligned (32))) A;
+  __sync_fetch_and_add(&A.c, 0);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-9.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-9.c
new file mode 100644
index 00000000000..382849702ca
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-9.c
@@ -0,0 +1,17 @@ 
+/* { dg-do compile } */
+/* Verify that masks are aligned properly.  */
+/* { dg-options "-O3 -minline-atomics" } */
+/* Check for mask */
+/* { dg-final { scan-assembler "\tli\t[at]\d,-16777216" } } */
+
+int
+main ()
+{
+  struct A {
+    char a;
+    char b;
+    char c;
+    char d;
+  } __attribute__ ((packed)) __attribute__((aligned (32))) A;
+  __sync_fetch_and_add(&A.d, 0);
+}
diff --git a/libgcc/config/riscv/atomic.c b/libgcc/config/riscv/atomic.c
index 904d8c59cf0..9583027b757 100644
--- a/libgcc/config/riscv/atomic.c
+++ b/libgcc/config/riscv/atomic.c
@@ -30,6 +30,8 @@  see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define INVERT		"not %[tmp1], %[tmp1]\n\t"
 #define DONT_INVERT	""
 
+/* Logic duplicated in gcc/gcc/config/riscv/sync.md for use when inlining is enabled */
+
 #define GENERATE_FETCH_AND_OP(type, size, opname, insn, invert, cop)	\
   type __sync_fetch_and_ ## opname ## _ ## size (type *p, type v)	\
   {									\