diff mbox series

[v2] RISCV: Add support for inlining subword atomics

Message ID 20220310223647.3553551-1-patrick@rivosinc.com
State New
Headers show
Series [v2] RISCV: Add support for inlining subword atomics | expand

Commit Message

Patrick O'Neill March 10, 2022, 10:36 p.m. UTC
RISC-V has no support for subword atomic operations; code currently
generates libatomic library calls.

This patch changes the default behavior to inline subword atomic calls 
(using the same logic as the existing library call).
Behavior can be specified using the -minline-atomics and
-mno-inline-atomics command line flags.

gcc/libgcc/config/riscv/atomic.c has the same logic implemented in asm.
This will need to stay for backwards compatibility and the
-mno-inline-atomics flag.

2022-02-15 Patrick O'Neill <patrick@rivosinc.com>

	PR target/104338
	* riscv.opt: Add command-line flag.
	* invoke.texi: Add blurb regarding command-line flag.
	* sync.md (atomic_fetch_<atomic_optab><mode>): logic for 
	expanding subword atomic operations.
	* sync.md (subword_atomic_fetch_strong_<atomic_optab>): LR/SC
	block for performing atomic operation
	* atomic.c: Add reference to duplicate logic.
	* inline-atomics-1.c: New test.
	* inline-atomics-2.c: Likewise.
	* inline-atomics-3.c: Likewise.
	* inline-atomics-4.c: Likewise.
	* inline-atomics-5.c: Likewise.
	* inline-atomics-6.c: Likewise.
	* inline-atomics-7.c: Likewise.
	* inline-atomics-8.c: Likewise.
	* inline-atomics-9.c: Likewise.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
---
There may be further concerns about the memory consistency of these 
operations, but this patch focuses on simply moving the logic inline.
Those concerns can be addressed in a future patch.
---
v2 Changelog:
 - Add texti blurb
 - Update target flag
 - add 'UNSPEC_SYNC_OLD_OP_SUBWORD' for subword ops
---
 gcc/config/riscv/riscv.opt                    |   4 +
 gcc/config/riscv/sync.md                      |  98 +++
 gcc/doc/invoke.texi                           |   7 +
 .../gcc.target/riscv/inline-atomics-1.c       |  11 +
 .../gcc.target/riscv/inline-atomics-2.c       |  12 +
 .../gcc.target/riscv/inline-atomics-3.c       | 569 ++++++++++++++++++
 .../gcc.target/riscv/inline-atomics-4.c       | 566 +++++++++++++++++
 .../gcc.target/riscv/inline-atomics-5.c       |  13 +
 .../gcc.target/riscv/inline-atomics-6.c       |  12 +
 .../gcc.target/riscv/inline-atomics-7.c       |  12 +
 .../gcc.target/riscv/inline-atomics-8.c       |  17 +
 .../gcc.target/riscv/inline-atomics-9.c       |  17 +
 libgcc/config/riscv/atomic.c                  |   2 +
 13 files changed, 1340 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/inline-atomics-9.c

Comments

Pan RZ April 8, 2022, 2:48 a.m. UTC | #1
Hi Patrick,

Glad to know that efforts have been made to add inlining subword atomic supports into gcc. The patch looks great so far, yet as Andreas Schwab has pointed out (at riscv-collab/riscv-gcc#337), looks like it only contains atomic fetch stuff. Just wondering do you have further plans to implement support for atomic store / exchange as well? Also, as a reminder, note that after adding store / exchange supports, some related macros like ATOMIC_BOOL_LOCK_FREE and ATOMIC_CHAR_LOCK_FREE's values (defined in <atomic>) may need to be set to true. Currently in RISC-V gcc, they are all defined as false. This may also need to be done for std::atomic<bool>::is_always_lock_free and std::atomic_is_lock_free(bool_var).

See:https://en.cppreference.com/w/cpp/atomic/atomic_is_lock_free  and
https://github.com/riscv-collab/riscv-gcc/issues/337#issuecomment-1086664815  

What's your opinion on this?

Best regards, RZ Pan (XieJiSS)
Patrick O'Neill April 8, 2022, 4:48 p.m. UTC | #2
Hi RZ Pan,

I'll start working on the atomic store/exchange stuff. It shouldn't be
too difficult to add since it will have similar masking logic to
atomic fetch.

Also - I briefly looked and couldn't find the place where those macro's
values for RISC-V are defined in GCC. If anyone can point me in the
right direction for that, I would appreciate it :)

Thank you,
Patrick

On 4/7/22 19:48, Pan RZ wrote:
> Hi Patrick,
>
> Glad to know that efforts have been made to add inlining subword 
> atomic supports into gcc. The patch looks great so far, yet as Andreas 
> Schwab has pointed out (at riscv-collab/riscv-gcc#337), looks like it 
> only contains atomic fetch stuff. Just wondering do you have further 
> plans to implement support for atomic store / exchange as well? Also, 
> as a reminder, note that after adding store / exchange supports, some 
> related macros like ATOMIC_BOOL_LOCK_FREE and ATOMIC_CHAR_LOCK_FREE's 
> values (defined in <atomic>) may need to be set to true. Currently in 
> RISC-V gcc, they are all defined as false. This may also need to be 
> done for std::atomic<bool>::is_always_lock_free and 
> std::atomic_is_lock_free(bool_var).
>
> See:https://en.cppreference.com/w/cpp/atomic/atomic_is_lock_free and
> https://github.com/riscv-collab/riscv-gcc/issues/337#issuecomment-1086664815 
>
> What's your opinion on this?
>
> Best regards, RZ Pan (XieJiSS)
>
Pan RZ April 8, 2022, 6:12 p.m. UTC | #3
Hi Patrick,


We are more than delighted to hear that you'd like to implement inlining 
subword atomic load/store and exchange as well!


I searched for these macros in the gcc codebase, and it seems like the 
internal logic that defines ATOMIC_* builtin macros can be found at 
gcc/c-family/c-cppbuiltin.cc#L665-L796 (inside static void 
cpp_atomic_builtins), while is_always_lock_free is at 
libstdc++-v3/include/std/atomic#L99. I'm not 100% sure, though, due to 
my limited understanding of the codebase. Hope this information helps :-)


Yours sincerely,

Pan RZ (XieJiSS) @ PLCT Lab
Patrick O'Neill April 8, 2022, 6:39 p.m. UTC | #4
Hi Pan RZ,

I appreciate the help - that's a good starting point for the macros.

It looks like the file:
gcc/config/nds32/linux.h
interacts with the macro:
#define HAVE_sync_compare_and_swaphi 1

I'm not sure if that's the correct way to do it/if this is defined in a
different way for targets like x86/ARM/etc.

I'll circle back around to it once the other inline ops are done. In the
meantime if anyone else wants to dig in and figure out how to update the
macros (for RISC-V) without impacting other targets, that would make
this patch ready a little faster ;)

Thanks,
Patrick
Andreas Schwab April 8, 2022, 8:29 p.m. UTC | #5
On Apr 08 2022, Patrick O'Neill wrote:

> It looks like the file:
> gcc/config/nds32/linux.h
> interacts with the macro:
> #define HAVE_sync_compare_and_swaphi 1
>
> I'm not sure if that's the correct way to do it/if this is defined in a
> different way for targets like x86/ARM/etc.

They are normally autogenerated, see insn-flags.h.
diff mbox series

Patch

diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index 9fffc08220d..8378e41aa85 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -225,3 +225,7 @@  Enum(isa_spec_class) String(20191213) Value(ISA_SPEC_CLASS_20191213)
 misa-spec=
 Target RejectNegative Joined Enum(isa_spec_class) Var(riscv_isa_spec) Init(TARGET_DEFAULT_ISA_SPEC)
 Set the version of RISC-V ISA spec.
+
+minline-atomics
+Target Mask(INLINE_SUBWORD_ATOMIC)
+Always inline subword atomic operations.
diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
index 86b41e6b00a..05cbdfd5db3 100644
--- a/gcc/config/riscv/sync.md
+++ b/gcc/config/riscv/sync.md
@@ -22,6 +22,7 @@ 
 (define_c_enum "unspec" [
   UNSPEC_COMPARE_AND_SWAP
   UNSPEC_SYNC_OLD_OP
+  UNSPEC_SYNC_OLD_OP_SUBWORD
   UNSPEC_SYNC_EXCHANGE
   UNSPEC_ATOMIC_STORE
   UNSPEC_MEMORY_BARRIER
@@ -92,6 +93,103 @@ 
   "%F3amo<insn>.<amo>%A3 %0,%z2,%1"
   [(set (attr "length") (const_int 8))])
 
+(define_expand "atomic_fetch_<atomic_optab><mode>"
+  [(set (match_operand:SHORT 0 "register_operand" "=&r")	      ;; old value at mem
+	(match_operand:SHORT 1 "memory_operand" "+A"))		      ;; mem location
+   (set (match_dup 1)
+	(unspec_volatile:SHORT
+	  [(any_atomic:SHORT (match_dup 1)
+		     (match_operand:SHORT 2 "reg_or_0_operand" "rJ")) ;; value for op
+	   (match_operand:SI 3 "const_int_operand")]		      ;; model
+	 UNSPEC_SYNC_OLD_OP_SUBWORD))]
+  "TARGET_ATOMIC && TARGET_INLINE_SUBWORD_ATOMIC"
+{
+  /* We have no QImode/HImode atomics, so form a mask, then use
+     subword_atomic_fetch_strong_<mode> to implement a LR/SC version of the
+     operation. */
+
+  /* Logic duplicated in gcc/libgcc/config/riscv/atomic.c for use when inlining
+     is disabled */
+
+  rtx old = gen_reg_rtx (SImode);
+  rtx mem = operands[1];
+  rtx value = operands[2];
+  rtx mask = gen_reg_rtx (SImode);
+  rtx notmask = gen_reg_rtx (SImode);
+
+  rtx addr = force_reg (Pmode, XEXP (mem, 0));
+
+  rtx aligned_addr = gen_reg_rtx (Pmode);
+  emit_move_insn (aligned_addr,  gen_rtx_AND (Pmode, addr,
+					      gen_int_mode (-4, Pmode)));
+
+  rtx aligned_mem = change_address (mem, SImode, aligned_addr);
+
+  rtx shift = gen_reg_rtx (SImode);
+  emit_move_insn (shift, gen_rtx_AND (SImode, gen_lowpart (SImode, addr),
+				      gen_int_mode (3, SImode)));
+  emit_move_insn (shift, gen_rtx_ASHIFT (SImode, shift,
+					 gen_int_mode(3, SImode)));
+
+  rtx value_reg = gen_reg_rtx (SImode);
+  emit_move_insn (value_reg, simplify_gen_subreg (SImode, value, <MODE>mode, 0));
+
+  rtx shifted_value = gen_reg_rtx (SImode);
+  emit_move_insn(shifted_value, gen_rtx_ASHIFT(SImode, value_reg,
+					       gen_lowpart (QImode, shift)));
+
+  int unshifted_mask;
+  if (<MODE>mode == QImode)
+    unshifted_mask = 0xFF;
+  else
+    unshifted_mask = 0xFFFF;
+
+  rtx mask_reg = gen_reg_rtx (SImode);
+  emit_move_insn (mask_reg, gen_int_mode(unshifted_mask, SImode));
+
+  emit_move_insn (mask, gen_rtx_ASHIFT(SImode, mask_reg,
+				       gen_lowpart (QImode, shift)));
+
+  emit_move_insn (notmask, gen_rtx_NOT(SImode, mask));
+
+  emit_insn (gen_subword_atomic_fetch_strong_<atomic_optab> (old, aligned_mem,
+							     shifted_value,
+							     mask, notmask));
+
+  emit_move_insn (old, gen_rtx_ASHIFTRT(SImode, old,
+					gen_lowpart(QImode, shift)));
+
+  emit_move_insn (operands[0], gen_lowpart(<MODE>mode, old));
+
+  DONE;
+})
+
+(define_insn "subword_atomic_fetch_strong_<atomic_optab>"
+  [(set (match_operand:SI 0 "register_operand" "=&r")		   ;; old value at mem
+	(match_operand:SI 1 "memory_operand" "+A"))		   ;; mem location
+   (set (match_dup 1)
+	(unspec_volatile:SI
+	  [(any_atomic:SI (match_dup 1)
+		     (match_operand:SI 2 "register_operand" "rI")) ;; value for op
+	   (match_operand:SI 3 "register_operand" "rI")]	   ;; mask
+	 UNSPEC_SYNC_OLD_OP_SUBWORD))
+    (match_operand:SI 4 "register_operand" "rI")		   ;; not_mask
+    (clobber (match_scratch:SI 5 "=&r"))			   ;; tmp_1
+    (clobber (match_scratch:SI 6 "=&r"))]			   ;; tmp_2
+  "TARGET_ATOMIC && TARGET_INLINE_SUBWORD_ATOMIC"
+  {
+    return
+    "1:\;"
+    "lr.w.aq\t%0, %1\;"
+    "<insn>\t%5, %0, %2\;"
+    "and\t%5, %5, %3\;"
+    "and\t%6, %0, %4\;"
+    "or\t%6, %6, %5\;"
+    "sc.w.rl\t%5, %6, %1\;"
+    "bnez\t%5, 1b";
+  }
+  [(set (attr "length") (const_int 28))])
+
 (define_insn "atomic_exchange<mode>"
   [(set (match_operand:GPR 0 "register_operand" "=&r")
 	(unspec_volatile:GPR
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e1a00c80307..1007110aafb 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1200,6 +1200,7 @@  See RS/6000 and PowerPC Options.
 -mbig-endian  -mlittle-endian @gol
 -mstack-protector-guard=@var{guard} -mstack-protector-guard-reg=@var{reg} @gol
 -mstack-protector-guard-offset=@var{offset}}
+-minline-atomics  -mno-inline-atomics @gol
 
 @emph{RL78 Options}
 @gccoptlist{-msim  -mmul=none  -mmul=g13  -mmul=g14  -mallregs @gol
@@ -27712,6 +27713,12 @@  Do or don't use smaller but slower prologue and epilogue code that uses
 library function calls.  The default is to use fast inline prologues and
 epilogues.
 
+@item -minline-atomics
+@itemx -mno-inline-atomics
+@opindex minline-atomics
+Do or don't use smaller but slower subword atomic emulation code that uses
+library function calls.  The default is to use fast inline subword atomics.
+
 @item -mshorten-memrefs
 @itemx -mno-shorten-memrefs
 @opindex mshorten-memrefs
diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-1.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-1.c
new file mode 100644
index 00000000000..110fdabd313
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-1.c
@@ -0,0 +1,11 @@ 
+/* { dg-do compile } */
+/* { dg-options "-mno-inline-atomics" } */
+/* { dg-final { scan-assembler "\tcall\t__sync_fetch_and_add_1" } } */
+
+char bar;
+
+int
+main ()
+{
+  __sync_fetch_and_add(&bar, 1);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-2.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-2.c
new file mode 100644
index 00000000000..8d5c31d8b79
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-2.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* Verify that subword atomics do not generate calls.  */
+/* { dg-options "-minline-atomics" } */
+/* { dg-final { scan-assembler-not "\tcall\t__sync_fetch_and_add_1" } } */
+
+char bar;
+
+int
+main ()
+{
+  __sync_fetch_and_add(&bar, 1);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-3.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-3.c
new file mode 100644
index 00000000000..19b382d45b0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-3.c
@@ -0,0 +1,569 @@ 
+/* Check all char alignments.  */
+/* Duplicate logic as libatomic/testsuite/libatomic.c/atomic-op-1.c */
+/* Test __atomic routines for existence and proper execution on 1 byte
+   values with each valid memory model.  */
+/* { dg-do run } */
+/* { dg-options "-latomic -minline-atomics -Wno-address-of-packed-member" } */
+
+/* Test the execution of the __atomic_*OP builtin routines for a char.  */
+
+extern void abort(void);
+
+char count, res;
+const char init = ~0;
+
+struct A
+{
+   char a;
+   char b;
+   char c;
+   char d;
+} __attribute__ ((packed)) A;
+
+/* The fetch_op routines return the original value before the operation.  */
+
+void
+test_fetch_add (char* v)
+{
+  *v = 0;
+  count = 1;
+
+  if (__atomic_fetch_add (v, count, __ATOMIC_RELAXED) != 0)
+    abort ();
+
+  if (__atomic_fetch_add (v, 1, __ATOMIC_CONSUME) != 1)
+    abort ();
+
+  if (__atomic_fetch_add (v, count, __ATOMIC_ACQUIRE) != 2)
+    abort ();
+
+  if (__atomic_fetch_add (v, 1, __ATOMIC_RELEASE) != 3)
+    abort ();
+
+  if (__atomic_fetch_add (v, count, __ATOMIC_ACQ_REL) != 4)
+    abort ();
+
+  if (__atomic_fetch_add (v, 1, __ATOMIC_SEQ_CST) != 5)
+    abort ();
+}
+
+
+void
+test_fetch_sub (char* v)
+{
+  *v = res = 20;
+  count = 0;
+
+  if (__atomic_fetch_sub (v, count + 1, __ATOMIC_RELAXED) !=  res--)
+    abort ();
+
+  if (__atomic_fetch_sub (v, 1, __ATOMIC_CONSUME) !=  res--)
+    abort ();
+
+  if (__atomic_fetch_sub (v, count + 1, __ATOMIC_ACQUIRE) !=  res--)
+    abort ();
+
+  if (__atomic_fetch_sub (v, 1, __ATOMIC_RELEASE) !=  res--)
+    abort ();
+
+  if (__atomic_fetch_sub (v, count + 1, __ATOMIC_ACQ_REL) !=  res--)
+    abort ();
+
+  if (__atomic_fetch_sub (v, 1, __ATOMIC_SEQ_CST) !=  res--)
+    abort ();
+}
+
+void
+test_fetch_and (char* v)
+{
+  *v = init;
+
+  if (__atomic_fetch_and (v, 0, __ATOMIC_RELAXED) !=  init)
+    abort ();
+
+  if (__atomic_fetch_and (v, init, __ATOMIC_CONSUME) !=  0)
+    abort ();
+
+  if (__atomic_fetch_and (v, 0, __ATOMIC_ACQUIRE) !=  0)
+    abort ();
+
+  *v = ~*v;
+  if (__atomic_fetch_and (v, init, __ATOMIC_RELEASE) !=  init)
+    abort ();
+
+  if (__atomic_fetch_and (v, 0, __ATOMIC_ACQ_REL) !=  init)
+    abort ();
+
+  if (__atomic_fetch_and (v, 0, __ATOMIC_SEQ_CST) !=  0)
+    abort ();
+}
+
+void
+test_fetch_nand (char* v)
+{
+  *v = init;
+
+  if (__atomic_fetch_nand (v, 0, __ATOMIC_RELAXED) !=  init)
+    abort ();
+
+  if (__atomic_fetch_nand (v, init, __ATOMIC_CONSUME) !=  init)
+    abort ();
+
+  if (__atomic_fetch_nand (v, 0, __ATOMIC_ACQUIRE) !=  0 )
+    abort ();
+
+  if (__atomic_fetch_nand (v, init, __ATOMIC_RELEASE) !=  init)
+    abort ();
+
+  if (__atomic_fetch_nand (v, init, __ATOMIC_ACQ_REL) !=  0)
+    abort ();
+
+  if (__atomic_fetch_nand (v, 0, __ATOMIC_SEQ_CST) !=  init)
+    abort ();
+}
+
+void
+test_fetch_xor (char* v)
+{
+  *v = init;
+  count = 0;
+
+  if (__atomic_fetch_xor (v, count, __ATOMIC_RELAXED) !=  init)
+    abort ();
+
+  if (__atomic_fetch_xor (v, ~count, __ATOMIC_CONSUME) !=  init)
+    abort ();
+
+  if (__atomic_fetch_xor (v, 0, __ATOMIC_ACQUIRE) !=  0)
+    abort ();
+
+  if (__atomic_fetch_xor (v, ~count, __ATOMIC_RELEASE) !=  0)
+    abort ();
+
+  if (__atomic_fetch_xor (v, 0, __ATOMIC_ACQ_REL) !=  init)
+    abort ();
+
+  if (__atomic_fetch_xor (v, ~count, __ATOMIC_SEQ_CST) !=  init)
+    abort ();
+}
+
+void
+test_fetch_or (char* v)
+{
+  *v = 0;
+  count = 1;
+
+  if (__atomic_fetch_or (v, count, __ATOMIC_RELAXED) !=  0)
+    abort ();
+
+  count *= 2;
+  if (__atomic_fetch_or (v, 2, __ATOMIC_CONSUME) !=  1)
+    abort ();
+
+  count *= 2;
+  if (__atomic_fetch_or (v, count, __ATOMIC_ACQUIRE) !=  3)
+    abort ();
+
+  count *= 2;
+  if (__atomic_fetch_or (v, 8, __ATOMIC_RELEASE) !=  7)
+    abort ();
+
+  count *= 2;
+  if (__atomic_fetch_or (v, count, __ATOMIC_ACQ_REL) !=  15)
+    abort ();
+
+  count *= 2;
+  if (__atomic_fetch_or (v, count, __ATOMIC_SEQ_CST) !=  31)
+    abort ();
+}
+
+/* The OP_fetch routines return the new value after the operation.  */
+
+void
+test_add_fetch (char* v)
+{
+  *v = 0;
+  count = 1;
+
+  if (__atomic_add_fetch (v, count, __ATOMIC_RELAXED) != 1)
+    abort ();
+
+  if (__atomic_add_fetch (v, 1, __ATOMIC_CONSUME) != 2)
+    abort ();
+
+  if (__atomic_add_fetch (v, count, __ATOMIC_ACQUIRE) != 3)
+    abort ();
+
+  if (__atomic_add_fetch (v, 1, __ATOMIC_RELEASE) != 4)
+    abort ();
+
+  if (__atomic_add_fetch (v, count, __ATOMIC_ACQ_REL) != 5)
+    abort ();
+
+  if (__atomic_add_fetch (v, count, __ATOMIC_SEQ_CST) != 6)
+    abort ();
+}
+
+
+void
+test_sub_fetch (char* v)
+{
+  *v = res = 20;
+  count = 0;
+
+  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_RELAXED) !=  --res)
+    abort ();
+
+  if (__atomic_sub_fetch (v, 1, __ATOMIC_CONSUME) !=  --res)
+    abort ();
+
+  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_ACQUIRE) !=  --res)
+    abort ();
+
+  if (__atomic_sub_fetch (v, 1, __ATOMIC_RELEASE) !=  --res)
+    abort ();
+
+  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_ACQ_REL) !=  --res)
+    abort ();
+
+  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_SEQ_CST) !=  --res)
+    abort ();
+}
+
+void
+test_and_fetch (char* v)
+{
+  *v = init;
+
+  if (__atomic_and_fetch (v, 0, __ATOMIC_RELAXED) !=  0)
+    abort ();
+
+  *v = init;
+  if (__atomic_and_fetch (v, init, __ATOMIC_CONSUME) !=  init)
+    abort ();
+
+  if (__atomic_and_fetch (v, 0, __ATOMIC_ACQUIRE) !=  0)
+    abort ();
+
+  *v = ~*v;
+  if (__atomic_and_fetch (v, init, __ATOMIC_RELEASE) !=  init)
+    abort ();
+
+  if (__atomic_and_fetch (v, 0, __ATOMIC_ACQ_REL) !=  0)
+    abort ();
+
+  *v = ~*v;
+  if (__atomic_and_fetch (v, 0, __ATOMIC_SEQ_CST) !=  0)
+    abort ();
+}
+
+void
+test_nand_fetch (char* v)
+{
+  *v = init;
+
+  if (__atomic_nand_fetch (v, 0, __ATOMIC_RELAXED) !=  init)
+    abort ();
+
+  if (__atomic_nand_fetch (v, init, __ATOMIC_CONSUME) !=  0)
+    abort ();
+
+  if (__atomic_nand_fetch (v, 0, __ATOMIC_ACQUIRE) !=  init)
+    abort ();
+
+  if (__atomic_nand_fetch (v, init, __ATOMIC_RELEASE) !=  0)
+    abort ();
+
+  if (__atomic_nand_fetch (v, init, __ATOMIC_ACQ_REL) !=  init)
+    abort ();
+
+  if (__atomic_nand_fetch (v, 0, __ATOMIC_SEQ_CST) !=  init)
+    abort ();
+}
+
+
+
+void
+test_xor_fetch (char* v)
+{
+  *v = init;
+  count = 0;
+
+  if (__atomic_xor_fetch (v, count, __ATOMIC_RELAXED) !=  init)
+    abort ();
+
+  if (__atomic_xor_fetch (v, ~count, __ATOMIC_CONSUME) !=  0)
+    abort ();
+
+  if (__atomic_xor_fetch (v, 0, __ATOMIC_ACQUIRE) !=  0)
+    abort ();
+
+  if (__atomic_xor_fetch (v, ~count, __ATOMIC_RELEASE) !=  init)
+    abort ();
+
+  if (__atomic_xor_fetch (v, 0, __ATOMIC_ACQ_REL) !=  init)
+    abort ();
+
+  if (__atomic_xor_fetch (v, ~count, __ATOMIC_SEQ_CST) !=  0)
+    abort ();
+}
+
+void
+test_or_fetch (char* v)
+{
+  *v = 0;
+  count = 1;
+
+  if (__atomic_or_fetch (v, count, __ATOMIC_RELAXED) !=  1)
+    abort ();
+
+  count *= 2;
+  if (__atomic_or_fetch (v, 2, __ATOMIC_CONSUME) !=  3)
+    abort ();
+
+  count *= 2;
+  if (__atomic_or_fetch (v, count, __ATOMIC_ACQUIRE) !=  7)
+    abort ();
+
+  count *= 2;
+  if (__atomic_or_fetch (v, 8, __ATOMIC_RELEASE) !=  15)
+    abort ();
+
+  count *= 2;
+  if (__atomic_or_fetch (v, count, __ATOMIC_ACQ_REL) !=  31)
+    abort ();
+
+  count *= 2;
+  if (__atomic_or_fetch (v, count, __ATOMIC_SEQ_CST) !=  63)
+    abort ();
+}
+
+
+/* Test the OP routines with a result which isn't used. Use both variations
+   within each function.  */
+
+void
+test_add (char* v)
+{
+  *v = 0;
+  count = 1;
+
+  __atomic_add_fetch (v, count, __ATOMIC_RELAXED);
+  if (*v != 1)
+    abort ();
+
+  __atomic_fetch_add (v, count, __ATOMIC_CONSUME);
+  if (*v != 2)
+    abort ();
+
+  __atomic_add_fetch (v, 1 , __ATOMIC_ACQUIRE);
+  if (*v != 3)
+    abort ();
+
+  __atomic_fetch_add (v, 1, __ATOMIC_RELEASE);
+  if (*v != 4)
+    abort ();
+
+  __atomic_add_fetch (v, count, __ATOMIC_ACQ_REL);
+  if (*v != 5)
+    abort ();
+
+  __atomic_fetch_add (v, count, __ATOMIC_SEQ_CST);
+  if (*v != 6)
+    abort ();
+}
+
+
+void
+test_sub (char* v)
+{
+  *v = res = 20;
+  count = 0;
+
+  __atomic_sub_fetch (v, count + 1, __ATOMIC_RELAXED);
+  if (*v != --res)
+    abort ();
+
+  __atomic_fetch_sub (v, count + 1, __ATOMIC_CONSUME);
+  if (*v != --res)
+    abort ();
+
+  __atomic_sub_fetch (v, 1, __ATOMIC_ACQUIRE);
+  if (*v != --res)
+    abort ();
+
+  __atomic_fetch_sub (v, 1, __ATOMIC_RELEASE);
+  if (*v != --res)
+    abort ();
+
+  __atomic_sub_fetch (v, count + 1, __ATOMIC_ACQ_REL);
+  if (*v != --res)
+    abort ();
+
+  __atomic_fetch_sub (v, count + 1, __ATOMIC_SEQ_CST);
+  if (*v != --res)
+    abort ();
+}
+
+void
+test_and (char* v)
+{
+  *v = init;
+
+  __atomic_and_fetch (v, 0, __ATOMIC_RELAXED);
+  if (*v != 0)
+    abort ();
+
+  *v = init;
+  __atomic_fetch_and (v, init, __ATOMIC_CONSUME);
+  if (*v != init)
+    abort ();
+
+  __atomic_and_fetch (v, 0, __ATOMIC_ACQUIRE);
+  if (*v != 0)
+    abort ();
+
+  *v = ~*v;
+  __atomic_fetch_and (v, init, __ATOMIC_RELEASE);
+  if (*v != init)
+    abort ();
+
+  __atomic_and_fetch (v, 0, __ATOMIC_ACQ_REL);
+  if (*v != 0)
+    abort ();
+
+  *v = ~*v;
+  __atomic_fetch_and (v, 0, __ATOMIC_SEQ_CST);
+  if (*v != 0)
+    abort ();
+}
+
+void
+test_nand (char* v)
+{
+  *v = init;
+
+  __atomic_fetch_nand (v, 0, __ATOMIC_RELAXED);
+  if (*v != init)
+    abort ();
+
+  __atomic_fetch_nand (v, init, __ATOMIC_CONSUME);
+  if (*v != 0)
+    abort ();
+
+  __atomic_nand_fetch (v, 0, __ATOMIC_ACQUIRE);
+  if (*v != init)
+    abort ();
+
+  __atomic_nand_fetch (v, init, __ATOMIC_RELEASE);
+  if (*v != 0)
+    abort ();
+
+  __atomic_fetch_nand (v, init, __ATOMIC_ACQ_REL);
+  if (*v != init)
+    abort ();
+
+  __atomic_nand_fetch (v, 0, __ATOMIC_SEQ_CST);
+  if (*v != init)
+    abort ();
+}
+
+
+
+void
+test_xor (char* v)
+{
+  *v = init;
+  count = 0;
+
+  __atomic_xor_fetch (v, count, __ATOMIC_RELAXED);
+  if (*v != init)
+    abort ();
+
+  __atomic_fetch_xor (v, ~count, __ATOMIC_CONSUME);
+  if (*v != 0)
+    abort ();
+
+  __atomic_xor_fetch (v, 0, __ATOMIC_ACQUIRE);
+  if (*v != 0)
+    abort ();
+
+  __atomic_fetch_xor (v, ~count, __ATOMIC_RELEASE);
+  if (*v != init)
+    abort ();
+
+  __atomic_fetch_xor (v, 0, __ATOMIC_ACQ_REL);
+  if (*v != init)
+    abort ();
+
+  __atomic_xor_fetch (v, ~count, __ATOMIC_SEQ_CST);
+  if (*v != 0)
+    abort ();
+}
+
+void
+test_or (char* v)
+{
+  *v = 0;
+  count = 1;
+
+  __atomic_or_fetch (v, count, __ATOMIC_RELAXED);
+  if (*v != 1)
+    abort ();
+
+  count *= 2;
+  __atomic_fetch_or (v, count, __ATOMIC_CONSUME);
+  if (*v != 3)
+    abort ();
+
+  count *= 2;
+  __atomic_or_fetch (v, 4, __ATOMIC_ACQUIRE);
+  if (*v != 7)
+    abort ();
+
+  count *= 2;
+  __atomic_fetch_or (v, 8, __ATOMIC_RELEASE);
+  if (*v != 15)
+    abort ();
+
+  count *= 2;
+  __atomic_or_fetch (v, count, __ATOMIC_ACQ_REL);
+  if (*v != 31)
+    abort ();
+
+  count *= 2;
+  __atomic_fetch_or (v, count, __ATOMIC_SEQ_CST);
+  if (*v != 63)
+    abort ();
+}
+
+int
+main ()
+{
+  char* V[] = {&A.a, &A.b, &A.c, &A.d};
+
+  for (int i = 0; i < 4; i++) {
+    test_fetch_add (V[i]);
+    test_fetch_sub (V[i]);
+    test_fetch_and (V[i]);
+    test_fetch_nand (V[i]);
+    test_fetch_xor (V[i]);
+    test_fetch_or (V[i]);
+
+    test_add_fetch (V[i]);
+    test_sub_fetch (V[i]);
+    test_and_fetch (V[i]);
+    test_nand_fetch (V[i]);
+    test_xor_fetch (V[i]);
+    test_or_fetch (V[i]);
+
+    test_add (V[i]);
+    test_sub (V[i]);
+    test_and (V[i]);
+    test_nand (V[i]);
+    test_xor (V[i]);
+    test_or (V[i]);
+  }
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-4.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-4.c
new file mode 100644
index 00000000000..619cf1f86ca
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-4.c
@@ -0,0 +1,566 @@ 
+/* Check all short alignments.  */
+/* Duplicate logic as libatomic/testsuite/libatomic.c/atomic-op-2.c */
+/* Test __atomic routines for existence and proper execution on 2 byte
+   values with each valid memory model.  */
+/* { dg-do run } */
+/* { dg-options "-latomic -minline-atomics -Wno-address-of-packed-member" } */
+
+/* Test the execution of the __atomic_*OP builtin routines for a short.  */
+
+extern void abort(void);
+
+short count, res;
+const short init = ~0;
+
+struct A
+{
+   short a;
+   short b;
+} __attribute__ ((packed)) A;
+
+/* The fetch_op routines return the original value before the operation.  */
+
+void
+test_fetch_add (short* v)
+{
+  *v = 0;
+  count = 1;
+
+  if (__atomic_fetch_add (v, count, __ATOMIC_RELAXED) != 0)
+    abort ();
+
+  if (__atomic_fetch_add (v, 1, __ATOMIC_CONSUME) != 1)
+    abort ();
+
+  if (__atomic_fetch_add (v, count, __ATOMIC_ACQUIRE) != 2)
+    abort ();
+
+  if (__atomic_fetch_add (v, 1, __ATOMIC_RELEASE) != 3)
+    abort ();
+
+  if (__atomic_fetch_add (v, count, __ATOMIC_ACQ_REL) != 4)
+    abort ();
+
+  if (__atomic_fetch_add (v, 1, __ATOMIC_SEQ_CST) != 5)
+    abort ();
+}
+
+
+void
+test_fetch_sub (short* v)
+{
+  *v = res = 20;
+  count = 0;
+
+  if (__atomic_fetch_sub (v, count + 1, __ATOMIC_RELAXED) !=  res--)
+    abort ();
+
+  if (__atomic_fetch_sub (v, 1, __ATOMIC_CONSUME) !=  res--)
+    abort ();
+
+  if (__atomic_fetch_sub (v, count + 1, __ATOMIC_ACQUIRE) !=  res--)
+    abort ();
+
+  if (__atomic_fetch_sub (v, 1, __ATOMIC_RELEASE) !=  res--)
+    abort ();
+
+  if (__atomic_fetch_sub (v, count + 1, __ATOMIC_ACQ_REL) !=  res--)
+    abort ();
+
+  if (__atomic_fetch_sub (v, 1, __ATOMIC_SEQ_CST) !=  res--)
+    abort ();
+}
+
+void
+test_fetch_and (short* v)
+{
+  *v = init;
+
+  if (__atomic_fetch_and (v, 0, __ATOMIC_RELAXED) !=  init)
+    abort ();
+
+  if (__atomic_fetch_and (v, init, __ATOMIC_CONSUME) !=  0)
+    abort ();
+
+  if (__atomic_fetch_and (v, 0, __ATOMIC_ACQUIRE) !=  0)
+    abort ();
+
+  *v = ~*v;
+  if (__atomic_fetch_and (v, init, __ATOMIC_RELEASE) !=  init)
+    abort ();
+
+  if (__atomic_fetch_and (v, 0, __ATOMIC_ACQ_REL) !=  init)
+    abort ();
+
+  if (__atomic_fetch_and (v, 0, __ATOMIC_SEQ_CST) !=  0)
+    abort ();
+}
+
+void
+test_fetch_nand (short* v)
+{
+  *v = init;
+
+  if (__atomic_fetch_nand (v, 0, __ATOMIC_RELAXED) !=  init)
+    abort ();
+
+  if (__atomic_fetch_nand (v, init, __ATOMIC_CONSUME) !=  init)
+    abort ();
+
+  if (__atomic_fetch_nand (v, 0, __ATOMIC_ACQUIRE) !=  0 )
+    abort ();
+
+  if (__atomic_fetch_nand (v, init, __ATOMIC_RELEASE) !=  init)
+    abort ();
+
+  if (__atomic_fetch_nand (v, init, __ATOMIC_ACQ_REL) !=  0)
+    abort ();
+
+  if (__atomic_fetch_nand (v, 0, __ATOMIC_SEQ_CST) !=  init)
+    abort ();
+}
+
+void
+test_fetch_xor (short* v)
+{
+  *v = init;
+  count = 0;
+
+  if (__atomic_fetch_xor (v, count, __ATOMIC_RELAXED) !=  init)
+    abort ();
+
+  if (__atomic_fetch_xor (v, ~count, __ATOMIC_CONSUME) !=  init)
+    abort ();
+
+  if (__atomic_fetch_xor (v, 0, __ATOMIC_ACQUIRE) !=  0)
+    abort ();
+
+  if (__atomic_fetch_xor (v, ~count, __ATOMIC_RELEASE) !=  0)
+    abort ();
+
+  if (__atomic_fetch_xor (v, 0, __ATOMIC_ACQ_REL) !=  init)
+    abort ();
+
+  if (__atomic_fetch_xor (v, ~count, __ATOMIC_SEQ_CST) !=  init)
+    abort ();
+}
+
+void
+test_fetch_or (short* v)
+{
+  *v = 0;
+  count = 1;
+
+  if (__atomic_fetch_or (v, count, __ATOMIC_RELAXED) !=  0)
+    abort ();
+
+  count *= 2;
+  if (__atomic_fetch_or (v, 2, __ATOMIC_CONSUME) !=  1)
+    abort ();
+
+  count *= 2;
+  if (__atomic_fetch_or (v, count, __ATOMIC_ACQUIRE) !=  3)
+    abort ();
+
+  count *= 2;
+  if (__atomic_fetch_or (v, 8, __ATOMIC_RELEASE) !=  7)
+    abort ();
+
+  count *= 2;
+  if (__atomic_fetch_or (v, count, __ATOMIC_ACQ_REL) !=  15)
+    abort ();
+
+  count *= 2;
+  if (__atomic_fetch_or (v, count, __ATOMIC_SEQ_CST) !=  31)
+    abort ();
+}
+
+/* The OP_fetch routines return the new value after the operation.  */
+
+void
+test_add_fetch (short* v)
+{
+  *v = 0;
+  count = 1;
+
+  if (__atomic_add_fetch (v, count, __ATOMIC_RELAXED) != 1)
+    abort ();
+
+  if (__atomic_add_fetch (v, 1, __ATOMIC_CONSUME) != 2)
+    abort ();
+
+  if (__atomic_add_fetch (v, count, __ATOMIC_ACQUIRE) != 3)
+    abort ();
+
+  if (__atomic_add_fetch (v, 1, __ATOMIC_RELEASE) != 4)
+    abort ();
+
+  if (__atomic_add_fetch (v, count, __ATOMIC_ACQ_REL) != 5)
+    abort ();
+
+  if (__atomic_add_fetch (v, count, __ATOMIC_SEQ_CST) != 6)
+    abort ();
+}
+
+
+void
+test_sub_fetch (short* v)
+{
+  *v = res = 20;
+  count = 0;
+
+  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_RELAXED) !=  --res)
+    abort ();
+
+  if (__atomic_sub_fetch (v, 1, __ATOMIC_CONSUME) !=  --res)
+    abort ();
+
+  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_ACQUIRE) !=  --res)
+    abort ();
+
+  if (__atomic_sub_fetch (v, 1, __ATOMIC_RELEASE) !=  --res)
+    abort ();
+
+  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_ACQ_REL) !=  --res)
+    abort ();
+
+  if (__atomic_sub_fetch (v, count + 1, __ATOMIC_SEQ_CST) !=  --res)
+    abort ();
+}
+
+void
+test_and_fetch (short* v)
+{
+  *v = init;
+
+  if (__atomic_and_fetch (v, 0, __ATOMIC_RELAXED) !=  0)
+    abort ();
+
+  *v = init;
+  if (__atomic_and_fetch (v, init, __ATOMIC_CONSUME) !=  init)
+    abort ();
+
+  if (__atomic_and_fetch (v, 0, __ATOMIC_ACQUIRE) !=  0)
+    abort ();
+
+  *v = ~*v;
+  if (__atomic_and_fetch (v, init, __ATOMIC_RELEASE) !=  init)
+    abort ();
+
+  if (__atomic_and_fetch (v, 0, __ATOMIC_ACQ_REL) !=  0)
+    abort ();
+
+  *v = ~*v;
+  if (__atomic_and_fetch (v, 0, __ATOMIC_SEQ_CST) !=  0)
+    abort ();
+}
+
+void
+test_nand_fetch (short* v)
+{
+  *v = init;
+
+  if (__atomic_nand_fetch (v, 0, __ATOMIC_RELAXED) !=  init)
+    abort ();
+
+  if (__atomic_nand_fetch (v, init, __ATOMIC_CONSUME) !=  0)
+    abort ();
+
+  if (__atomic_nand_fetch (v, 0, __ATOMIC_ACQUIRE) !=  init)
+    abort ();
+
+  if (__atomic_nand_fetch (v, init, __ATOMIC_RELEASE) !=  0)
+    abort ();
+
+  if (__atomic_nand_fetch (v, init, __ATOMIC_ACQ_REL) !=  init)
+    abort ();
+
+  if (__atomic_nand_fetch (v, 0, __ATOMIC_SEQ_CST) !=  init)
+    abort ();
+}
+
+
+
+void
+test_xor_fetch (short* v)
+{
+  *v = init;
+  count = 0;
+
+  if (__atomic_xor_fetch (v, count, __ATOMIC_RELAXED) !=  init)
+    abort ();
+
+  if (__atomic_xor_fetch (v, ~count, __ATOMIC_CONSUME) !=  0)
+    abort ();
+
+  if (__atomic_xor_fetch (v, 0, __ATOMIC_ACQUIRE) !=  0)
+    abort ();
+
+  if (__atomic_xor_fetch (v, ~count, __ATOMIC_RELEASE) !=  init)
+    abort ();
+
+  if (__atomic_xor_fetch (v, 0, __ATOMIC_ACQ_REL) !=  init)
+    abort ();
+
+  if (__atomic_xor_fetch (v, ~count, __ATOMIC_SEQ_CST) !=  0)
+    abort ();
+}
+
+void
+test_or_fetch (short* v)
+{
+  *v = 0;
+  count = 1;
+
+  if (__atomic_or_fetch (v, count, __ATOMIC_RELAXED) !=  1)
+    abort ();
+
+  count *= 2;
+  if (__atomic_or_fetch (v, 2, __ATOMIC_CONSUME) !=  3)
+    abort ();
+
+  count *= 2;
+  if (__atomic_or_fetch (v, count, __ATOMIC_ACQUIRE) !=  7)
+    abort ();
+
+  count *= 2;
+  if (__atomic_or_fetch (v, 8, __ATOMIC_RELEASE) !=  15)
+    abort ();
+
+  count *= 2;
+  if (__atomic_or_fetch (v, count, __ATOMIC_ACQ_REL) !=  31)
+    abort ();
+
+  count *= 2;
+  if (__atomic_or_fetch (v, count, __ATOMIC_SEQ_CST) !=  63)
+    abort ();
+}
+
+
+/* Test the OP routines with a result which isn't used. Use both variations
+   within each function.  */
+
+void
+test_add (short* v)
+{
+  *v = 0;
+  count = 1;
+
+  __atomic_add_fetch (v, count, __ATOMIC_RELAXED);
+  if (*v != 1)
+    abort ();
+
+  __atomic_fetch_add (v, count, __ATOMIC_CONSUME);
+  if (*v != 2)
+    abort ();
+
+  __atomic_add_fetch (v, 1 , __ATOMIC_ACQUIRE);
+  if (*v != 3)
+    abort ();
+
+  __atomic_fetch_add (v, 1, __ATOMIC_RELEASE);
+  if (*v != 4)
+    abort ();
+
+  __atomic_add_fetch (v, count, __ATOMIC_ACQ_REL);
+  if (*v != 5)
+    abort ();
+
+  __atomic_fetch_add (v, count, __ATOMIC_SEQ_CST);
+  if (*v != 6)
+    abort ();
+}
+
+
+void
+test_sub (short* v)
+{
+  *v = res = 20;
+  count = 0;
+
+  __atomic_sub_fetch (v, count + 1, __ATOMIC_RELAXED);
+  if (*v != --res)
+    abort ();
+
+  __atomic_fetch_sub (v, count + 1, __ATOMIC_CONSUME);
+  if (*v != --res)
+    abort ();
+
+  __atomic_sub_fetch (v, 1, __ATOMIC_ACQUIRE);
+  if (*v != --res)
+    abort ();
+
+  __atomic_fetch_sub (v, 1, __ATOMIC_RELEASE);
+  if (*v != --res)
+    abort ();
+
+  __atomic_sub_fetch (v, count + 1, __ATOMIC_ACQ_REL);
+  if (*v != --res)
+    abort ();
+
+  __atomic_fetch_sub (v, count + 1, __ATOMIC_SEQ_CST);
+  if (*v != --res)
+    abort ();
+}
+
+void
+test_and (short* v)
+{
+  *v = init;
+
+  __atomic_and_fetch (v, 0, __ATOMIC_RELAXED);
+  if (*v != 0)
+    abort ();
+
+  *v = init;
+  __atomic_fetch_and (v, init, __ATOMIC_CONSUME);
+  if (*v != init)
+    abort ();
+
+  __atomic_and_fetch (v, 0, __ATOMIC_ACQUIRE);
+  if (*v != 0)
+    abort ();
+
+  *v = ~*v;
+  __atomic_fetch_and (v, init, __ATOMIC_RELEASE);
+  if (*v != init)
+    abort ();
+
+  __atomic_and_fetch (v, 0, __ATOMIC_ACQ_REL);
+  if (*v != 0)
+    abort ();
+
+  *v = ~*v;
+  __atomic_fetch_and (v, 0, __ATOMIC_SEQ_CST);
+  if (*v != 0)
+    abort ();
+}
+
+void
+test_nand (short* v)
+{
+  *v = init;
+
+  __atomic_fetch_nand (v, 0, __ATOMIC_RELAXED);
+  if (*v != init)
+    abort ();
+
+  __atomic_fetch_nand (v, init, __ATOMIC_CONSUME);
+  if (*v != 0)
+    abort ();
+
+  __atomic_nand_fetch (v, 0, __ATOMIC_ACQUIRE);
+  if (*v != init)
+    abort ();
+
+  __atomic_nand_fetch (v, init, __ATOMIC_RELEASE);
+  if (*v != 0)
+    abort ();
+
+  __atomic_fetch_nand (v, init, __ATOMIC_ACQ_REL);
+  if (*v != init)
+    abort ();
+
+  __atomic_nand_fetch (v, 0, __ATOMIC_SEQ_CST);
+  if (*v != init)
+    abort ();
+}
+
+
+
+void
+test_xor (short* v)
+{
+  *v = init;
+  count = 0;
+
+  __atomic_xor_fetch (v, count, __ATOMIC_RELAXED);
+  if (*v != init)
+    abort ();
+
+  __atomic_fetch_xor (v, ~count, __ATOMIC_CONSUME);
+  if (*v != 0)
+    abort ();
+
+  __atomic_xor_fetch (v, 0, __ATOMIC_ACQUIRE);
+  if (*v != 0)
+    abort ();
+
+  __atomic_fetch_xor (v, ~count, __ATOMIC_RELEASE);
+  if (*v != init)
+    abort ();
+
+  __atomic_fetch_xor (v, 0, __ATOMIC_ACQ_REL);
+  if (*v != init)
+    abort ();
+
+  __atomic_xor_fetch (v, ~count, __ATOMIC_SEQ_CST);
+  if (*v != 0)
+    abort ();
+}
+
+void
+test_or (short* v)
+{
+  *v = 0;
+  count = 1;
+
+  __atomic_or_fetch (v, count, __ATOMIC_RELAXED);
+  if (*v != 1)
+    abort ();
+
+  count *= 2;
+  __atomic_fetch_or (v, count, __ATOMIC_CONSUME);
+  if (*v != 3)
+    abort ();
+
+  count *= 2;
+  __atomic_or_fetch (v, 4, __ATOMIC_ACQUIRE);
+  if (*v != 7)
+    abort ();
+
+  count *= 2;
+  __atomic_fetch_or (v, 8, __ATOMIC_RELEASE);
+  if (*v != 15)
+    abort ();
+
+  count *= 2;
+  __atomic_or_fetch (v, count, __ATOMIC_ACQ_REL);
+  if (*v != 31)
+    abort ();
+
+  count *= 2;
+  __atomic_fetch_or (v, count, __ATOMIC_SEQ_CST);
+  if (*v != 63)
+    abort ();
+}
+
+int
+main () {
+  short* V[] = {&A.a, &A.b};
+
+  for (int i = 0; i < 2; i++) {
+    test_fetch_add (V[i]);
+    test_fetch_sub (V[i]);
+    test_fetch_and (V[i]);
+    test_fetch_nand (V[i]);
+    test_fetch_xor (V[i]);
+    test_fetch_or (V[i]);
+
+    test_add_fetch (V[i]);
+    test_sub_fetch (V[i]);
+    test_and_fetch (V[i]);
+    test_nand_fetch (V[i]);
+    test_xor_fetch (V[i]);
+    test_or_fetch (V[i]);
+
+    test_add (V[i]);
+    test_sub (V[i]);
+    test_and (V[i]);
+    test_nand (V[i]);
+    test_xor (V[i]);
+    test_or (V[i]);
+  }
+
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-5.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-5.c
new file mode 100644
index 00000000000..c2751235dbf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-5.c
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* Verify that constant propogation is functioning.  */
+/* The -4 should be propogated into an ANDI statement. */
+/* { dg-options "-minline-atomics" } */
+/* { dg-final { scan-assembler-not "\tli\t[at]\d,-4" } } */
+
+char bar;
+
+int
+main ()
+{
+  __sync_fetch_and_add(&bar, 1);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-6.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-6.c
new file mode 100644
index 00000000000..18249bae7d1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-6.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* Verify that masks are generated to the correct size.  */
+/* { dg-options "-O3 -minline-atomics" } */
+/* Check for mask */
+/* { dg-final { scan-assembler "\tli\t[at]\d,255" } } */
+
+int
+main ()
+{
+  char bar __attribute__((aligned (32)));
+  __sync_fetch_and_add(&bar, 0);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-7.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-7.c
new file mode 100644
index 00000000000..81bbf4badce
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-7.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* Verify that masks are generated to the correct size.  */
+/* { dg-options "-O3 -minline-atomics" } */
+/* Check for mask */
+/* { dg-final { scan-assembler "\tli\t[at]\d,65535" } } */
+
+int
+main ()
+{
+  short bar __attribute__((aligned (32)));
+  __sync_fetch_and_add(&bar, 0);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-8.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-8.c
new file mode 100644
index 00000000000..d27562ed981
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-8.c
@@ -0,0 +1,17 @@ 
+/* { dg-do compile } */
+/* Verify that masks are aligned properly.  */
+/* { dg-options "-O3 -minline-atomics" } */
+/* Check for mask */
+/* { dg-final { scan-assembler "\tli\t[at]\d,16711680" } } */
+
+int
+main ()
+{
+  struct A {
+    char a;
+    char b;
+    char c;
+    char d;
+  } __attribute__ ((packed)) __attribute__((aligned (32))) A;
+  __sync_fetch_and_add(&A.c, 0);
+}
diff --git a/gcc/testsuite/gcc.target/riscv/inline-atomics-9.c b/gcc/testsuite/gcc.target/riscv/inline-atomics-9.c
new file mode 100644
index 00000000000..382849702ca
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/inline-atomics-9.c
@@ -0,0 +1,17 @@ 
+/* { dg-do compile } */
+/* Verify that masks are aligned properly.  */
+/* { dg-options "-O3 -minline-atomics" } */
+/* Check for mask */
+/* { dg-final { scan-assembler "\tli\t[at]\d,-16777216" } } */
+
+int
+main ()
+{
+  struct A {
+    char a;
+    char b;
+    char c;
+    char d;
+  } __attribute__ ((packed)) __attribute__((aligned (32))) A;
+  __sync_fetch_and_add(&A.d, 0);
+}
diff --git a/libgcc/config/riscv/atomic.c b/libgcc/config/riscv/atomic.c
index 7007e7a20e4..a29909b97b5 100644
--- a/libgcc/config/riscv/atomic.c
+++ b/libgcc/config/riscv/atomic.c
@@ -30,6 +30,8 @@  see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #define INVERT		"not %[tmp1], %[tmp1]\n\t"
 #define DONT_INVERT	""
 
+/* Logic duplicated in gcc/gcc/config/riscv/sync.md for use when inlining is enabled */
+
 #define GENERATE_FETCH_AND_OP(type, size, opname, insn, invert, cop)	\
   type __sync_fetch_and_ ## opname ## _ ## size (type *p, type v)	\
   {									\