diff mbox

[RFC,20/38] tcg/i386: implement fences

Message ID 1440375847-17603-21-git-send-email-cota@braap.org
State New
Headers show

Commit Message

Emilio Cota Aug. 24, 2015, 12:23 a.m. UTC
Signed-off-by: Emilio G. Cota <cota@braap.org>
---
 tcg/i386/tcg-target.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

Comments

Paolo Bonzini Aug. 24, 2015, 1:32 a.m. UTC | #1
On 23/08/2015 17:23, Emilio G. Cota wrote:
> +    case INDEX_op_fence_load:
> +        tcg_out_fence(s, 0xe8);
> +        break;
> +    case INDEX_op_fence_full:
> +        tcg_out_fence(s, 0xf0);
> +        break;
> +    case INDEX_op_fence_store:
> +        tcg_out_fence(s, 0xf8);
> +        break;
> +

lfence and sfence are not needed in generated code; all loads are
acquires and all stores are release on x86.

Also, on targets that do not have MFENCE you want to generate something
like "lock addl $0, (%esp)".

Paolo
Emilio Cota Aug. 25, 2015, 3:02 a.m. UTC | #2
On Sun, Aug 23, 2015 at 18:32:51 -0700, Paolo Bonzini wrote:
> 
> 
> On 23/08/2015 17:23, Emilio G. Cota wrote:
> > +    case INDEX_op_fence_load:
> > +        tcg_out_fence(s, 0xe8);
> > +        break;
> > +    case INDEX_op_fence_full:
> > +        tcg_out_fence(s, 0xf0);
> > +        break;
> > +    case INDEX_op_fence_store:
> > +        tcg_out_fence(s, 0xf8);
> > +        break;
> > +
> 
> lfence and sfence are not needed in generated code; all loads are
> acquires and all stores are release on x86.

lfence and sfence here serve two purposes:

1) Template for other architectures
2) x86 code does sometimes have lfence/sfence (e.g. movntq+sfence),
   so I guessed they should remain in the translated code.
   If on x86 we always ignore the Write-Combining from the
   guest, maybe we could claim the l/sfence pair here is really unnecessary.
   I'm no x86 expert so for a first stab I decided to put them there.
   
   I didn't intend to translate say *all* PPC/ARM load barriers
   into lfences when generating x86, which is I think your point.

> Also, on targets that do not have MFENCE you want to generate something
> like "lock addl $0, (%esp)".

Good point, will fix.

Thanks,

		Emilio
Paolo Bonzini Aug. 25, 2015, 10:55 p.m. UTC | #3
> lfence and sfence here serve two purposes:
> 
> 1) Template for other architectures

Ok, this makes sense.

> 2) x86 code does sometimes have lfence/sfence (e.g. movntq+sfence),
>    so I guessed they should remain in the translated code.
>    If on x86 we always ignore the Write-Combining from the
>    guest, maybe we could claim the l/sfence pair here is really unnecessary.

Yeah, I think it's fair enough to ignore WC and nontemporal stores.

>    I didn't intend to translate say *all* PPC/ARM load barriers
>    into lfences when generating x86, which is I think your point.

Yeah, it's just that the only gen_op_smp_rmb() you had in the RFC
also did not need an lfence.  But it seems like we're on the same
page.

Paolo
diff mbox

Patch

diff --git a/tcg/i386/tcg-target.c b/tcg/i386/tcg-target.c
index 887f22f..6600c45 100644
--- a/tcg/i386/tcg-target.c
+++ b/tcg/i386/tcg-target.c
@@ -1123,6 +1123,13 @@  static void tcg_out_jmp(TCGContext *s, tcg_insn_unit *dest)
     tcg_out_branch(s, 0, dest);
 }
 
+static inline void tcg_out_fence(TCGContext *s, uint8_t op)
+{
+        tcg_out8(s, 0x0f);
+        tcg_out8(s, 0xae);
+        tcg_out8(s, op);
+}
+
 #if defined(CONFIG_SOFTMMU)
 /* helper signature: helper_ret_ld_mmu(CPUState *env, target_ulong addr,
  *                                     int mmu_idx, uintptr_t ra)
@@ -2088,6 +2095,16 @@  static inline void tcg_out_op(TCGContext *s, TCGOpcode opc,
         }
         break;
 
+    case INDEX_op_fence_load:
+        tcg_out_fence(s, 0xe8);
+        break;
+    case INDEX_op_fence_full:
+        tcg_out_fence(s, 0xf0);
+        break;
+    case INDEX_op_fence_store:
+        tcg_out_fence(s, 0xf8);
+        break;
+
     case INDEX_op_mov_i32:  /* Always emitted via tcg_out_mov.  */
     case INDEX_op_mov_i64:
     case INDEX_op_movi_i32: /* Always emitted via tcg_out_movi.  */
@@ -2226,6 +2243,9 @@  static const TCGTargetOpDef x86_op_defs[] = {
     { INDEX_op_qemu_ld_i64, { "r", "r", "L", "L" } },
     { INDEX_op_qemu_st_i64, { "L", "L", "L", "L" } },
 #endif
+    { INDEX_op_fence_load, { } },
+    { INDEX_op_fence_store, { } },
+    { INDEX_op_fence_full, { } },
     { -1 },
 };