diff mbox

[12/12] trace: [all] Add "guest_vmem" event

Message ID 20140131161008.32741.5791.stgit@fimbulvetr.bsc.es
State New
Headers show

Commit Message

Lluís Vilanova Jan. 31, 2014, 4:10 p.m. UTC
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
---
 include/exec/cpu-all.h        |   58 +++++++++++++++++++++--------------------
 include/exec/exec-all.h       |    3 ++
 include/exec/softmmu_header.h |   17 ++++++++++++
 tcg/tcg-op.h                  |    8 ++++++
 tcg/tcg.c                     |    1 +
 trace-events                  |   15 +++++++++++
 trace/tcg-op-internal.h       |   55 +++++++++++++++++++++++++++++++++++++++
 7 files changed, 129 insertions(+), 28 deletions(-)
 create mode 100644 trace/tcg-op-internal.h

Comments

Richard Henderson Feb. 4, 2014, 3:08 p.m. UTC | #1
On 01/31/2014 08:10 AM, Lluís Vilanova wrote:
> +#define ldub(p)    ({ trace_guest_vmem(p, 1, 0); ldub_raw(p);    })

Are you sure you want to log these here?  Uses of these macros are
not restricted to the guest.  Therefore you could wind up with e.g.
PCI device accesses being attributed to the target cpu.

> --- a/include/exec/softmmu_header.h
> +++ b/include/exec/softmmu_header.h
> @@ -25,6 +25,11 @@
>   * You should have received a copy of the GNU Lesser General Public
>   * License along with this library; if not, see <http://www.gnu.org/licenses/>.
>   */
> +
> +#if !defined(TRACE_TCG_CODE_ACCESSOR)
> +#include "trace.h"
> +#endif
> +
>  #if DATA_SIZE == 8
>  #define SUFFIX q
>  #define USUFFIX q
> @@ -88,6 +93,10 @@ glue(glue(cpu_ld, USUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr)
>      target_ulong addr;
>      int mmu_idx;
>  
> +#if !defined(TRACE_TCG_CODE_ACCESSOR)
> +    trace_guest_vmem(ptr, DATA_SIZE, 0);
> +#endif

These are going to result in double-logging the same access with

> +#define tcg_gen_qemu_ld_i32(val, addr, idx, memop)      \
> +    do {                                                \
> +        uint8_t _memop_size = _tcg_memop_size(memop);   \
> +        trace_guest_vmem_tcg(addr, _memop_size, 0);     \
> +        tcg_gen_qemu_ld_i32(val, addr, idx, memop);     \
> +    } while (0)

... these.

Of course, those softmmu functions are also used by the system emulators in the
same way the ldub macro above is used for userland emulation.  So again you
have non-target accesses being attributed to the target.

Also, doing this action with macros, here, seems truly backward.  Why not
simply modify the real tcg_gen_qemu_ld_i32 in tcg.c?


r~
Lluís Vilanova Feb. 4, 2014, 8:01 p.m. UTC | #2
Richard Henderson writes:

> On 01/31/2014 08:10 AM, Lluís Vilanova wrote:
>> +#define ldub(p)    ({ trace_guest_vmem(p, 1, 0); ldub_raw(p);    })

> Are you sure you want to log these here?  Uses of these macros are
> not restricted to the guest.  Therefore you could wind up with e.g.
> PCI device accesses being attributed to the target cpu.

These defines are only enabled in user-level mode.

But I wrote them really long ago, and I just realized they are not
up-to-date. The changes should also cover the 'cpu_*_data' and 'cpu_*_kernel'
variants. These macros are mostly used in helpers (e.g., helper_boundl).


>> --- a/include/exec/softmmu_header.h
>> +++ b/include/exec/softmmu_header.h
>> @@ -25,6 +25,11 @@
>> * You should have received a copy of the GNU Lesser General Public
>> * License along with this library; if not, see <http://www.gnu.org/licenses/>.
>> */
>> +
>> +#if !defined(TRACE_TCG_CODE_ACCESSOR)
>> +#include "trace.h"
>> +#endif
>> +
>> #if DATA_SIZE == 8
>> #define SUFFIX q
>> #define USUFFIX q
>> @@ -88,6 +93,10 @@ glue(glue(cpu_ld, USUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr)
>> target_ulong addr;
>> int mmu_idx;
>> 
>> +#if !defined(TRACE_TCG_CODE_ACCESSOR)
>> +    trace_guest_vmem(ptr, DATA_SIZE, 0);
>> +#endif

> These are going to result in double-logging the same access with

>> +#define tcg_gen_qemu_ld_i32(val, addr, idx, memop)      \
>> +    do {                                                \
>> +        uint8_t _memop_size = _tcg_memop_size(memop);   \
>> +        trace_guest_vmem_tcg(addr, _memop_size, 0);     \
>> +        tcg_gen_qemu_ld_i32(val, addr, idx, memop);     \
>> +    } while (0)

> ... these.

I don't see how 'tcg_gen_qemu_ld_i32' gets to call 'cpu_ldl_data' (for
example). I also did this long ago, so maybe it changed, but a quick look at the
code shows that in softmmu-mode, a TLB miss performs a slow path access with the
functions in "softmmu_template.h", not "softmmu_exec.h" (which includes
"softmmu_header.h").

Maybe you're referring to some case I've missed.


> Of course, those softmmu functions are also used by the system emulators in the
> same way the ldub macro above is used for userland emulation.  So again you
> have non-target accesses being attributed to the target.

What do you mean by non-target accesses? Accesses not directly "encoded" in the
semantics of a guest instruction? If so, I did this in purpose (like the helper
example on the top).


> Also, doing this action with macros, here, seems truly backward.  Why not
> simply modify the real tcg_gen_qemu_ld_i32 in tcg.c?

The 'trace_guest_vmem_tcg' function just calls a helper generator function in
"helper.h", which cannot be included in "tcg.c". Another possibility is to just
forget about using "helper.h", and instead "manually" generate the call to the
helper; but using macros seems to me it's easier to maintain.


Thanks,
  Lluis
Richard Henderson Feb. 6, 2014, 4:12 p.m. UTC | #3
On 02/04/2014 12:01 PM, Lluís Vilanova wrote:
> Richard Henderson writes:
> 
>> On 01/31/2014 08:10 AM, Lluís Vilanova wrote:
>>> +#define ldub(p)    ({ trace_guest_vmem(p, 1, 0); ldub_raw(p);    })
> 
>> Are you sure you want to log these here?  Uses of these macros are
>> not restricted to the guest.  Therefore you could wind up with e.g.
>> PCI device accesses being attributed to the target cpu.
> 
> These defines are only enabled in user-level mode.
> 
> But I wrote them really long ago, and I just realized they are not
> up-to-date. The changes should also cover the 'cpu_*_data' and 'cpu_*_kernel'
> variants. These macros are mostly used in helpers (e.g., helper_boundl).

Yes, I know.  But they're also used in non-cpu contexts such as __get_user
(i.e. kernel accesses) or virtio.c (i.e. device accesses).

>> These are going to result in double-logging the same access with
> 
>>> +#define tcg_gen_qemu_ld_i32(val, addr, idx, memop)      \
>>> +    do {                                                \
>>> +        uint8_t _memop_size = _tcg_memop_size(memop);   \
>>> +        trace_guest_vmem_tcg(addr, _memop_size, 0);     \
>>> +        tcg_gen_qemu_ld_i32(val, addr, idx, memop);     \
>>> +    } while (0)
> 
>> ... these.
> 
> I don't see how 'tcg_gen_qemu_ld_i32' gets to call 'cpu_ldl_data'

tcg_gen_qemu_ld_i32 triggers some inline code, with an out of line fallback in
softmmu-template.h.

You're logging both on the main path (before qemu_ld_i32 opcode) and in the
fallback path.  It would be one thing if you logged something different in the
fallback path, but you're not.  You're using the exact same logging routine.

> What do you mean by non-target accesses? Accesses not directly "encoded" in the
> semantics of a guest instruction? If so, I did this in purpose (like the helper
> example on the top).

No, I mean accesses initiated by a device.  Such things are rare, I admit,
since most devices use physical addresses not virtual.  But e.g. old Sparc
hardware used a single mmu to handle both cpu and bus accesses.

> The 'trace_guest_vmem_tcg' function just calls a helper generator function in
> "helper.h", which cannot be included in "tcg.c". Another possibility is to just
> forget about using "helper.h", and instead "manually" generate the call to the
> helper; but using macros seems to me it's easier to maintain.

You simply need to move the declarations somewhere else.  See e.g.
tcg-runtime.h for a set of helpers shared across all translators.


r~
Lluís Vilanova Feb. 10, 2014, 1:29 p.m. UTC | #4
Richard Henderson writes:

> On 02/04/2014 12:01 PM, Lluís Vilanova wrote:
>> Richard Henderson writes:
>> 
>>> On 01/31/2014 08:10 AM, Lluís Vilanova wrote:
>>>> +#define ldub(p)    ({ trace_guest_vmem(p, 1, 0); ldub_raw(p);    })
>> 
>>> Are you sure you want to log these here?  Uses of these macros are
>>> not restricted to the guest.  Therefore you could wind up with e.g.
>>> PCI device accesses being attributed to the target cpu.
>> 
>> These defines are only enabled in user-level mode.
>> 
>> But I wrote them really long ago, and I just realized they are not
>> up-to-date. The changes should also cover the 'cpu_*_data' and 'cpu_*_kernel'
>> variants. These macros are mostly used in helpers (e.g., helper_boundl).

> Yes, I know.  But they're also used in non-cpu contexts such as __get_user
> (i.e. kernel accesses) or virtio.c (i.e. device accesses).

The macro "__get_user" accesses memory using physical addresses, so it is
not traced.

Grepping at "virtio.c" shows that only "ld*_phys" and "ld*_p" are used (which
are not traced). Similarly happens for stores.

I looked at refereces for ld/st macros (without the "_raw" and "_p" suffixes),
and no non-target code came up (I'm including accesses - directly or indirectly
- initiated from helper functions as target code).

Am I missing something?


>>> These are going to result in double-logging the same access with
>> 
>>>> +#define tcg_gen_qemu_ld_i32(val, addr, idx, memop)      \
>>>> +    do {                                                \
>>>> +        uint8_t _memop_size = _tcg_memop_size(memop);   \
>>>> +        trace_guest_vmem_tcg(addr, _memop_size, 0);     \
>>>> +        tcg_gen_qemu_ld_i32(val, addr, idx, memop);     \
>>>> +    } while (0)
>> 
>>> ... these.
>> 
>> I don't see how 'tcg_gen_qemu_ld_i32' gets to call 'cpu_ldl_data'

> tcg_gen_qemu_ld_i32 triggers some inline code, with an out of line fallback in
> softmmu-template.h.

> You're logging both on the main path (before qemu_ld_i32 opcode) and in the
> fallback path.  It would be one thing if you logged something different in the
> fallback path, but you're not.  You're using the exact same logging routine.

Aha, but I'm only tracing functions in "softmmu_header.h" (e.g.,
"cpu_ldub_kernel"). I've looked at the result of preprocessing
"softmmu_template.h" (from "target-i386/mem_helper.c"), and the functions there
(e.g., "helper_ret_ldub_mmu") perform translation themselves and then access
physical memory (e.g., "ldub_p"). AFAIK, I did not add any tracing to that path.


>> What do you mean by non-target accesses? Accesses not directly "encoded" in the
>> semantics of a guest instruction? If so, I did this in purpose (like the helper
>> example on the top).

> No, I mean accesses initiated by a device.  Such things are rare, I admit,
> since most devices use physical addresses not virtual.  But e.g. old Sparc
> hardware used a single mmu to handle both cpu and bus accesses.

Hmmm, I'll have to take a look at that. Do you have any function or file for the
Sparc case from the top of your head? Admittedly, the whole memory access flow
is quite convoluted in QEMU, specially due to the big number of macros
involved. I guess whether something could be done to simplify all this a little
bit.


>> The 'trace_guest_vmem_tcg' function just calls a helper generator function in
>> "helper.h", which cannot be included in "tcg.c". Another possibility is to just
>> forget about using "helper.h", and instead "manually" generate the call to the
>> helper; but using macros seems to me it's easier to maintain.

> You simply need to move the declarations somewhere else.  See e.g.
> tcg-runtime.h for a set of helpers shared across all translators.

Nice. I'll move them there.


Thanks,
  Lluis
diff mbox

Patch

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 4cb4b4a..4ecb486 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -250,21 +250,23 @@  extern unsigned long reserved_va;
 
 #if defined(CONFIG_USER_ONLY)
 
+#include "trace.h"
+
 /* if user mode, no other memory access functions */
-#define ldub(p) ldub_raw(p)
-#define ldsb(p) ldsb_raw(p)
-#define lduw(p) lduw_raw(p)
-#define ldsw(p) ldsw_raw(p)
-#define ldl(p) ldl_raw(p)
-#define ldq(p) ldq_raw(p)
-#define ldfl(p) ldfl_raw(p)
-#define ldfq(p) ldfq_raw(p)
-#define stb(p, v) stb_raw(p, v)
-#define stw(p, v) stw_raw(p, v)
-#define stl(p, v) stl_raw(p, v)
-#define stq(p, v) stq_raw(p, v)
-#define stfl(p, v) stfl_raw(p, v)
-#define stfq(p, v) stfq_raw(p, v)
+#define ldub(p)    ({ trace_guest_vmem(p, 1, 0); ldub_raw(p);    })
+#define ldsb(p)    ({ trace_guest_vmem(p, 1, 0); ldsb_raw(p);    })
+#define lduw(p)    ({ trace_guest_vmem(p, 2, 0); lduw_raw(p);    })
+#define ldsw(p)    ({ trace_guest_vmem(p, 2, 0); ldsw_raw(p);    })
+#define ldl(p)     ({ trace_guest_vmem(p, 4, 0); ldl_raw(p);     })
+#define ldq(p)     ({ trace_guest_vmem(p, 8, 0); ldq_raw(p);     })
+#define ldfl(p)    ({ trace_guest_vmem(p, 4, 0); ldfl_raw(p);    })
+#define ldfq(p)    ({ trace_guest_vmem(p, 8, 0); ldfq_raw(p);    })
+#define stb(p, v)  ({ trace_guest_vmem(p, 1, 1); stb_raw(p, v);  })
+#define stw(p, v)  ({ trace_guest_vmem(p, 2, 1); stw_raw(p, v);  })
+#define stl(p, v)  ({ trace_guest_vmem(p, 4, 1); stl_raw(p, v);  })
+#define stq(p, v)  ({ trace_guest_vmem(p, 8, 1); stq_raw(p, v);  })
+#define stfl(p, v) ({ trace_guest_vmem(p, 4, 1); stfl_raw(p, v); })
+#define stfq(p, v) ({ trace_guest_vmem(p, 8, 1); stfq_raw(p, v); })
 
 #define cpu_ldub_code(env1, p) ldub_raw(p)
 #define cpu_ldsb_code(env1, p) ldsb_raw(p)
@@ -295,20 +297,20 @@  extern unsigned long reserved_va;
 #define cpu_stl_kernel(env, addr, data) stl_raw(addr, data)
 #define cpu_stq_kernel(env, addr, data) stq_raw(addr, data)
 
-#define ldub_kernel(p) ldub_raw(p)
-#define ldsb_kernel(p) ldsb_raw(p)
-#define lduw_kernel(p) lduw_raw(p)
-#define ldsw_kernel(p) ldsw_raw(p)
-#define ldl_kernel(p) ldl_raw(p)
-#define ldq_kernel(p) ldq_raw(p)
-#define ldfl_kernel(p) ldfl_raw(p)
-#define ldfq_kernel(p) ldfq_raw(p)
-#define stb_kernel(p, v) stb_raw(p, v)
-#define stw_kernel(p, v) stw_raw(p, v)
-#define stl_kernel(p, v) stl_raw(p, v)
-#define stq_kernel(p, v) stq_raw(p, v)
-#define stfl_kernel(p, v) stfl_raw(p, v)
-#define stfq_kernel(p, vt) stfq_raw(p, v)
+#define ldub_kernel(p)     ({ trace_guest_vmem(p, 1, 0); ldub_raw(p);    })
+#define ldsb_kernel(p)     ({ trace_guest_vmem(p, 1, 0); ldsb_raw(p);    })
+#define lduw_kernel(p)     ({ trace_guest_vmem(p, 2, 0); lduw_raw(p);    })
+#define ldsw_kernel(p)     ({ trace_guest_vmem(p, 2, 0); ldsw_raw(p);    })
+#define ldl_kernel(p)      ({ trace_guest_vmem(p, 4, 0); ldl_raw(p);     })
+#define ldq_kernel(p)      ({ trace_guest_vmem(p, 8, 0); ldq_raw(p);     })
+#define ldfl_kernel(p)     ({ trace_guest_vmem(p, 4, 0); ldfl_raw(p);    })
+#define ldfq_kernel(p)     ({ trace_guest_vmem(p, 8, 0); ldfq_raw(p);    })
+#define stb_kernel(p, v)   ({ trace_guest_vmem(p, 1, 1); stb_raw(p, v);  })
+#define stw_kernel(p, v)   ({ trace_guest_vmem(p, 2, 1); stw_raw(p, v);  })
+#define stl_kernel(p, v)   ({ trace_guest_vmem(p, 4, 1); stl_raw(p, v);  })
+#define stq_kernel(p, v)   ({ trace_guest_vmem(p, 8, 1); stq_raw(p, v);  })
+#define stfl_kernel(p, v)  ({ trace_guest_vmem(p, 4, 1); stfl_raw(p, v); })
+#define stfq_kernel(p, vt) ({ trace_guest_vmem(p, 8, 1); stfq_raw(p, v); })
 
 #define cpu_ldub_data(env, addr) ldub_raw(addr)
 #define cpu_lduw_data(env, addr) lduw_raw(addr)
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index ea90b64..f30cc4e 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -339,6 +339,8 @@  uint32_t helper_ldl_cmmu(CPUArchState *env, target_ulong addr, int mmu_idx);
 uint64_t helper_ldq_cmmu(CPUArchState *env, target_ulong addr, int mmu_idx);
 
 #define ACCESS_TYPE (NB_MMU_MODES + 1)
+/* do not trace '*_code' accesses during instruction disassembly */
+#define TRACE_TCG_CODE_ACCESSOR 1
 #define MEMSUFFIX _code
 
 #define DATA_SIZE 1
@@ -354,6 +356,7 @@  uint64_t helper_ldq_cmmu(CPUArchState *env, target_ulong addr, int mmu_idx);
 #include "exec/softmmu_header.h"
 
 #undef ACCESS_TYPE
+#undef TRACE_TCG_CODE_ACCESSOR
 #undef MEMSUFFIX
 
 #endif
diff --git a/include/exec/softmmu_header.h b/include/exec/softmmu_header.h
index d8d9c81..ccd9cb1 100644
--- a/include/exec/softmmu_header.h
+++ b/include/exec/softmmu_header.h
@@ -25,6 +25,11 @@ 
  * You should have received a copy of the GNU Lesser General Public
  * License along with this library; if not, see <http://www.gnu.org/licenses/>.
  */
+
+#if !defined(TRACE_TCG_CODE_ACCESSOR)
+#include "trace.h"
+#endif
+
 #if DATA_SIZE == 8
 #define SUFFIX q
 #define USUFFIX q
@@ -88,6 +93,10 @@  glue(glue(cpu_ld, USUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr)
     target_ulong addr;
     int mmu_idx;
 
+#if !defined(TRACE_TCG_CODE_ACCESSOR)
+    trace_guest_vmem(ptr, DATA_SIZE, 0);
+#endif
+
     addr = ptr;
     page_index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
     mmu_idx = CPU_MMU_INDEX;
@@ -109,6 +118,10 @@  glue(glue(cpu_lds, SUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr)
     target_ulong addr;
     int mmu_idx;
 
+#if !defined(TRACE_TCG_CODE_ACCESSOR)
+    trace_guest_vmem(ptr, DATA_SIZE, 0);
+#endif
+
     addr = ptr;
     page_index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
     mmu_idx = CPU_MMU_INDEX;
@@ -136,6 +149,10 @@  glue(glue(cpu_st, SUFFIX), MEMSUFFIX)(CPUArchState *env, target_ulong ptr,
     target_ulong addr;
     int mmu_idx;
 
+#if !defined(TRACE_TCG_CODE_ACCESSOR)
+    trace_guest_vmem(ptr, DATA_SIZE, 1);
+#endif
+
     addr = ptr;
     page_index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
     mmu_idx = CPU_MMU_INDEX;
diff --git a/tcg/tcg-op.h b/tcg/tcg-op.h
index 7eabf22..0ce2f81 100644
--- a/tcg/tcg-op.h
+++ b/tcg/tcg-op.h
@@ -2888,3 +2888,11 @@  static inline void tcg_gen_qemu_st64(TCGv_i64 arg, TCGv addr, int mem_index)
 # define tcg_gen_ext_i32_ptr(R, A) \
     tcg_gen_ext_i32_i64(TCGV_PTR_TO_NAT(R), (A))
 #endif /* TCG_TARGET_REG_BITS == 32 */
+
+#if !defined(TCG_OP_NOTRACE_GUEST_MEM)
+/* To avoid a circular dependency with helper.h, overload tcg_gen_qemu_*
+ * routines with preprocessor macros to insert TCG virtual memory access
+ * tracing.
+ */
+#include "trace/tcg-op-internal.h"
+#endif
diff --git a/tcg/tcg.c b/tcg/tcg.c
index acd02b9..7847277 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -47,6 +47,7 @@ 
 #define NO_CPU_IO_DEFS
 #include "cpu.h"
 
+#define TCG_OP_NOTRACE_GUEST_MEM
 #include "tcg-op.h"
 
 #if UINTPTR_MAX == UINT32_MAX
diff --git a/trace-events b/trace-events
index 1b668d1..3f5a55c 100644
--- a/trace-events
+++ b/trace-events
@@ -1186,3 +1186,18 @@  xen_pv_mmio_write(uint64_t addr) "WARNING: write to Xen PV Device MMIO space (ad
 # hw/pci/pci_host.c
 pci_cfg_read(const char *dev, unsigned devid, unsigned fnid, unsigned offs, unsigned val) "%s %02u:%u @0x%x -> 0x%x"
 pci_cfg_write(const char *dev, unsigned devid, unsigned fnid, unsigned offs, unsigned val) "%s %02u:%u @0x%x <- 0x%x"
+
+
+
+## Guest events, keep at bottom
+
+# @vaddr: Access' virtual address.
+# @size : Access' size (bytes).
+# @write: Whether the access is a write.
+#
+# Start virtual memory access (before any potential access violation).
+#
+# This event can be raised at execution time when running in 'user' mode.
+#
+# Targets: TCG(all)
+disable tcg guest_vmem(TCGv vaddr, uint8_t size, uint8_t write) "vaddr=0x%016"PRIx64" size=%d write=%d"
diff --git a/trace/tcg-op-internal.h b/trace/tcg-op-internal.h
new file mode 100644
index 0000000..fea46fa
--- /dev/null
+++ b/trace/tcg-op-internal.h
@@ -0,0 +1,55 @@ 
+/* -*- mode: c -*-
+ * Copyright (c) 2012-2014 Lluís Vilanova <vilanova@ac.upc.edu>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+/**
+ * @file Capture TCG code generation for virtual memory accesses.
+ *
+ * Assumes that no other lower-level call will be performed by target
+ * architecture disassembly code on TCG instructions for accessing memory.
+ *
+ * Capturing calls to higher-level functions like @tcg_gen_qemu_ld8u would allow
+ * using constants for the access size (instead of computing it from the memory
+ * operand argument), but is harder to maintain.
+ */
+
+#ifndef TRACE__TCG_OP_INTERNAL_H
+#define TRACE__TCG_OP_INTERNAL_H
+
+static inline uint8_t _tcg_memop_size(TCGMemOp op)
+{
+    return 1 << (op & MO_SIZE);
+}
+
+#define tcg_gen_qemu_ld_i32(val, addr, idx, memop)      \
+    do {                                                \
+        uint8_t _memop_size = _tcg_memop_size(memop);   \
+        trace_guest_vmem_tcg(addr, _memop_size, 0);     \
+        tcg_gen_qemu_ld_i32(val, addr, idx, memop);     \
+    } while (0)
+
+#define tcg_gen_qemu_st_i32(val, addr, idx, memop)      \
+    do {                                                \
+        uint8_t _memop_size = _tcg_memop_size(memop);   \
+        trace_guest_vmem_tcg(addr, _memop_size, 1);     \
+        tcg_gen_qemu_st_i32(val, addr, idx, memop);     \
+    } while (0)
+
+#define tcg_gen_qemu_ld_i64(val, addr, idx, memop)      \
+    do {                                                \
+        uint8_t _memop_size = _tcg_memop_size(memop);   \
+        trace_guest_vmem_tcg(addr, _memop_size, 0);     \
+        tcg_gen_qemu_ld_i64(val, addr, idx, memop);     \
+    } while (0)
+
+#define tcg_gen_qemu_st_i64(val, addr, idx, memop)      \
+    do {                                                \
+        uint8_t _memop_size = _tcg_memop_size(memop);   \
+        trace_guest_vmem_tcg(addr, _memop_size, 1);     \
+        tcg_gen_qemu_st_i64(val, addr, idx, memop);     \
+    } while (0)
+
+#endif  /* TRACE__TCG_OP_INTERNAL_H */