Patchwork [2/3] Tracepoint, buffer & monitor framework

login
register
mail settings
Submitter Prerna Saxena
Date May 24, 2010, 4:16 p.m.
Message ID <4BFAA673.4050809@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/53436/
State New
Headers show

Comments

Prerna Saxena - May 24, 2010, 4:16 p.m.
Patch that implements tracepoint framework + trace buffer + monitor support.

tracing can be enabled at compile time using --enable-trace switch, 
which compiles tracing support(all of these).

Monitor commands introduced :
1. info trace : to see contents of trace buffer.
2. info tracepoints : to see available tracepoints and their status.
3. trace [on|off] : to enable / disable trace data collection.
4. tracepoint ABC [on|off] : to enable / disable traces from a specific 
tracepoint, eg ABC
Stefan Hajnoczi - May 25, 2010, 11:40 a.m.
+#define DEFINE_TRACE(name, tproto, tassign, tprint)	\
+	void trace_##name(tproto)				\
+	{ 							\
+	  unsigned int hash;					\
+	  char tpname[] = __stringify(name);			\
+	  struct tracepoint *tp;				\
+	  struct __trace_struct_##name var, *entry;		\
+								\
+	  hash = tdb_hash(tpname);		\
+	  tp = find_tracepoint_by_name(tpname);			\
+	  if (tp == NULL || !tp->state)				\
+		return;						\
+								\
+	  entry = &var;						\
+	  tassign						\
+								\
+	  write_trace_to_buffer(&qemu_buf, hash, tp->trace_id,	\
+		entry, sizeof(struct __trace_struct_##name));	\
+	}							\

I think this is too much work.  Let each tracepoint have its own global struct
tracepoint so it can directly reference it using tracepoint_##name - no hash
lookup needed.  Add the QLIST_ENTRY directly to struct tracepoint so the
tracepoint register/unregister code can assign ids and look up tracepoints by
name.  No critical path code needs to do name lookups and the hash table can
disappear.

+#define DECLARE_TRACE(name, tproto, tstruct)			\
+	struct __trace_struct_##name {				\
+		tstruct						\
+	};							\

Should this struct be packed so more fields can fit?

+    trace_queue->trace_buffer[tmp].metadata.write_complete = 0;

This is not guaranteed to work without memory barriers.  There is no way for
the trace consumer to block until there is more data available.  The
synchronization needs to consider writing traces to a file, which has different
constraints than dumping the current contents of the trace buffer.

We're missing a way to trace to a file.  That could be done in binary or text.
It would be easier in text because we already have the format strings and don't
need a unique ID mapping in an external binary parsing tool.

Making data available after crash is also useful.  The easiest way is to dump
the trace buffer from the core dump using gdb.  However, we'd need some way of
making sense of the bytes.  That could be done by reading the tracepoint_lib
structures from the core dump.

(The way I do trace recovery from a core dump in my simple tracer is to binary
dump the trace buffer from the core dump.  Since the trace buffer contents are
normally written out to file unchanged anyway, the simpletrace.py script can
read the dumped trace buffer like a normal trace file.)

Nitpicks:

Some added lines of code use tabs for indentation, 4 space indentation should
be used.

+    {
+        .name       = "tracepoint",
+        .args_type  = "",
+        .params     = "",
+        .help       = "show contents of trace buffer",

Copy-pasted, .help not updated.

@@ -145,6 +147,10 @@ struct Monitor {
 #ifdef CONFIG_DEBUG_MONITOR
     int print_calls_nr;
 #endif
+#ifdef CONFIG_QEMU_TRACE
+    struct DebugBuffer *qemu_buf_ptr;
+#endif
+
     QError *error;
     QLIST_HEAD(,mon_fd_t) fds;
     QLIST_ENTRY(Monitor) entry;

Would TraceBuffer be a more appropriate name for DebugBuffer?  qemu_buf_ptr is
vague, perhaps trace_buf is more clear?

I'm not sure I understand the reason for qemu_buf_ptr.  There is already a
global qemu_buf and qemu_buf_ptr is a pointer to that?

+    if(!strncmp(tp_state, "on", 3))
[...]
+	    if(!strncmp(tp_state, "off", 4))

"on" with 3 and "off" with 4 are equivalent to strcmp().  "on" with 2 and
"off" with 3 would allow for any suffix after the matched string.

+#else /* CONFIG_QEMU_TRACE */
+static void do_tracepoint_status(Monitor *mon, const QDict *qdict)
+{
+    monitor_printf(mon, "Internal tracing not compiled\n");
+}
+#endif

"tracepoint" has this !CONFIG_QEMU_TRACE function but "trace" doesn't.

+#define INCREMENT_INDEX(HEAD,IDX) (HEAD->IDX++) % HEAD->buf_size
[...]
+    if ((trace_queue->last + 1) % trace_queue->buf_size
+					 == trace_queue->first)
+	trace_queue->first = INCREMENT_INDEX(trace_queue, first);
+    trace_queue->last  = INCREMENT_INDEX(trace_queue, last);

Slightly safer macro:
#define NEXT_INDEX(HEAD,IDX) (((HEAD)->IDX + 1) % (HEAD)->buf_size)
[...]
if (NEXT_INDEX(trace_queue, last) == trace_queue->first)
    trace_queue->first = NEXT_INDEX(trace_queue, first);
trace_queue->last  = NEXT_INDEX(trace_queue, last);

+    tmp = trace_queue->last;
Instead of using tmp:
DebugEntry *entry = &trace_queue->trace_buffer[trace_queue->last];

This will make the code to fill out the trace entry easier to read.
"trace_queue->trace_buffer[tmp]." can be replaced with "entry->".

+#ifndef __TRACE_BUFFER_H__
+#define __TRACE_BUFFER_H__

QEMU doesn't use __ for include guards, just TRACE_BUFFER_H.

+#define EFFECTIVE_SLOT_SIZE     SLOT_SIZE-9
+
+struct DebugEntry{
+    char data[EFFECTIVE_SLOT_SIZE];
+    uint64_t timestamp;
+    struct {
+	unsigned write_complete :1;
+	unsigned reserved :7;
+	} metadata;
+};

There will be padding after data[] and before timestamp.  Perhaps data[]
should be the last field of the struct.  Also, QEMU coding style typedefs
structs and doesn't using the explicit "struct DebugEntry" tag.

I think this could be called a TraceEntry.

+struct DebugBuffer {
[...]
+	struct DebugEntry trace_buffer[BUFFER_SIZE];
+};

I think the struct could be called TraceBuffer and the array would be called
entries[].

@@ -158,7 +162,7 @@ ETEXI

 STEXI
 @item change @var{device} @var{setting}
-@findex change
+@findex :change

 Change the configuration of a device.

Not sure what this hunk is doing :).

+struct tracepoint {
+	char *name;			/* Tracepoint name */
+	uint8_t  trace_id;		/* numerical ID */

Maximum 256 tracepoints in QEMU?  A limit of 65536 is less likely to
be an issue in the future.

+	void __trace_print_##name(Monitor *mon, void *data)	\
+	{							\
+	  struct __trace_struct_##name *entry;			\
+								\
+	  if(!entry)						\
+		return;						\

This does not work, entry is not initialized.

+#define DO_TRACE(name, args...)					\
+	    trace_##name(args);

This macro is unused?

+/* In essence, the structure becomes :
+ * struct tracepoint_lib {

This comment will get out of sync easily.

+       qemu_malloc(sizeof(struct tracepoint_lib));
+
+    if (!new_entry)
+       return NULL;    /* No memory */

qemu_malloc() does not return NULL on out-of-memory, it aborts the program.
Same for allocating new_entry->entry.name.

+    new_entry->entry.name = (char*)qemu_malloc(strlen(name)+1);
+    if(!new_entry->entry.name)
+	return NULL;
+
+    strncpy(new_entry->entry.name, name, strlen(name)+1);

Perhaps just strdup() instead of manual qemu_malloc()/strncpy().

Stefan
Prerna Saxena - May 25, 2010, 6:20 p.m.
Hi Stefan,
Thanks for having a look.
As I'd mentioned, this patchset is *work in progress*, which explains 
the dummy comments and coding style violations at places :) I was merely 
sharing a draft of what my approach is -- so that we can work together 
on how much of it can add to the trace framework you've proposed.

On 05/25/2010 05:10 PM, Stefan Hajnoczi wrote:

> I think this is too much work.  Let each tracepoint have its own global struct
> tracepoint so it can directly reference it using tracepoint_##name - no hash
> lookup needed.  Add the QLIST_ENTRY directly to struct tracepoint so the
> tracepoint register/unregister code can assign ids and look up tracepoints by
> name.  No critical path code needs to do name lookups and the hash table can
> disappear.

I had employed a combination of hash (derived from name) and an ID 
(which is the offset within a hash bucket where the tracepoint details 
are stored) to determine tracepoint information for a given name. Your 
suggestion to eliminate name queries is good, let me see how much of 
this can be scaled down.

>
> +#define DECLARE_TRACE(name, tproto, tstruct)			\
> +	struct __trace_struct_##name {				\
> +		tstruct						\
> +	};							\
>
> Should this struct be packed so more fields can fit?
>

Yes, indeed. Thanks for reminding !

> +    trace_queue->trace_buffer[tmp].metadata.write_complete = 0;
>
> This is not guaranteed to work without memory barriers.  There is no way for
> the trace consumer to block until there is more data available.  The
> synchronization needs to consider writing traces to a file, which has different
> constraints than dumping the current contents of the trace buffer.
>
> We're missing a way to trace to a file.  That could be done in binary or text.
> It would be easier in text because we already have the format strings and don't
> need a unique ID mapping in an external binary parsing tool.
>

OK, at the time of working on this I hadnt really thought of dumping 
traces in a file. It meant too much of IO latency that such tracing 
would bring in. My idea of a tracer entailed buffer based logging with a 
simple reader(see last)

> Making data available after crash is also useful.  The easiest way is to dump
> the trace buffer from the core dump using gdb.  However, we'd need some way of
> making sense of the bytes.  That could be done by reading the tracepoint_lib
> structures from the core dump.
>

Agree.

> (The way I do trace recovery from a core dump in my simple tracer is to binary
> dump the trace buffer from the core dump.  Since the trace buffer contents are
> normally written out to file unchanged anyway, the simpletrace.py script can
> read the dumped trace buffer like a normal trace file.)
>
> Nitpicks:
>

As I mentioned, this is work in progress so you'd have seen quite a lot 
of violations. Thanks for pointing those out, I'll clean those up for 
whatever approach we choose to use :)

> Some added lines of code use tabs for indentation, 4 space indentation should
> be used.
>
> +struct tracepoint {
> +	char *name;			/* Tracepoint name */
> +	uint8_t  trace_id;		/* numerical ID */
>
> Maximum 256 tracepoints in QEMU?  A limit of 65536 is less likely to
> be an issue in the future.
>

No, this field describes the maximum tracepoints for a given hash queue.

> +	void __trace_print_##name(Monitor *mon, void *data)	\
> +	{							\
> +	  struct __trace_struct_##name *entry;			\
> +								\
> +	  if(!entry)						\
> +		return;						\
>
> This does not work, entry is not initialized.

Typo ! should've been : if(!data)

>
> +#define DO_TRACE(name, args...)					\
> +	    trace_##name(args);
>
> This macro is unused?

A relic that needs to be cleaned :)

>
> +/* In essence, the structure becomes :
> + * struct tracepoint_lib {
>
> This comment will get out of sync easily.
>
> +       qemu_malloc(sizeof(struct tracepoint_lib));
> +
> +    if (!new_entry)
> +       return NULL;    /* No memory */
>
> qemu_malloc() does not return NULL on out-of-memory, it aborts the program.
> Same for allocating new_entry->entry.name.
>

Wondering how I forgot that ! thanks for reminding.

> +    new_entry->entry.name = (char*)qemu_malloc(strlen(name)+1);
> +    if(!new_entry->entry.name)
> +	return NULL;
> +
> +    strncpy(new_entry->entry.name, name, strlen(name)+1);
>
> Perhaps just strdup() instead of manual qemu_malloc()/strncpy().
>
> Stefan
>

I'll work on merging this circular buffer + monitor-based reader as a 
backend for your proposed tracer. Would it be a good idea to have two 
trace buffers -- when one is full, it gets written to disk ; while the 
second is used to log traces.
I think the monitor interface for reading traces can be retained as is.
Also, I'd implemented the monitor interface for enabling/disabling data 
logging for a given tracepoint (for a running guest) Not sure if this is 
supported in the set of patches you've posted ? It might be a good to 
have feature.
Stefan Hajnoczi - May 25, 2010, 8:07 p.m.
On Tue, May 25, 2010 at 7:20 PM, Prerna Saxena
<prerna@linux.vnet.ibm.com> wrote:
>> Some added lines of code use tabs for indentation, 4 space indentation
>> should
>> be used.
>>
>> +struct tracepoint {
>> +       char *name;                     /* Tracepoint name */
>> +       uint8_t  trace_id;              /* numerical ID */
>>
>> Maximum 256 tracepoints in QEMU?  A limit of 65536 is less likely to
>> be an issue in the future.
>>
>
> No, this field describes the maximum tracepoints for a given hash queue.

I see now, thanks.

> I'll work on merging this circular buffer + monitor-based reader as a
> backend for your proposed tracer. Would it be a good idea to have two trace
> buffers -- when one is full, it gets written to disk ; while the second is
> used to log traces.

In a double-buffering approach there are finite resources.  There
needs to be a case for when the write-out buffer hasn't been written
yet and the active trace buffer becomes full.  I think in that case
the active buffer should start overwriting the oldest entry.

> I think the monitor interface for reading traces can be retained as is.
> Also, I'd implemented the monitor interface for enabling/disabling data
> logging for a given tracepoint (for a running guest) Not sure if this is
> supported in the set of patches you've posted ? It might be a good to have
> feature.

The "disable" trace event feature in my tracing backends patchset
allows statically disabling a trace event.  It doesn't support
enabling/disabling trace events at runtime, which is left up to the
backend.

The motivation for the "disable" attribute in the trace-events file is
to allow completely disabling a trace event without having to remove
it from trace-events *and* removing trace_*() calls in QEMU source
code.  It's a handy way of completely knocking out a trace event.

Thanks for your patches,
Stefan

Patch

Signed-off by : Prerna (prerna@linux.vnet.ibm.com)

Index: qemu/monitor.c
===================================================================
--- qemu.orig/monitor.c
+++ qemu/monitor.c
@@ -55,7 +55,9 @@ 
 #include "json-streamer.h"
 #include "json-parser.h"
 #include "osdep.h"
-
+#ifdef CONFIG_QEMU_TRACE
+#include "tracepoint.h"
+#endif
 //#define DEBUG
 //#define DEBUG_COMPLETION
 
@@ -145,6 +147,10 @@  struct Monitor {
 #ifdef CONFIG_DEBUG_MONITOR
     int print_calls_nr;
 #endif
+#ifdef CONFIG_QEMU_TRACE
+    struct DebugBuffer *qemu_buf_ptr;
+#endif
+
     QError *error;
     QLIST_HEAD(,mon_fd_t) fds;
     QLIST_ENTRY(Monitor) entry;
@@ -1198,6 +1204,63 @@  static void do_log(Monitor *mon, const Q
     cpu_set_log(mask);
 }
 
+#ifdef CONFIG_QEMU_TRACE
+static void do_trace(Monitor *mon, const QDict *qdict)
+{
+    const char *option = qdict_get_try_str(qdict, "option");
+    if (!option || !strncmp(option, "on", 3)) {
+	mon->qemu_buf_ptr = &qemu_buf;
+        initialize_buffer(mon->qemu_buf_ptr);
+    } else if (!strncmp(option, "off", 4)) {
+       mon->qemu_buf_ptr->enabled = 0;
+    } else {
+        monitor_printf(mon, "Invalid option %s\n", option);
+    }
+}
+
+static void do_tracepoint_status(Monitor *mon, const QDict *qdict)
+{
+    uint8_t status_num;
+    struct tracepoint *t;
+    const char *tp_name = qdict_get_str(qdict, "name");
+    const char *tp_state = qdict_get_str(qdict, "option");
+
+    if(!mon->qemu_buf_ptr || !mon->qemu_buf_ptr->enabled) {
+        monitor_printf(mon, "Tracing disabled. Use \"trace on\" before this\n");
+	return;
+    }
+    if(!strncmp(tp_state, "on", 3))
+	status_num = 1;
+    else {
+	    if(!strncmp(tp_state, "off", 4))
+		status_num = 0;
+	    else {
+        	monitor_printf(mon, "Invalid option : %s. Use [on|off]\n",
+								tp_state);
+		return;
+		 }
+	 }
+    t = find_tracepoint_by_name(tp_name);
+    if(t == NULL) {
+        monitor_printf(mon, "Tracepoint %s does not exist\n", tp_name);
+	return;
+	}
+    if(t->state == status_num) {
+        monitor_printf(mon, "State of Tracepoint %s is already %s\n",
+						 tp_name, tp_state);
+	return;
+	}
+    /* Do I need a spin_lock here ? */
+    t->state = status_num;
+    return;
+}
+#else /* CONFIG_QEMU_TRACE */
+static void do_tracepoint_status(Monitor *mon, const QDict *qdict)
+{
+    monitor_printf(mon, "Internal tracing not compiled\n");
+}
+#endif
+
 static void do_singlestep(Monitor *mon, const QDict *qdict)
 {
     const char *option = qdict_get_try_str(qdict, "option");
@@ -2170,6 +2233,42 @@  static void do_info_profile(Monitor *mon
 }
 #endif
 
+#ifdef CONFIG_QEMU_TRACE
+
+static void qemu_dump_trace(Monitor *mon, struct DebugEntry *data)
+{
+    int i;
+
+    char *iter = (char *)data;
+    for (i=0; i<SLOT_SIZE/sizeof(int); i++)
+	 monitor_printf(mon, "%c",iter[i]);
+    monitor_printf(mon, "\n");
+    return;
+}
+
+static void do_info_trace(Monitor *mon)
+{
+    struct DebugEntry data;
+    /* TODO : change def of read to have null return*/
+    if(!mon->qemu_buf_ptr || !mon->qemu_buf_ptr->enabled)
+	return;
+
+    while(1) {
+	if(!read_trace_from_buffer(mon->qemu_buf_ptr, &data))
+	    break ;
+	/* qemu_dump_trace() is a simple hex dump */
+//	 qemu_dump_trace(mon, &data);
+	print_trace(&data, mon);
+    };
+    return;
+}
+#else /* CONFIG_QEMU_TRACE */
+static void do_info_trace(Monitor *mon)
+{
+    monitor_printf(mon, "Internal tracing not compiled\n");
+}
+#endif
+
 /* Capture support */
 static QLIST_HEAD (capture_list_head, CaptureState) capture_head;
 
@@ -2779,6 +2878,20 @@  static const mon_cmd_t info_cmds[] = {
         .mhandler.info = do_info_roms,
     },
     {
+        .name       = "trace",
+        .args_type  = "",
+        .params     = "",
+        .help       = "show contents of trace buffer",
+        .mhandler.info = do_info_trace,
+    },
+    {
+        .name       = "tracepoint",
+        .args_type  = "",
+        .params     = "",
+        .help       = "show contents of trace buffer",
+        .mhandler.info = do_info_tracepoint,
+    },
+    {
         .name       = NULL,
     },
 };
@@ -4633,6 +4746,9 @@  void monitor_init(CharDriverState *chr, 
     QLIST_INSERT_HEAD(&mon_list, mon, entry);
     if (!default_mon || (flags & MONITOR_IS_DEFAULT))
         default_mon = mon;
+
+    /* Disable trace buffer unless enabled */
+    mon->qemu_buf_ptr = NULL;
 }
 
 static void bdrv_password_cb(Monitor *mon, const char *password, void *opaque)
Index: qemu/trace-buffer.c
===================================================================
--- /dev/null
+++ qemu/trace-buffer.c
@@ -0,0 +1,111 @@ 
+/*
+ * Trace buffer implementation for QEMU.
+ *
+ * Copyright IBM, Corp. 2010
+ *
+ * Author:
+ *  Prerna   <prerna@linux.vnet.ibm.com>
+ *
+ *  This work is licensed under the terms of the GNU GPL, version 2.  See
+ *  the COPYING file in the top-level directory.
+ *
+ */
+
+#include<string.h>
+
+#include "trace-buffer.h"
+
+#define INCREMENT_INDEX(HEAD,IDX) (HEAD->IDX++) % HEAD->buf_size
+/* TODOS :
+ * functions for computing timestamps, etc
+ */
+
+/* Trace Buffer Manipulation Functions */
+
+/* Initialise the buffer.
+ * @head	: pointer to the trace queue
+ */
+void initialize_buffer(struct DebugBuffer *head)
+{
+    if (!head)         /* Buffer doesnt exist, do nothing */
+       return ;
+
+    head->buf_size = BUFFER_SIZE;
+    head->first = head->last = 0;
+    head->enabled = 1;
+    return;
+}
+
+/* Write trace data to buffer.
+ * @trace_queue	: pointer to the trace queue.
+ * @data	: data to be written.
+ * @len         : Length of data to be written.
+ * @type       	: class of events.
+ * return      	: 0 for successful write.
+ *             	: -1 if size of data exceeds slot size.
+ */
+int write_trace_to_buffer(struct DebugBuffer *trace_queue,
+			uint32_t hash, uint8_t id, const void *data,
+				size_t len/*, EVENT_TYPE etype*/)
+{
+    int tmp;
+
+    /* Exit early if tracing is disabled */
+    if (!trace_queue || !trace_queue->enabled)
+	return -1;
+
+    /* Data not available or too large data */
+    if (!data || !len || len > EFFECTIVE_SLOT_SIZE - 5)
+       return -1;
+
+    /* NOTE : the buffer operates in overwrite mode */
+    tmp = trace_queue->last;
+    if ((trace_queue->last + 1) % trace_queue->buf_size
+					 == trace_queue->first)
+	trace_queue->first = INCREMENT_INDEX(trace_queue, first);
+    trace_queue->last  = INCREMENT_INDEX(trace_queue, last);
+
+    /* TODO: Placeholder for timestamp computation */
+
+    trace_queue->trace_buffer[tmp].metadata.write_complete = 0;
+    memcpy(trace_queue->trace_buffer[tmp].data, &hash, sizeof(hash));
+    memcpy(&trace_queue->trace_buffer[tmp].data[sizeof(hash)], &id,
+								sizeof(id));
+    memcpy(&trace_queue->trace_buffer[tmp].data[sizeof(hash)+sizeof(id)],
+								 data, len);
+    memset(&trace_queue->trace_buffer[tmp].data[len+sizeof(hash)+ sizeof(id)],
+		0, EFFECTIVE_SLOT_SIZE-len-(sizeof(hash)+sizeof(id)));
+    trace_queue->trace_buffer[tmp].metadata.write_complete = 1;
+    return 0;
+}
+
+/* Read trace data from buffer.
+ * @trace_queue	: pointer to the trace queue.
+ * @data      	: pointer to DebugEntry variable
+ *                into which data gets read.
+ * return      	: pointer to address of DebugEntry variable
+ *                into which data gets read.
+ *                NULL in case of error.
+ */
+struct DebugEntry* read_trace_from_buffer(struct DebugBuffer *trace_queue,
+						struct DebugEntry *data)
+{
+    /* Exit early if tracing is disabled */
+    if (!trace_queue->enabled)
+	return NULL;
+
+    if (!trace_queue || !data )
+	return NULL;
+
+    /* check for empty buffer OR incomplete writes*/
+    if (trace_queue->first == trace_queue->last ||
+	!trace_queue->trace_buffer[trace_queue->first].
+					metadata.write_complete)
+	return NULL;
+
+    memcpy(data, &trace_queue->trace_buffer[trace_queue->first],
+					 sizeof(struct DebugEntry));
+    trace_queue->first = INCREMENT_INDEX(trace_queue, first);
+
+    return data;
+}
Index: qemu/trace-buffer.h
===================================================================
--- /dev/null
+++ qemu/trace-buffer.h
@@ -0,0 +1,56 @@ 
+/*
+ * Trace buffer implementation for QEMU.
+ *
+ * Copyright IBM, Corp. 2010
+ *
+ * Author:
+ *   Prerna   <prerna@linux.vnet.ibm.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+
+#ifndef __TRACE_BUFFER_H__
+#define __TRACE_BUFFER_H__
+#include<stdint.h>
+
+#include"qemu-common.h"
+
+#define BUFFER_SIZE		128 /* Number of debug slots */
+#define SLOT_SIZE       	40 /* size of buffer slot in bytes */
+#define EFFECTIVE_SLOT_SIZE     SLOT_SIZE-9
+
+struct DebugEntry{
+    char data[EFFECTIVE_SLOT_SIZE];
+    uint64_t timestamp;
+    struct {
+	unsigned write_complete :1;
+	unsigned reserved :7;
+	} metadata;
+};
+
+struct DebugBuffer {
+	uint8_t enabled;	/* flag to test if buffer is enabled */
+				/* 1 for enabled, 0 for disabled */
+	uint16_t buf_size;	/* Size of buffer */
+	uint16_t first;		/* Index to first entry of queue
+				 * from where first 'read' occurs.
+				 */
+	uint16_t last;		/* Index to last entry of queue
+				 * at which next 'write' must occur.
+				 */
+	struct DebugEntry trace_buffer[BUFFER_SIZE];
+};
+
+struct DebugBuffer qemu_buf;
+
+/* Trace Buffer manipulation functions */
+void initialize_buffer (struct DebugBuffer *head);
+int write_trace_to_buffer(struct DebugBuffer *trace_queue,
+			uint32_t hash, uint8_t id, const void *data,
+					size_t len/*, EVENT_TYPE etype*/);
+struct DebugEntry * read_trace_from_buffer(struct DebugBuffer *trace_queue,
+						struct DebugEntry *data);
+#endif /* __TRACE_BUFFER_H__ */
Index: qemu/qemu-monitor.hx
===================================================================
--- qemu.orig/qemu-monitor.hx
+++ qemu/qemu-monitor.hx
@@ -114,6 +114,10 @@  show migration status
 show balloon information
 @item info qtree
 show device tree
+@item info trace
+show contents of trace buffer
+@item info tracepoint
+show status of all tracepoints
 @end table
 ETEXI
 
@@ -158,7 +162,7 @@  ETEXI
 
 STEXI
 @item change @var{device} @var{setting}
-@findex change
+@findex :change
 
 Change the configuration of a device.
 
@@ -235,9 +239,41 @@  STEXI
 @item log @var{item1}[,...]
 @findex log
 Activate logging of the specified items to @file{/tmp/qemu.log}.
+#ifdef CONFIG_QEMU_TRACE
 ETEXI
 
     {
+        .name       = "trace",
+        .args_type  = "option:s?",
+        .params     = "[on|off]",
+        .help       = "activate logging of the specified items ",
+        .mhandler.cmd = do_trace,
+    },
+
+STEXI
+@item trace [on|off]
+@findex trace
+If called with option on, it enables tracing.
+If called with option off, tracing is disabled.
+#endif
+ETEXI
+
+    {
+        .name       = "tracepoint",
+        .args_type  = "name:s,option:s",
+        .params     = "name on|off",
+        .help       = "activate / deactivate tracepoint ",
+        .mhandler.cmd = do_tracepoint_status,
+    },
+
+STEXI
+@item tracepoint @var [on|off]
+@findex tracepoint
+If called with option 'on', it enables tracepoint @var.
+If called with option 'off', tracepoint @var is disabled.
+ETEXI
+
+     {
         .name       = "savevm",
         .args_type  = "name:s?",
         .params     = "[tag|id]",
Index: qemu/vl.c
===================================================================
--- qemu.orig/vl.c
+++ qemu/vl.c
@@ -166,6 +166,10 @@  int main(int argc, char **argv)
 //#define DEBUG_NET
 //#define DEBUG_SLIRP
 
+#ifdef CONFIG_QEMU_TRACE
+#include "trace-entries.h"
+#endif
+
 #define DEFAULT_RAM_SIZE 128
 
 #define MAX_VIRTIO_CONSOLES 1
@@ -3653,6 +3657,11 @@  int main(int argc, char **argv, char **e
         drive_add(NULL, SD_ALIAS);
     }
 
+#ifdef CONFIG_QEMU_TRACE
+    init_tracepoint_tbl();
+    init_tracepoints();
+#endif
+
     /* open the virtual block devices */
     if (snapshot)
         qemu_opts_foreach(&qemu_drive_opts, drive_enable_snapshot, NULL, 0);
Index: qemu/configure
===================================================================
--- qemu.orig/configure
+++ qemu/configure
@@ -283,6 +283,7 @@  bsd="no"
 linux="no"
 solaris="no"
 profiler="no"
+trace="no"
 cocoa="no"
 softmmu="yes"
 linux_user="no"
@@ -575,6 +576,10 @@  for opt do
   ;;
   --enable-profiler) profiler="yes"
   ;;
+  --enable-trace) trace="yes"
+  ;;
+  --disable-trace) trace="no"
+  ;;
   --enable-cocoa)
       cocoa="yes" ;
       sdl="no" ;
@@ -2003,6 +2008,7 @@  echo "gprof enabled     $gprof"
 echo "sparse enabled    $sparse"
 echo "strip binaries    $strip_opt"
 echo "profiler          $profiler"
+echo "trace             $trace"
 echo "static build      $static"
 echo "-Werror enabled   $werror"
 if test "$darwin" = "yes" ; then
@@ -2121,6 +2127,9 @@  fi
 if test $profiler = "yes" ; then
   echo "CONFIG_PROFILER=y" >> $config_host_mak
 fi
+if test $trace = "yes" ; then
+  echo "CONFIG_QEMU_TRACE=y" >> $config_host_mak
+fi
 if test "$slirp" = "yes" ; then
   echo "CONFIG_SLIRP=y" >> $config_host_mak
   QEMU_CFLAGS="-I\$(SRC_PATH)/slirp $QEMU_CFLAGS"
Index: qemu/Makefile.target
===================================================================
--- qemu.orig/Makefile.target
+++ qemu/Makefile.target
@@ -162,6 +162,10 @@  endif #CONFIG_BSD_USER
 ifdef CONFIG_SOFTMMU
 
 obj-y = arch_init.o cpus.o monitor.o machine.o gdbstub.o balloon.o
+#add option for trace-buffer to be integrated
+ifdef CONFIG_QEMU_TRACE
+obj-y += $(trace-obj-y)
+endif
 # virtio has to be here due to weird dependency between PCI and virtio-net.
 # need to fix this properly
 obj-y += virtio-blk.o virtio-balloon.o virtio-net.o virtio-serial-bus.o
Index: qemu/tracepoint.h
===================================================================
--- /dev/null
+++ qemu/tracepoint.h
@@ -0,0 +1,95 @@ 
+/*
+ * Tracepoint implementation for QEMU.
+ *
+ * Copyright IBM, Corp. 2010
+ *
+ * Author:
+ *  Prerna   <prerna@linux.vnet.ibm.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef __TRACEPOINT_H__
+#define __TRACEPOINT_H__
+#include<stdarg.h>
+
+#include "qemu-queue.h"
+#include "trace-buffer.h"
+#include "monitor.h"
+#include "qdict.h"
+
+struct tracepoint {
+	char *name;			/* Tracepoint name */
+	uint8_t  trace_id;		/* numerical ID */
+	uint8_t state;			/* State. */
+	void (*print_fn)(Monitor *mon, void *ptr);
+					/* print function */
+} __attribute__((aligned(32)));
+
+#define DECLARE_TRACE(name, tproto, tstruct)			\
+	struct __trace_struct_##name {				\
+		tstruct						\
+	};							\
+	extern void trace_##name(tproto);			\
+	extern void __trace_print_##name(Monitor *mon, void *data);
+
+#define DEFINE_TRACE(name, tproto, tassign, tprint)	\
+	void trace_##name(tproto)				\
+	{ 							\
+	  unsigned int hash;					\
+	  char tpname[] = __stringify(name);			\
+	  struct tracepoint *tp;				\
+	  struct __trace_struct_##name var, *entry;		\
+								\
+	  hash = tdb_hash(tpname);		\
+	  tp = find_tracepoint_by_name(tpname);			\
+	  if (tp == NULL || !tp->state)				\
+		return;						\
+								\
+	  entry = &var;						\
+	  tassign						\
+								\
+	  write_trace_to_buffer(&qemu_buf, hash, tp->trace_id,	\
+		entry, sizeof(struct __trace_struct_##name));	\
+	}							\
+								\
+	void __trace_print_##name(Monitor *mon, void *data)	\
+	{							\
+	  struct __trace_struct_##name *entry;			\
+								\
+	  if(!entry)						\
+		return;						\
+								\
+	  entry = (struct __trace_struct_##name *)data;		\
+	  monitor_printf(mon,tprint);				\
+	  return;						\
+	}
+
+#define INIT_TRACE(name) 					\
+	register_tracepoint(__stringify_1(name), __trace_print_##name)
+
+#define DO_TRACE(name, args...)					\
+	    trace_##name(args);
+
+#define TP_PROTO(proto...)		proto
+#define TP_ARGS(args...)		args
+#define __field(type, item)            	type    item
+#define TP_STRUCT__entry(args...) 	args
+#define TP_fast_assign(args...) 	args
+
+#define __entry		 		entry
+#define __stringify_1(x...)		#x
+#define __stringify(x...)		__stringify_1(x)
+#define TP_printk(fmt, args...) 	 fmt, args
+
+
+void init_tracepoint_tbl(void);
+extern struct tracepoint* register_tracepoint(const char *name,
+			void *print_func);
+extern struct tracepoint* find_tracepoint_by_name(const char *name);
+extern struct tracepoint* find_tracepoint_by_id(uint32_t hash, uint8_t id);
+extern void print_trace(struct DebugEntry *ptr, Monitor *mon);
+void do_info_tracepoint(Monitor *mon);
+#endif /* __TRACEPOINT_H__ */
Index: qemu/tracepoint.c
===================================================================
--- /dev/null
+++ qemu/tracepoint.c
@@ -0,0 +1,216 @@ 
+/*
+ * Tracepoint implementation for QEMU.
+ *
+ * Copyright IBM, Corp. 2010
+ *
+ * Author:
+ *  Prerna   <prerna@linux.vnet.ibm.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "tracepoint.h"
+
+#define TRACEPOINT_HASH_BITS 6
+#define TRACEPOINT_TABLE_SIZE (1 << TRACEPOINT_HASH_BITS)
+static QLIST_HEAD(tracepoint_head, tracepoint_lib)
+			tracepoint_table[TRACEPOINT_TABLE_SIZE];
+
+/* In essence, the structure becomes :
+ * struct tracepoint_lib {
+ * 	struct tracepoint {
+ * 		char *name;
+ * 		uint8_t trace_id;
+ * 		uint8_t state;
+ * 		void (*print_fn)(Monitor *mon, void *ptr);
+ * 	}entry;
+ * 	struct {
+ * 		struct tracepoint_lib *le_next;
+ * 		struct tracepoint_lib **le_prev;
+ * 	}list_ptr;
+ *
+ */
+struct tracepoint_lib {
+	struct tracepoint entry;
+	QLIST_ENTRY(tracepoint_lib) list_ptr;
+};
+
+/*
+ * Initialize tracepoint table
+ *
+ */
+void init_tracepoint_tbl(void)
+{
+    int i=0;
+    for (i=0; i< TRACEPOINT_TABLE_SIZE; i++)
+	QLIST_INIT(&tracepoint_table[i]);
+}
+
+/*
+ * Register a given tracepoint in the global library
+ * @name : tracepoint name string
+ * @print_func : ptr to function that eventually prints logged traces.
+ * return : struct tracepoint * , NULL for error.
+ *
+ */
+struct tracepoint* register_tracepoint(const char *name, void *print_func)
+{
+    unsigned int hash;
+    uint8_t idx=0;
+    struct tracepoint_head *head;
+    struct tracepoint_lib *e, *new_entry, *prev;
+
+    if (!name || !print_func)
+	return NULL;
+
+    hash = tdb_hash(name);
+    head = &tracepoint_table[hash % TRACEPOINT_TABLE_SIZE];
+
+    e = find_tracepoint_by_name(name);
+    if (e)
+	return e;
+
+    new_entry = (struct tracepoint_lib*)
+	qemu_malloc(sizeof(struct tracepoint_lib));
+
+    if (!new_entry)
+	return NULL;	/* No memory */
+
+    new_entry->entry.name = (char*)qemu_malloc(strlen(name)+1);
+    if(!new_entry->entry.name)
+	return NULL;
+
+    strncpy(new_entry->entry.name, name, strlen(name)+1);
+    new_entry->entry.trace_id = 0; 	/* Trace ID */
+    new_entry->entry.state = 0; 	/* inactive */
+    new_entry->entry.print_fn = print_func;
+    new_entry->list_ptr.le_next = NULL;
+    new_entry->list_ptr.le_prev = NULL;
+
+    /* Add entry to tracepoint lib */
+    if (!QLIST_EMPTY(head)) {
+	prev = QLIST_FIRST(head);
+	QLIST_FOREACH(e, head, list_ptr) {
+	    idx++;
+	    prev = e;
+	    if (!strcmp(name, e->entry.name)) {
+		qemu_free(new_entry->entry.name);
+		qemu_free(new_entry);
+	        return NULL;	/* Already exists */
+	        }
+	    }
+	new_entry->entry.trace_id = idx; 	/* Trace ID */
+	QLIST_INSERT_AFTER(prev, new_entry, list_ptr);
+	}
+    else
+	QLIST_INSERT_HEAD(head, new_entry, list_ptr);
+
+    return &(new_entry->entry);
+}
+
+/*
+ * Return tracepoint* for a given name
+ * @name : name string
+ * return : struct tracepoint* in case of a successful match
+ * 	NULL otherwise.
+ *
+ */
+struct tracepoint* find_tracepoint_by_name(const char *name)
+{
+    unsigned int hash;
+    struct tracepoint_head *head;
+    struct tracepoint_lib *e;
+
+    if(name == NULL)
+	return NULL;
+
+    hash = tdb_hash(name);
+    head = &tracepoint_table[hash % TRACEPOINT_TABLE_SIZE];
+
+    if(QLIST_EMPTY(head))
+	return NULL;
+
+    QLIST_FOREACH(e, head, list_ptr) {
+	if(!strcmp(name, e->entry.name))
+	     return &(e->entry);
+	}
+
+    return NULL;
+}
+
+/*
+ * Return struct tracepoint entry corresponding to
+ * a given hash and id.
+ *
+ * @hash: hash value of tracepoint name
+ * @id	: trace_id of the tracepoint
+ *
+ */
+struct tracepoint* find_tracepoint_by_id(unsigned int hash, uint8_t id)
+{
+    struct tracepoint_head *head;
+    struct tracepoint_lib *e;
+
+    head = &tracepoint_table[hash % TRACEPOINT_TABLE_SIZE];
+
+    if(QLIST_EMPTY(head)) /* List empty */
+	return NULL;
+
+    QLIST_FOREACH(e, head, list_ptr) {
+	if(e->entry.trace_id == id)
+	     return &(e->entry);
+	}
+
+    return NULL; /* Not found */
+}
+
+/*
+ * Display data logged
+ *
+ * @ptr	: pointer to the DebugEntry in the trace buffer
+ * @mon	: pointer to Monitor
+ *
+ */
+void print_trace(struct DebugEntry *ptr, Monitor *mon)
+{
+    unsigned int hash;
+    uint8_t id;
+    struct tracepoint *tp;
+
+    if(ptr == NULL)
+	return;
+
+    memcpy(&hash, ptr->data, sizeof(hash));
+    id = ptr->data[sizeof(hash)];
+
+    tp = find_tracepoint_by_id(hash, id);
+    if (tp == NULL) {
+	monitor_printf(mon, "Not found : hash %u, id %u\n",hash, id);
+	return;
+	}
+
+//     monitor_printf(mon, "Found %s : hash %u, id %u\n",tp->name, hash, id);
+    (tp->print_fn)(mon, (void*)&ptr->data[sizeof(hash)+sizeof(id)]);
+    return;
+}
+
+/*
+ * Print all available tracepoints with their status.
+ * (Called from monitor)
+ */
+void do_info_tracepoint(Monitor *mon)
+{
+    int i;
+    struct tracepoint_head *head;
+    struct tracepoint_lib *e;
+
+    for (i=0; i< TRACEPOINT_TABLE_SIZE; i++) {
+	head = &tracepoint_table[i];
+	if(!QLIST_EMPTY(head)) {
+	    QLIST_FOREACH(e, head, list_ptr)
+		monitor_printf(mon, "%s %d\n", e->entry.name, e->entry.state);
+	}
+    }
+}
Index: qemu/trace-entries.h
===================================================================
--- /dev/null
+++ qemu/trace-entries.h
@@ -0,0 +1,33 @@ 
+/*
+ * Tracepoint declarations for QEMU.
+ *
+ * Copyright IBM, Corp. 2010
+ *
+ * Author:
+ *  Prerna   <prerna@linux.vnet.ibm.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#ifndef __TRACE_ENTRIES_H__
+#define __TRACE_ENTRIES_H__
+
+#include "tracepoint.h"
+
+void init_tracepoints(void);
+
+/*
+ * Add DECLARE_TRACE macros here.
+ *
+ * DECLARE_TRACE( name,
+ * 	TP_PROTO( type1 var1, type2 var2..),
+ * 	TP_STRUCT__entry(
+ * 			__field(type1, var1);
+ * 			__field(type2, var2);...
+ * 		),
+ * )
+ */
+
+#endif /*__TRACE_ENTRIES_H__ */
Index: qemu/trace-entries.c
===================================================================
--- /dev/null
+++ qemu/trace-entries.c
@@ -0,0 +1,40 @@ 
+/*
+ *  Tracepoint declaration(s) for QEMU.
+ *
+ *  Copyright IBM, Corp. 2010
+ *
+ *  Author:
+ *   Prerna   <prerna@linux.vnet.ibm.com>
+ *
+ *  This work is licensed under the terms of the GNU GPL, version 2.  See
+ *  the COPYING file in the top-level directory.
+ *
+ */
+#include "trace-entries.h"
+/*
+ * Initialize tracepoints
+ * Add INIT_TRACE() macro for each tracepoint you declare,
+ * inside this function.
+ *
+ */
+void init_tracepoints(void)
+{
+// INIT_TRACE(foo);
+
+   return;
+}
+
+/*
+ * Add a DEFINE_TRACE() macro for each tracepoint you declare.
+ *
+ * DEFINE_TRACE( name,
+ *     TP_PROTO( type1 var1, type2 var2..),
+ *     TP_fast_assign(
+ *                     __entry->var1 = arg1;
+ *                     __entry->var2 = arg2;...
+ *             ),
+ *     TP_printk("....", __entry->var1, __entry->var2) // No semicolon at end
+ * )
+ *
+ */
+
Index: qemu/Makefile.objs
===================================================================
--- qemu.orig/Makefile.objs
+++ qemu/Makefile.objs
@@ -5,6 +5,10 @@  qobject-obj-y += qjson.o json-lexer.o js
 qobject-obj-y += qerror.o
 
 #######################################################################
+# Trace Object
+trace-obj-y = trace-buffer.o tracepoint.o trace-entries.o $(SRC_PATH)/qdict.h
+
+#######################################################################
 # block-obj-y is code used by both qemu system emulation and qemu-img
 
 block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o module.o
Index: qemu/Makefile
===================================================================
--- qemu.orig/Makefile
+++ qemu/Makefile
@@ -1,6 +1,6 @@ 
 # Makefile for QEMU.
 
-GENERATED_HEADERS = config-host.h
+GENERATED_HEADERS = config-host.h trace-entries.h
 
 ifneq ($(wildcard config-host.mak),)
 # Put the all: rule here so that config-host.mak can contain dependencies.
@@ -130,16 +130,18 @@  bt-host.o: QEMU_CFLAGS += $(BLUEZ_CFLAGS
 
 iov.o: iov.c iov.h
 
+#tracepoint.o: tracepoint.h tracepoint.c $(SRC_PATH)/qdict.h
+
 ######################################################################
 
 qemu-img.o: qemu-img-cmds.h
 qemu-img.o qemu-tool.o qemu-nbd.o qemu-io.o: $(GENERATED_HEADERS)
 
-qemu-img$(EXESUF): qemu-img.o qemu-tool.o qemu-error.o $(block-obj-y) $(qobject-obj-y)
+qemu-img$(EXESUF): qemu-img.o qemu-tool.o qemu-error.o $(block-obj-y) $(qobject-obj-y) $(trace-obj-y)
 
-qemu-nbd$(EXESUF): qemu-nbd.o qemu-tool.o qemu-error.o $(block-obj-y) $(qobject-obj-y)
+qemu-nbd$(EXESUF): qemu-nbd.o qemu-tool.o qemu-error.o $(block-obj-y) $(qobject-obj-y) $(trace-obj-y)
 
-qemu-io$(EXESUF): qemu-io.o cmd.o qemu-tool.o qemu-error.o $(block-obj-y) $(qobject-obj-y)
+qemu-io$(EXESUF): qemu-io.o cmd.o qemu-tool.o qemu-error.o $(block-obj-y) $(qobject-obj-y) $(trace-obj-y)
 
 qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx
 	$(call quiet-command,sh $(SRC_PATH)/hxtool -h < $< > $@,"  GEN   $@")