diff mbox

why unlikely(rsv) in ext3_clear_inode()?

Message ID alpine.DEB.1.10.0810271918390.19731@gandalf.stny.rr.com
State Not Applicable, archived
Headers show

Commit Message

Steven Rostedt Oct. 27, 2008, 11:32 p.m. UTC
On Mon, 27 Oct 2008, Mike Snitzer wrote:

> Please see: e6022603b9aa7d61d20b392e69edcdbbc1789969
> 
> Having a look at the LKML archives this was raised back in 2006:
> http://lkml.org/lkml/2006/6/23/337
> 
> I'm not interested in whether unlikely() actually helps here.
> 
> I'm still missing _why_ rsv is mostly NULL at this callsite, as Andrew
> asserted here:
> http://lkml.org/lkml/2006/6/23/400
> 
> And then Steve here: http://lkml.org/lkml/2006/6/24/76
> Where he said:
> "The problem is that in these cases the pointer is NULL several thousands
> of times for every time it is not NULL (if ever).  The non-NULL case is
> where an error occurred or something very special.  So I don't see how
> the if here is a problem?"
> 
> I'm missing which error or what "something very special" is the
> unlikely() reason for having rsv be NULL.
> 
> Looking at the code; ext3_clear_inode() is _the_ place where the
> i_block_alloc_info is cleaned up.  In my testing the rsv is _never_
> NULL if the file was open for writing.  Are we saying that reads are
> much more common than writes?  May be a reasonable assumption but
> saying as much is very different than what Steve seemed to be eluding
> to...
> 
> Anyway, I'd appreciate some clarification here.

Attached is a patch that I used for counting.

Here's my results:
# cat /debug/tracing/ftrace_null 
45
# cat /debug/tracing/ftrace_nonnull 
7

Ah, seems that there is cases where it is nonnull more often. Anyway, it 
obviously is not a fast path (total of 52). Even if it was all null, it is 
not big enough to call for the confusion.

I'd suggest removing the if conditional, and just calling kfree.

-- Steve

Comments

Andrew Morton Oct. 27, 2008, 11:48 p.m. UTC | #1
On Mon, 27 Oct 2008 19:32:11 -0400 (EDT)
Steven Rostedt <rostedt@goodmis.org> wrote:

> Attached is a patch that I used for counting.

If we could get something like that into mainline then I can
stop maintaining profile-likely-unlikely-macros.patch, which would
be welcome.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o Oct. 28, 2008, 12:13 a.m. UTC | #2
On Mon, Oct 27, 2008 at 07:32:11PM -0400, Steven Rostedt wrote:
> Attached is a patch that I used for counting.
> 
> Here's my results:
> # cat /debug/tracing/ftrace_null 
> 45
> # cat /debug/tracing/ftrace_nonnull 
> 7
> 
> Ah, seems that there is cases where it is nonnull more often. Anyway, it 
> obviously is not a fast path (total of 52). Even if it was all null, it is 
> not big enough to call for the confusion.

Silly question --- what was your test workload?

							- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mike Snitzer Oct. 28, 2008, 12:14 a.m. UTC | #3
On Mon, Oct 27, 2008 at 7:32 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Mon, 27 Oct 2008, Mike Snitzer wrote:
>
>> Please see: e6022603b9aa7d61d20b392e69edcdbbc1789969
>>
>> Having a look at the LKML archives this was raised back in 2006:
>> http://lkml.org/lkml/2006/6/23/337
>>
>> I'm not interested in whether unlikely() actually helps here.
>>
>> I'm still missing _why_ rsv is mostly NULL at this callsite, as Andrew
>> asserted here:
>> http://lkml.org/lkml/2006/6/23/400
>>
>> And then Steve here: http://lkml.org/lkml/2006/6/24/76
>> Where he said:
>> "The problem is that in these cases the pointer is NULL several thousands
>> of times for every time it is not NULL (if ever).  The non-NULL case is
>> where an error occurred or something very special.  So I don't see how
>> the if here is a problem?"
>>
>> I'm missing which error or what "something very special" is the
>> unlikely() reason for having rsv be NULL.
>>
>> Looking at the code; ext3_clear_inode() is _the_ place where the
>> i_block_alloc_info is cleaned up.  In my testing the rsv is _never_
>> NULL if the file was open for writing.  Are we saying that reads are
>> much more common than writes?  May be a reasonable assumption but
>> saying as much is very different than what Steve seemed to be eluding
>> to...
>>
>> Anyway, I'd appreciate some clarification here.
>
> Attached is a patch that I used for counting.
>
> Here's my results:
> # cat /debug/tracing/ftrace_null
> 45
> # cat /debug/tracing/ftrace_nonnull
> 7
>
> Ah, seems that there is cases where it is nonnull more often. Anyway, it
> obviously is not a fast path (total of 52). Even if it was all null, it is
> not big enough to call for the confusion.

What was your workload that resulted in this break-down?  AFAIK you'd
have 100% in ftrace_nonnull if you simply opened new files and wrote
to them.

> I'd suggest removing the if conditional, and just calling kfree.

Yes, probably.

thanks,
Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steven Rostedt Oct. 28, 2008, 12:21 a.m. UTC | #4
On Mon, 27 Oct 2008, Theodore Tso wrote:

> On Mon, Oct 27, 2008 at 07:32:11PM -0400, Steven Rostedt wrote:
> > Attached is a patch that I used for counting.
> > 
> > Here's my results:
> > # cat /debug/tracing/ftrace_null 
> > 45
> > # cat /debug/tracing/ftrace_nonnull 
> > 7
> > 
> > Ah, seems that there is cases where it is nonnull more often. Anyway, it 
> > obviously is not a fast path (total of 52). Even if it was all null, it is 
> > not big enough to call for the confusion.
> 
> Silly question --- what was your test workload?


Hehe, I just booted the box. The counting started right away, so I just 
looked at the work load.

Anyway, I'm writing a generic "unlikely" profiler that should make Andrew 
happy. And this will also catch this case as well.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

---
 fs/ext3/super.c           |    8 ++++
 kernel/trace/Makefile     |    1 
 kernel/trace/trace_null.c |   76 ++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 84 insertions(+), 1 deletion(-)

Index: linux-tip.git/fs/ext3/super.c
===================================================================
--- linux-tip.git.orig/fs/ext3/super.c	2008-10-24 10:08:43.000000000 -0400
+++ linux-tip.git/fs/ext3/super.c	2008-10-27 19:01:22.000000000 -0400
@@ -502,6 +502,9 @@  static void destroy_inodecache(void)
 	kmem_cache_destroy(ext3_inode_cachep);
 }
 
+extern void ftrace_null(void);
+extern void ftrace_nonnull(void);
+
 static void ext3_clear_inode(struct inode *inode)
 {
 	struct ext3_block_alloc_info *rsv = EXT3_I(inode)->i_block_alloc_info;
@@ -519,8 +522,11 @@  static void ext3_clear_inode(struct inod
 #endif
 	ext3_discard_reservation(inode);
 	EXT3_I(inode)->i_block_alloc_info = NULL;
-	if (unlikely(rsv))
+	if (unlikely(rsv)) {
+		ftrace_nonnull();
 		kfree(rsv);
+	} else
+		ftrace_null();
 }
 
 static inline void ext3_show_quota_options(struct seq_file *seq, struct super_block *sb)
Index: linux-tip.git/kernel/trace/Makefile
===================================================================
--- linux-tip.git.orig/kernel/trace/Makefile	2008-10-27 19:00:03.000000000 -0400
+++ linux-tip.git/kernel/trace/Makefile	2008-10-27 19:08:25.000000000 -0400
@@ -13,6 +13,7 @@  endif
 obj-$(CONFIG_FUNCTION_TRACER) += libftrace.o
 obj-$(CONFIG_RING_BUFFER) += ring_buffer.o
 
+obj-$(CONFIG_TRACING) += trace_null.o
 obj-$(CONFIG_TRACING) += trace.o
 obj-$(CONFIG_CONTEXT_SWITCH_TRACER) += trace_sched_switch.o
 obj-$(CONFIG_SYSPROF_TRACER) += trace_sysprof.o
Index: linux-tip.git/kernel/trace/trace_null.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-tip.git/kernel/trace/trace_null.c	2008-10-27 19:23:15.000000000 -0400
@@ -0,0 +1,76 @@ 
+/*
+ * Function profiler
+ *
+ * Copyright (C) 2008 Steven Rostedt <srostedt@redhat.com>
+ */
+#include <linux/kallsyms.h>
+#include <linux/seq_file.h>
+#include <linux/spinlock.h>
+#include <linux/debugfs.h>
+#include <linux/uaccess.h>
+#include <linux/module.h>
+#include <linux/ftrace.h>
+#include <linux/hash.h>
+#include <linux/fs.h>
+#include <asm/local.h>
+#include "trace.h"
+
+
+static atomic_t ftrace_null_count;
+static atomic_t ftrace_nonnull_count;
+
+void ftrace_null(void)
+{
+	atomic_inc(&ftrace_null_count);
+}
+EXPORT_SYMBOL(ftrace_null);
+
+void ftrace_nonnull(void)
+{
+	atomic_inc(&ftrace_nonnull_count);
+}
+EXPORT_SYMBOL(ftrace_nonnull);
+
+static ssize_t
+tracing_read_long(struct file *filp, char __user *ubuf,
+		  size_t cnt, loff_t *ppos)
+{
+	atomic_t *p = filp->private_data;
+	char buf[64];
+	int r;
+
+	r = sprintf(buf, "%d\n", atomic_read(p));
+
+	return simple_read_from_buffer(ubuf, cnt, ppos, buf, r);
+}
+
+static struct file_operations tracing_read_long_fops = {
+	.open		= tracing_open_generic,
+	.read		= tracing_read_long,
+};
+
+
+static __init int ftrace_null_init(void)
+{
+	struct dentry *d_tracer;
+	struct dentry *entry;
+
+	d_tracer = tracing_init_dentry();
+
+	entry = debugfs_create_file("ftrace_null", 0444, d_tracer,
+				    &ftrace_null_count,
+				    &tracing_read_long_fops);
+	if (!entry)
+		pr_warning("Could not create debugfs 'ftrace_null' entry\n");
+
+	entry = debugfs_create_file("ftrace_nonnull", 0444, d_tracer,
+				    &ftrace_nonnull_count,
+				    &tracing_read_long_fops);
+	if (!entry)
+		pr_warning("Could not create debugfs 'ftrace_nonnull' entry\n");
+
+
+	return 0;
+}
+
+device_initcall(ftrace_null_init);