diff mbox

[RFC,v3,0/18] DAX page fault locking

Message ID 20160511091930.GE14744@quack2.suse.cz
State Accepted, archived
Headers show

Commit Message

Jan Kara May 11, 2016, 9:19 a.m. UTC
On Tue 10-05-16 16:39:37, Ross Zwisler wrote:
> On Tue, May 10, 2016 at 02:30:03PM -0600, Ross Zwisler wrote:
> > On Tue, May 10, 2016 at 05:28:14PM +0200, Jan Kara wrote:
> > > On Mon 09-05-16 11:38:28, Jan Kara wrote:
> > > Somehow, I'm not able to reproduce the warnings... Anyway, I think I see
> > > what's going on. Can you check whether the warning goes away when you
> > > change the condition at the end of page_cache_tree_delete() to:
> > > 
> > >         if (!dax_mapping(mapping) && !workingset_node_pages(node) &&
> > >             list_empty(&node->private_list)) {
> > 
> > Yep, this took care of both of the issues that I reported.  I'll restart my
> > testing with this in my baseline, but as of this fix I don't have any more
> > open testing issues. :)
> 
> Well, looks like I spoke too soon.  The two tests that were failing for me are
> now passing, but I can still create what looks like a related failure using
> XFS, DAX, and the two xfstests generic/231 and generic/232 run back-to-back.

Hum, full xfstests run completes for me just fine. Can you reproduce the
issue with the attached debug patch? Thanks!

								Honza

Comments

Ross Zwisler May 11, 2016, 3:52 p.m. UTC | #1
On Wed, May 11, 2016 at 11:19:30AM +0200, Jan Kara wrote:
> On Tue 10-05-16 16:39:37, Ross Zwisler wrote:
> > On Tue, May 10, 2016 at 02:30:03PM -0600, Ross Zwisler wrote:
> > > On Tue, May 10, 2016 at 05:28:14PM +0200, Jan Kara wrote:
> > > > On Mon 09-05-16 11:38:28, Jan Kara wrote:
> > > > Somehow, I'm not able to reproduce the warnings... Anyway, I think I see
> > > > what's going on. Can you check whether the warning goes away when you
> > > > change the condition at the end of page_cache_tree_delete() to:
> > > > 
> > > >         if (!dax_mapping(mapping) && !workingset_node_pages(node) &&
> > > >             list_empty(&node->private_list)) {
> > > 
> > > Yep, this took care of both of the issues that I reported.  I'll restart my
> > > testing with this in my baseline, but as of this fix I don't have any more
> > > open testing issues. :)
> > 
> > Well, looks like I spoke too soon.  The two tests that were failing for me are
> > now passing, but I can still create what looks like a related failure using
> > XFS, DAX, and the two xfstests generic/231 and generic/232 run back-to-back.
> 
> Hum, full xfstests run completes for me just fine. Can you reproduce the
> issue with the attached debug patch? Thanks!

Here's the resulting debug:

[  212.541923] Wrong node->count 244.
[  212.542316] Host sb pmem0p2 ino 2097257
[  212.542696] Node dump: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

From 8f01c60e8be5d5671b09b07a5fb647177e33293d Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Wed, 11 May 2016 11:14:11 +0200
Subject: [PATCH] Debugging workingset

Signed-off-by: Jan Kara <jack@suse.cz>
---
 mm/workingset.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/mm/workingset.c b/mm/workingset.c
index 8a75f8d2916a..b692fc756fda 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -12,6 +12,7 @@ 
 #include <linux/swap.h>
 #include <linux/fs.h>
 #include <linux/mm.h>
+#include <linux/dax.h>
 
 /*
  *		Double CLOCK lists
@@ -402,6 +403,7 @@  static enum lru_status shadow_lru_isolate(struct list_head *item,
 
 	node = container_of(item, struct radix_tree_node, private_list);
 	mapping = node->private_data;
+	WARN_ON(dax_mapping(mapping));
 
 	/* Coming from the list, invert the lock order */
 	if (!spin_trylock(&mapping->tree_lock)) {
@@ -418,7 +420,18 @@  static enum lru_status shadow_lru_isolate(struct list_head *item,
 	 * no pages, so we expect to be able to remove them all and
 	 * delete and free the empty node afterwards.
 	 */
-
+	if (!node->count || (node->count & RADIX_TREE_COUNT_MASK)) {
+		printk(KERN_ERR "Wrong node->count %u.\n", node->count);
+		if (!mapping->host) {
+			printk(KERN_ERR "Node mapping has no host!\n");
+		} else {
+			printk(KERN_ERR "Host sb %s ino %lu\n", mapping->host->i_sb->s_id, mapping->host->i_ino);
+		}
+		printk(KERN_ERR "Node dump:");
+		for (i = 0; i < RADIX_TREE_MAP_SIZE; i++)
+			printk(KERN_CONT " %lx", (unsigned long)node->slots[i]);
+		printk(KERN_CONT "\n");
+	}
 	BUG_ON(!node->count);
 	BUG_ON(node->count & RADIX_TREE_COUNT_MASK);
 
-- 
2.6.6