Message ID | 1227285646-16263-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com |
---|---|
State | Superseded, archived |
Headers | show |
Hi Ted, Along with this change you can drop the patch aneesh-8-fix-double-free-of-blocks from the patchqueue. The changes are not needed. We were finding double free due to a race in uninit bg code which i am fixing in series sent after this mail. -aneesh -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Nov 21, 2008 at 10:10:46PM +0530, Aneesh Kumar K.V wrote: > Indicate that the group locks can be taken in loop. I've been looking at this patch more closely, and I think there's a major problem here. You've statically declared alloc_sem_key to be NR_BG_LOCKS: > +#ifdef CONFIG_LOCKDEP > +static struct lock_class_key alloc_sem_key[NR_BG_LOCKS]; > +#endif NR_BG_LOCKS is defined in include/linux/blockgroup_lock.h, and is 4 if NR_CPUS is 1 or 2, 8 if NR_CPUS is 3, 16 if NR_CPUS is between 4 and 7, 32 if NR_CPUS is between 8 and 15, and so on. It gets used this way: > +#ifdef CONFIG_LOCKDEP > + __init_rwsem(&meta_group_info[i]->alloc_sem, > + "&meta_group_info[i]->alloc_sem", > + &alloc_sem_key[i]); But i is set thusly: i = group & (EXT4_DESC_PER_BLOCK(sb) - 1); which means i is between 0 and 127 if the filesystem has block 4k filesystem.... It's also not clear to me that this will do the right thing if there are multiple ext4 filesystems mounted. Since we are using a static array for the lockdep class keys, that means that sb->s_group_info[x] for one filesystem is considered in the same lockdep class as sb->s_group_info[x] for another filesystem. This could cause false positives if there are multiple ext4 filesystems mounted and two CPU's are simultaneously accessing the filesystems and then access the two s_group_info structures in different orders. Am I missing something? - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Nov 22, 2008 at 03:46:25PM -0500, Theodore Tso wrote: > On Fri, Nov 21, 2008 at 10:10:46PM +0530, Aneesh Kumar K.V wrote: > > Indicate that the group locks can be taken in loop. > > I've been looking at this patch more closely, and I think there's a > major problem here. OK, after looking at this in yet more detail (and having changed planes in Dallas :-), I am more than ever convinced this patch is not rightq. We have an rw_sem for each block group, grp->alloc_sem, which is allocated in groups of meta blockgroups. The whole reason why we should worry about keeping them in the same class is we should worry about is if for some reason, the multiblock allocator happens to allocate two block group's alloc_sem, but one does them out of order (say, bg 4, then bg 2, while another does bg 2, then 4), we would get a dead lock. I'm guessing that what caused the problem for you was ext4_mb_init_group(), which if you are using 1k filesystems, tries to grab multiple grp->alloc_sem's. In each place where we find those, we need to use down_write_nested --- see Documentation/lockdep-design.txt. If there are any other places in mballoc.c which grabs multiple alloc_sem's at the same time, we'll have to use define new subclasses. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index 7293209..1fa311c 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -2413,6 +2413,9 @@ ext4_mb_store_history(struct ext4_allocation_context *ac) #define ext4_mb_history_init(sb) #endif +#ifdef CONFIG_LOCKDEP +static struct lock_class_key alloc_sem_key[NR_BG_LOCKS]; +#endif /* Create and initialize ext4_group_info data for the given group. */ int ext4_mb_add_groupinfo(struct super_block *sb, ext4_group_t group, @@ -2473,8 +2476,14 @@ int ext4_mb_add_groupinfo(struct super_block *sb, ext4_group_t group, } INIT_LIST_HEAD(&meta_group_info[i]->bb_prealloc_list); - init_rwsem(&meta_group_info[i]->alloc_sem); +#ifdef CONFIG_LOCKDEP + __init_rwsem(&meta_group_info[i]->alloc_sem, + "&meta_group_info[i]->alloc_sem", + &alloc_sem_key[i]); meta_group_info[i]->bb_free_root.rb_node = NULL;; +#else + init_rwsem(&meta_group_info[i]->alloc_sem); +#endif #ifdef DOUBLE_CHECK {