Message ID | 1311672994.2355.17.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC |
---|---|
State | Not Applicable, archived |
Delegated to: | David Miller |
Headers | show |
On Tue, Jul 26, 2011 at 11:36:34AM +0200, Eric Dumazet wrote: > [PATCH v2] vfs: dont chain pipe/anon/socket on superblock s_inodes list > > Workloads using pipes and sockets hit inode_sb_list_lock contention. > > superblock s_inodes list is needed for quota, dirty, pagecache and > fsnotify management. pipe/anon/socket fs are clearly not candidates for > these. > > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Looks good to me, Reviewed-by: Christoph Hellwig <hch@lst.de> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le mardi 26 juillet 2011 à 05:42 -0400, Christoph Hellwig a écrit : > On Tue, Jul 26, 2011 at 11:36:34AM +0200, Eric Dumazet wrote: > > [PATCH v2] vfs: dont chain pipe/anon/socket on superblock s_inodes list > > > > Workloads using pipes and sockets hit inode_sb_list_lock contention. > > > > superblock s_inodes list is needed for quota, dirty, pagecache and > > fsnotify management. pipe/anon/socket fs are clearly not candidates for > > these. > > > > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> > > Looks good to me, > > Reviewed-by: Christoph Hellwig <hch@lst.de> > Thanks ! BTW, we have one atomic op that could be avoided in new_inode() spin_lock(&inode->i_lock); inode->i_state = 0; spin_unlock(&inode->i_lock); can probably be changed to something less expensive... inode->i_state = 0; smp_wmb(); Not clear if we really need a memory barrier either.... -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jul 26, 2011 at 12:43:33PM +0200, Eric Dumazet wrote: > BTW, we have one atomic op that could be avoided in new_inode() > > spin_lock(&inode->i_lock); > inode->i_state = 0; > spin_unlock(&inode->i_lock); > > can probably be changed to something less expensive... > > inode->i_state = 0; > smp_wmb(); > > Not clear if we really need a memory barrier either.... I think we already had this in some of the earlier vfs/inode scale series, but it got lost when Al asked to just put the fundamental changes in. For plain new_inode() the barrier shouldn't be needed as we take the sb list lock just a little later. I'm not sure about your new variant, so I'll rather lave that to you. There's a few other things missing from earlier iterations, most notable the non-atomic i_count, and the bucket locks for the inode hash, if you're eager enough to look into that area. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/anon_inodes.c b/fs/anon_inodes.c index 4d433d3..f11e43e 100644 --- a/fs/anon_inodes.c +++ b/fs/anon_inodes.c @@ -187,7 +187,7 @@ EXPORT_SYMBOL_GPL(anon_inode_getfd); */ static struct inode *anon_inode_mkinode(void) { - struct inode *inode = new_inode(anon_inode_mnt->mnt_sb); + struct inode *inode = new_inode_pseudo(anon_inode_mnt->mnt_sb); if (!inode) return ERR_PTR(-ENOMEM); diff --git a/fs/inode.c b/fs/inode.c index 96c77b8..319b93b 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -362,9 +362,11 @@ EXPORT_SYMBOL_GPL(inode_sb_list_add); static inline void inode_sb_list_del(struct inode *inode) { - spin_lock(&inode_sb_list_lock); - list_del_init(&inode->i_sb_list); - spin_unlock(&inode_sb_list_lock); + if (!list_empty(&inode->i_sb_list)) { + spin_lock(&inode_sb_list_lock); + list_del_init(&inode->i_sb_list); + spin_unlock(&inode_sb_list_lock); + } } static unsigned long hash(struct super_block *sb, unsigned long hashval) @@ -797,6 +799,29 @@ unsigned int get_next_ino(void) EXPORT_SYMBOL(get_next_ino); /** + * new_inode_pseudo - obtain an inode + * @sb: superblock + * + * Allocates a new inode for given superblock. + * Inode wont be chained in superblock s_inodes list + * This means : + * - fs can't be unmount + * - quotas, fsnotify, writeback can't work + */ +struct inode *new_inode_pseudo(struct super_block *sb) +{ + struct inode *inode = alloc_inode(sb); + + if (inode) { + spin_lock(&inode->i_lock); + inode->i_state = 0; + spin_unlock(&inode->i_lock); + INIT_LIST_HEAD(&inode->i_sb_list); + } + return inode; +} + +/** * new_inode - obtain an inode * @sb: superblock * @@ -814,13 +839,9 @@ struct inode *new_inode(struct super_block *sb) spin_lock_prefetch(&inode_sb_list_lock); - inode = alloc_inode(sb); - if (inode) { - spin_lock(&inode->i_lock); - inode->i_state = 0; - spin_unlock(&inode->i_lock); + inode = new_inode_pseudo(sb); + if (inode) inode_sb_list_add(inode); - } return inode; } EXPORT_SYMBOL(new_inode); diff --git a/fs/pipe.c b/fs/pipe.c index 1b7f9af..0e0be1d 100644 --- a/fs/pipe.c +++ b/fs/pipe.c @@ -948,7 +948,7 @@ static const struct dentry_operations pipefs_dentry_operations = { static struct inode * get_pipe_inode(void) { - struct inode *inode = new_inode(pipe_mnt->mnt_sb); + struct inode *inode = new_inode_pseudo(pipe_mnt->mnt_sb); struct pipe_inode_info *pipe; if (!inode) diff --git a/include/linux/fs.h b/include/linux/fs.h index a665804..cc363fa 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2310,7 +2310,8 @@ extern void __iget(struct inode * inode); extern void iget_failed(struct inode *); extern void end_writeback(struct inode *); extern void __destroy_inode(struct inode *); -extern struct inode *new_inode(struct super_block *); +extern struct inode *new_inode_pseudo(struct super_block *sb); +extern struct inode *new_inode(struct super_block *sb); extern void free_inode_nonrcu(struct inode *inode); extern int should_remove_suid(struct dentry *); extern int file_remove_suid(struct file *); diff --git a/net/socket.c b/net/socket.c index 02dc82d..26ed35c 100644 --- a/net/socket.c +++ b/net/socket.c @@ -467,7 +467,7 @@ static struct socket *sock_alloc(void) struct inode *inode; struct socket *sock; - inode = new_inode(sock_mnt->mnt_sb); + inode = new_inode_pseudo(sock_mnt->mnt_sb); if (!inode) return NULL;