From patchwork Tue Feb 5 16:14:27 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: NBD: Move from a global spinlock to one lock per NBD device Date: Tue, 05 Feb 2013 06:14:27 -0000 From: Nicholas Thomas X-Patchwork-Id: 218287 Message-Id: <1360080867.18869.30.camel@eboracum.office.bytemark.co.uk> To: Paul.Clements@steeleye.com, jaxboe@fusionio.com Cc: netdev@vger.kernel.org, hkchu@google.com, davem@davemloft.net, eric.dumazet@gmail.com This patch is entirely based on a submission by Jerry Chu, incorporating suggestions from Eric Dumazet. Modern, faster NICs make the original comment on why a single lock is preferable incorrect; moving to one lock per NBD device removes a performance bottleneck. Original patch: http://permalink.gmane.org/gmane.linux.network/207233 Signed-off-by: Nick Thomas --- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 043ddcc..d285b64 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -57,20 +57,9 @@ static unsigned int debugflags; static unsigned int nbds_max = 16; static struct nbd_device *nbd_dev; +static spinlock_t *nbd_locks __read_mostly; static int max_part; -/* - * Use just one lock (or at most 1 per NIC). Two arguments for this: - * 1. Each NIC is essentially a synchronization point for all servers - * accessed through that NIC so there's no need to have more locks - * than NICs anyway. - * 2. More locks lead to more "Dirty cache line bouncing" which will slow - * down each lock to the point where they're actually slower than just - * a single lock. - * Thanks go to Jens Axboe and Al Viro for their LKML emails explaining this! - */ -static DEFINE_SPINLOCK(nbd_lock); - #ifndef NDEBUG static const char *ioctl_cmd_to_ascii(int cmd) { @@ -781,6 +770,15 @@ static int __init nbd_init(void) if (!nbd_dev) return -ENOMEM; + nbd_locks = kcalloc(nbds_max, sizeof(*nbd_locks), GFP_KERNEL); + if (!nbd_locks) { + kfree(nbd_dev); + return -ENOMEM; + } + + for (i = 0; i < nbds_max; i++) + spin_lock_init(&nbd_locks[i]); + part_shift = 0; if (max_part > 0) { part_shift = fls(max_part); @@ -812,7 +810,7 @@ static int __init nbd_init(void) * every gendisk to have its very own request_queue struct. * These structs are big so we dynamically allocate them. */ - disk->queue = blk_init_queue(do_nbd_request, &nbd_lock); + disk->queue = blk_init_queue(do_nbd_request, &nbd_locks[i]); if (!disk->queue) { put_disk(disk); goto out; @@ -863,6 +861,7 @@ out: put_disk(nbd_dev[i].disk); } kfree(nbd_dev); + kfree(nbd_locks); return err; } @@ -880,6 +879,7 @@ static void __exit nbd_cleanup(void) } unregister_blkdev(NBD_MAJOR, "nbd"); kfree(nbd_dev); + kfree(nbd_locks); printk(KERN_INFO "nbd: unregistered device at major %d\n", NBD_MAJOR); }