From patchwork Thu Jul 22 15:58:40 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Michael S. Tsirkin" X-Patchwork-Id: 59590 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id F3B46B6ED0 for ; Fri, 23 Jul 2010 02:05:04 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752469Ab0GVQEf (ORCPT ); Thu, 22 Jul 2010 12:04:35 -0400 Received: from mx1.redhat.com ([209.132.183.28]:8373 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750817Ab0GVQEc (ORCPT ); Thu, 22 Jul 2010 12:04:32 -0400 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id o6MG3xA9026429 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 22 Jul 2010 12:03:59 -0400 Received: from redhat.com (vpn2-8-242.ams2.redhat.com [10.36.8.242]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with SMTP id o6MG3rWP003205; Thu, 22 Jul 2010 12:03:54 -0400 Date: Thu, 22 Jul 2010 18:58:40 +0300 From: "Michael S. Tsirkin" To: Tejun Heo Cc: Oleg Nesterov , Sridhar Samudrala , netdev , lkml , "kvm@vger.kernel.org" , Andrew Morton , Dmitri Vorobiev , Jiri Kosina , Thomas Gleixner , Ingo Molnar , Andi Kleen Subject: Re: [PATCH UPDATED 1/3] vhost: replace vhost_workqueue with per-vhost kthread Message-ID: <20100722155840.GA1743@redhat.com> References: <4BFEE216.2070807@kernel.org> <20100528150830.GB21880@redhat.com> <4BFFE742.2060205@kernel.org> <20100530112925.GB27611@redhat.com> <4C02C961.9050606@kernel.org> <20100531152221.GB2987@redhat.com> <4C03D983.9010905@kernel.org> <20100531160020.GC3067@redhat.com> <4C04D41B.4050704@kernel.org> <4C06A580.9060300@kernel.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <4C06A580.9060300@kernel.org> User-Agent: Mutt/1.5.20 (2009-12-10) X-Scanned-By: MIMEDefang 2.67 on 10.5.11.11 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Wed, Jun 02, 2010 at 08:40:00PM +0200, Tejun Heo wrote: > Replace vhost_workqueue with per-vhost kthread. Other than callback > argument change from struct work_struct * to struct vhost_work *, > there's no visible change to vhost_poll_*() interface. > > This conversion is to make each vhost use a dedicated kthread so that > resource control via cgroup can be applied. > > Partially based on Sridhar Samudrala's patch. > > * Updated to use sub structure vhost_work instead of directly using > vhost_poll at Michael's suggestion. > > * Added flusher wake_up() optimization at Michael's suggestion. > > Signed-off-by: Tejun Heo > Cc: Michael S. Tsirkin > Cc: Sridhar Samudrala All the tricky barrier pairing made me uncomfortable. So I came up with this on top (untested): if we do all operations under the spinlock, we can get by without barriers and atomics. And since we need the lock for list operations anyway, this should have no paerformance impact. What do you think? --- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 0c6b533..7730a30 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -73,7 +73,7 @@ void vhost_poll_init(struct vhost_poll *poll, vhost_work_fn_t fn, INIT_LIST_HEAD(&work->node); work->fn = fn; init_waitqueue_head(&work->done); - atomic_set(&work->flushing, 0); + work->flushing = 0; work->queue_seq = work->done_seq = 0; } @@ -99,13 +99,23 @@ void vhost_poll_stop(struct vhost_poll *poll) void vhost_poll_flush(struct vhost_poll *poll) { struct vhost_work *work = &poll->work; - int seq = work->queue_seq; + unsigned seq, left; + int flushing; - atomic_inc(&work->flushing); - smp_mb__after_atomic_inc(); /* mb flush-b0 paired with worker-b1 */ - wait_event(work->done, seq - work->done_seq <= 0); - atomic_dec(&work->flushing); - smp_mb__after_atomic_dec(); /* rmb flush-b1 paired with worker-b0 */ + spin_lock_irq(&dev->work_lock); + seq = work->queue_seq; + work->flushing++; + spin_unlock_irq(&dev->work_lock); + wait_event(work->done, { + spin_lock_irq(&dev->work_lock); + left = work->done_seq - seq; + spin_unlock_irq(&dev->work_lock); + left < UINT_MAX / 2; + }); + spin_lock_irq(&dev->work_lock); + flushing = --work->flushing; + spin_unlock_irq(&dev->work_lock); + BUG_ON(flushing < 0); } void vhost_poll_queue(struct vhost_poll *poll) @@ -151,37 +161,37 @@ static void vhost_vq_reset(struct vhost_dev *dev, static int vhost_worker(void *data) { struct vhost_dev *dev = data; - struct vhost_work *work; + struct vhost_work *work = NULL; -repeat: - set_current_state(TASK_INTERRUPTIBLE); /* mb paired w/ kthread_stop */ + for (;;) { + set_current_state(TASK_INTERRUPTIBLE); /* mb paired w/ kthread_stop */ - if (kthread_should_stop()) { - __set_current_state(TASK_RUNNING); - return 0; - } + if (kthread_should_stop()) { + __set_current_state(TASK_RUNNING); + return 0; + } - work = NULL; - spin_lock_irq(&dev->work_lock); - if (!list_empty(&dev->work_list)) { - work = list_first_entry(&dev->work_list, - struct vhost_work, node); - list_del_init(&work->node); - } - spin_unlock_irq(&dev->work_lock); + spin_lock_irq(&dev->work_lock); + if (work) { + work->done_seq = work->queue_seq; + if (work->flushing) + wake_up_all(&work->done); + } + if (!list_empty(&dev->work_list)) { + work = list_first_entry(&dev->work_list, + struct vhost_work, node); + list_del_init(&work->node); + } else + work = NULL; + spin_unlock_irq(&dev->work_lock); + + if (work) { + __set_current_state(TASK_RUNNING); + work->fn(work); + } else + schedule(); - if (work) { - __set_current_state(TASK_RUNNING); - work->fn(work); - smp_wmb(); /* wmb worker-b0 paired with flush-b1 */ - work->done_seq = work->queue_seq; - smp_mb(); /* mb worker-b1 paired with flush-b0 */ - if (atomic_read(&work->flushing)) - wake_up_all(&work->done); - } else - schedule(); - - goto repeat; + } } long vhost_dev_init(struct vhost_dev *dev, diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h index 0e63091..3693327 100644 --- a/drivers/vhost/vhost.h +++ b/drivers/vhost/vhost.h @@ -27,9 +27,9 @@ struct vhost_work { struct list_head node; vhost_work_fn_t fn; wait_queue_head_t done; - atomic_t flushing; - int queue_seq; - int done_seq; + int flushing; + unsigned queue_seq; + unsigned done_seq; }; /* Poll a file (eventfd or socket) */