Patchwork [2/5] virtio_blk: implement REQ_FLUSH/FUA support

login
register
mail settings
Submitter Tejun Heo
Date Aug. 16, 2010, 4:52 p.m.
Message ID <1281977523-19335-3-git-send-email-tj@kernel.org>
Download mbox | patch
Permalink /patch/61821/
State Not Applicable
Delegated to: David Miller
Headers show

Comments

Tejun Heo - Aug. 16, 2010, 4:52 p.m.
From: Tejun Heo <tj@kernle.org>

Remove now unused REQ_HARDBARRIER support and implement REQ_FLUSH/FUA
support instead.  A new feature flag VIRTIO_BLK_F_FUA is added to
indicate the support for FUA.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
---
 drivers/block/virtio_blk.c |   26 ++++++++++++--------------
 include/linux/virtio_blk.h |    6 +++++-
 2 files changed, 17 insertions(+), 15 deletions(-)
Christoph Hellwig - Aug. 16, 2010, 6:33 p.m.
On Mon, Aug 16, 2010 at 06:52:00PM +0200, Tejun Heo wrote:
> From: Tejun Heo <tj@kernle.org>
> 
> Remove now unused REQ_HARDBARRIER support and implement REQ_FLUSH/FUA
> support instead.  A new feature flag VIRTIO_BLK_F_FUA is added to
> indicate the support for FUA.

I'm not sure it's worth it.  The pure REQ_FLUSH path works not and is
well tested with kvm/qemu.   We can still easily add a FUA bit, and
even a pre-flush bit if the protocol roundtrips matter in real life
benchmarking.

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rusty Russell - Aug. 17, 2010, 1:16 a.m.
On Tue, 17 Aug 2010 02:22:00 am Tejun Heo wrote:
> From: Tejun Heo <tj@kernle.org>
> 
> Remove now unused REQ_HARDBARRIER support and implement REQ_FLUSH/FUA
> support instead.  A new feature flag VIRTIO_BLK_F_FUA is added to
> indicate the support for FUA.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Michael S. Tsirkin <mst@redhat.com>

Acked-by: Rusty Russell <rusty@rustcorp.com.au>

And also for the lguest-specific patch.

Thanks!
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo - Aug. 17, 2010, 8:17 a.m.
Hello,

On 08/16/2010 08:33 PM, Christoph Hellwig wrote:
> On Mon, Aug 16, 2010 at 06:52:00PM +0200, Tejun Heo wrote:
>> From: Tejun Heo <tj@kernle.org>
>>
>> Remove now unused REQ_HARDBARRIER support and implement REQ_FLUSH/FUA
>> support instead.  A new feature flag VIRTIO_BLK_F_FUA is added to
>> indicate the support for FUA.
> 
> I'm not sure it's worth it.  The pure REQ_FLUSH path works not and is
> well tested with kvm/qemu.   We can still easily add a FUA bit, and
> even a pre-flush bit if the protocol roundtrips matter in real life
> benchmarking.

Hmmm... the underlying storage could be md/dm RAIDs in which case FUA
should be cheaper than FLUSH.

Thanks.
Tejun Heo - Aug. 17, 2010, 8:18 a.m.
On 08/17/2010 03:16 AM, Rusty Russell wrote:
> On Tue, 17 Aug 2010 02:22:00 am Tejun Heo wrote:
>> From: Tejun Heo <tj@kernle.org>
>>
>> Remove now unused REQ_HARDBARRIER support and implement REQ_FLUSH/FUA
>> support instead.  A new feature flag VIRTIO_BLK_F_FUA is added to
>> indicate the support for FUA.
>>
>> Signed-off-by: Tejun Heo <tj@kernel.org>
>> Cc: Michael S. Tsirkin <mst@redhat.com>
> 
> Acked-by: Rusty Russell <rusty@rustcorp.com.au>
> 
> And also for the lguest-specific patch.

Thanks.  Acked-by's added.
Christoph Hellwig - Aug. 17, 2010, 1:23 p.m.
On Tue, Aug 17, 2010 at 10:17:15AM +0200, Tejun Heo wrote:
> >> Remove now unused REQ_HARDBARRIER support and implement REQ_FLUSH/FUA
> >> support instead.  A new feature flag VIRTIO_BLK_F_FUA is added to
> >> indicate the support for FUA.
> > 
> > I'm not sure it's worth it.  The pure REQ_FLUSH path works not and is
> > well tested with kvm/qemu.   We can still easily add a FUA bit, and
> > even a pre-flush bit if the protocol roundtrips matter in real life
> > benchmarking.
> 
> Hmmm... the underlying storage could be md/dm RAIDs in which case FUA
> should be cheaper than FLUSH.

If someone ever wrote a virtio-blk backend that sits directly ontop
of the Linux block layer that would be true.  Of the five known
virtio-blk backends all operate on normal files using the Posix I/O
APIs, or the Linux aio API (optionally in qemu) or in-kernel
vfs_read/vfs_write (vhost-blk).

Given how little testing lguest gets compared to qemu I really don't
want a protocol addition for it unless it really buys us something.
Once we're done with this barrier conversion I plan into benchmarking
FUA and a pre-flush tag on the command for virtio in real life setups,
and see if it actually buys us anything.

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo - Aug. 17, 2010, 4:22 p.m.
Hello,

On 08/17/2010 03:23 PM, Christoph Hellwig wrote:
>> Hmmm... the underlying storage could be md/dm RAIDs in which case FUA
>> should be cheaper than FLUSH.
> 
> If someone ever wrote a virtio-blk backend that sits directly ontop
> of the Linux block layer that would be true.  Of the five known
> virtio-blk backends all operate on normal files using the Posix I/O
> APIs, or the Linux aio API (optionally in qemu) or in-kernel
> vfs_read/vfs_write (vhost-blk).

Right.

> Given how little testing lguest gets compared to qemu I really don't
> want a protocol addition for it unless it really buys us something.
> Once we're done with this barrier conversion I plan into benchmarking
> FUA and a pre-flush tag on the command for virtio in real life setups,
> and see if it actually buys us anything.

Hmmm... yeah, we can drop it.  Michael, what do you think?

Thanks.
Rusty Russell - Aug. 18, 2010, 10:22 a.m.
On Tue, 17 Aug 2010 10:53:27 pm Christoph Hellwig wrote:
> Given how little testing lguest gets compared to qemu I really don't
> want a protocol addition for it unless it really buys us something.
> Once we're done with this barrier conversion I plan into benchmarking
> FUA and a pre-flush tag on the command for virtio in real life setups,
> and see if it actually buys us anything.

Absolutely.  Lguest should follow, not lead!

Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index d10b635..ed0fb7d 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -128,9 +128,6 @@  static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
 		}
 	}
 
-	if (vbr->req->cmd_flags & REQ_HARDBARRIER)
-		vbr->out_hdr.type |= VIRTIO_BLK_T_BARRIER;
-
 	sg_set_buf(&vblk->sg[out++], &vbr->out_hdr, sizeof(vbr->out_hdr));
 
 	/*
@@ -157,6 +154,8 @@  static bool do_req(struct request_queue *q, struct virtio_blk *vblk,
 		if (rq_data_dir(vbr->req) == WRITE) {
 			vbr->out_hdr.type |= VIRTIO_BLK_T_OUT;
 			out += num;
+			if (req->cmd_flags & REQ_FUA)
+				vbr->out_hdr.type |= VIRTIO_BLK_T_FUA;
 		} else {
 			vbr->out_hdr.type |= VIRTIO_BLK_T_IN;
 			in += num;
@@ -307,6 +306,7 @@  static int __devinit virtblk_probe(struct virtio_device *vdev)
 {
 	struct virtio_blk *vblk;
 	struct request_queue *q;
+	unsigned int flush;
 	int err;
 	u64 cap;
 	u32 v, blk_size, sg_elems, opt_io_size;
@@ -388,15 +388,13 @@  static int __devinit virtblk_probe(struct virtio_device *vdev)
 	vblk->disk->driverfs_dev = &vdev->dev;
 	index++;
 
-	/*
-	 * If the FLUSH feature is supported we do have support for
-	 * flushing a volatile write cache on the host.  Use that to
-	 * implement write barrier support; otherwise, we must assume
-	 * that the host does not perform any kind of volatile write
-	 * caching.
-	 */
+	/* configure queue flush support */
+	flush = 0;
 	if (virtio_has_feature(vdev, VIRTIO_BLK_F_FLUSH))
-		blk_queue_flush(q, REQ_FLUSH);
+		flush |= REQ_FLUSH;
+	if (virtio_has_feature(vdev, VIRTIO_BLK_F_FUA))
+		flush |= REQ_FUA;
+	blk_queue_flush(q, flush);
 
 	/* If disk is read-only in the host, the guest should obey */
 	if (virtio_has_feature(vdev, VIRTIO_BLK_F_RO))
@@ -515,9 +513,9 @@  static const struct virtio_device_id id_table[] = {
 };
 
 static unsigned int features[] = {
-	VIRTIO_BLK_F_BARRIER, VIRTIO_BLK_F_SEG_MAX, VIRTIO_BLK_F_SIZE_MAX,
-	VIRTIO_BLK_F_GEOMETRY, VIRTIO_BLK_F_RO, VIRTIO_BLK_F_BLK_SIZE,
-	VIRTIO_BLK_F_SCSI, VIRTIO_BLK_F_FLUSH, VIRTIO_BLK_F_TOPOLOGY
+	VIRTIO_BLK_F_SEG_MAX, VIRTIO_BLK_F_SIZE_MAX, VIRTIO_BLK_F_GEOMETRY,
+	VIRTIO_BLK_F_RO, VIRTIO_BLK_F_BLK_SIZE, VIRTIO_BLK_F_SCSI,
+	VIRTIO_BLK_F_FLUSH, VIRTIO_BLK_F_TOPOLOGY, VIRTIO_BLK_F_FUA,
 };
 
 /*
diff --git a/include/linux/virtio_blk.h b/include/linux/virtio_blk.h
index 167720d..f453f3c 100644
--- a/include/linux/virtio_blk.h
+++ b/include/linux/virtio_blk.h
@@ -16,6 +16,7 @@ 
 #define VIRTIO_BLK_F_SCSI	7	/* Supports scsi command passthru */
 #define VIRTIO_BLK_F_FLUSH	9	/* Cache flush command support */
 #define VIRTIO_BLK_F_TOPOLOGY	10	/* Topology information is available */
+#define VIRTIO_BLK_F_FUA	11	/* Forced Unit Access write support */
 
 #define VIRTIO_BLK_ID_BYTES	20	/* ID string length */
 
@@ -70,7 +71,10 @@  struct virtio_blk_config {
 #define VIRTIO_BLK_T_FLUSH	4
 
 /* Get device ID command */
-#define VIRTIO_BLK_T_GET_ID    8
+#define VIRTIO_BLK_T_GET_ID	8
+
+/* FUA command */
+#define VIRTIO_BLK_T_FUA	16
 
 /* Barrier before this op. */
 #define VIRTIO_BLK_T_BARRIER	0x80000000