diff mbox

[1/3] xen-disk: only advertize feature-persistent if grant copy is not available

Message ID 20170620134756.9632-2-paul.durrant@citrix.com
State New
Headers show

Commit Message

Paul Durrant June 20, 2017, 1:47 p.m. UTC
If grant copy is available then it will always be used in preference to
persistent maps. In this case feature-persistent should not be advertized
to the frontend, otherwise it may needlessly copy data into persistently
granted buffers.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
---
 hw/block/xen_disk.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

Comments

Stefano Stabellini June 20, 2017, 10:19 p.m. UTC | #1
On Tue, 20 Jun 2017, Paul Durrant wrote:
> If grant copy is available then it will always be used in preference to
> persistent maps. In this case feature-persistent should not be advertized
> to the frontend, otherwise it may needlessly copy data into persistently
> granted buffers.
> 
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>

CC'ing Roger.

It is true that using feature-persistent together with grant copies is a
a very bad idea.

But this change enstablishes an explicit preference of
feature_grant_copy over feature-persistent in the xen_disk backend. It
is not obvious to me that it should be the case.

Why is feature_grant_copy (without feature-persistent) better than
feature-persistent (without feature_grant_copy)? Shouldn't we simply
avoid grant copies to copy data to persistent grants?


> ---
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Anthony Perard <anthony.perard@citrix.com>
> Cc: Kevin Wolf <kwolf@redhat.com>
> Cc: Max Reitz <mreitz@redhat.com>
> ---
>  hw/block/xen_disk.c | 15 ++++++++-------
>  1 file changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
> index 3a22805fbc..9b06e3aa81 100644
> --- a/hw/block/xen_disk.c
> +++ b/hw/block/xen_disk.c
> @@ -1023,11 +1023,18 @@ static int blk_init(struct XenDevice *xendev)
>  
>      blkdev->file_blk  = BLOCK_SIZE;
>  
> +    blkdev->feature_grant_copy =
> +                (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0);
> +
> +    xen_pv_printf(&blkdev->xendev, 3, "grant copy operation %s\n",
> +                  blkdev->feature_grant_copy ? "enabled" : "disabled");
> +
>      /* fill info
>       * blk_connect supplies sector-size and sectors
>       */
>      xenstore_write_be_int(&blkdev->xendev, "feature-flush-cache", 1);
> -    xenstore_write_be_int(&blkdev->xendev, "feature-persistent", 1);
> +    xenstore_write_be_int(&blkdev->xendev, "feature-persistent",
> +                          !blkdev->feature_grant_copy);
>      xenstore_write_be_int(&blkdev->xendev, "info", info);
>  
>      blk_parse_discard(blkdev);
> @@ -1202,12 +1209,6 @@ static int blk_connect(struct XenDevice *xendev)
>  
>      xen_be_bind_evtchn(&blkdev->xendev);
>  
> -    blkdev->feature_grant_copy =
> -                (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0);
> -
> -    xen_pv_printf(&blkdev->xendev, 3, "grant copy operation %s\n",
> -                  blkdev->feature_grant_copy ? "enabled" : "disabled");
> -
>      xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, ring-ref %d, "
>                    "remote port %d, local port %d\n",
>                    blkdev->xendev.protocol, blkdev->ring_ref,
> -- 
> 2.11.0
>
Roger Pau Monné June 21, 2017, 9:17 a.m. UTC | #2
On Tue, Jun 20, 2017 at 03:19:33PM -0700, Stefano Stabellini wrote:
> On Tue, 20 Jun 2017, Paul Durrant wrote:
> > If grant copy is available then it will always be used in preference to
> > persistent maps. In this case feature-persistent should not be advertized
> > to the frontend, otherwise it may needlessly copy data into persistently
> > granted buffers.
> > 
> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> 
> CC'ing Roger.
> 
> It is true that using feature-persistent together with grant copies is a
> a very bad idea.
> 
> But this change enstablishes an explicit preference of
> feature_grant_copy over feature-persistent in the xen_disk backend. It
> is not obvious to me that it should be the case.
> 
> Why is feature_grant_copy (without feature-persistent) better than
> feature-persistent (without feature_grant_copy)? Shouldn't we simply
> avoid grant copies to copy data to persistent grants?

When using persistent grants the frontend must always copy data from
the buffer to the persistent grant, there's no way to avoid this.

Using grant_copy we move the copy from the frontend to the backend,
which means the CPU time of the copy is accounted to the backend. This
is not ideal, but IMHO it's better than persistent grants because it
avoids keeping a pool of mapped grants that consume memory and make
the code more complex.

Do you have some performance data showing the difference between
persistent grants vs grant copy?

Roger.
Paul Durrant June 21, 2017, 9:35 a.m. UTC | #3
> -----Original Message-----
> From: Roger Pau Monne
> Sent: 21 June 2017 10:18
> To: Stefano Stabellini <sstabellini@kernel.org>
> Cc: Paul Durrant <Paul.Durrant@citrix.com>; xen-devel@lists.xenproject.org;
> qemu-devel@nongnu.org; qemu-block@nongnu.org; Anthony Perard
> <anthony.perard@citrix.com>; Kevin Wolf <kwolf@redhat.com>; Max Reitz
> <mreitz@redhat.com>
> Subject: Re: [PATCH 1/3] xen-disk: only advertize feature-persistent if grant
> copy is not available
> 
> On Tue, Jun 20, 2017 at 03:19:33PM -0700, Stefano Stabellini wrote:
> > On Tue, 20 Jun 2017, Paul Durrant wrote:
> > > If grant copy is available then it will always be used in preference to
> > > persistent maps. In this case feature-persistent should not be advertized
> > > to the frontend, otherwise it may needlessly copy data into persistently
> > > granted buffers.
> > >
> > > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> >
> > CC'ing Roger.
> >
> > It is true that using feature-persistent together with grant copies is a
> > a very bad idea.
> >
> > But this change enstablishes an explicit preference of
> > feature_grant_copy over feature-persistent in the xen_disk backend. It
> > is not obvious to me that it should be the case.
> >
> > Why is feature_grant_copy (without feature-persistent) better than
> > feature-persistent (without feature_grant_copy)? Shouldn't we simply
> > avoid grant copies to copy data to persistent grants?
> 
> When using persistent grants the frontend must always copy data from
> the buffer to the persistent grant, there's no way to avoid this.
> 
> Using grant_copy we move the copy from the frontend to the backend,
> which means the CPU time of the copy is accounted to the backend. This
> is not ideal, but IMHO it's better than persistent grants because it
> avoids keeping a pool of mapped grants that consume memory and make
> the code more complex.
> 
> Do you have some performance data showing the difference between
> persistent grants vs grant copy?
> 

No, but I can get some :-)

For a little background... I've been trying to push throughput of fio running in a debian stretch guest on my skull canyon NUC. When I started out, I was getting ~100MBbs. When I finished, with this patch, the IOThreads one, the multi-page ring one and a bit of hackery to turn off all the aio flushes that seem to occur even if the image is opened with O_DIRECT, I was getting ~960Mbps... which is about line rate for the SSD in the in NUC.

So, I'll force use of persistent grants on and see what sort of throughput I get.

Cheers,

  Paul

> Roger.
Paul Durrant June 21, 2017, 10:40 a.m. UTC | #4
> -----Original Message-----
> From: Qemu-devel [mailto:qemu-devel-
> bounces+paul.durrant=citrix.com@nongnu.org] On Behalf Of Paul Durrant
> Sent: 21 June 2017 10:36
> To: Roger Pau Monne <roger.pau@citrix.com>; Stefano Stabellini
> <sstabellini@kernel.org>
> Cc: Kevin Wolf <kwolf@redhat.com>; qemu-block@nongnu.org; qemu-
> devel@nongnu.org; Max Reitz <mreitz@redhat.com>; Anthony Perard
> <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org
> Subject: Re: [Qemu-devel] [PATCH 1/3] xen-disk: only advertize feature-
> persistent if grant copy is not available
> 
> > -----Original Message-----
> > From: Roger Pau Monne
> > Sent: 21 June 2017 10:18
> > To: Stefano Stabellini <sstabellini@kernel.org>
> > Cc: Paul Durrant <Paul.Durrant@citrix.com>; xen-
> devel@lists.xenproject.org;
> > qemu-devel@nongnu.org; qemu-block@nongnu.org; Anthony Perard
> > <anthony.perard@citrix.com>; Kevin Wolf <kwolf@redhat.com>; Max
> Reitz
> > <mreitz@redhat.com>
> > Subject: Re: [PATCH 1/3] xen-disk: only advertize feature-persistent if
> grant
> > copy is not available
> >
> > On Tue, Jun 20, 2017 at 03:19:33PM -0700, Stefano Stabellini wrote:
> > > On Tue, 20 Jun 2017, Paul Durrant wrote:
> > > > If grant copy is available then it will always be used in preference to
> > > > persistent maps. In this case feature-persistent should not be
> advertized
> > > > to the frontend, otherwise it may needlessly copy data into persistently
> > > > granted buffers.
> > > >
> > > > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> > >
> > > CC'ing Roger.
> > >
> > > It is true that using feature-persistent together with grant copies is a
> > > a very bad idea.
> > >
> > > But this change enstablishes an explicit preference of
> > > feature_grant_copy over feature-persistent in the xen_disk backend. It
> > > is not obvious to me that it should be the case.
> > >
> > > Why is feature_grant_copy (without feature-persistent) better than
> > > feature-persistent (without feature_grant_copy)? Shouldn't we simply
> > > avoid grant copies to copy data to persistent grants?
> >
> > When using persistent grants the frontend must always copy data from
> > the buffer to the persistent grant, there's no way to avoid this.
> >
> > Using grant_copy we move the copy from the frontend to the backend,
> > which means the CPU time of the copy is accounted to the backend. This
> > is not ideal, but IMHO it's better than persistent grants because it
> > avoids keeping a pool of mapped grants that consume memory and make
> > the code more complex.
> >
> > Do you have some performance data showing the difference between
> > persistent grants vs grant copy?
> >
> 
> No, but I can get some :-)
> 
> For a little background... I've been trying to push throughput of fio running in
> a debian stretch guest on my skull canyon NUC. When I started out, I was
> getting ~100MBbs. When I finished, with this patch, the IOThreads one, the
> multi-page ring one and a bit of hackery to turn off all the aio flushes that
> seem to occur even if the image is opened with O_DIRECT, I was getting
> ~960Mbps... which is about line rate for the SSD in the in NUC.
> 
> So, I'll force use of persistent grants on and see what sort of throughput I
> get.

A quick test with grant copy forced off (causing persistent grants to be used)... My VM is debian stretch using a 16 page shared ring from blkfront. The image backing xvdb is a fully inflated 10G qcow2.

root@dhcp-237-70:~# fio --randrepeat=1 --ioengine=libaio --direct=0 --gtod_reduce=1 --name=test --filename=/dev/xvdb --bs=512k --iodepth=64 --size=10G --readwrite=randwrite --ramp_time=4
test: (g=0): rw=randwrite, bs=512K-512K/512K-512K/512K-512K, ioengine=libaio, iodepth=64
fio-2.16
Starting 1 process
Jobs: 1 (f=1): [w(1)] [70.6% done] [0KB/539.4MB/0KB /s] [0/1078/0 iops] [eta 00m:05s]
test: (groupid=0, jobs=1): err= 0: pid=633: Wed Jun 21 06:26:06 2017
  write: io=6146.6MB, bw=795905KB/s, iops=1546, runt=  7908msec
  cpu          : usr=2.07%, sys=34.00%, ctx=4490, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.3%, >=64=166.9%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued    : total=r=0/w=12230/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
  WRITE: io=6146.6MB, aggrb=795904KB/s, minb=795904KB/s, maxb=795904KB/s, mint=7908msec, maxt=7908msec

Disk stats (read/write):
  xvdb: ios=54/228860, merge=0/2230616, ticks=16/5403048, in_queue=5409068, util=98.26%

The dom0 cpu usage for the relevant IOThread was ~60%

The same test with grant copy...

root@dhcp-237-70:~# fio --randrepeat=1 --ioengine=libaio --direct=0 --gtod_reduce=1 --name=test --filename=/dev/xvdb --bs=512k --iodepth=64 --size=10G --readwrite=randwrite --ramp_time=4
test: (g=0): rw=randwrite, bs=512K-512K/512K-512K/512K-512K, ioengine=libaio, iodepth=64
fio-2.16
Starting 1 process
Jobs: 1 (f=1): [w(1)] [70.6% done] [0KB/607.7MB/0KB /s] [0/1215/0 iops] [eta 00m:05s]
test: (groupid=0, jobs=1): err= 0: pid=483: Wed Jun 21 06:35:14 2017
  write: io=6232.0MB, bw=810976KB/s, iops=1575, runt=  7869msec
  cpu          : usr=2.44%, sys=37.42%, ctx=3570, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.3%, >=64=164.6%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
     issued    : total=r=0/w=12401/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
     latency   : target=0, window=0, percentile=100.00%, depth=64

Run status group 0 (all jobs):
  WRITE: io=6232.0MB, aggrb=810975KB/s, minb=810975KB/s, maxb=810975KB/s, mint=7869msec, maxt=7869msec

Disk stats (read/write):
  xvdb: ios=54/229583, merge=0/2235879, ticks=16/5409500, in_queue=5415080, util=98.27%

So, higher throughput and iops. The dom0 cpu usage was running at ~70%, so there is definitely more dom0 overhead by using grant copy. The usage of grant copy could probably be improved through since the current code issues an copy ioctl per ioreq. With some batching I suspect some, if not all, of the extra overhead could be recovered.

Cheers,

  Paul

> 
> Cheers,
> 
>   Paul
> 
> > Roger.
Roger Pau Monné June 21, 2017, 10:50 a.m. UTC | #5
On Wed, Jun 21, 2017 at 11:40:00AM +0100, Paul Durrant wrote:
> > -----Original Message-----
> > From: Qemu-devel [mailto:qemu-devel-
> > bounces+paul.durrant=citrix.com@nongnu.org] On Behalf Of Paul Durrant
> > Sent: 21 June 2017 10:36
> > To: Roger Pau Monne <roger.pau@citrix.com>; Stefano Stabellini
> > <sstabellini@kernel.org>
> > Cc: Kevin Wolf <kwolf@redhat.com>; qemu-block@nongnu.org; qemu-
> > devel@nongnu.org; Max Reitz <mreitz@redhat.com>; Anthony Perard
> > <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org
> > Subject: Re: [Qemu-devel] [PATCH 1/3] xen-disk: only advertize feature-
> > persistent if grant copy is not available
> > 
> > > -----Original Message-----
> > > From: Roger Pau Monne
> > > Sent: 21 June 2017 10:18
> > > To: Stefano Stabellini <sstabellini@kernel.org>
> > > Cc: Paul Durrant <Paul.Durrant@citrix.com>; xen-
> > devel@lists.xenproject.org;
> > > qemu-devel@nongnu.org; qemu-block@nongnu.org; Anthony Perard
> > > <anthony.perard@citrix.com>; Kevin Wolf <kwolf@redhat.com>; Max
> > Reitz
> > > <mreitz@redhat.com>
> > > Subject: Re: [PATCH 1/3] xen-disk: only advertize feature-persistent if
> > grant
> > > copy is not available
> > >
> > > On Tue, Jun 20, 2017 at 03:19:33PM -0700, Stefano Stabellini wrote:
> > > > On Tue, 20 Jun 2017, Paul Durrant wrote:
> > > > > If grant copy is available then it will always be used in preference to
> > > > > persistent maps. In this case feature-persistent should not be
> > advertized
> > > > > to the frontend, otherwise it may needlessly copy data into persistently
> > > > > granted buffers.
> > > > >
> > > > > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> > > >
> > > > CC'ing Roger.
> > > >
> > > > It is true that using feature-persistent together with grant copies is a
> > > > a very bad idea.
> > > >
> > > > But this change enstablishes an explicit preference of
> > > > feature_grant_copy over feature-persistent in the xen_disk backend. It
> > > > is not obvious to me that it should be the case.
> > > >
> > > > Why is feature_grant_copy (without feature-persistent) better than
> > > > feature-persistent (without feature_grant_copy)? Shouldn't we simply
> > > > avoid grant copies to copy data to persistent grants?
> > >
> > > When using persistent grants the frontend must always copy data from
> > > the buffer to the persistent grant, there's no way to avoid this.
> > >
> > > Using grant_copy we move the copy from the frontend to the backend,
> > > which means the CPU time of the copy is accounted to the backend. This
> > > is not ideal, but IMHO it's better than persistent grants because it
> > > avoids keeping a pool of mapped grants that consume memory and make
> > > the code more complex.
> > >
> > > Do you have some performance data showing the difference between
> > > persistent grants vs grant copy?
> > >
> > 
> > No, but I can get some :-)
> > 
> > For a little background... I've been trying to push throughput of fio running in
> > a debian stretch guest on my skull canyon NUC. When I started out, I was
> > getting ~100MBbs. When I finished, with this patch, the IOThreads one, the
> > multi-page ring one and a bit of hackery to turn off all the aio flushes that
> > seem to occur even if the image is opened with O_DIRECT, I was getting
> > ~960Mbps... which is about line rate for the SSD in the in NUC.
> > 
> > So, I'll force use of persistent grants on and see what sort of throughput I
> > get.
> 
> A quick test with grant copy forced off (causing persistent grants to be used)... My VM is debian stretch using a 16 page shared ring from blkfront. The image backing xvdb is a fully inflated 10G qcow2.
> 
> root@dhcp-237-70:~# fio --randrepeat=1 --ioengine=libaio --direct=0 --gtod_reduce=1 --name=test --filename=/dev/xvdb --bs=512k --iodepth=64 --size=10G --readwrite=randwrite --ramp_time=4
> test: (g=0): rw=randwrite, bs=512K-512K/512K-512K/512K-512K, ioengine=libaio, iodepth=64
> fio-2.16
> Starting 1 process
> Jobs: 1 (f=1): [w(1)] [70.6% done] [0KB/539.4MB/0KB /s] [0/1078/0 iops] [eta 00m:05s]
> test: (groupid=0, jobs=1): err= 0: pid=633: Wed Jun 21 06:26:06 2017
>   write: io=6146.6MB, bw=795905KB/s, iops=1546, runt=  7908msec
>   cpu          : usr=2.07%, sys=34.00%, ctx=4490, majf=0, minf=1
>   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.3%, >=64=166.9%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
>      issued    : total=r=0/w=12230/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
>      latency   : target=0, window=0, percentile=100.00%, depth=64
> 
> Run status group 0 (all jobs):
>   WRITE: io=6146.6MB, aggrb=795904KB/s, minb=795904KB/s, maxb=795904KB/s, mint=7908msec, maxt=7908msec
> 
> Disk stats (read/write):
>   xvdb: ios=54/228860, merge=0/2230616, ticks=16/5403048, in_queue=5409068, util=98.26%
> 
> The dom0 cpu usage for the relevant IOThread was ~60%
> 
> The same test with grant copy...
> 
> root@dhcp-237-70:~# fio --randrepeat=1 --ioengine=libaio --direct=0 --gtod_reduce=1 --name=test --filename=/dev/xvdb --bs=512k --iodepth=64 --size=10G --readwrite=randwrite --ramp_time=4
> test: (g=0): rw=randwrite, bs=512K-512K/512K-512K/512K-512K, ioengine=libaio, iodepth=64
> fio-2.16
> Starting 1 process
> Jobs: 1 (f=1): [w(1)] [70.6% done] [0KB/607.7MB/0KB /s] [0/1215/0 iops] [eta 00m:05s]
> test: (groupid=0, jobs=1): err= 0: pid=483: Wed Jun 21 06:35:14 2017
>   write: io=6232.0MB, bw=810976KB/s, iops=1575, runt=  7869msec
>   cpu          : usr=2.44%, sys=37.42%, ctx=3570, majf=0, minf=1
>   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.3%, >=64=164.6%
>      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
>      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
>      issued    : total=r=0/w=12401/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
>      latency   : target=0, window=0, percentile=100.00%, depth=64
> 
> Run status group 0 (all jobs):
>   WRITE: io=6232.0MB, aggrb=810975KB/s, minb=810975KB/s, maxb=810975KB/s, mint=7869msec, maxt=7869msec
> 
> Disk stats (read/write):
>   xvdb: ios=54/229583, merge=0/2235879, ticks=16/5409500, in_queue=5415080, util=98.27%
> 
> So, higher throughput and iops. The dom0 cpu usage was running at ~70%, so there is definitely more dom0 overhead by using grant copy. The usage of grant copy could probably be improved through since the current code issues an copy ioctl per ioreq. With some batching I suspect some, if not all, of the extra overhead could be recovered.

There's almost always going to be more CPU overhead with grant-copy,
since when using persistent grants QEMU can avoid all (or almost all)
of the ioctls to the grant device.

For the persistent-grants benchmark, did you warm up the grant cache
first? (ie: are those results from a first run of fio?)

In any case, I'm happy to use something different than persistent
grants as long as the performance is similar.

Roger.
Paul Durrant June 21, 2017, 11:05 a.m. UTC | #6
> -----Original Message-----
> From: Roger Pau Monne
> Sent: 21 June 2017 11:51
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>; Kevin Wolf
> <kwolf@redhat.com>; qemu-block@nongnu.org; qemu-devel@nongnu.org;
> Max Reitz <mreitz@redhat.com>; Anthony Perard
> <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org
> Subject: Re: [Qemu-devel] [PATCH 1/3] xen-disk: only advertize feature-
> persistent if grant copy is not available
> 
> On Wed, Jun 21, 2017 at 11:40:00AM +0100, Paul Durrant wrote:
> > > -----Original Message-----
> > > From: Qemu-devel [mailto:qemu-devel-
> > > bounces+paul.durrant=citrix.com@nongnu.org] On Behalf Of Paul
> Durrant
> > > Sent: 21 June 2017 10:36
> > > To: Roger Pau Monne <roger.pau@citrix.com>; Stefano Stabellini
> > > <sstabellini@kernel.org>
> > > Cc: Kevin Wolf <kwolf@redhat.com>; qemu-block@nongnu.org; qemu-
> > > devel@nongnu.org; Max Reitz <mreitz@redhat.com>; Anthony Perard
> > > <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org
> > > Subject: Re: [Qemu-devel] [PATCH 1/3] xen-disk: only advertize feature-
> > > persistent if grant copy is not available
> > >
> > > > -----Original Message-----
> > > > From: Roger Pau Monne
> > > > Sent: 21 June 2017 10:18
> > > > To: Stefano Stabellini <sstabellini@kernel.org>
> > > > Cc: Paul Durrant <Paul.Durrant@citrix.com>; xen-
> > > devel@lists.xenproject.org;
> > > > qemu-devel@nongnu.org; qemu-block@nongnu.org; Anthony Perard
> > > > <anthony.perard@citrix.com>; Kevin Wolf <kwolf@redhat.com>; Max
> > > Reitz
> > > > <mreitz@redhat.com>
> > > > Subject: Re: [PATCH 1/3] xen-disk: only advertize feature-persistent if
> > > grant
> > > > copy is not available
> > > >
> > > > On Tue, Jun 20, 2017 at 03:19:33PM -0700, Stefano Stabellini wrote:
> > > > > On Tue, 20 Jun 2017, Paul Durrant wrote:
> > > > > > If grant copy is available then it will always be used in preference to
> > > > > > persistent maps. In this case feature-persistent should not be
> > > advertized
> > > > > > to the frontend, otherwise it may needlessly copy data into
> persistently
> > > > > > granted buffers.
> > > > > >
> > > > > > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> > > > >
> > > > > CC'ing Roger.
> > > > >
> > > > > It is true that using feature-persistent together with grant copies is a
> > > > > a very bad idea.
> > > > >
> > > > > But this change enstablishes an explicit preference of
> > > > > feature_grant_copy over feature-persistent in the xen_disk backend.
> It
> > > > > is not obvious to me that it should be the case.
> > > > >
> > > > > Why is feature_grant_copy (without feature-persistent) better than
> > > > > feature-persistent (without feature_grant_copy)? Shouldn't we
> simply
> > > > > avoid grant copies to copy data to persistent grants?
> > > >
> > > > When using persistent grants the frontend must always copy data from
> > > > the buffer to the persistent grant, there's no way to avoid this.
> > > >
> > > > Using grant_copy we move the copy from the frontend to the backend,
> > > > which means the CPU time of the copy is accounted to the backend.
> This
> > > > is not ideal, but IMHO it's better than persistent grants because it
> > > > avoids keeping a pool of mapped grants that consume memory and
> make
> > > > the code more complex.
> > > >
> > > > Do you have some performance data showing the difference between
> > > > persistent grants vs grant copy?
> > > >
> > >
> > > No, but I can get some :-)
> > >
> > > For a little background... I've been trying to push throughput of fio
> running in
> > > a debian stretch guest on my skull canyon NUC. When I started out, I was
> > > getting ~100MBbs. When I finished, with this patch, the IOThreads one,
> the
> > > multi-page ring one and a bit of hackery to turn off all the aio flushes that
> > > seem to occur even if the image is opened with O_DIRECT, I was getting
> > > ~960Mbps... which is about line rate for the SSD in the in NUC.
> > >
> > > So, I'll force use of persistent grants on and see what sort of throughput I
> > > get.
> >
> > A quick test with grant copy forced off (causing persistent grants to be
> used)... My VM is debian stretch using a 16 page shared ring from blkfront.
> The image backing xvdb is a fully inflated 10G qcow2.
> >
> > root@dhcp-237-70:~# fio --randrepeat=1 --ioengine=libaio --direct=0 --
> gtod_reduce=1 --name=test --filename=/dev/xvdb --bs=512k --iodepth=64 -
> -size=10G --readwrite=randwrite --ramp_time=4
> > test: (g=0): rw=randwrite, bs=512K-512K/512K-512K/512K-512K,
> ioengine=libaio, iodepth=64
> > fio-2.16
> > Starting 1 process
> > Jobs: 1 (f=1): [w(1)] [70.6% done] [0KB/539.4MB/0KB /s] [0/1078/0 iops]
> [eta 00m:05s]
> > test: (groupid=0, jobs=1): err= 0: pid=633: Wed Jun 21 06:26:06 2017
> >   write: io=6146.6MB, bw=795905KB/s, iops=1546, runt=  7908msec
> >   cpu          : usr=2.07%, sys=34.00%, ctx=4490, majf=0, minf=1
> >   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.3%,
> >=64=166.9%
> >      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
> >      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%,
> >=64=0.0%
> >      issued    : total=r=0/w=12230/d=0, short=r=0/w=0/d=0,
> drop=r=0/w=0/d=0
> >      latency   : target=0, window=0, percentile=100.00%, depth=64
> >
> > Run status group 0 (all jobs):
> >   WRITE: io=6146.6MB, aggrb=795904KB/s, minb=795904KB/s,
> maxb=795904KB/s, mint=7908msec, maxt=7908msec
> >
> > Disk stats (read/write):
> >   xvdb: ios=54/228860, merge=0/2230616, ticks=16/5403048,
> in_queue=5409068, util=98.26%
> >
> > The dom0 cpu usage for the relevant IOThread was ~60%
> >
> > The same test with grant copy...
> >
> > root@dhcp-237-70:~# fio --randrepeat=1 --ioengine=libaio --direct=0 --
> gtod_reduce=1 --name=test --filename=/dev/xvdb --bs=512k --iodepth=64 -
> -size=10G --readwrite=randwrite --ramp_time=4
> > test: (g=0): rw=randwrite, bs=512K-512K/512K-512K/512K-512K,
> ioengine=libaio, iodepth=64
> > fio-2.16
> > Starting 1 process
> > Jobs: 1 (f=1): [w(1)] [70.6% done] [0KB/607.7MB/0KB /s] [0/1215/0 iops]
> [eta 00m:05s]
> > test: (groupid=0, jobs=1): err= 0: pid=483: Wed Jun 21 06:35:14 2017
> >   write: io=6232.0MB, bw=810976KB/s, iops=1575, runt=  7869msec
> >   cpu          : usr=2.44%, sys=37.42%, ctx=3570, majf=0, minf=1
> >   IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.3%,
> >=64=164.6%
> >      submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >=64=0.0%
> >      complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%,
> >=64=0.0%
> >      issued    : total=r=0/w=12401/d=0, short=r=0/w=0/d=0,
> drop=r=0/w=0/d=0
> >      latency   : target=0, window=0, percentile=100.00%, depth=64
> >
> > Run status group 0 (all jobs):
> >   WRITE: io=6232.0MB, aggrb=810975KB/s, minb=810975KB/s,
> maxb=810975KB/s, mint=7869msec, maxt=7869msec
> >
> > Disk stats (read/write):
> >   xvdb: ios=54/229583, merge=0/2235879, ticks=16/5409500,
> in_queue=5415080, util=98.27%
> >
> > So, higher throughput and iops. The dom0 cpu usage was running at ~70%,
> so there is definitely more dom0 overhead by using grant copy. The usage of
> grant copy could probably be improved through since the current code issues
> an copy ioctl per ioreq. With some batching I suspect some, if not all, of the
> extra overhead could be recovered.
> 
> There's almost always going to be more CPU overhead with grant-copy,
> since when using persistent grants QEMU can avoid all (or almost all)
> of the ioctls to the grant device.
> 
> For the persistent-grants benchmark, did you warm up the grant cache
> first? (ie: are those results from a first run of fio?)
> 

No, that was the third run I did (and the same in the grant copy case).

> In any case, I'm happy to use something different than persistent
> grants as long as the performance is similar.
> 

Yes, I'd even suggest removing the persistent grant code from xen_disk.c in the interest of reducing the complexity of the code... but I guess that depends on how likely folks are to be using a new QEMU with an older set of Xen libraries (and hence not have grant copy available to them).

  Paul

> Roger.
diff mbox

Patch

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index 3a22805fbc..9b06e3aa81 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -1023,11 +1023,18 @@  static int blk_init(struct XenDevice *xendev)
 
     blkdev->file_blk  = BLOCK_SIZE;
 
+    blkdev->feature_grant_copy =
+                (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0);
+
+    xen_pv_printf(&blkdev->xendev, 3, "grant copy operation %s\n",
+                  blkdev->feature_grant_copy ? "enabled" : "disabled");
+
     /* fill info
      * blk_connect supplies sector-size and sectors
      */
     xenstore_write_be_int(&blkdev->xendev, "feature-flush-cache", 1);
-    xenstore_write_be_int(&blkdev->xendev, "feature-persistent", 1);
+    xenstore_write_be_int(&blkdev->xendev, "feature-persistent",
+                          !blkdev->feature_grant_copy);
     xenstore_write_be_int(&blkdev->xendev, "info", info);
 
     blk_parse_discard(blkdev);
@@ -1202,12 +1209,6 @@  static int blk_connect(struct XenDevice *xendev)
 
     xen_be_bind_evtchn(&blkdev->xendev);
 
-    blkdev->feature_grant_copy =
-                (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0);
-
-    xen_pv_printf(&blkdev->xendev, 3, "grant copy operation %s\n",
-                  blkdev->feature_grant_copy ? "enabled" : "disabled");
-
     xen_pv_printf(&blkdev->xendev, 1, "ok: proto %s, ring-ref %d, "
                   "remote port %d, local port %d\n",
                   blkdev->xendev.protocol, blkdev->ring_ref,