diff mbox

[PATCHv2] block/nfs: cache allocated filesize for read-only files

Message ID 1440403565-27432-1-git-send-email-pl@kamp.de
State New
Headers show

Commit Message

Peter Lieven Aug. 24, 2015, 8:06 a.m. UTC
If the file is readonly its not expected to grow so
save the blocking call to nfs_fstat_async and use
the value saved at connection time. Also important
the monitor (and thus the main loop) will not hang
if block device info is queried and the NFS share
is unresponsive.

Signed-off-by: Peter Lieven <pl@kamp.de>
---
v1->v2: update cache on reopen_prepare [Max]

 block/nfs.c | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

Comments

Max Reitz Aug. 24, 2015, 6:39 p.m. UTC | #1
On 24.08.2015 10:06, Peter Lieven wrote:
> If the file is readonly its not expected to grow so
> save the blocking call to nfs_fstat_async and use
> the value saved at connection time. Also important
> the monitor (and thus the main loop) will not hang
> if block device info is queried and the NFS share
> is unresponsive.
> 
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
> v1->v2: update cache on reopen_prepare [Max]
> 
>  block/nfs.c | 35 +++++++++++++++++++++++++++++++++++
>  1 file changed, 35 insertions(+)

Reviewed-by: Max Reitz <mreitz@redhat.com>

I hope you're ready for the "Stale actual-size value with
cache=direct,read-only=on,format=raw files on NFS" reports. :-)
Peter Lieven Aug. 24, 2015, 7:34 p.m. UTC | #2
Am 24.08.2015 um 20:39 schrieb Max Reitz:
> On 24.08.2015 10:06, Peter Lieven wrote:
>> If the file is readonly its not expected to grow so
>> save the blocking call to nfs_fstat_async and use
>> the value saved at connection time. Also important
>> the monitor (and thus the main loop) will not hang
>> if block device info is queried and the NFS share
>> is unresponsive.
>>
>> Signed-off-by: Peter Lieven <pl@kamp.de>
>> ---
>> v1->v2: update cache on reopen_prepare [Max]
>>
>>  block/nfs.c | 35 +++++++++++++++++++++++++++++++++++
>>  1 file changed, 35 insertions(+)
> Reviewed-by: Max Reitz <mreitz@redhat.com>
>
> I hope you're ready for the "Stale actual-size value with
> cache=direct,read-only=on,format=raw files on NFS" reports. :-)
actually a good point, maybe the cache should only be used if

!(bs->open_flags & BDRV_O_NOCACHE)

for my cdrom stuff this is still ok.

Peter
Max Reitz Aug. 24, 2015, 8:13 p.m. UTC | #3
On 24.08.2015 21:34, Peter Lieven wrote:
> Am 24.08.2015 um 20:39 schrieb Max Reitz:
>> On 24.08.2015 10:06, Peter Lieven wrote:
>>> If the file is readonly its not expected to grow so
>>> save the blocking call to nfs_fstat_async and use
>>> the value saved at connection time. Also important
>>> the monitor (and thus the main loop) will not hang
>>> if block device info is queried and the NFS share
>>> is unresponsive.
>>>
>>> Signed-off-by: Peter Lieven <pl@kamp.de>
>>> ---
>>> v1->v2: update cache on reopen_prepare [Max]
>>>
>>>  block/nfs.c | 35 +++++++++++++++++++++++++++++++++++
>>>  1 file changed, 35 insertions(+)
>> Reviewed-by: Max Reitz <mreitz@redhat.com>
>>
>> I hope you're ready for the "Stale actual-size value with
>> cache=direct,read-only=on,format=raw files on NFS" reports. :-)
> actually a good point, maybe the cache should only be used if
> 
> !(bs->open_flags & BDRV_O_NOCACHE)

Good enough a point to fix it? ;-)

Max
Jeff Cody Aug. 26, 2015, 3:31 p.m. UTC | #4
On Mon, Aug 24, 2015 at 10:13:16PM +0200, Max Reitz wrote:
> On 24.08.2015 21:34, Peter Lieven wrote:
> > Am 24.08.2015 um 20:39 schrieb Max Reitz:
> >> On 24.08.2015 10:06, Peter Lieven wrote:
> >>> If the file is readonly its not expected to grow so
> >>> save the blocking call to nfs_fstat_async and use
> >>> the value saved at connection time. Also important
> >>> the monitor (and thus the main loop) will not hang
> >>> if block device info is queried and the NFS share
> >>> is unresponsive.
> >>>
> >>> Signed-off-by: Peter Lieven <pl@kamp.de>
> >>> ---
> >>> v1->v2: update cache on reopen_prepare [Max]
> >>>
> >>>  block/nfs.c | 35 +++++++++++++++++++++++++++++++++++
> >>>  1 file changed, 35 insertions(+)
> >> Reviewed-by: Max Reitz <mreitz@redhat.com>
> >>
> >> I hope you're ready for the "Stale actual-size value with
> >> cache=direct,read-only=on,format=raw files on NFS" reports. :-)
> > actually a good point, maybe the cache should only be used if
> > 
> > !(bs->open_flags & BDRV_O_NOCACHE)
> 
> Good enough a point to fix it? ;-)
> 
> Max
> 

It seems more inline with expected behavior, to add the cache checking
in before using the size cache.  Would you be opposed to a v3 with
this check added in?

One other concern I have is similar to a concern Max raised earlier -
about an external program modifying the raw image, while QEMU has it
opened r/o.  In particular, I wonder about an NFS server making an
image either sparse / non-sparse.  If it was exported read-only, it
may be a valid assumption that this could be done safely, as it would
not change the reported file size or contents, just the allocated size
on disk.
Peter Lieven Aug. 26, 2015, 6:49 p.m. UTC | #5
Am 26.08.2015 um 17:31 schrieb Jeff Cody:
> On Mon, Aug 24, 2015 at 10:13:16PM +0200, Max Reitz wrote:
>> On 24.08.2015 21:34, Peter Lieven wrote:
>>> Am 24.08.2015 um 20:39 schrieb Max Reitz:
>>>> On 24.08.2015 10:06, Peter Lieven wrote:
>>>>> If the file is readonly its not expected to grow so
>>>>> save the blocking call to nfs_fstat_async and use
>>>>> the value saved at connection time. Also important
>>>>> the monitor (and thus the main loop) will not hang
>>>>> if block device info is queried and the NFS share
>>>>> is unresponsive.
>>>>>
>>>>> Signed-off-by: Peter Lieven <pl@kamp.de>
>>>>> ---
>>>>> v1->v2: update cache on reopen_prepare [Max]
>>>>>
>>>>>  block/nfs.c | 35 +++++++++++++++++++++++++++++++++++
>>>>>  1 file changed, 35 insertions(+)
>>>> Reviewed-by: Max Reitz <mreitz@redhat.com>
>>>>
>>>> I hope you're ready for the "Stale actual-size value with
>>>> cache=direct,read-only=on,format=raw files on NFS" reports. :-)
>>> actually a good point, maybe the cache should only be used if
>>>
>>> !(bs->open_flags & BDRV_O_NOCACHE)
>> Good enough a point to fix it? ;-)
>>
>> Max
>>
> It seems more inline with expected behavior, to add the cache checking
> in before using the size cache.  Would you be opposed to a v3 with
> this check added in?

Of course, will send it tomorrow.

>
> One other concern I have is similar to a concern Max raised earlier -
> about an external program modifying the raw image, while QEMU has it
> opened r/o.  In particular, I wonder about an NFS server making an
> image either sparse / non-sparse.  If it was exported read-only, it
> may be a valid assumption that this could be done safely, as it would
> not change the reported file size or contents, just the allocated size
> on disk.

This might be a use case. But if I allow caching the allocated filesize
might not always be correct. This is even the case on a NFS share mounted
through the kernel where some attributes a cached for some time.

Anyway, would it hurt here if the actual filesize was too small?
In fact it was incorrect since libnfs support was added :-)

Peter
Jeff Cody Aug. 26, 2015, 7:14 p.m. UTC | #6
On Wed, Aug 26, 2015 at 08:49:06PM +0200, Peter Lieven wrote:
> Am 26.08.2015 um 17:31 schrieb Jeff Cody:
> > On Mon, Aug 24, 2015 at 10:13:16PM +0200, Max Reitz wrote:
> >> On 24.08.2015 21:34, Peter Lieven wrote:
> >>> Am 24.08.2015 um 20:39 schrieb Max Reitz:
> >>>> On 24.08.2015 10:06, Peter Lieven wrote:
> >>>>> If the file is readonly its not expected to grow so
> >>>>> save the blocking call to nfs_fstat_async and use
> >>>>> the value saved at connection time. Also important
> >>>>> the monitor (and thus the main loop) will not hang
> >>>>> if block device info is queried and the NFS share
> >>>>> is unresponsive.
> >>>>>
> >>>>> Signed-off-by: Peter Lieven <pl@kamp.de>
> >>>>> ---
> >>>>> v1->v2: update cache on reopen_prepare [Max]
> >>>>>
> >>>>>  block/nfs.c | 35 +++++++++++++++++++++++++++++++++++
> >>>>>  1 file changed, 35 insertions(+)
> >>>> Reviewed-by: Max Reitz <mreitz@redhat.com>
> >>>>
> >>>> I hope you're ready for the "Stale actual-size value with
> >>>> cache=direct,read-only=on,format=raw files on NFS" reports. :-)
> >>> actually a good point, maybe the cache should only be used if
> >>>
> >>> !(bs->open_flags & BDRV_O_NOCACHE)
> >> Good enough a point to fix it? ;-)
> >>
> >> Max
> >>
> > It seems more inline with expected behavior, to add the cache checking
> > in before using the size cache.  Would you be opposed to a v3 with
> > this check added in?
> 
> Of course, will send it tomorrow.
> 
> >
> > One other concern I have is similar to a concern Max raised earlier -
> > about an external program modifying the raw image, while QEMU has it
> > opened r/o.  In particular, I wonder about an NFS server making an
> > image either sparse / non-sparse.  If it was exported read-only, it
> > may be a valid assumption that this could be done safely, as it would
> > not change the reported file size or contents, just the allocated size
> > on disk.
> 
> This might be a use case. But if I allow caching the allocated filesize
> might not always be correct. This is even the case on a NFS share mounted
> through the kernel where some attributes a cached for some time.
> 
> Anyway, would it hurt here if the actual filesize was too small?
> In fact it was incorrect since libnfs support was added :-)
> 

Yeah, I'm not sure what harm it would cause in practice.  It is a
fairly edge use case to begin with, and a relatively benign side
affect (especially since you added reopen() support).

With the cache flag checking, I am comfortable adding my r-b.

Jeff
Stefan Hajnoczi Sept. 3, 2015, 5:05 p.m. UTC | #7
On Wed, Aug 26, 2015 at 03:14:41PM -0400, Jeff Cody wrote:
> On Wed, Aug 26, 2015 at 08:49:06PM +0200, Peter Lieven wrote:
> > Am 26.08.2015 um 17:31 schrieb Jeff Cody:
> > > On Mon, Aug 24, 2015 at 10:13:16PM +0200, Max Reitz wrote:
> > >> On 24.08.2015 21:34, Peter Lieven wrote:
> > >>> Am 24.08.2015 um 20:39 schrieb Max Reitz:
> > >>>> On 24.08.2015 10:06, Peter Lieven wrote:
> > >>>>> If the file is readonly its not expected to grow so
> > >>>>> save the blocking call to nfs_fstat_async and use
> > >>>>> the value saved at connection time. Also important
> > >>>>> the monitor (and thus the main loop) will not hang
> > >>>>> if block device info is queried and the NFS share
> > >>>>> is unresponsive.
> > >>>>>
> > >>>>> Signed-off-by: Peter Lieven <pl@kamp.de>
> > >>>>> ---
> > >>>>> v1->v2: update cache on reopen_prepare [Max]
> > >>>>>
> > >>>>>  block/nfs.c | 35 +++++++++++++++++++++++++++++++++++
> > >>>>>  1 file changed, 35 insertions(+)
> > >>>> Reviewed-by: Max Reitz <mreitz@redhat.com>
> > >>>>
> > >>>> I hope you're ready for the "Stale actual-size value with
> > >>>> cache=direct,read-only=on,format=raw files on NFS" reports. :-)
> > >>> actually a good point, maybe the cache should only be used if
> > >>>
> > >>> !(bs->open_flags & BDRV_O_NOCACHE)
> > >> Good enough a point to fix it? ;-)
> > >>
> > >> Max
> > >>
> > > It seems more inline with expected behavior, to add the cache checking
> > > in before using the size cache.  Would you be opposed to a v3 with
> > > this check added in?
> > 
> > Of course, will send it tomorrow.
> > 
> > >
> > > One other concern I have is similar to a concern Max raised earlier -
> > > about an external program modifying the raw image, while QEMU has it
> > > opened r/o.  In particular, I wonder about an NFS server making an
> > > image either sparse / non-sparse.  If it was exported read-only, it
> > > may be a valid assumption that this could be done safely, as it would
> > > not change the reported file size or contents, just the allocated size
> > > on disk.
> > 
> > This might be a use case. But if I allow caching the allocated filesize
> > might not always be correct. This is even the case on a NFS share mounted
> > through the kernel where some attributes a cached for some time.
> > 
> > Anyway, would it hurt here if the actual filesize was too small?
> > In fact it was incorrect since libnfs support was added :-)
> > 
> 
> Yeah, I'm not sure what harm it would cause in practice.  It is a
> fairly edge use case to begin with, and a relatively benign side
> affect (especially since you added reopen() support).
> 
> With the cache flag checking, I am comfortable adding my r-b.

I don't remember QEMU's behavior on an LVM volume (which can be resized
underneath QEMU but isn't expected to grow or shrink under normal
operation).

The same semantics should probably be used here.

Stefan
diff mbox

Patch

diff --git a/block/nfs.c b/block/nfs.c
index 02eb4e4..a52e9d5 100644
--- a/block/nfs.c
+++ b/block/nfs.c
@@ -43,6 +43,7 @@  typedef struct NFSClient {
     int events;
     bool has_zero_init;
     AioContext *aio_context;
+    blkcnt_t st_blocks;
 } NFSClient;
 
 typedef struct NFSRPC {
@@ -374,6 +375,7 @@  static int64_t nfs_client_open(NFSClient *client, const char *filename,
     }
 
     ret = DIV_ROUND_UP(st.st_size, BDRV_SECTOR_SIZE);
+    client->st_blocks = st.st_blocks;
     client->has_zero_init = S_ISREG(st.st_mode);
     goto out;
 fail:
@@ -464,6 +466,10 @@  static int64_t nfs_get_allocated_file_size(BlockDriverState *bs)
     NFSRPC task = {0};
     struct stat st;
 
+    if (bdrv_is_read_only(bs)) {
+        return client->st_blocks * 512;
+    }
+
     task.st = &st;
     if (nfs_fstat_async(client->context, client->fh, nfs_co_generic_cb,
                         &task) != 0) {
@@ -484,6 +490,34 @@  static int nfs_file_truncate(BlockDriverState *bs, int64_t offset)
     return nfs_ftruncate(client->context, client->fh, offset);
 }
 
+/* Note that this will not re-establish a connection with the NFS server
+ * - it is effectively a NOP.  */
+static int nfs_reopen_prepare(BDRVReopenState *state,
+                              BlockReopenQueue *queue, Error **errp)
+{
+    NFSClient *client = state->bs->opaque;
+    struct stat st;
+    int ret = 0;
+
+    if (state->flags & BDRV_O_RDWR && bdrv_is_read_only(state->bs)) {
+        error_setg(errp, "Cannot open a read-only mount as read-write");
+        return -EACCES;
+    }
+
+    /* Update cache for read-only reopens */
+    if (!(state->flags & BDRV_O_RDWR)) {
+        ret = nfs_fstat(client->context, client->fh, &st);
+        if (ret < 0) {
+            error_setg(errp, "Failed to fstat file: %s",
+                       nfs_get_error(client->context));
+            return ret;
+        }
+        client->st_blocks = st.st_blocks;
+    }
+
+    return 0;
+}
+
 static BlockDriver bdrv_nfs = {
     .format_name                    = "nfs",
     .protocol_name                  = "nfs",
@@ -499,6 +533,7 @@  static BlockDriver bdrv_nfs = {
     .bdrv_file_open                 = nfs_file_open,
     .bdrv_close                     = nfs_file_close,
     .bdrv_create                    = nfs_file_create,
+    .bdrv_reopen_prepare            = nfs_reopen_prepare,
 
     .bdrv_co_readv                  = nfs_co_readv,
     .bdrv_co_writev                 = nfs_co_writev,