diff mbox series

xfstest 614 and allocation size should not be 0

Message ID CAH2r5mvD4CRP_iZRNhS+809wxOj8FSc4FHbN+UVFp8+pMJcpyg@mail.gmail.com
State New
Headers show
Series xfstest 614 and allocation size should not be 0 | expand

Commit Message

Steve French March 17, 2021, 6:10 a.m. UTC
Was examining why xfstest 614 failed (smb3.1.1 mount to Windows), and
noticed a couple of problems:

1) we don't requery the allocation size (number of blocks) in all
cases we should. This small fix should address that

the allocation size until the next open/queryinfo so we still end up
with an allocation size of 0 for a 64K file which breaks the test.

What the test is doing is quite simple:

xfs_io -f -c "truncate 64K" -c "mmap -w 0 64K" -c "mwrite -S 0xab 0
64K" -c "munmap" foo1 ; stat -c %b foo1

And it fails - due to seeing a number of blocks 0 rather than the
correct value (128).  With actimeo=0 we do a subsequent open/query
operation which would cause it to pass since the second open/query
does show the correct allocation size.

Any ideas?

Comments

Tom Talpey March 17, 2021, 11:25 a.m. UTC | #1
On 3/17/2021 2:10 AM, Steve French wrote:
> Was examining why xfstest 614 failed (smb3.1.1 mount to Windows), and
> noticed a couple of problems:
> 
> 1) we don't requery the allocation size (number of blocks) in all
> cases we should. This small fix should address that
> 
> --- a/fs/cifs/inode.c
> +++ b/fs/cifs/inode.c
> @@ -2390,15 +2390,16 @@ int cifs_getattr(struct user_namespace
> *mnt_userns, const struct path *path,
>          struct cifs_tcon *tcon = cifs_sb_master_tcon(cifs_sb);
>          struct inode *inode = d_inode(dentry);
>          int rc;
>          /*
>           * We need to be sure that all dirty pages are written and the server
>           * has actual ctime, mtime and file length.
>           */
> -       if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE)) &&
> +       if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE |
> STATX_BLOCKS)) &&

Seems obviously enough correct.

> and 2) we don't set the allocation size on the case where a cached
> file on this client is written, and to make it worse if we queried

Also obviously the cache needs to be kept in sync, but is it accurate to
set the allocation size before writing? That's the server's field, so
shouldn't it be written, then queried?

> (post query attributes flag) at SMB3 close for compounded operations -
> the Windows server (not sure about others) apparently doesn't update
> the allocation size until the next open/queryinfo so we still end up
> with an allocation size of 0 for a 64K file which breaks the test.
> 
> What the test is doing is quite simple:
> 
> xfs_io -f -c "truncate 64K" -c "mmap -w 0 64K" -c "mwrite -S 0xab 0
> 64K" -c "munmap" foo1 ; stat -c %b foo1
> 
> And it fails - due to seeing a number of blocks 0 rather than the
> correct value (128).  With actimeo=0 we do a subsequent open/query
> operation which would cause it to pass since the second open/query
> does show the correct allocation size.
> 
> Any ideas?

What actually goes on the wire diring the test? It looks like the
munmap step should be msync'ing - does cifs.ko not write the data?

Tom.
Steve French March 17, 2021, 12:18 p.m. UTC | #2
On Wed, Mar 17, 2021 at 6:25 AM Tom Talpey <tom@talpey.com> wrote:
>
> On 3/17/2021 2:10 AM, Steve French wrote:
> > Was examining why xfstest 614 failed (smb3.1.1 mount to Windows), and
> > noticed a couple of problems:
> >
> > 1) we don't requery the allocation size (number of blocks) in all
> > cases we should. This small fix should address that
> >
> > --- a/fs/cifs/inode.c
> > +++ b/fs/cifs/inode.c
> > @@ -2390,15 +2390,16 @@ int cifs_getattr(struct user_namespace
> > *mnt_userns, const struct path *path,
> >          struct cifs_tcon *tcon = cifs_sb_master_tcon(cifs_sb);
> >          struct inode *inode = d_inode(dentry);
> >          int rc;
> >          /*
> >           * We need to be sure that all dirty pages are written and the server
> >           * has actual ctime, mtime and file length.
> >           */
> > -       if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE)) &&
> > +       if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE |
> > STATX_BLOCKS)) &&
>
> Seems obviously enough correct.
>
> > and 2) we don't set the allocation size on the case where a cached
> > file on this client is written, and to make it worse if we queried
>
> Also obviously the cache needs to be kept in sync, but is it accurate to
> set the allocation size before writing? That's the server's field, so
> shouldn't it be written, then queried?
>
> > (post query attributes flag) at SMB3 close for compounded operations -
> > the Windows server (not sure about others) apparently doesn't update
> > the allocation size until the next open/queryinfo so we still end up
> > with an allocation size of 0 for a 64K file which breaks the test.
> >
> > What the test is doing is quite simple:
> >
> > xfs_io -f -c "truncate 64K" -c "mmap -w 0 64K" -c "mwrite -S 0xab 0
> > 64K" -c "munmap" foo1 ; stat -c %b foo1
> >
> > And it fails - due to seeing a number of blocks 0 rather than the
> > correct value (128).  With actimeo=0 we do a subsequent open/query
> > operation which would cause it to pass since the second open/query
> > does show the correct allocation size.
> >
> > Any ideas?
>
> What actually goes on the wire diring the test? It looks like the
> munmap step should be msync'ing - does cifs.ko not write the data?

Oddly enough this works to Azure but not Windows. What we see
on the wire is simple enough:

1) create/getinfo foo1 --> file not found
2) create foo1
3) oplock break of root's cached handle
4) close root handle
6) open of root/getinfo("FileFsFullSizeInformation")/Close
6) Flush foo1
7) ioctl set sparse foo1
8) setinfo FILE_ENDOFFILE_INFO foo1
9) create/setinfo(FILE_BASIC_INFO)/close of foo1
10) read 64K foo1 (all zeros)
11) write 64K foo1
12) close foo1   (post query attributes of close show size 64K,
allocation size 0  ---> should be allocation size 64K)

But it works to Azure ...
Tom Talpey March 17, 2021, 1:04 p.m. UTC | #3
On 3/17/2021 8:18 AM, Steve French wrote:
> On Wed, Mar 17, 2021 at 6:25 AM Tom Talpey <tom@talpey.com> wrote:
>>
>> On 3/17/2021 2:10 AM, Steve French wrote:
>>> Was examining why xfstest 614 failed (smb3.1.1 mount to Windows), and
>>> noticed a couple of problems:
>>>
>>> 1) we don't requery the allocation size (number of blocks) in all
>>> cases we should. This small fix should address that
>>>
>>> --- a/fs/cifs/inode.c
>>> +++ b/fs/cifs/inode.c
>>> @@ -2390,15 +2390,16 @@ int cifs_getattr(struct user_namespace
>>> *mnt_userns, const struct path *path,
>>>           struct cifs_tcon *tcon = cifs_sb_master_tcon(cifs_sb);
>>>           struct inode *inode = d_inode(dentry);
>>>           int rc;
>>>           /*
>>>            * We need to be sure that all dirty pages are written and the server
>>>            * has actual ctime, mtime and file length.
>>>            */
>>> -       if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE)) &&
>>> +       if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE |
>>> STATX_BLOCKS)) &&
>>
>> Seems obviously enough correct.
>>
>>> and 2) we don't set the allocation size on the case where a cached
>>> file on this client is written, and to make it worse if we queried
>>
>> Also obviously the cache needs to be kept in sync, but is it accurate to
>> set the allocation size before writing? That's the server's field, so
>> shouldn't it be written, then queried?
>>
>>> (post query attributes flag) at SMB3 close for compounded operations -
>>> the Windows server (not sure about others) apparently doesn't update
>>> the allocation size until the next open/queryinfo so we still end up
>>> with an allocation size of 0 for a 64K file which breaks the test.
>>>
>>> What the test is doing is quite simple:
>>>
>>> xfs_io -f -c "truncate 64K" -c "mmap -w 0 64K" -c "mwrite -S 0xab 0
>>> 64K" -c "munmap" foo1 ; stat -c %b foo1
>>>
>>> And it fails - due to seeing a number of blocks 0 rather than the
>>> correct value (128).  With actimeo=0 we do a subsequent open/query
>>> operation which would cause it to pass since the second open/query
>>> does show the correct allocation size.
>>>
>>> Any ideas?
>>
>> What actually goes on the wire diring the test? It looks like the
>> munmap step should be msync'ing - does cifs.ko not write the data?
> 
> Oddly enough this works to Azure but not Windows. What we see
> on the wire is simple enough:
> 
> 1) create/getinfo foo1 --> file not found
> 2) create foo1
> 3) oplock break of root's cached handle
> 4) close root handle
> 6) open of root/getinfo("FileFsFullSizeInformation")/Close
> 6) Flush foo1
> 7) ioctl set sparse foo1
> 8) setinfo FILE_ENDOFFILE_INFO foo1
> 9) create/setinfo(FILE_BASIC_INFO)/close of foo1
> 10) read 64K foo1 (all zeros)
> 11) write 64K foo1
> 12) close foo1   (post query attributes of close show size 64K,
> allocation size 0  ---> should be allocation size 64K)
> 
> But it works to Azure ...

Wait, let me find my Spock ears.

Faaascinating.

Obviously the backend is doing some kind of deferred allocation.
Assuming this is NTFS, have you tried an alternative? I guess I'd
also suggest asking Neal, if you can reach him easily.

If it is deferring the block decision, I'd be cautious about
implementing a workaround like launching a second query attributes.
There's little guarantee it will be always correct.

If you can establish that the client is doing the right thing,
personally I'd implement some sort of waiver. Maybe make it dependent
on the server and its FS type, even.

I looked at MS-FSA and it explicitly states the AllocationSize is
updated as part of processing every write. Section 2.1.5.3:

> If ((ByteOffset+ ByteCount) > Open.Stream.AllocationSize), the object
> store MUST increase Open.Stream.AllocationSize to
> BlockAlign(ByteOffset+ ByteCount, Open.File.Volume.ClusterSize). If
> there is not enough disk space, the operation MUST be failed with
> STATUS_DISK_FULL.

So this might be worth a message to dochelp.

Tom.
diff mbox series

Patch

--- a/fs/cifs/inode.c
+++ b/fs/cifs/inode.c
@@ -2390,15 +2390,16 @@  int cifs_getattr(struct user_namespace
*mnt_userns, const struct path *path,
        struct cifs_tcon *tcon = cifs_sb_master_tcon(cifs_sb);
        struct inode *inode = d_inode(dentry);
        int rc;
        /*
         * We need to be sure that all dirty pages are written and the server
         * has actual ctime, mtime and file length.
         */
-       if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE)) &&
+       if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_SIZE |
STATX_BLOCKS)) &&


and 2) we don't set the allocation size on the case where a cached
file on this client is written, and to make it worse if we queried
(post query attributes flag) at SMB3 close for compounded operations -
the Windows server (not sure about others) apparently doesn't update