[1/1] man-page: copy_file_range(2) allow for cross-device copies

Message ID 20181026201057.36899-3-olga.kornievskaia@gmail.com
State New
Headers show
Series
  • [1/1] man-page: copy_file_range(2) allow for cross-device copies
Related show

Commit Message

Olga Kornievskaia Oct. 26, 2018, 8:10 p.m.
From: Olga Kornievskaia <kolga@netapp.com>

A proposed VFS change removes the check for the files to reside
under the same file system. Instead, a file system driver implementation
is allowed to perform a cross-device copy_file_range() and if
the file system fails to support it instead fallback to doing
a do_splice copy. Therefore, EXDEV error code only applies to
kernel version prior to such support.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
---
 man2/copy_file_range.2 | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Dave Chinner Oct. 27, 2018, 9:12 a.m. | #1
On Fri, Oct 26, 2018 at 04:10:47PM -0400, Olga Kornievskaia wrote:
> From: Olga Kornievskaia <kolga@netapp.com>
> 
> A proposed VFS change removes the check for the files to reside
> under the same file system. Instead, a file system driver implementation
> is allowed to perform a cross-device copy_file_range() and if
> the file system fails to support it instead fallback to doing
> a do_splice copy. Therefore, EXDEV error code only applies to
> kernel version prior to such support.
> 
> Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> ---
>  man2/copy_file_range.2 | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/man2/copy_file_range.2 b/man2/copy_file_range.2
> index 20374ab..88b40bb 100644
> --- a/man2/copy_file_range.2
> +++ b/man2/copy_file_range.2
> @@ -39,7 +39,8 @@ The
>  .BR copy_file_range ()
>  system call performs an in-kernel copy between two file descriptors
>  without the additional cost of transferring data from the kernel to user space
> -and then back into the kernel.
> +and then back into the kernel. Starting kernel version 4.21 passed in
> +file descriptors are not required to be under the same mounted file system.
>  It copies up to
>  .I len
>  bytes of data from file descriptor
> @@ -131,7 +132,8 @@ There is not enough space on the target filesystem to complete the copy.
>  .B EXDEV
>  The files referred to by
>  .IR file_in " and " file_out
> -are not on the same mounted filesystem.
> +are not on the same mounted filesystem when the kernel does not support
> +cross device file copy.

Kernel can support cross device file copy, the filesystem may not.

EXDEV
	One of the files specified by file_in and file_out are on a
	filesystem that does not support cross device copies.

Cheers,

Dave.
Matthew Wilcox Oct. 27, 2018, 1:23 p.m. | #2
On Sat, Oct 27, 2018 at 08:12:40PM +1100, Dave Chinner wrote:
> > @@ -131,7 +132,8 @@ There is not enough space on the target filesystem to complete the copy.
> >  .B EXDEV
> >  The files referred to by
> >  .IR file_in " and " file_out
> > -are not on the same mounted filesystem.
> > +are not on the same mounted filesystem when the kernel does not support
> > +cross device file copy.
> 
> Kernel can support cross device file copy, the filesystem may not.
> 
> EXDEV
> 	One of the files specified by file_in and file_out are on a
> 	filesystem that does not support cross device copies.

I mentioned this in my last review, and Olga pointed out that one of
the changes in this patch means the kernel will do the copy using
do_splice_direct if the filesystem doesn't support cross-device copying.
We should keep this error documented for those on old kernels, but the
kernel will never return -EXDEV any more.
Dave Chinner Oct. 28, 2018, 1:33 a.m. | #3
On Sat, Oct 27, 2018 at 06:23:39AM -0700, Matthew Wilcox wrote:
> On Sat, Oct 27, 2018 at 08:12:40PM +1100, Dave Chinner wrote:
> > > @@ -131,7 +132,8 @@ There is not enough space on the target filesystem to complete the copy.
> > >  .B EXDEV
> > >  The files referred to by
> > >  .IR file_in " and " file_out
> > > -are not on the same mounted filesystem.
> > > +are not on the same mounted filesystem when the kernel does not support
> > > +cross device file copy.
> > 
> > Kernel can support cross device file copy, the filesystem may not.
> > 
> > EXDEV
> > 	One of the files specified by file_in and file_out are on a
> > 	filesystem that does not support cross device copies.
> 
> I mentioned this in my last review, and Olga pointed out that one of
> the changes in this patch means the kernel will do the copy using
> do_splice_direct if the filesystem doesn't support cross-device copying.
> We should keep this error documented for those on old kernels, but the
> kernel will never return -EXDEV any more.

Uh, wot? Where's the patch named "VFS: enable copy_file_range() for
all filesystems"? That shoul dnot be a detail hidden inside some
other patch that multiple people completely miss during review.

If we are completely changing the kernel's behaviour, the patch
should be explicitly named to call out the change of behaviour, and
the commit message should clearly explain the change being made and
why.
 
/me goes looking.

Yup, it is only mentione din passing as a side-effect of an
implementation change. That's back to front. Describe the behaviour
change, what users will see and the reasons for making the change,
leave the code to describe exactly what the change is. Then you can
describe the actions needed to make the new functionality work. e.g.
The first patch shoul dbe described as:

VFS: generic cross-device copy_file_range() support for all filesystems

From: ....

In preparation for enabling cross-device offloading of
copy_file_range() functionality, first enable generic cross-device
support by allowing copy_file_range() to fall back to a page cache
based physical data copy. This means the copy_file_range() syscall
will never fail with EXDEV, and so in future userspace will not need
to detect or provide a fallback for failed cross-device copies
anymore.

This requires pushing the cross device support checks down into the
filesystem ->copy_file_range methods and falling back to the page
cache copy if they return EXDEV.

Further, clone_file_range() is only supported within a filesystem,
so we must also check that the files belong to the same superblock
before calling ->clone_file_range(). If they are on different
superblocks, skip the attempt to clone the file and instead try to
copy the file.

S-o-B: .....


Cheers,

Dave.
Matthew Wilcox Oct. 28, 2018, 2:39 a.m. | #4
On Sun, Oct 28, 2018 at 12:33:07PM +1100, Dave Chinner wrote:
> On Sat, Oct 27, 2018 at 06:23:39AM -0700, Matthew Wilcox wrote:
> > I mentioned this in my last review, and Olga pointed out that one of
> > the changes in this patch means the kernel will do the copy using
> > do_splice_direct if the filesystem doesn't support cross-device copying.
> > We should keep this error documented for those on old kernels, but the
> > kernel will never return -EXDEV any more.
> 
> Uh, wot? Where's the patch named "VFS: enable copy_file_range() for
> all filesystems"? That shoul dnot be a detail hidden inside some
> other patch that multiple people completely miss during review.

Yes, I completely agree.

I'd actually suggest doing the patches the other way around; first push
the -EXDEV check into the filesystems, then make an EXDEV return cause a
call to do_splice_direct().  I think that makes for a more understandable
patch series (ie it's just splitting the last hunk from the existing
patch 1 into a separate patch with an appropriate changelog).

> If we are completely changing the kernel's behaviour, the patch
> should be explicitly named to call out the change of behaviour, and
> the commit message should clearly explain the change being made and
> why.
>  
> /me goes looking.
> 
> Yup, it is only mentione din passing as a side-effect of an
> implementation change. That's back to front. Describe the behaviour
> change, what users will see and the reasons for making the change,
> leave the code to describe exactly what the change is. Then you can
> describe the actions needed to make the new functionality work. e.g.
> The first patch shoul dbe described as:
> 
> VFS: generic cross-device copy_file_range() support for all filesystems
> 
> From: ....
> 
> In preparation for enabling cross-device offloading of
> copy_file_range() functionality, first enable generic cross-device
> support by allowing copy_file_range() to fall back to a page cache
> based physical data copy. This means the copy_file_range() syscall
> will never fail with EXDEV, and so in future userspace will not need
> to detect or provide a fallback for failed cross-device copies
> anymore.
> 
> This requires pushing the cross device support checks down into the
> filesystem ->copy_file_range methods and falling back to the page
> cache copy if they return EXDEV.
> 
> Further, clone_file_range() is only supported within a filesystem,
> so we must also check that the files belong to the same superblock
> before calling ->clone_file_range(). If they are on different
> superblocks, skip the attempt to clone the file and instead try to
> copy the file.
> 
> S-o-B: .....
> 
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
Olga Kornievskaia Oct. 29, 2018, 2:25 p.m. | #5
On Sat, Oct 27, 2018 at 9:33 PM Dave Chinner <david@fromorbit.com> wrote:
>
> On Sat, Oct 27, 2018 at 06:23:39AM -0700, Matthew Wilcox wrote:
> > On Sat, Oct 27, 2018 at 08:12:40PM +1100, Dave Chinner wrote:
> > > > @@ -131,7 +132,8 @@ There is not enough space on the target filesystem to complete the copy.
> > > >  .B EXDEV
> > > >  The files referred to by
> > > >  .IR file_in " and " file_out
> > > > -are not on the same mounted filesystem.
> > > > +are not on the same mounted filesystem when the kernel does not support
> > > > +cross device file copy.
> > >
> > > Kernel can support cross device file copy, the filesystem may not.
> > >
> > > EXDEV
> > >     One of the files specified by file_in and file_out are on a
> > >     filesystem that does not support cross device copies.
> >
> > I mentioned this in my last review, and Olga pointed out that one of
> > the changes in this patch means the kernel will do the copy using
> > do_splice_direct if the filesystem doesn't support cross-device copying.
> > We should keep this error documented for those on old kernels, but the
> > kernel will never return -EXDEV any more.
>
> Uh, wot? Where's the patch named "VFS: enable copy_file_range() for
> all filesystems"? That shoul dnot be a detail hidden inside some
> other patch that multiple people completely miss during review.

The fact that cross-device check was moved is what allowed for all
filesystems to use copy_file_range(). I think choosing the words of
"moving cross device check" was also appropriate title for the VFS
patch.

To other what you thought was the main change was a side-effect or at
least that's not where the discussion was centered. The discussion was
focused on the fact that there is cross device of same and different
file systems and such discussion deemed appropriate to be noted
clearly in the commit message. The proposed commit message below
doesn't capture it.

I'm fine with the commit message below as well. If that's acceptable
to others. I can change the commit to the wording below.

When I started reading the message I thought your comments where about
the "man-page" patch that it lacked the wording including in the VFS
patch. So to clarify, do you have any objections to the wording in the
man-page patch or was this just about the VFS patch?

> If we are completely changing the kernel's behaviour, the patch
> should be explicitly named to call out the change of behaviour, and
> the commit message should clearly explain the change being made and
> why.
>
> /me goes looking.
>
> Yup, it is only mentione din passing as a side-effect of an
> implementation change. That's back to front. Describe the behaviour
> change, what users will see and the reasons for making the change,
> leave the code to describe exactly what the change is. Then you can
> describe the actions needed to make the new functionality work. e.g.
> The first patch shoul dbe described as:
>
> VFS: generic cross-device copy_file_range() support for all filesystems

> From: ....
>
> In preparation for enabling cross-device offloading of
> copy_file_range() functionality, first enable generic cross-device
> support by allowing copy_file_range() to fall back to a page cache
> based physical data copy. This means the copy_file_range() syscall
> will never fail with EXDEV, and so in future userspace will not need
> to detect or provide a fallback for failed cross-device copies
> anymore.
>
> This requires pushing the cross device support checks down into the
> filesystem ->copy_file_range methods and falling back to the page
> cache copy if they return EXDEV.
>
> Further, clone_file_range() is only supported within a filesystem,
> so we must also check that the files belong to the same superblock
> before calling ->clone_file_range(). If they are on different
> superblocks, skip the attempt to clone the file and instead try to
> copy the file.
>
> S-o-B: .....

>
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
Olga Kornievskaia Oct. 29, 2018, 3:52 p.m. | #6
On Mon, Oct 29, 2018 at 10:25 AM Olga Kornievskaia
<olga.kornievskaia@gmail.com> wrote:
>
> On Sat, Oct 27, 2018 at 9:33 PM Dave Chinner <david@fromorbit.com> wrote:
> >
> > On Sat, Oct 27, 2018 at 06:23:39AM -0700, Matthew Wilcox wrote:
> > > On Sat, Oct 27, 2018 at 08:12:40PM +1100, Dave Chinner wrote:
> > > > > @@ -131,7 +132,8 @@ There is not enough space on the target filesystem to complete the copy.
> > > > >  .B EXDEV
> > > > >  The files referred to by
> > > > >  .IR file_in " and " file_out
> > > > > -are not on the same mounted filesystem.
> > > > > +are not on the same mounted filesystem when the kernel does not support
> > > > > +cross device file copy.
> > > >
> > > > Kernel can support cross device file copy, the filesystem may not.
> > > >
> > > > EXDEV
> > > >     One of the files specified by file_in and file_out are on a
> > > >     filesystem that does not support cross device copies.
> > >
> > > I mentioned this in my last review, and Olga pointed out that one of
> > > the changes in this patch means the kernel will do the copy using
> > > do_splice_direct if the filesystem doesn't support cross-device copying.
> > > We should keep this error documented for those on old kernels, but the
> > > kernel will never return -EXDEV any more.
> >
> > Uh, wot? Where's the patch named "VFS: enable copy_file_range() for
> > all filesystems"? That shoul dnot be a detail hidden inside some
> > other patch that multiple people completely miss during review.
>
> The fact that cross-device check was moved is what allowed for all
> filesystems to use copy_file_range(). I think choosing the words of
> "moving cross device check" was also appropriate title for the VFS
> patch.

I wasn't completely correct. It was the check removal and then also
allowing for -EXDEV error to keep going with the  do_splice_direct().
I could have left EXDEV being an error but thought it would be
beneficial to add such ability. One reason to keep all the changes in
one patch was said to be so that porting back all of this
functionality is included.

So one option would be keep EXDEV as an error if that's what folks
want then this side-effect would not be a problem and the focus would
be on what it was in the first place : removal (or relocation) of the
cross device check into the filesystems.

> To other what you thought was the main change was a side-effect or at
> least that's not where the discussion was centered. The discussion was
> focused on the fact that there is cross device of same and different
> file systems and such discussion deemed appropriate to be noted
> clearly in the commit message. The proposed commit message below
> doesn't capture it.
>
> I'm fine with the commit message below as well. If that's acceptable
> to others. I can change the commit to the wording below.
>
> When I started reading the message I thought your comments where about
> the "man-page" patch that it lacked the wording including in the VFS
> patch. So to clarify, do you have any objections to the wording in the
> man-page patch or was this just about the VFS patch?
>
> > If we are completely changing the kernel's behaviour, the patch
> > should be explicitly named to call out the change of behaviour, and
> > the commit message should clearly explain the change being made and
> > why.
> >
> > /me goes looking.
> >
> > Yup, it is only mentione din passing as a side-effect of an
> > implementation change. That's back to front. Describe the behaviour
> > change, what users will see and the reasons for making the change,
> > leave the code to describe exactly what the change is. Then you can
> > describe the actions needed to make the new functionality work. e.g.
> > The first patch shoul dbe described as:
> >
> > VFS: generic cross-device copy_file_range() support for all filesystems
>
> > From: ....
> >
> > In preparation for enabling cross-device offloading of
> > copy_file_range() functionality, first enable generic cross-device
> > support by allowing copy_file_range() to fall back to a page cache
> > based physical data copy. This means the copy_file_range() syscall
> > will never fail with EXDEV, and so in future userspace will not need
> > to detect or provide a fallback for failed cross-device copies
> > anymore.
> >
> > This requires pushing the cross device support checks down into the
> > filesystem ->copy_file_range methods and falling back to the page
> > cache copy if they return EXDEV.
> >
> > Further, clone_file_range() is only supported within a filesystem,
> > so we must also check that the files belong to the same superblock
> > before calling ->clone_file_range(). If they are on different
> > superblocks, skip the attempt to clone the file and instead try to
> > copy the file.
> >
> > S-o-B: .....
>
> >
> >
> > Cheers,
> >
> > Dave.
> > --
> > Dave Chinner
> > david@fromorbit.com
Amir Goldstein Oct. 29, 2018, 5:49 p.m. | #7
On Mon, Oct 29, 2018 at 5:52 PM Olga Kornievskaia
<olga.kornievskaia@gmail.com> wrote:
>
> On Mon, Oct 29, 2018 at 10:25 AM Olga Kornievskaia
> <olga.kornievskaia@gmail.com> wrote:
> >
> > On Sat, Oct 27, 2018 at 9:33 PM Dave Chinner <david@fromorbit.com> wrote:
> > >
> > > On Sat, Oct 27, 2018 at 06:23:39AM -0700, Matthew Wilcox wrote:
> > > > On Sat, Oct 27, 2018 at 08:12:40PM +1100, Dave Chinner wrote:
> > > > > > @@ -131,7 +132,8 @@ There is not enough space on the target filesystem to complete the copy.
> > > > > >  .B EXDEV
> > > > > >  The files referred to by
> > > > > >  .IR file_in " and " file_out
> > > > > > -are not on the same mounted filesystem.
> > > > > > +are not on the same mounted filesystem when the kernel does not support
> > > > > > +cross device file copy.
> > > > >
> > > > > Kernel can support cross device file copy, the filesystem may not.
> > > > >
> > > > > EXDEV
> > > > >     One of the files specified by file_in and file_out are on a
> > > > >     filesystem that does not support cross device copies.
> > > >
> > > > I mentioned this in my last review, and Olga pointed out that one of
> > > > the changes in this patch means the kernel will do the copy using
> > > > do_splice_direct if the filesystem doesn't support cross-device copying.
> > > > We should keep this error documented for those on old kernels, but the
> > > > kernel will never return -EXDEV any more.
> > >
> > > Uh, wot? Where's the patch named "VFS: enable copy_file_range() for
> > > all filesystems"? That shoul dnot be a detail hidden inside some
> > > other patch that multiple people completely miss during review.
> >
> > The fact that cross-device check was moved is what allowed for all
> > filesystems to use copy_file_range(). I think choosing the words of
> > "moving cross device check" was also appropriate title for the VFS
> > patch.
>
> I wasn't completely correct. It was the check removal and then also
> allowing for -EXDEV error to keep going with the  do_splice_direct().
> I could have left EXDEV being an error but thought it would be
> beneficial to add such ability. One reason to keep all the changes in
> one patch was said to be so that porting back all of this
> functionality is included.
>

Olga,

There are many details in this patch series and repeated miscommunications
about the details. Accuracy is important - "the reason to keep all changes" -
that was a commetn of mine referring *only* to the move of same sb check
from VFS to different filesystems.

The *extra* change of do_splice_direct cross fs was a very nice low
hanging fruit, but completely unrelated to your original  motivation.
I completely agree with Dave that it should be in a separate patch.
I also side with Dave that it would make sense as the first patch of the series.

> So one option would be keep EXDEV as an error if that's what folks
> want then this side-effect would not be a problem and the focus would
> be on what it was in the first place : removal (or relocation) of the
> cross device check into the filesystems.
>

This is up to you. It's an extra change and not related to your NFS work
and you may submit it or not. I think it will be nice if you did submit it.
There was no objection to this change, only objection was to bundle it together
with a different unrelated patch.

Thanks,
Amir.

Patch

diff --git a/man2/copy_file_range.2 b/man2/copy_file_range.2
index 20374ab..88b40bb 100644
--- a/man2/copy_file_range.2
+++ b/man2/copy_file_range.2
@@ -39,7 +39,8 @@  The
 .BR copy_file_range ()
 system call performs an in-kernel copy between two file descriptors
 without the additional cost of transferring data from the kernel to user space
-and then back into the kernel.
+and then back into the kernel. Starting kernel version 4.21 passed in
+file descriptors are not required to be under the same mounted file system.
 It copies up to
 .I len
 bytes of data from file descriptor
@@ -131,7 +132,8 @@  There is not enough space on the target filesystem to complete the copy.
 .B EXDEV
 The files referred to by
 .IR file_in " and " file_out
-are not on the same mounted filesystem.
+are not on the same mounted filesystem when the kernel does not support
+cross device file copy.
 .SH VERSIONS
 The
 .BR copy_file_range ()