[07/11] vfs: copy_file_range should update file timestamps

Message ID 20181203083416.28978-8-david@fromorbit.com
State New
Headers show
Series
  • fs: fixes for major copy_file_range() issues
Related show

Commit Message

Dave Chinner Dec. 3, 2018, 8:34 a.m.
From: Dave Chinner <dchinner@redhat.com>

Timestamps are not updated right now, so programs looking for
timestamp updates for file modifications (like rsync) will not
detect that files have changed. We are also accessing the source
data when doing a copy (but not when cloning) so we need to update
atime on the source file as well.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/read_write.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Amir Goldstein Dec. 3, 2018, 10:47 a.m. | #1
On Mon, Dec 3, 2018 at 10:34 AM Dave Chinner <david@fromorbit.com> wrote:
>
> From: Dave Chinner <dchinner@redhat.com>
>
> Timestamps are not updated right now, so programs looking for
> timestamp updates for file modifications (like rsync) will not
> detect that files have changed. We are also accessing the source
> data when doing a copy (but not when cloning) so we need to update
> atime on the source file as well.
>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/read_write.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/fs/read_write.c b/fs/read_write.c
> index 3b101183ea19..3288db1d5f21 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -1576,6 +1576,16 @@ static ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
>  {
>         ssize_t ret;
>
> +       /* Update source timestamps, because we are accessing file data */
> +       file_accessed(file_in);
> +
> +       /* Update destination timestamps, since we can alter file contents. */
> +       if (!(file_out->f_mode & FMODE_NOCMTIME)) {
> +               ret = file_update_time(file_out);
> +               if (ret)
> +                       return ret;
> +       }
> +

If there is a consistency about who is responsible of calling file_accessed()
and file_update_time() it eludes me. grep tells me that they are mostly
handled by filesystem code or generic helpers called by filesystem code
and not in the vfs helpers.

FMODE_NOCMTIME seems like an xfs specific flag (for DMAPI?), which
most generic callers of file_update_time() completely ignore.
This seems like another argument in favor of leaving the responsibility
of the timestamp updates to the filesystem.

Maybe I am missing something?

Thanks,
Amir.
Olga Kornievskaia Dec. 3, 2018, 5:33 p.m. | #2
On Mon, Dec 3, 2018 at 5:47 AM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Mon, Dec 3, 2018 at 10:34 AM Dave Chinner <david@fromorbit.com> wrote:
> >
> > From: Dave Chinner <dchinner@redhat.com>
> >
> > Timestamps are not updated right now, so programs looking for
> > timestamp updates for file modifications (like rsync) will not
> > detect that files have changed. We are also accessing the source
> > data when doing a copy (but not when cloning) so we need to update
> > atime on the source file as well.
> >
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > ---
> >  fs/read_write.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/fs/read_write.c b/fs/read_write.c
> > index 3b101183ea19..3288db1d5f21 100644
> > --- a/fs/read_write.c
> > +++ b/fs/read_write.c
> > @@ -1576,6 +1576,16 @@ static ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
> >  {
> >         ssize_t ret;
> >
> > +       /* Update source timestamps, because we are accessing file data */
> > +       file_accessed(file_in);
> > +
> > +       /* Update destination timestamps, since we can alter file contents. */
> > +       if (!(file_out->f_mode & FMODE_NOCMTIME)) {
> > +               ret = file_update_time(file_out);
> > +               if (ret)
> > +                       return ret;
> > +       }
> > +
>
> If there is a consistency about who is responsible of calling file_accessed()
> and file_update_time() it eludes me. grep tells me that they are mostly
> handled by filesystem code or generic helpers called by filesystem code
> and not in the vfs helpers.
>
> FMODE_NOCMTIME seems like an xfs specific flag (for DMAPI?), which
> most generic callers of file_update_time() completely ignore.
> This seems like another argument in favor of leaving the responsibility
> of the timestamp updates to the filesystem.
>
> Maybe I am missing something?
>

I had similar question before about who is responsible for doing the
checks. I agree that attributes should be updated for the case when no
filesystem support exist for copy_file_range() but this code does it
for all the cases. I also wonder if it's appropriate to update the
attributes before the copy is actually done?

> Thanks,
> Amir.
Darrick J. Wong Dec. 3, 2018, 6:22 p.m. | #3
On Mon, Dec 03, 2018 at 12:33:50PM -0500, Olga Kornievskaia wrote:
> On Mon, Dec 3, 2018 at 5:47 AM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Mon, Dec 3, 2018 at 10:34 AM Dave Chinner <david@fromorbit.com> wrote:
> > >
> > > From: Dave Chinner <dchinner@redhat.com>
> > >
> > > Timestamps are not updated right now, so programs looking for
> > > timestamp updates for file modifications (like rsync) will not
> > > detect that files have changed. We are also accessing the source
> > > data when doing a copy (but not when cloning) so we need to update
> > > atime on the source file as well.
> > >
> > > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > > ---
> > >  fs/read_write.c | 10 ++++++++++
> > >  1 file changed, 10 insertions(+)
> > >
> > > diff --git a/fs/read_write.c b/fs/read_write.c
> > > index 3b101183ea19..3288db1d5f21 100644
> > > --- a/fs/read_write.c
> > > +++ b/fs/read_write.c
> > > @@ -1576,6 +1576,16 @@ static ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
> > >  {
> > >         ssize_t ret;
> > >
> > > +       /* Update source timestamps, because we are accessing file data */
> > > +       file_accessed(file_in);
> > > +
> > > +       /* Update destination timestamps, since we can alter file contents. */
> > > +       if (!(file_out->f_mode & FMODE_NOCMTIME)) {
> > > +               ret = file_update_time(file_out);
> > > +               if (ret)
> > > +                       return ret;
> > > +       }
> > > +
> >
> > If there is a consistency about who is responsible of calling file_accessed()
> > and file_update_time() it eludes me. grep tells me that they are mostly
> > handled by filesystem code or generic helpers called by filesystem code
> > and not in the vfs helpers.
> >
> > FMODE_NOCMTIME seems like an xfs specific flag (for DMAPI?), which
> > most generic callers of file_update_time() completely ignore.
> > This seems like another argument in favor of leaving the responsibility
> > of the timestamp updates to the filesystem.
> >
> > Maybe I am missing something?
> >
> 
> I had similar question before about who is responsible for doing the
> checks. I agree that attributes should be updated for the case when no
> filesystem support exist for copy_file_range() but this code does it
> for all the cases. I also wonder if it's appropriate to update the
> attributes before the copy is actually done?

The other functions that change file contents (write, clonerange) update
mtime and remove suid before initiating the operation.  For mtime I
think we should maintain consistent behavior, and for suid removal we
definitely need to revoke that before we change the file contents.

--D

> > Thanks,
> > Amir.
Dave Chinner Dec. 3, 2018, 11:19 p.m. | #4
On Mon, Dec 03, 2018 at 12:47:39PM +0200, Amir Goldstein wrote:
> On Mon, Dec 3, 2018 at 10:34 AM Dave Chinner <david@fromorbit.com> wrote:
> >
> > From: Dave Chinner <dchinner@redhat.com>
> >
> > Timestamps are not updated right now, so programs looking for
> > timestamp updates for file modifications (like rsync) will not
> > detect that files have changed. We are also accessing the source
> > data when doing a copy (but not when cloning) so we need to update
> > atime on the source file as well.
> >
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > ---
> >  fs/read_write.c | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/fs/read_write.c b/fs/read_write.c
> > index 3b101183ea19..3288db1d5f21 100644
> > --- a/fs/read_write.c
> > +++ b/fs/read_write.c
> > @@ -1576,6 +1576,16 @@ static ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
> >  {
> >         ssize_t ret;
> >
> > +       /* Update source timestamps, because we are accessing file data */
> > +       file_accessed(file_in);
> > +
> > +       /* Update destination timestamps, since we can alter file contents. */
> > +       if (!(file_out->f_mode & FMODE_NOCMTIME)) {
> > +               ret = file_update_time(file_out);
> > +               if (ret)
> > +                       return ret;
> > +       }
> > +
> 
> If there is a consistency about who is responsible of calling file_accessed()
> and file_update_time() it eludes me. grep tells me that they are mostly
> handled by filesystem code or generic helpers called by filesystem code
> and not in the vfs helpers.

This isn't the "vfs helper" - this is the code that executes a data
copy. We have to do these timestamp updates regardless of the copy
mechanism used so it makes no real sense to force every
implementation to do it, and then also have to ensure the generic
fallback does it as well. Do it once for everyone, then nobody else
needs to care about it.

> FMODE_NOCMTIME seems like an xfs specific flag (for DMAPI?), which

It's a generic VFS flag that originally only XFS used. We check it
in places where data IO to XFS files might be done. Given that we
have vfs functions doing write on behalf of XFS filesystems (such as
remap_file_range() and copy_file_range() the timestamp updates need
to check this flag.

> most generic callers of file_update_time() completely ignore.

Because most cases don't get called from a context that can have
FMODE_NOCMTIME set. If more filesystems start to use FMODE_NOCMTIME
then it will have to be more widely checked.

Cheers,

Dave.
Christoph Hellwig Dec. 4, 2018, 3:24 p.m. | #5
On Mon, Dec 03, 2018 at 07:34:12PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Timestamps are not updated right now, so programs looking for
> timestamp updates for file modifications (like rsync) will not
> detect that files have changed. We are also accessing the source
> data when doing a copy (but not when cloning) so we need to update
> atime on the source file as well.

This needs to be done inside the method, as a few file systems
do odd things about timestamps (yes, even now that XFS doesn't do that
anymore :)).

Patch

diff --git a/fs/read_write.c b/fs/read_write.c
index 3b101183ea19..3288db1d5f21 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1576,6 +1576,16 @@  static ssize_t do_copy_file_range(struct file *file_in, loff_t pos_in,
 {
 	ssize_t ret;
 
+	/* Update source timestamps, because we are accessing file data */
+	file_accessed(file_in);
+
+	/* Update destination timestamps, since we can alter file contents. */
+	if (!(file_out->f_mode & FMODE_NOCMTIME)) {
+		ret = file_update_time(file_out);
+		if (ret)
+			return ret;
+	}
+
 	/*
 	 * Clear the security bits if the process is not being run by root.
 	 * This keeps people from modifying setuid and setgid binaries.