mbox series

[RFC,0/5] ext4: rework delayed allocated cluster accounting

Message ID 20180513175624.12887-1-enwlinux@gmail.com
Headers show
Series ext4: rework delayed allocated cluster accounting | expand

Message

Eric Whitney May 13, 2018, 5:56 p.m. UTC
The goals of this patch series are to solve the specific bugs
described in bugzilla #151491 and to arrive at a correct solution
for delayed allocated cluster accounting for bigalloc file systems
generally. Under some circumstances, ext4 currently makes gross
overestimates of the number of reserved clusters required to handle
bigalloc write requests under delayed allocation.  In addition to
premature ENOSPC and quota limit failures, these overestimates tend
to persist over time and in some cases even persist across umounts
and remounts of an affected file system.

This patch series is a work in progress.  Although it does appear to
correct all known delayed cluster accounting deficiencies, and
produces no regressions for all xfstest-bld test cases, it currently
contains a significant bug that adversely affects ext4's direct I/O
code.  The next and hopefully final step is to address that bug.
These patches are being posted now to show the work in progress and
to solicit comments on its general approach and implementation.

All of the patches in the series must be applied as a unit to obtain
a correct solution to the problem.  The individual fixes in this
series have not been submitted separately because they can cause
new failures or misbehavior to appear.  The effect is to make matters
even worse for the use case described in bugzilla #151491 until all
the fixes have been applied together.

This patch series is based on 4.17-rc3.  4.17-rc4 has exhibited a
number of undiagnosed xfstest-bldtest test failures that appear to be
unrelated to ext4 and which make comparisons with a regression
baseline troublesome.  4.17-rc3 is free of those failures.

Eric Whitney (5):
  ext4: fix reserved cluster accounting at delayed write time
  ext4: reduce reserved cluster count by number of allocated clusters
  ext4: adjust reserved cluster count when removing extents
  ext4:  release delayed allocated clusters when removing block ranges
  ext4: don't release delalloc clusters when invalidating page

 fs/ext4/ext4.h           |  11 +
 fs/ext4/extents.c        | 661 ++++++++++++++++++++++++++++-------------------
 fs/ext4/extents_status.c | 303 ++++++++++++++++++++++
 fs/ext4/extents_status.h |  20 ++
 fs/ext4/inode.c          | 113 ++++----
 fs/ext4/mballoc.c        |  11 +-
 6 files changed, 798 insertions(+), 321 deletions(-)

Comments

Liu Bo May 16, 2018, 1 a.m. UTC | #1
On Mon, May 14, 2018 at 1:56 AM, Eric Whitney <enwlinux@gmail.com> wrote:
> The goals of this patch series are to solve the specific bugs
> described in bugzilla #151491 and to arrive at a correct solution
> for delayed allocated cluster accounting for bigalloc file systems
> generally. Under some circumstances, ext4 currently makes gross
> overestimates of the number of reserved clusters required to handle
> bigalloc write requests under delayed allocation.  In addition to
> premature ENOSPC and quota limit failures, these overestimates tend
> to persist over time and in some cases even persist across umounts
> and remounts of an affected file system.
>
> This patch series is a work in progress.  Although it does appear to
> correct all known delayed cluster accounting deficiencies, and
> produces no regressions for all xfstest-bld test cases, it currently
> contains a significant bug that adversely affects ext4's direct I/O
> code.  The next and hopefully final step is to address that bug.
> These patches are being posted now to show the work in progress and
> to solicit comments on its general approach and implementation.
>
> All of the patches in the series must be applied as a unit to obtain
> a correct solution to the problem.  The individual fixes in this
> series have not been submitted separately because they can cause
> new failures or misbehavior to appear.  The effect is to make matters
> even worse for the use case described in bugzilla #151491 until all
> the fixes have been applied together.
>
> This patch series is based on 4.17-rc3.  4.17-rc4 has exhibited a
> number of undiagnosed xfstest-bldtest test failures that appear to be
> unrelated to ext4 and which make comparisons with a regression
> baseline troublesome.  4.17-rc3 is free of those failures.
>

FYI, I'm going thru the whole patch set, the first two patches look good to me.

thanks,
liubo

> Eric Whitney (5):
>   ext4: fix reserved cluster accounting at delayed write time
>   ext4: reduce reserved cluster count by number of allocated clusters
>   ext4: adjust reserved cluster count when removing extents
>   ext4:  release delayed allocated clusters when removing block ranges
>   ext4: don't release delalloc clusters when invalidating page
>
>  fs/ext4/ext4.h           |  11 +
>  fs/ext4/extents.c        | 661 ++++++++++++++++++++++++++++-------------------
>  fs/ext4/extents_status.c | 303 ++++++++++++++++++++++
>  fs/ext4/extents_status.h |  20 ++
>  fs/ext4/inode.c          | 113 ++++----
>  fs/ext4/mballoc.c        |  11 +-
>  6 files changed, 798 insertions(+), 321 deletions(-)
>
> --
> 2.11.0
>
Eric Whitney June 7, 2018, 9:09 p.m. UTC | #2
* Liu Bo <obuil.liubo@gmail.com>:
> On Mon, May 14, 2018 at 1:56 AM, Eric Whitney <enwlinux@gmail.com> wrote:
> > The goals of this patch series are to solve the specific bugs
> > described in bugzilla #151491 and to arrive at a correct solution
> > for delayed allocated cluster accounting for bigalloc file systems
> > generally. Under some circumstances, ext4 currently makes gross
> > overestimates of the number of reserved clusters required to handle
> > bigalloc write requests under delayed allocation.  In addition to
> > premature ENOSPC and quota limit failures, these overestimates tend
> > to persist over time and in some cases even persist across umounts
> > and remounts of an affected file system.
> >
> > This patch series is a work in progress.  Although it does appear to
> > correct all known delayed cluster accounting deficiencies, and
> > produces no regressions for all xfstest-bld test cases, it currently
> > contains a significant bug that adversely affects ext4's direct I/O
> > code.  The next and hopefully final step is to address that bug.
> > These patches are being posted now to show the work in progress and
> > to solicit comments on its general approach and implementation.
> >
> > All of the patches in the series must be applied as a unit to obtain
> > a correct solution to the problem.  The individual fixes in this
> > series have not been submitted separately because they can cause
> > new failures or misbehavior to appear.  The effect is to make matters
> > even worse for the use case described in bugzilla #151491 until all
> > the fixes have been applied together.
> >
> > This patch series is based on 4.17-rc3.  4.17-rc4 has exhibited a
> > number of undiagnosed xfstest-bldtest test failures that appear to be
> > unrelated to ext4 and which make comparisons with a regression
> > baseline troublesome.  4.17-rc3 is free of those failures.
> >
> 
> FYI, I'm going thru the whole patch set, the first two patches look good to me.
> 
> thanks,
> liubo

Hi Liu Bo:

Thanks for your review!  At this point, I'd suggest setting your
review of my final three patches aside for now.

The final three patches are more complicated than I'd like because I
was trying very hard to avoid consuming a lot of memory.  It
turns out that Ted is willing to consume some memory to solve the
truncation/hole punching problem those patches try to address,
and he's proposed a promising approach that self-limits its memory
consumption.  I'm going to try implementing his suggestion before going
further with my original patches.  I think the resulting code will be
much simpler.

Eric


> 
> > Eric Whitney (5):
> >   ext4: fix reserved cluster accounting at delayed write time
> >   ext4: reduce reserved cluster count by number of allocated clusters
> >   ext4: adjust reserved cluster count when removing extents
> >   ext4:  release delayed allocated clusters when removing block ranges
> >   ext4: don't release delalloc clusters when invalidating page
> >
> >  fs/ext4/ext4.h           |  11 +
> >  fs/ext4/extents.c        | 661 ++++++++++++++++++++++++++++-------------------
> >  fs/ext4/extents_status.c | 303 ++++++++++++++++++++++
> >  fs/ext4/extents_status.h |  20 ++
> >  fs/ext4/inode.c          | 113 ++++----
> >  fs/ext4/mballoc.c        |  11 +-
> >  6 files changed, 798 insertions(+), 321 deletions(-)
> >
> > --
> > 2.11.0
> >