mbox series

[00/12] block/bio, fs: convert put_page() to put_user_page*()

Message ID 20190724042518.14363-1-jhubbard@nvidia.com
Headers show
Series block/bio, fs: convert put_page() to put_user_page*() | expand

Message

john.hubbard@gmail.com July 24, 2019, 4:25 a.m. UTC
From: John Hubbard <jhubbard@nvidia.com>

Hi,

This is mostly Jerome's work, converting the block/bio and related areas
to call put_user_page*() instead of put_page(). Because I've changed
Jerome's patches, in some cases significantly, I'd like to get his
feedback before we actually leave him listed as the author (he might
want to disown some or all of these).

I added a new patch, in order to make this work with Christoph Hellwig's
recent overhaul to bio_release_pages(): "block: bio_release_pages: use
flags arg instead of bool".

I've started the series with a patch that I've posted in another
series ("mm/gup: add make_dirty arg to put_user_pages_dirty_lock()"[1]),
because I'm not sure which of these will go in first, and this allows each
to stand alone.

Testing: not much beyond build and boot testing has been done yet. And
I'm not set up to even exercise all of it (especially the IB parts) at
run time.

Anyway, changes here are:

* Store, in the iov_iter, a "came from gup (get_user_pages)" parameter.
  Then, use the new iov_iter_get_pages_use_gup() to retrieve it when
  it is time to release the pages. That allows choosing between put_page()
  and put_user_page*().

* Pass in one more piece of information to bio_release_pages: a "from_gup"
  parameter. Similar use as above.

* Change the block layer, and several file systems, to use
  put_user_page*().

[1] https://lore.kernel.org/r/20190724012606.25844-2-jhubbard@nvidia.com
    And please note the correction email that I posted as a follow-up,
    if you're looking closely at that patch. :) The fixed version is
    included here.

John Hubbard (3):
  mm/gup: add make_dirty arg to put_user_pages_dirty_lock()
  block: bio_release_pages: use flags arg instead of bool
  fs/ceph: fix a build warning: returning a value from void function

Jérôme Glisse (9):
  iov_iter: add helper to test if an iter would use GUP v2
  block: bio_release_pages: convert put_page() to put_user_page*()
  block_dev: convert put_page() to put_user_page*()
  fs/nfs: convert put_page() to put_user_page*()
  vhost-scsi: convert put_page() to put_user_page*()
  fs/cifs: convert put_page() to put_user_page*()
  fs/fuse: convert put_page() to put_user_page*()
  fs/ceph: convert put_page() to put_user_page*()
  9p/net: convert put_page() to put_user_page*()

 block/bio.c                                |  81 ++++++++++++---
 drivers/infiniband/core/umem.c             |   5 +-
 drivers/infiniband/hw/hfi1/user_pages.c    |   5 +-
 drivers/infiniband/hw/qib/qib_user_pages.c |   5 +-
 drivers/infiniband/hw/usnic/usnic_uiom.c   |   5 +-
 drivers/infiniband/sw/siw/siw_mem.c        |   8 +-
 drivers/vhost/scsi.c                       |  13 ++-
 fs/block_dev.c                             |  22 +++-
 fs/ceph/debugfs.c                          |   2 +-
 fs/ceph/file.c                             |  62 ++++++++---
 fs/cifs/cifsglob.h                         |   3 +
 fs/cifs/file.c                             |  22 +++-
 fs/cifs/misc.c                             |  19 +++-
 fs/direct-io.c                             |   2 +-
 fs/fuse/dev.c                              |  22 +++-
 fs/fuse/file.c                             |  53 +++++++---
 fs/nfs/direct.c                            |  10 +-
 include/linux/bio.h                        |  22 +++-
 include/linux/mm.h                         |   5 +-
 include/linux/uio.h                        |  11 ++
 mm/gup.c                                   | 115 +++++++++------------
 net/9p/trans_common.c                      |  14 ++-
 net/9p/trans_common.h                      |   3 +-
 net/9p/trans_virtio.c                      |  18 +++-
 24 files changed, 357 insertions(+), 170 deletions(-)

Comments

Christoph Hellwig July 24, 2019, 6:17 a.m. UTC | #1
On Tue, Jul 23, 2019 at 09:25:06PM -0700, john.hubbard@gmail.com wrote:
> * Store, in the iov_iter, a "came from gup (get_user_pages)" parameter.
>   Then, use the new iov_iter_get_pages_use_gup() to retrieve it when
>   it is time to release the pages. That allows choosing between put_page()
>   and put_user_page*().
> 
> * Pass in one more piece of information to bio_release_pages: a "from_gup"
>   parameter. Similar use as above.
> 
> * Change the block layer, and several file systems, to use
>   put_user_page*().

I think we can do this in a simple and better way.  We have 5 ITER_*
types.  Of those ITER_DISCARD as the name suggests never uses pages, so
we can skip handling it.  ITER_PIPE is rejected іn the direct I/O path,
which leaves us with three.

Out of those ITER_BVEC needs a user page reference, so we want to call
put_user_page* on it.  ITER_BVEC always already has page reference,
which means in the block direct I/O path path we alread don't take
a page reference.  We should extent that handling to all other calls
of iov_iter_get_pages / iov_iter_get_pages_alloc.  I think we should
just reject ITER_KVEC for direct I/O as well as we have no users and
it is rather pointless.  Alternatively if we see a use for it the
callers should always have a life page reference anyway (or might
be on kmalloc memory), so we really should not take a reference either.

In other words:  the only time we should ever have to put a page in
this patch is when they are user pages.  We'll need to clean up
various bits of code for that, but that can be done gradually before
even getting to the actual put_user_pages conversion.
John Hubbard July 24, 2019, 11:23 p.m. UTC | #2
On 7/23/19 11:17 PM, Christoph Hellwig wrote:
> On Tue, Jul 23, 2019 at 09:25:06PM -0700, john.hubbard@gmail.com wrote:
>> * Store, in the iov_iter, a "came from gup (get_user_pages)" parameter.
>>   Then, use the new iov_iter_get_pages_use_gup() to retrieve it when
>>   it is time to release the pages. That allows choosing between put_page()
>>   and put_user_page*().
>>
>> * Pass in one more piece of information to bio_release_pages: a "from_gup"
>>   parameter. Similar use as above.
>>
>> * Change the block layer, and several file systems, to use
>>   put_user_page*().
> 
> I think we can do this in a simple and better way.  We have 5 ITER_*
> types.  Of those ITER_DISCARD as the name suggests never uses pages, so
> we can skip handling it.  ITER_PIPE is rejected іn the direct I/O path,
> which leaves us with three.
> 
> Out of those ITER_BVEC needs a user page reference, so we want to call

               ^ ITER_IOVEC, I hope. Otherwise I'm hopeless lost. :)

> put_user_page* on it.  ITER_BVEC always already has page reference,
> which means in the block direct I/O path path we alread don't take
> a page reference.  We should extent that handling to all other calls
> of iov_iter_get_pages / iov_iter_get_pages_alloc.  I think we should
> just reject ITER_KVEC for direct I/O as well as we have no users and
> it is rather pointless.  Alternatively if we see a use for it the
> callers should always have a life page reference anyway (or might
> be on kmalloc memory), so we really should not take a reference either.
> 
> In other words:  the only time we should ever have to put a page in
> this patch is when they are user pages.  We'll need to clean up
> various bits of code for that, but that can be done gradually before
> even getting to the actual put_user_pages conversion.
> 

Sounds great. I'm part way into it and it doesn't look too bad. The main
question is where to scatter various checks and assertions, to keep
the kvecs out of direct I/0. Or at least keep the gups away from 
direct I/0.


thanks,
Bob Liu July 25, 2019, 12:41 a.m. UTC | #3
On 7/24/19 12:25 PM, john.hubbard@gmail.com wrote:
> From: John Hubbard <jhubbard@nvidia.com>
> 
> Hi,
> 
> This is mostly Jerome's work, converting the block/bio and related areas
> to call put_user_page*() instead of put_page(). Because I've changed
> Jerome's patches, in some cases significantly, I'd like to get his
> feedback before we actually leave him listed as the author (he might
> want to disown some or all of these).
> 

Could you add some background to the commit log for people don't have the context..
Why this converting? What's the main differences?

Regards, -Bob

> I added a new patch, in order to make this work with Christoph Hellwig's
> recent overhaul to bio_release_pages(): "block: bio_release_pages: use
> flags arg instead of bool".
> 
> I've started the series with a patch that I've posted in another
> series ("mm/gup: add make_dirty arg to put_user_pages_dirty_lock()"[1]),
> because I'm not sure which of these will go in first, and this allows each
> to stand alone.
> 
> Testing: not much beyond build and boot testing has been done yet. And
> I'm not set up to even exercise all of it (especially the IB parts) at
> run time.
> 
> Anyway, changes here are:
> 
> * Store, in the iov_iter, a "came from gup (get_user_pages)" parameter.
>   Then, use the new iov_iter_get_pages_use_gup() to retrieve it when
>   it is time to release the pages. That allows choosing between put_page()
>   and put_user_page*().
> 
> * Pass in one more piece of information to bio_release_pages: a "from_gup"
>   parameter. Similar use as above.
> 
> * Change the block layer, and several file systems, to use
>   put_user_page*().
> 
> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__lore.kernel.org_r_20190724012606.25844-2D2-2Djhubbard-40nvidia.com&d=DwIDaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=1ktT0U2YS_I8Zz2o-MS1YcCAzWZ6hFGtyTgvVMGM7gI&m=FpFhv2rjbKCAYGmO6Hy8WJAottr1Qz_mDKDLObQ40FU&s=q-_mX3daEr22WbdZMElc_ZbD8L9oGLD7U0xLeyJ661Y&e= 
>     And please note the correction email that I posted as a follow-up,
>     if you're looking closely at that patch. :) The fixed version is
>     included here.
> 
> John Hubbard (3):
>   mm/gup: add make_dirty arg to put_user_pages_dirty_lock()
>   block: bio_release_pages: use flags arg instead of bool
>   fs/ceph: fix a build warning: returning a value from void function
> 
> Jérôme Glisse (9):
>   iov_iter: add helper to test if an iter would use GUP v2
>   block: bio_release_pages: convert put_page() to put_user_page*()
>   block_dev: convert put_page() to put_user_page*()
>   fs/nfs: convert put_page() to put_user_page*()
>   vhost-scsi: convert put_page() to put_user_page*()
>   fs/cifs: convert put_page() to put_user_page*()
>   fs/fuse: convert put_page() to put_user_page*()
>   fs/ceph: convert put_page() to put_user_page*()
>   9p/net: convert put_page() to put_user_page*()
> 
>  block/bio.c                                |  81 ++++++++++++---
>  drivers/infiniband/core/umem.c             |   5 +-
>  drivers/infiniband/hw/hfi1/user_pages.c    |   5 +-
>  drivers/infiniband/hw/qib/qib_user_pages.c |   5 +-
>  drivers/infiniband/hw/usnic/usnic_uiom.c   |   5 +-
>  drivers/infiniband/sw/siw/siw_mem.c        |   8 +-
>  drivers/vhost/scsi.c                       |  13 ++-
>  fs/block_dev.c                             |  22 +++-
>  fs/ceph/debugfs.c                          |   2 +-
>  fs/ceph/file.c                             |  62 ++++++++---
>  fs/cifs/cifsglob.h                         |   3 +
>  fs/cifs/file.c                             |  22 +++-
>  fs/cifs/misc.c                             |  19 +++-
>  fs/direct-io.c                             |   2 +-
>  fs/fuse/dev.c                              |  22 +++-
>  fs/fuse/file.c                             |  53 +++++++---
>  fs/nfs/direct.c                            |  10 +-
>  include/linux/bio.h                        |  22 +++-
>  include/linux/mm.h                         |   5 +-
>  include/linux/uio.h                        |  11 ++
>  mm/gup.c                                   | 115 +++++++++------------
>  net/9p/trans_common.c                      |  14 ++-
>  net/9p/trans_common.h                      |   3 +-
>  net/9p/trans_virtio.c                      |  18 +++-
>  24 files changed, 357 insertions(+), 170 deletions(-)
>
John Hubbard July 26, 2019, 1:24 a.m. UTC | #4
On 7/24/19 5:41 PM, Bob Liu wrote:
> On 7/24/19 12:25 PM, john.hubbard@gmail.com wrote:
>> From: John Hubbard <jhubbard@nvidia.com>
>>
>> Hi,
>>
>> This is mostly Jerome's work, converting the block/bio and related areas
>> to call put_user_page*() instead of put_page(). Because I've changed
>> Jerome's patches, in some cases significantly, I'd like to get his
>> feedback before we actually leave him listed as the author (he might
>> want to disown some or all of these).
>>
> 
> Could you add some background to the commit log for people don't have the context..
> Why this converting? What's the main differences?
> 

Hi Bob,

1. Many of the patches have a blurb like this:

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

...and if you look at that commit, you'll find several pages of
information in its commit description, which should address your point.

2. This whole series has to be re-worked, as per the other feedback thread.
So I'll keep your comment in mind when I post a new series.

thanks,
John Hubbard Aug. 5, 2019, 10:54 p.m. UTC | #5
On 7/23/19 11:17 PM, Christoph Hellwig wrote:
> On Tue, Jul 23, 2019 at 09:25:06PM -0700, john.hubbard@gmail.com wrote:
>> * Store, in the iov_iter, a "came from gup (get_user_pages)" parameter.
>>   Then, use the new iov_iter_get_pages_use_gup() to retrieve it when
>>   it is time to release the pages. That allows choosing between put_page()
>>   and put_user_page*().
>>
>> * Pass in one more piece of information to bio_release_pages: a "from_gup"
>>   parameter. Similar use as above.
>>
>> * Change the block layer, and several file systems, to use
>>   put_user_page*().
> 
> I think we can do this in a simple and better way.  We have 5 ITER_*
> types.  Of those ITER_DISCARD as the name suggests never uses pages, so
> we can skip handling it.  ITER_PIPE is rejected іn the direct I/O path,
> which leaves us with three.
> 

Hi Christoph,

Are you working on anything like this? Or on the put_user_bvec() idea?
Please let me know, otherwise I'll go in and implement something here.


thanks,
Christoph Hellwig Aug. 7, 2019, 6:34 a.m. UTC | #6
On Mon, Aug 05, 2019 at 03:54:35PM -0700, John Hubbard wrote:
> On 7/23/19 11:17 PM, Christoph Hellwig wrote:
> > On Tue, Jul 23, 2019 at 09:25:06PM -0700, john.hubbard@gmail.com wrote:
> >> * Store, in the iov_iter, a "came from gup (get_user_pages)" parameter.
> >>   Then, use the new iov_iter_get_pages_use_gup() to retrieve it when
> >>   it is time to release the pages. That allows choosing between put_page()
> >>   and put_user_page*().
> >>
> >> * Pass in one more piece of information to bio_release_pages: a "from_gup"
> >>   parameter. Similar use as above.
> >>
> >> * Change the block layer, and several file systems, to use
> >>   put_user_page*().
> > 
> > I think we can do this in a simple and better way.  We have 5 ITER_*
> > types.  Of those ITER_DISCARD as the name suggests never uses pages, so
> > we can skip handling it.  ITER_PIPE is rejected іn the direct I/O path,
> > which leaves us with three.
> > 
> 
> Hi Christoph,
> 
> Are you working on anything like this?

I was hoping I could steer you towards it.  But if you don't want to do
it yourself I'll add it to my ever growing todo list.

> Or on the put_user_bvec() idea?

I have a prototype from two month ago:

http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/gup-bvec

but that only survived the most basic testing, so it'll need more work,
which I'm not sure when I'll find time for.
John Hubbard Aug. 7, 2019, 6:38 a.m. UTC | #7
On 8/6/19 11:34 PM, Christoph Hellwig wrote:
> On Mon, Aug 05, 2019 at 03:54:35PM -0700, John Hubbard wrote:
>> On 7/23/19 11:17 PM, Christoph Hellwig wrote:
...
>>> I think we can do this in a simple and better way.  We have 5 ITER_*
>>> types.  Of those ITER_DISCARD as the name suggests never uses pages, so
>>> we can skip handling it.  ITER_PIPE is rejected іn the direct I/O path,
>>> which leaves us with three.
>>>
>>
>> Hi Christoph,
>>
>> Are you working on anything like this?
> 
> I was hoping I could steer you towards it.  But if you don't want to do
> it yourself I'll add it to my ever growing todo list.
> 

Sure, I'm up for this. The bvec-related items are the next logical part
of the gup/dma conversions to work on, and I just wanted to avoid solving the
same problem if you were already in the code.


>> Or on the put_user_bvec() idea?
> 
> I have a prototype from two month ago:
> 
> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/gup-bvec
> 
> but that only survived the most basic testing, so it'll need more work,
> which I'm not sure when I'll find time for.
> 

I'll take a peek, and probably pester you with a few questions if I get
confused. :)

thanks,