diff mbox

[0/6,v3] kvmalloc

Message ID 20170126103216.GG6590@dhcp22.suse.cz
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Michal Hocko Jan. 26, 2017, 10:32 a.m. UTC
On Thu 26-01-17 11:08:02, Michal Hocko wrote:
> On Thu 26-01-17 10:36:49, Daniel Borkmann wrote:
> > On 01/26/2017 08:43 AM, Michal Hocko wrote:
> > > On Wed 25-01-17 21:16:42, Daniel Borkmann wrote:
> [...]
> > > > I assume that kvzalloc() is still the same from [1], right? If so, then
> > > > it would unfortunately (partially) reintroduce the issue that was fixed.
> > > > If you look above at flags, they're also passed to __vmalloc() to not
> > > > trigger OOM in these situations I've experienced.
> > > 
> > > Pushing __GFP_NORETRY to __vmalloc doesn't have the effect you might
> > > think it would. It can still trigger the OOM killer becauset the flags
> > > are no propagated all the way down to all allocations requests (e.g.
> > > page tables). This is the same reason why GFP_NOFS is not supported in
> > > vmalloc.
> > 
> > Ok, good to know, is that somewhere clearly documented (like for the
> > case with kmalloc())?
> 
> I am afraid that we really suck on this front. I will add something.

So I have folded the following to the patch 1. It is in line with
kvmalloc and hopefully at least tell more than the current code.
---

Comments

Daniel Borkmann Jan. 26, 2017, 11:04 a.m. UTC | #1
On 01/26/2017 11:32 AM, Michal Hocko wrote:
> On Thu 26-01-17 11:08:02, Michal Hocko wrote:
>> On Thu 26-01-17 10:36:49, Daniel Borkmann wrote:
>>> On 01/26/2017 08:43 AM, Michal Hocko wrote:
>>>> On Wed 25-01-17 21:16:42, Daniel Borkmann wrote:
>> [...]
>>>>> I assume that kvzalloc() is still the same from [1], right? If so, then
>>>>> it would unfortunately (partially) reintroduce the issue that was fixed.
>>>>> If you look above at flags, they're also passed to __vmalloc() to not
>>>>> trigger OOM in these situations I've experienced.
>>>>
>>>> Pushing __GFP_NORETRY to __vmalloc doesn't have the effect you might
>>>> think it would. It can still trigger the OOM killer becauset the flags
>>>> are no propagated all the way down to all allocations requests (e.g.
>>>> page tables). This is the same reason why GFP_NOFS is not supported in
>>>> vmalloc.
>>>
>>> Ok, good to know, is that somewhere clearly documented (like for the
>>> case with kmalloc())?
>>
>> I am afraid that we really suck on this front. I will add something.
>
> So I have folded the following to the patch 1. It is in line with
> kvmalloc and hopefully at least tell more than the current code.
> ---
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index d89034a393f2..6c1aa2c68887 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1741,6 +1741,13 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align,
>    *	Allocate enough pages to cover @size from the page level
>    *	allocator with @gfp_mask flags.  Map them into contiguous
>    *	kernel virtual space, using a pagetable protection of @prot.
> + *
> + *	Reclaim modifiers in @gfp_mask - __GFP_NORETRY, __GFP_REPEAT
> + *	and __GFP_NOFAIL are not supported

We could probably also mention that __GFP_ZERO in @gfp_mask is
supported, though.

> + *	Any use of gfp flags outside of GFP_KERNEL should be consulted
> + *	with mm people.

Just a question: should that read 'GFP_KERNEL | __GFP_HIGHMEM' as
that is what vmalloc() resp. vzalloc() and others pass as flags?

> + *
>    */

Sounds good otherwise, thanks Michal!

>   static void *__vmalloc_node(unsigned long size, unsigned long align,
>   			    gfp_t gfp_mask, pgprot_t prot,
Michal Hocko Jan. 26, 2017, 11:49 a.m. UTC | #2
On Thu 26-01-17 12:04:13, Daniel Borkmann wrote:
> On 01/26/2017 11:32 AM, Michal Hocko wrote:
> > On Thu 26-01-17 11:08:02, Michal Hocko wrote:
> > > On Thu 26-01-17 10:36:49, Daniel Borkmann wrote:
> > > > On 01/26/2017 08:43 AM, Michal Hocko wrote:
> > > > > On Wed 25-01-17 21:16:42, Daniel Borkmann wrote:
> > > [...]
> > > > > > I assume that kvzalloc() is still the same from [1], right? If so, then
> > > > > > it would unfortunately (partially) reintroduce the issue that was fixed.
> > > > > > If you look above at flags, they're also passed to __vmalloc() to not
> > > > > > trigger OOM in these situations I've experienced.
> > > > > 
> > > > > Pushing __GFP_NORETRY to __vmalloc doesn't have the effect you might
> > > > > think it would. It can still trigger the OOM killer becauset the flags
> > > > > are no propagated all the way down to all allocations requests (e.g.
> > > > > page tables). This is the same reason why GFP_NOFS is not supported in
> > > > > vmalloc.
> > > > 
> > > > Ok, good to know, is that somewhere clearly documented (like for the
> > > > case with kmalloc())?
> > > 
> > > I am afraid that we really suck on this front. I will add something.
> > 
> > So I have folded the following to the patch 1. It is in line with
> > kvmalloc and hopefully at least tell more than the current code.
> > ---
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index d89034a393f2..6c1aa2c68887 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -1741,6 +1741,13 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align,
> >    *	Allocate enough pages to cover @size from the page level
> >    *	allocator with @gfp_mask flags.  Map them into contiguous
> >    *	kernel virtual space, using a pagetable protection of @prot.
> > + *
> > + *	Reclaim modifiers in @gfp_mask - __GFP_NORETRY, __GFP_REPEAT
> > + *	and __GFP_NOFAIL are not supported
> 
> We could probably also mention that __GFP_ZERO in @gfp_mask is
> supported, though.

There are others which would be supported so I would rather stay with
explicit unsupported.

> 
> > + *	Any use of gfp flags outside of GFP_KERNEL should be consulted
> > + *	with mm people.
> 
> Just a question: should that read 'GFP_KERNEL | __GFP_HIGHMEM' as
> that is what vmalloc() resp. vzalloc() and others pass as flags?

yes, even though I think that specifying __GFP_HIGHMEM shouldn't be
really necessary. Are there any users who would really insist on vmalloc
pages in lowmem? Anyway this made me recheck kvmalloc_node
implementation and I am not adding this flags which would mean a
regression from the current state. Will fix it up.
Joe Perches Jan. 26, 2017, 12:14 p.m. UTC | #3
On Thu, 2017-01-26 at 11:32 +0100, Michal Hocko wrote:
> So I have folded the following to the patch 1. It is in line with
> kvmalloc and hopefully at least tell more than the current code.
[]
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
[]
> @@ -1741,6 +1741,13 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align,
>   *	Allocate enough pages to cover @size from the page level
>   *	allocator with @gfp_mask flags.  Map them into contiguous
>   *	kernel virtual space, using a pagetable protection of @prot.
> + *
> + *	Reclaim modifiers in @gfp_mask - __GFP_NORETRY, __GFP_REPEAT
> + *	and __GFP_NOFAIL are not supported

Maybe add a BUILD_BUG or a WARN_ON_ONCE to catch new occurrences?
Michal Hocko Jan. 26, 2017, 12:27 p.m. UTC | #4
On Thu 26-01-17 04:14:37, Joe Perches wrote:
> On Thu, 2017-01-26 at 11:32 +0100, Michal Hocko wrote:
> > So I have folded the following to the patch 1. It is in line with
> > kvmalloc and hopefully at least tell more than the current code.
> []
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> []
> > @@ -1741,6 +1741,13 @@ void *__vmalloc_node_range(unsigned long size, unsigned long align,
> >   *	Allocate enough pages to cover @size from the page level
> >   *	allocator with @gfp_mask flags.  Map them into contiguous
> >   *	kernel virtual space, using a pagetable protection of @prot.
> > + *
> > + *	Reclaim modifiers in @gfp_mask - __GFP_NORETRY, __GFP_REPEAT
> > + *	and __GFP_NOFAIL are not supported
> 
> Maybe add a BUILD_BUG or a WARN_ON_ONCE to catch new occurrences?

I would really like to not touch vmalloc in this series.
diff mbox

Patch

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index d89034a393f2..6c1aa2c68887 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1741,6 +1741,13 @@  void *__vmalloc_node_range(unsigned long size, unsigned long align,
  *	Allocate enough pages to cover @size from the page level
  *	allocator with @gfp_mask flags.  Map them into contiguous
  *	kernel virtual space, using a pagetable protection of @prot.
+ *
+ *	Reclaim modifiers in @gfp_mask - __GFP_NORETRY, __GFP_REPEAT
+ *	and __GFP_NOFAIL are not supported
+ *
+ *	Any use of gfp flags outside of GFP_KERNEL should be consulted
+ *	with mm people.
+ *
  */
 static void *__vmalloc_node(unsigned long size, unsigned long align,
 			    gfp_t gfp_mask, pgprot_t prot,