diff mbox

net/sctp: vmalloc allocation failure in sctp_setsockopt/xt_alloc_table_info

Message ID 20161128151312.GA13172@localhost.localdomain
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Marcelo Ricardo Leitner Nov. 28, 2016, 3:13 p.m. UTC
On Mon, Nov 28, 2016 at 09:39:31AM -0500, Neil Horman wrote:
> On Mon, Nov 28, 2016 at 03:33:40PM +0100, Andrey Konovalov wrote:
> > On Mon, Nov 28, 2016 at 3:13 PM, Neil Horman <nhorman@tuxdriver.com> wrote:
> > > On Mon, Nov 28, 2016 at 02:00:19PM +0100, Andrey Konovalov wrote:
> > >> Hi!
> > >>
> > >> I've got the following error report while running the syzkaller fuzzer.
> > >>
> > >> On commit d8e435f3ab6fea2ea324dce72b51dd7761747523 (Nov 26).
> > >>
> > >> A reproducer is attached.
> > >>
> > >> a.out: vmalloc: allocation failure, allocated 823562240 of 1427091456
> > >> bytes, mode:0x24000c2(GFP_KERNEL|__GFP_HIGHMEM)
> > >>
> > > How much total ram do you have in this system?  The call appears to be
> > > attempting to allocate 1.3 Gb of data.  Even using vmalloc to allow
> > > discontiguous allocation, thats alot of memory, and if enough is in use already,
> > > I could make the argument that this might be expected behavior.
> > 
> > Hi Neail,
> > 
> > I have 2 Gb.
> > 
> That would be why.  Allocating 65% of the available system memory will almost
> certainly lead to OOM failures quickly.
> 
> > Just tested with 4 Gb, everything seems to be working fine.
> > So I guess this is not actually a bug and allocating 1.3 Gb is OK.

Still we probably should avoid the warn triggered by an userspace
application: (untested)

--8<--

Comments

Neil Horman Nov. 28, 2016, 5:46 p.m. UTC | #1
On Mon, Nov 28, 2016 at 01:13:12PM -0200, Marcelo Ricardo Leitner wrote:
> On Mon, Nov 28, 2016 at 09:39:31AM -0500, Neil Horman wrote:
> > On Mon, Nov 28, 2016 at 03:33:40PM +0100, Andrey Konovalov wrote:
> > > On Mon, Nov 28, 2016 at 3:13 PM, Neil Horman <nhorman@tuxdriver.com> wrote:
> > > > On Mon, Nov 28, 2016 at 02:00:19PM +0100, Andrey Konovalov wrote:
> > > >> Hi!
> > > >>
> > > >> I've got the following error report while running the syzkaller fuzzer.
> > > >>
> > > >> On commit d8e435f3ab6fea2ea324dce72b51dd7761747523 (Nov 26).
> > > >>
> > > >> A reproducer is attached.
> > > >>
> > > >> a.out: vmalloc: allocation failure, allocated 823562240 of 1427091456
> > > >> bytes, mode:0x24000c2(GFP_KERNEL|__GFP_HIGHMEM)
> > > >>
> > > > How much total ram do you have in this system?  The call appears to be
> > > > attempting to allocate 1.3 Gb of data.  Even using vmalloc to allow
> > > > discontiguous allocation, thats alot of memory, and if enough is in use already,
> > > > I could make the argument that this might be expected behavior.
> > > 
> > > Hi Neail,
> > > 
> > > I have 2 Gb.
> > > 
> > That would be why.  Allocating 65% of the available system memory will almost
> > certainly lead to OOM failures quickly.
> > 
> > > Just tested with 4 Gb, everything seems to be working fine.
> > > So I guess this is not actually a bug and allocating 1.3 Gb is OK.
> 
> Still we probably should avoid the warn triggered by an userspace
> application: (untested)
> 
I'm not sure I agree with that.  Generally speaking it seems like the right
thing to do, if you want to avoid filling logs with warnings, but this is the
sort of error that is going to be accompanied by severe service interruption.
I'd rather see a reason behind that in the logs, than just have it occur
silently.

Neil

> --8<--
> 
> diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
> index fc4977456c30..b56a0e128fc3 100644
> --- a/net/netfilter/x_tables.c
> +++ b/net/netfilter/x_tables.c
> @@ -958,7 +958,8 @@ struct xt_table_info *xt_alloc_table_info(unsigned int size)
>  	if (sz <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
>  		info = kmalloc(sz, GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY);
>  	if (!info) {
> -		info = vmalloc(sz);
> +		info = __vmalloc(sz, GFP_KERNEL | __GFP_NOWARN | __GFP_HIGHMEM,
> +				 PAGE_KERNEL);
>  		if (!info)
>  			return NULL;
>  	}
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
Florian Westphal Nov. 28, 2016, 5:47 p.m. UTC | #2
Neil Horman <nhorman@tuxdriver.com> wrote:
> I'm not sure I agree with that.  Generally speaking it seems like the right
> thing to do, if you want to avoid filling logs with warnings, but this is the
> sort of error that is going to be accompanied by severe service interruption.
> I'd rather see a reason behind that in the logs, than just have it occur
> silently.

Its not silent -- the setsockopt call will fail and userspace should
display an error.

So I agree with Marcelo, lets suppress the oom spew here.
Neil Horman Nov. 28, 2016, 5:56 p.m. UTC | #3
On Mon, Nov 28, 2016 at 06:47:10PM +0100, Florian Westphal wrote:
> Neil Horman <nhorman@tuxdriver.com> wrote:
> > I'm not sure I agree with that.  Generally speaking it seems like the right
> > thing to do, if you want to avoid filling logs with warnings, but this is the
> > sort of error that is going to be accompanied by severe service interruption.
> > I'd rather see a reason behind that in the logs, than just have it occur
> > silently.
> 
> Its not silent -- the setsockopt call will fail and userspace should
> display an error.
> 
Thats not true.  If the OOM succedes in freeing enough memory to fulfill the
request the setsockopt may complete without error, you're just left with a
killed process...somewhere.  Thats seems a bit dodgy to me

Not saying it has to be a full stack trace, but some log annotation that shows
the oom killer got invoked seems called for here

Neil

> So I agree with Marcelo, lets suppress the oom spew here.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
Florian Westphal Nov. 28, 2016, 6:09 p.m. UTC | #4
Neil Horman <nhorman@tuxdriver.com> wrote:

[ trimming CCs ]

> On Mon, Nov 28, 2016 at 06:47:10PM +0100, Florian Westphal wrote:
> > Neil Horman <nhorman@tuxdriver.com> wrote:
> > > I'm not sure I agree with that.  Generally speaking it seems like the right
> > > thing to do, if you want to avoid filling logs with warnings, but this is the
> > > sort of error that is going to be accompanied by severe service interruption.
> > > I'd rather see a reason behind that in the logs, than just have it occur
> > > silently.
> > 
> > Its not silent -- the setsockopt call will fail and userspace should
> > display an error.
> > 
> Thats not true.  If the OOM succedes in freeing enough memory to fulfill the
> request the setsockopt may complete without error, you're just left with a
> killed process...somewhere.  Thats seems a bit dodgy to me

We should prevent OOM killer from running in first place (GFP_NORETRY should work).
Marcelo Ricardo Leitner Nov. 28, 2016, 6:18 p.m. UTC | #5
On Mon, Nov 28, 2016 at 07:09:25PM +0100, Florian Westphal wrote:
> Neil Horman <nhorman@tuxdriver.com> wrote:
> 
> [ trimming CCs ]
> 
> > On Mon, Nov 28, 2016 at 06:47:10PM +0100, Florian Westphal wrote:
> > > Neil Horman <nhorman@tuxdriver.com> wrote:
> > > > I'm not sure I agree with that.  Generally speaking it seems like the right
> > > > thing to do, if you want to avoid filling logs with warnings, but this is the
> > > > sort of error that is going to be accompanied by severe service interruption.
> > > > I'd rather see a reason behind that in the logs, than just have it occur
> > > > silently.
> > > 
> > > Its not silent -- the setsockopt call will fail and userspace should
> > > display an error.
> > > 
> > Thats not true.  If the OOM succedes in freeing enough memory to fulfill the
> > request the setsockopt may complete without error, you're just left with a
> > killed process...somewhere.  Thats seems a bit dodgy to me
> 

__GFP_NOWARN is about allocation failures only and it won't disable OOM
kill messages.  oom_kill_process() has no idea on GFP_NOWARN when doing
the logging.

> We should prevent OOM killer from running in first place (GFP_NORETRY should work).

Oh. Really?
Eric Dumazet Nov. 28, 2016, 6:26 p.m. UTC | #6
On Mon, 2016-11-28 at 19:09 +0100, Florian Westphal wrote:

> We should prevent OOM killer from running in first place (GFP_NORETRY should work).

Make sure that a vmalloc(80000) will succeed, even if few pages need to
be swapped out.

Otherwise, some scripts using iptables will die while they where okay
before ?

I am not sure how GFP_NORETRY really works among different linux
kernels.

Maybe this flag has no effect for order-0 allocations anyway.
Marcelo Ricardo Leitner Nov. 30, 2016, 7:21 p.m. UTC | #7
On Mon, Nov 28, 2016 at 04:18:03PM -0200, Marcelo Ricardo Leitner wrote:
> On Mon, Nov 28, 2016 at 07:09:25PM +0100, Florian Westphal wrote:
> > Neil Horman <nhorman@tuxdriver.com> wrote:
> > 
> > [ trimming CCs ]
> > 
> > > On Mon, Nov 28, 2016 at 06:47:10PM +0100, Florian Westphal wrote:
> > > > Neil Horman <nhorman@tuxdriver.com> wrote:
> > > > > I'm not sure I agree with that.  Generally speaking it seems like the right
> > > > > thing to do, if you want to avoid filling logs with warnings, but this is the
> > > > > sort of error that is going to be accompanied by severe service interruption.
> > > > > I'd rather see a reason behind that in the logs, than just have it occur
> > > > > silently.
> > > > 
> > > > Its not silent -- the setsockopt call will fail and userspace should
> > > > display an error.
> > > > 
> > > Thats not true.  If the OOM succedes in freeing enough memory to fulfill the
> > > request the setsockopt may complete without error, you're just left with a
> > > killed process...somewhere.  Thats seems a bit dodgy to me
> > 
> 
> __GFP_NOWARN is about allocation failures only and it won't disable OOM
> kill messages.  oom_kill_process() has no idea on GFP_NOWARN when doing
> the logging.
> 
> > We should prevent OOM killer from running in first place (GFP_NORETRY should work).
> 
> Oh. Really?
> 

Now I see why. Then we're basically saying that's better to fail this
operation than to kill some random process around.
And kmalloc() is already using GFP_NORETRY in this same place.

  Marcelo
diff mbox

Patch

diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index fc4977456c30..b56a0e128fc3 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -958,7 +958,8 @@  struct xt_table_info *xt_alloc_table_info(unsigned int size)
 	if (sz <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER))
 		info = kmalloc(sz, GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY);
 	if (!info) {
-		info = vmalloc(sz);
+		info = __vmalloc(sz, GFP_KERNEL | __GFP_NOWARN | __GFP_HIGHMEM,
+				 PAGE_KERNEL);
 		if (!info)
 			return NULL;
 	}