Message ID | cover.1508251210.git.daniel@iogearbox.net |
---|---|
Headers | show |
Series | Fix for BPF devmap percpu allocation splat | expand |
From: Daniel Borkmann > Sent: 17 October 2017 15:56 > > The set fixes a splat in devmap percpu allocation when we alloc > the flush bitmap. Patch 1 is a prerequisite for the fix in patch 2, > patch 1 is rather small, so if this could be routed via -net, for > example, with Tejun's Ack that would be good. Patch 3 gets rid of > remaining PCPU_MIN_UNIT_SIZE checks, which are percpu allocator > internals and should not be used. Does it make sense to allow the user program to try to allocate ever smaller very large maps until it finds one that succeeds - thus using up all the percpu space? Or is this a 'root only' 'shoot self in foot' job? David
On 10/17/2017 05:03 PM, David Laight wrote: > From: Daniel Borkmann >> Sent: 17 October 2017 15:56 >> >> The set fixes a splat in devmap percpu allocation when we alloc >> the flush bitmap. Patch 1 is a prerequisite for the fix in patch 2, >> patch 1 is rather small, so if this could be routed via -net, for >> example, with Tejun's Ack that would be good. Patch 3 gets rid of >> remaining PCPU_MIN_UNIT_SIZE checks, which are percpu allocator >> internals and should not be used. > > Does it make sense to allow the user program to try to allocate ever > smaller very large maps until it finds one that succeeds - thus > using up all the percpu space? > > Or is this a 'root only' 'shoot self in foot' job? It's root only although John still has a pending fix to be flushed out for -net first in the next days to actually enforce that cap (devmap is not in an official kernel yet at this point, so all good), but apart from this, all map allocs in general are accounted for as well. Thanks, Daniel
Hello, Daniel. (cc'ing Dennis) On Tue, Oct 17, 2017 at 04:55:51PM +0200, Daniel Borkmann wrote: > The set fixes a splat in devmap percpu allocation when we alloc > the flush bitmap. Patch 1 is a prerequisite for the fix in patch 2, > patch 1 is rather small, so if this could be routed via -net, for > example, with Tejun's Ack that would be good. Patch 3 gets rid of > remaining PCPU_MIN_UNIT_SIZE checks, which are percpu allocator > internals and should not be used. > > Thanks! > > Daniel Borkmann (3): > mm, percpu: add support for __GFP_NOWARN flag This looks fine. > bpf: fix splat for illegal devmap percpu allocation > bpf: do not test for PCPU_MIN_UNIT_SIZE before percpu allocations These look okay too but if it helps percpu allocator can expose the maximum size / alignment supported to take out the guessing game too. Also, the reason why PCPU_MIN_UNIT_SIZE is what it is is because nobody needed anything bigger. Increasing the size doesn't really cost much at least on 64bit archs. Is that something we want to be considering? Thanks.
On 10/18/2017 03:25 PM, Tejun Heo wrote: > Hello, Daniel. > > (cc'ing Dennis) > > On Tue, Oct 17, 2017 at 04:55:51PM +0200, Daniel Borkmann wrote: >> The set fixes a splat in devmap percpu allocation when we alloc >> the flush bitmap. Patch 1 is a prerequisite for the fix in patch 2, >> patch 1 is rather small, so if this could be routed via -net, for >> example, with Tejun's Ack that would be good. Patch 3 gets rid of >> remaining PCPU_MIN_UNIT_SIZE checks, which are percpu allocator >> internals and should not be used. >> >> Thanks! >> >> Daniel Borkmann (3): >> mm, percpu: add support for __GFP_NOWARN flag > > This looks fine. Great, thanks! >> bpf: fix splat for illegal devmap percpu allocation >> bpf: do not test for PCPU_MIN_UNIT_SIZE before percpu allocations > > These look okay too but if it helps percpu allocator can expose the > maximum size / alignment supported to take out the guessing game too. At least from BPF side there's right now no infra for exposing max possible alloc sizes for maps to e.g. user space as indication. There are few users left in the tree, where it would make sense for having some helpers though: arch/tile/kernel/setup.c:729: if (size < PCPU_MIN_UNIT_SIZE) arch/tile/kernel/setup.c:730: size = PCPU_MIN_UNIT_SIZE; drivers/net/ethernet/chelsio/libcxgb/libcxgb_ppm.c:346: unsigned int max = (PCPU_MIN_UNIT_SIZE - sizeof(*pools)) << 3; drivers/net/ethernet/chelsio/libcxgb/libcxgb_ppm.c:352: /* make sure per cpu pool fits into PCPU_MIN_UNIT_SIZE */ drivers/scsi/libfc/fc_exch.c:2488: /* reduce range so per cpu pool fits into PCPU_MIN_UNIT_SIZE pool */ drivers/scsi/libfc/fc_exch.c:2489: pool_exch_range = (PCPU_MIN_UNIT_SIZE - sizeof(*pool)) / > Also, the reason why PCPU_MIN_UNIT_SIZE is what it is is because > nobody needed anything bigger. Increasing the size doesn't really > cost much at least on 64bit archs. Is that something we want to be > considering? For devmap (and cpumap) itself it wouldn't make sense. For per-cpu hashtable we could indeed consider it in the future. Thanks, Daniel
On 10/18/2017 04:03 PM, Daniel Borkmann wrote: > On 10/18/2017 03:25 PM, Tejun Heo wrote: >> Hello, Daniel. >> >> (cc'ing Dennis) >> >> On Tue, Oct 17, 2017 at 04:55:51PM +0200, Daniel Borkmann wrote: >>> The set fixes a splat in devmap percpu allocation when we alloc >>> the flush bitmap. Patch 1 is a prerequisite for the fix in patch 2, >>> patch 1 is rather small, so if this could be routed via -net, for >>> example, with Tejun's Ack that would be good. Patch 3 gets rid of >>> remaining PCPU_MIN_UNIT_SIZE checks, which are percpu allocator >>> internals and should not be used. >>> >>> Thanks! >>> >>> Daniel Borkmann (3): >>> mm, percpu: add support for __GFP_NOWARN flag >> >> This looks fine. > > Great, thanks! > >>> bpf: fix splat for illegal devmap percpu allocation >>> bpf: do not test for PCPU_MIN_UNIT_SIZE before percpu allocations >> >> These look okay too but if it helps percpu allocator can expose the >> maximum size / alignment supported to take out the guessing game too. > > At least from BPF side there's right now no infra for exposing > max possible alloc sizes for maps to e.g. user space as indication. > There are few users left in the tree, where it would make sense for > having some helpers though: > > arch/tile/kernel/setup.c:729: if (size < PCPU_MIN_UNIT_SIZE) > arch/tile/kernel/setup.c:730: size = PCPU_MIN_UNIT_SIZE; > drivers/net/ethernet/chelsio/libcxgb/libcxgb_ppm.c:346: unsigned int max = (PCPU_MIN_UNIT_SIZE - sizeof(*pools)) << 3; > drivers/net/ethernet/chelsio/libcxgb/libcxgb_ppm.c:352: /* make sure per cpu pool fits into PCPU_MIN_UNIT_SIZE */ > drivers/scsi/libfc/fc_exch.c:2488: /* reduce range so per cpu pool fits into PCPU_MIN_UNIT_SIZE pool */ > drivers/scsi/libfc/fc_exch.c:2489: pool_exch_range = (PCPU_MIN_UNIT_SIZE - sizeof(*pool)) / > >> Also, the reason why PCPU_MIN_UNIT_SIZE is what it is is because >> nobody needed anything bigger. Increasing the size doesn't really >> cost much at least on 64bit archs. Is that something we want to be >> considering? > > For devmap (and cpumap) itself it wouldn't make sense. For per-cpu > hashtable we could indeed consider it in the future. Higher prio imo would be to make the allocation itself faster though, I remember we talked about this back in May wrt hashtable, but I kind of lost track whether there was an update on this in the mean time. ;-) Cheers, Daniel
On Wed, Oct 18, 2017 at 7:22 AM, Daniel Borkmann <daniel@iogearbox.net> wrote: > > Higher prio imo would be to make the allocation itself faster > though, I remember we talked about this back in May wrt hashtable, > but I kind of lost track whether there was an update on this in > the mean time. ;-) new percpu allocator by Dennis fixed those issues. It's in 4.14
On 10/18/2017 05:28 PM, Alexei Starovoitov wrote: > On Wed, Oct 18, 2017 at 7:22 AM, Daniel Borkmann <daniel@iogearbox.net> wrote: >> >> Higher prio imo would be to make the allocation itself faster >> though, I remember we talked about this back in May wrt hashtable, >> but I kind of lost track whether there was an update on this in >> the mean time. ;-) > > new percpu allocator by Dennis fixed those issues. It's in 4.14 Ah, perfect!
Hi Daniel and Tejun, On Wed, Oct 18, 2017 at 06:25:26AM -0700, Tejun Heo wrote: > > Daniel Borkmann (3): > > mm, percpu: add support for __GFP_NOWARN flag > > This looks fine. > Looks good to me too. > > bpf: fix splat for illegal devmap percpu allocation > > bpf: do not test for PCPU_MIN_UNIT_SIZE before percpu allocations > > These look okay too but if it helps percpu allocator can expose the > maximum size / alignment supported to take out the guessing game too. > I can add this once we've addressed the below if we want to. > Also, the reason why PCPU_MIN_UNIT_SIZE is what it is is because > nobody needed anything bigger. Increasing the size doesn't really > cost much at least on 64bit archs. Is that something we want to be > considering? > I'm not sure I see the reason we can't match the minimum allocation size with the unit size? It seems weird to arbitrate the maximum allocation size given a lower bound on the unit size. Thanks, Dennis
From: Daniel Borkmann <daniel@iogearbox.net> Date: Tue, 17 Oct 2017 16:55:51 +0200 > The set fixes a splat in devmap percpu allocation when we alloc > the flush bitmap. Patch 1 is a prerequisite for the fix in patch 2, > patch 1 is rather small, so if this could be routed via -net, for > example, with Tejun's Ack that would be good. Patch 3 gets rid of > remaining PCPU_MIN_UNIT_SIZE checks, which are percpu allocator > internals and should not be used. Series applied.
Hello, On Wed, Oct 18, 2017 at 04:45:08PM -0500, Dennis Zhou wrote: > I'm not sure I see the reason we can't match the minimum allocation size > with the unit size? It seems weird to arbitrate the maximum allocation > size given a lower bound on the unit size. idk, it can be weird for the maximum allowed allocation size varying widely depending on how the machine boots up. Thanks.