diff mbox

[net-next] bpf: return errno -ENOMEM when exceeding RLIMIT_MEMLOCK

Message ID 148760491056.17885.7344022207445355578.stgit@firesoul
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Jesper Dangaard Brouer Feb. 20, 2017, 3:35 p.m. UTC
It is confusing users of samples/bpf that exceeding the resource
limits for RLIMIT_MEMLOCK result in an "Operation not permitted"
message.  This is due to bpf limits check return -EPERM.

Instead return -ENOMEM, like most other users of this API.

Fixes: aaac3ba95e4c ("bpf: charge user for creation of BPF maps and programs")
Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements")
Fixes: 5ccb071e97fb ("bpf: fix overflow in prog accounting")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 kernel/bpf/syscall.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Daniel Borkmann Feb. 20, 2017, 3:57 p.m. UTC | #1
On 02/20/2017 04:35 PM, Jesper Dangaard Brouer wrote:
> It is confusing users of samples/bpf that exceeding the resource
> limits for RLIMIT_MEMLOCK result in an "Operation not permitted"
> message.  This is due to bpf limits check return -EPERM.
>
> Instead return -ENOMEM, like most other users of this API.
>
> Fixes: aaac3ba95e4c ("bpf: charge user for creation of BPF maps and programs")
> Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements")
> Fixes: 5ccb071e97fb ("bpf: fix overflow in prog accounting")

Btw, last one just moves the helper so fixes doesn't really apply
there, but apart from that this is already uapi exposed behavior
like this for ~1.5yrs, so unfortunately too late to change now. I
think the original intention (arguably confusing in this context)
was that user doesn't have (rlimit) permission to allocate this
resource.

Thanks,
Daniel
Jesper Dangaard Brouer Feb. 20, 2017, 4:25 p.m. UTC | #2
On Mon, 20 Feb 2017 16:57:34 +0100
Daniel Borkmann <daniel@iogearbox.net> wrote:

> On 02/20/2017 04:35 PM, Jesper Dangaard Brouer wrote:
> > It is confusing users of samples/bpf that exceeding the resource
> > limits for RLIMIT_MEMLOCK result in an "Operation not permitted"
> > message.  This is due to bpf limits check return -EPERM.
> >
> > Instead return -ENOMEM, like most other users of this API.
> >
> > Fixes: aaac3ba95e4c ("bpf: charge user for creation of BPF maps and programs")
> > Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements")
> > Fixes: 5ccb071e97fb ("bpf: fix overflow in prog accounting")  
> 
> Btw, last one just moves the helper so fixes doesn't really apply
> there, but apart from that this is already uapi exposed behavior
> like this for ~1.5yrs, so unfortunately too late to change now. I
> think the original intention (arguably confusing in this context)
> was that user doesn't have (rlimit) permission to allocate this
> resource.

This is obviously confusing end-users, thus it should be fixed IMHO.
Alexei Starovoitov Feb. 21, 2017, 8:06 a.m. UTC | #3
On Mon, Feb 20, 2017 at 05:25:58PM +0100, Jesper Dangaard Brouer wrote:
> On Mon, 20 Feb 2017 16:57:34 +0100
> Daniel Borkmann <daniel@iogearbox.net> wrote:
> 
> > On 02/20/2017 04:35 PM, Jesper Dangaard Brouer wrote:
> > > It is confusing users of samples/bpf that exceeding the resource
> > > limits for RLIMIT_MEMLOCK result in an "Operation not permitted"
> > > message.  This is due to bpf limits check return -EPERM.
> > >
> > > Instead return -ENOMEM, like most other users of this API.
> > >
> > > Fixes: aaac3ba95e4c ("bpf: charge user for creation of BPF maps and programs")
> > > Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements")
> > > Fixes: 5ccb071e97fb ("bpf: fix overflow in prog accounting")  
> > 
> > Btw, last one just moves the helper so fixes doesn't really apply
> > there, but apart from that this is already uapi exposed behavior
> > like this for ~1.5yrs, so unfortunately too late to change now. I
> > think the original intention (arguably confusing in this context)
> > was that user doesn't have (rlimit) permission to allocate this
> > resource.
> 
> This is obviously confusing end-users, thus it should be fixed IMHO.

I don't think it's confusing and I think EPERM makes
the most sense as return code in such situation.
There is also code in iovisor/bcc that specifically looking
for EPERM to adjust ulimit.
May be it's not documented properly, but that's different story.
Jesper Dangaard Brouer Feb. 21, 2017, 1 p.m. UTC | #4
On Tue, 21 Feb 2017 00:06:11 -0800
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> On Mon, Feb 20, 2017 at 05:25:58PM +0100, Jesper Dangaard Brouer wrote:
> > On Mon, 20 Feb 2017 16:57:34 +0100
> > Daniel Borkmann <daniel@iogearbox.net> wrote:
> >   
> > > On 02/20/2017 04:35 PM, Jesper Dangaard Brouer wrote:  
> > > > It is confusing users of samples/bpf that exceeding the resource
> > > > limits for RLIMIT_MEMLOCK result in an "Operation not permitted"
> > > > message.  This is due to bpf limits check return -EPERM.
> > > >
> > > > Instead return -ENOMEM, like most other users of this API.
> > > >
> > > > Fixes: aaac3ba95e4c ("bpf: charge user for creation of BPF maps and programs")
> > > > Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements")
> > > > Fixes: 5ccb071e97fb ("bpf: fix overflow in prog accounting")    
> > > 
> > > Btw, last one just moves the helper so fixes doesn't really apply
> > > there, but apart from that this is already uapi exposed behavior
> > > like this for ~1.5yrs, so unfortunately too late to change now. I
> > > think the original intention (arguably confusing in this context)
> > > was that user doesn't have (rlimit) permission to allocate this
> > > resource.  
> > 
> > This is obviously confusing end-users, thus it should be fixed IMHO.  
> 
> I don't think it's confusing and I think EPERM makes
> the most sense as return code in such situation.

Most other kernel users return ENOMEM.

> There is also code in iovisor/bcc that specifically looking
> for EPERM to adjust ulimit.

If there is already a program that depend on this, then it is ABI and
we cannot change it... drop this patch.

> May be it's not documented properly, but that's different story.

Documented it here:
 https://prototype-kernel.readthedocs.io/en/latest/bpf/troubleshooting.html
Alexei Starovoitov Feb. 22, 2017, 7:14 a.m. UTC | #5
On Tue, Feb 21, 2017 at 02:00:13PM +0100, Jesper Dangaard Brouer wrote:
> On Tue, 21 Feb 2017 00:06:11 -0800
> Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> 
> > On Mon, Feb 20, 2017 at 05:25:58PM +0100, Jesper Dangaard Brouer wrote:
> > > On Mon, 20 Feb 2017 16:57:34 +0100
> > > Daniel Borkmann <daniel@iogearbox.net> wrote:
> > >   
> > > > On 02/20/2017 04:35 PM, Jesper Dangaard Brouer wrote:  
> > > > > It is confusing users of samples/bpf that exceeding the resource
> > > > > limits for RLIMIT_MEMLOCK result in an "Operation not permitted"
> > > > > message.  This is due to bpf limits check return -EPERM.
> > > > >
> > > > > Instead return -ENOMEM, like most other users of this API.
> > > > >
> > > > > Fixes: aaac3ba95e4c ("bpf: charge user for creation of BPF maps and programs")
> > > > > Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements")
> > > > > Fixes: 5ccb071e97fb ("bpf: fix overflow in prog accounting")    
> > > > 
> > > > Btw, last one just moves the helper so fixes doesn't really apply
> > > > there, but apart from that this is already uapi exposed behavior
> > > > like this for ~1.5yrs, so unfortunately too late to change now. I
> > > > think the original intention (arguably confusing in this context)
> > > > was that user doesn't have (rlimit) permission to allocate this
> > > > resource.  
> > > 
> > > This is obviously confusing end-users, thus it should be fixed IMHO.  
> > 
> > I don't think it's confusing and I think EPERM makes
> > the most sense as return code in such situation.
> 
> Most other kernel users return ENOMEM.

the places in the kernel that check for memlock are not fully consistent.

perf does this:
  lock_limit = rlimit(RLIMIT_MEMLOCK);
  lock_limit >>= PAGE_SHIFT;
  locked = vma->vm_mm->pinned_vm + extra;

  if ((locked > lock_limit) && perf_paranoid_tracepoint_raw() &&
          !capable(CAP_IPC_LOCK)) {
          ret = -EPERM;
          goto unlock;
  }

and in bpf land we got an idea of using memlock limit from perf.

> Documented it here:
>  https://prototype-kernel.readthedocs.io/en/latest/bpf/troubleshooting.html

Thanks! Looks good.
diff mbox

Patch

diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 08a4d287226b..37387a9b0d46 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -85,7 +85,7 @@  int bpf_map_precharge_memlock(u32 pages)
 	cur = atomic_long_read(&user->locked_vm);
 	free_uid(user);
 	if (cur + pages > memlock_limit)
-		return -EPERM;
+		return -ENOMEM;
 	return 0;
 }
 
@@ -101,7 +101,7 @@  static int bpf_map_charge_memlock(struct bpf_map *map)
 	if (atomic_long_read(&user->locked_vm) > memlock_limit) {
 		atomic_long_sub(map->pages, &user->locked_vm);
 		free_uid(user);
-		return -EPERM;
+		return -ENOMEM;
 	}
 	map->user = user;
 	return 0;
@@ -658,7 +658,7 @@  int __bpf_prog_charge(struct user_struct *user, u32 pages)
 		user_bufs = atomic_long_add_return(pages, &user->locked_vm);
 		if (user_bufs > memlock_limit) {
 			atomic_long_sub(pages, &user->locked_vm);
-			return -EPERM;
+			return -ENOMEM;
 		}
 	}