diff mbox series

[net] bpf: don't select potentially stale ri->map from buggy xdp progs

Message ID 01d69254802986ad3a8b18a8650c45df3df95def.1504821825.git.daniel@iogearbox.net
State Accepted, archived
Delegated to: David Miller
Headers show
Series [net] bpf: don't select potentially stale ri->map from buggy xdp progs | expand

Commit Message

Daniel Borkmann Sept. 7, 2017, 10:14 p.m. UTC
We can potentially run into a couple of issues with the XDP
bpf_redirect_map() helper. The ri->map in the per CPU storage
can become stale in several ways, mostly due to misuse, where
we can then trigger a use after free on the map:

i) prog A is calling bpf_redirect_map(), returning XDP_REDIRECT
and running on a driver not supporting XDP_REDIRECT yet. The
ri->map on that CPU becomes stale when the XDP program is unloaded
on the driver, and a prog B loaded on a different driver which
supports XDP_REDIRECT return code. prog B would have to omit
calling to bpf_redirect_map() and just return XDP_REDIRECT, which
would then access the freed map in xdp_do_redirect() since not
cleared for that CPU.

ii) prog A is calling bpf_redirect_map(), returning a code other
than XDP_REDIRECT. prog A is then detached, which triggers release
of the map. prog B is attached which, similarly as in i), would
just return XDP_REDIRECT without having called bpf_redirect_map()
and thus be accessing the freed map in xdp_do_redirect() since
not cleared for that CPU.

iii) prog A is attached to generic XDP, calling the bpf_redirect_map()
helper and returning XDP_REDIRECT. xdp_do_generic_redirect() is
currently not handling ri->map (will be fixed by Jesper), so it's
not being reset. Later loading a e.g. native prog B which would,
say, call bpf_xdp_redirect() and then returns XDP_REDIRECT would
find in xdp_do_redirect() that a map was set and uses that causing
use after free on map access.

Fix thus needs to avoid accessing stale ri->map pointers, naive
way would be to call a BPF function from drivers that just resets
it to NULL for all XDP return codes but XDP_REDIRECT and including
XDP_REDIRECT for drivers not supporting it yet (and let ri->map
being handled in xdp_do_generic_redirect()). There is a less
intrusive way w/o letting drivers call a reset for each BPF run.

The verifier knows we're calling into bpf_xdp_redirect_map()
helper, so it can do a small insn rewrite transparent to the prog
itself in the sense that it fills R4 with a pointer to the own
bpf_prog. We have that pointer at verification time anyway and
R4 is allowed to be used as per calling convention we scratch
R0 to R5 anyway, so they become inaccessible and program cannot
read them prior to a write. Then, the helper would store the prog
pointer in the current CPUs struct redirect_info. Later in
xdp_do_*_redirect() we check whether the redirect_info's prog
pointer is the same as passed xdp_prog pointer, and if that's
the case then all good, since the prog holds a ref on the map
anyway, so it is always valid at that point in time and must
have a reference count of at least 1. If in the unlikely case
they are not equal, it means we got a stale pointer, so we clear
and bail out right there. Also do reset map and the owning prog
in bpf_xdp_redirect(), so that bpf_xdp_redirect_map() and
bpf_xdp_redirect() won't get mixed up, only the last call should
take precedence. A tc bpf_redirect() doesn't use map anywhere
yet, so no need to clear it there since never accessed in that
layer.

Note that in case the prog is released, and thus the map as
well we're still under RCU read critical section at that time
and have preemption disabled as well. Once we commit with the
__dev_map_insert_ctx() from xdp_do_redirect_map() and set the
map to ri->map_to_flush, we still wait for a xdp_do_flush_map()
to finish in devmap dismantle time once flush_needed bit is set,
so that is fine.

Fixes: 97f91a7cf04f ("bpf: add bpf_redirect_map helper routine")
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 kernel/bpf/verifier.c | 16 ++++++++++++++++
 net/core/filter.c     | 21 +++++++++++++++++++--
 2 files changed, 35 insertions(+), 2 deletions(-)

Comments

Alexei Starovoitov Sept. 7, 2017, 10:37 p.m. UTC | #1
On 9/7/17 3:14 PM, Daniel Borkmann wrote:
> Fixes: 97f91a7cf04f ("bpf: add bpf_redirect_map helper routine")
> Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---
>  kernel/bpf/verifier.c | 16 ++++++++++++++++
>  net/core/filter.c     | 21 +++++++++++++++++++--
>  2 files changed, 35 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index d690c7d..477b693 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -4203,6 +4203,22 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env)
>  			continue;
>  		}
>
> +		if (insn->imm == BPF_FUNC_redirect_map) {
> +			u64 addr = (unsigned long)prog;
> +			struct bpf_insn r4_ld[] = {
> +				BPF_LD_IMM64(BPF_REG_4, addr),
> +				*insn,
> +			};
> +			cnt = ARRAY_SIZE(r4_ld);
> +
> +			new_prog = bpf_patch_insn_data(env, i + delta, r4_ld, cnt);

that's a neat trick.
I think we'll be seeing more of such pattern in the future.
Definitely less intrusive fix than asking drivers or net/core
to clear it.
Acked-by: Alexei Starovoitov <ast@kernel.org>
Jesper Dangaard Brouer Sept. 8, 2017, 5:06 a.m. UTC | #2
On Fri,  8 Sep 2017 00:14:51 +0200
Daniel Borkmann <daniel@iogearbox.net> wrote:

> +	/* This is really only caused by a deliberately crappy
> +	 * BPF program, normally we would never hit that case,
> +	 * so no need to inform someone via tracepoints either,
> +	 * just bail out.
> +	 */
> +	if (unlikely(map_owner != xdp_prog))
> +		return -EINVAL;

IMHO we do need to call the tracepoint here.  It is not just crappy
BPF-progs that cause this situation, it is also drivers not implementing
XDP_REDIRECT yet (which is all but ixgbe).  Due to the level XDP
operates at, tracepoints are the only way users can runtime troubleshoot
their XDP programs.
Daniel Borkmann Sept. 8, 2017, 10:34 a.m. UTC | #3
On 09/08/2017 07:06 AM, Jesper Dangaard Brouer wrote:
> On Fri,  8 Sep 2017 00:14:51 +0200
> Daniel Borkmann <daniel@iogearbox.net> wrote:
>
>> +	/* This is really only caused by a deliberately crappy
>> +	 * BPF program, normally we would never hit that case,
>> +	 * so no need to inform someone via tracepoints either,
>> +	 * just bail out.
>> +	 */
>> +	if (unlikely(map_owner != xdp_prog))
>> +		return -EINVAL;
>
> IMHO we do need to call the tracepoint here.  It is not just crappy
> BPF-progs that cause this situation, it is also drivers not implementing
> XDP_REDIRECT yet (which is all but ixgbe).  Due to the level XDP
> operates at, tracepoints are the only way users can runtime troubleshoot
> their XDP programs.

Drivers not implementing XDP_REDIRECT don't even get there in
the first place. What they will do is to hit the 'default' case
when they check for the action code from the BPF program. Then
call into bpf_warn_invalid_xdp_action(act), and fall-through
to hit the tracepoint at trace_xdp_exception() which is also
triggered by XDP_ABORTED usually. So when that happens we do
complain loudly and call a tracepoint already. We should probably
tweak the bpf_warn_invalid_xdp_action() message a little to make
it clear that the action could also just be unsupported by the
driver instead of being illegal.
Jesper Dangaard Brouer Sept. 8, 2017, 11:52 a.m. UTC | #4
On Fri, 08 Sep 2017 12:34:28 +0200 Daniel Borkmann <daniel@iogearbox.net> wrote:

> On 09/08/2017 07:06 AM, Jesper Dangaard Brouer wrote:
> > On Fri,  8 Sep 2017 00:14:51 +0200
> > Daniel Borkmann <daniel@iogearbox.net> wrote:
> >  
> >> +	/* This is really only caused by a deliberately crappy
> >> +	 * BPF program, normally we would never hit that case,
> >> +	 * so no need to inform someone via tracepoints either,
> >> +	 * just bail out.
> >> +	 */
> >> +	if (unlikely(map_owner != xdp_prog))
> >> +		return -EINVAL;  
> >
> > IMHO we do need to call the tracepoint here.  It is not just crappy
> > BPF-progs that cause this situation, it is also drivers not implementing
> > XDP_REDIRECT yet (which is all but ixgbe).  Due to the level XDP
> > operates at, tracepoints are the only way users can runtime troubleshoot
> > their XDP programs.  
> 
> Drivers not implementing XDP_REDIRECT don't even get there in
> the first place. What they will do is to hit the 'default' case
> when they check for the action code from the BPF program. Then
> call into bpf_warn_invalid_xdp_action(act), and fall-through
> to hit the tracepoint at trace_xdp_exception() which is also
> triggered by XDP_ABORTED usually. So when that happens we do
> complain loudly and call a tracepoint already. We should probably
> tweak the bpf_warn_invalid_xdp_action() message a little to make
> it clear that the action could also just be unsupported by the
> driver instead of being illegal.

Yes. drivers not implementing XDP_REDIRECT will cause a tracepoint
trace_xdp_exception() to be called for its _own_ packets.

But it will still setup and leave map and map_owner pointer dangling.
Another NIC can load an xdp_prog that return XDP_REDIRECT, which will hit
above if-statement, and its packets will disappear, without getting
recorded by a tracepoint (thus hard to debug!).

The fundamental point is that tracepoints is the way we choose to
handle debugging XDP programs.  Thus, we must trigger a tracepoint when
a packet gets dropped.  Even in this unlikely case.
Daniel Borkmann Sept. 8, 2017, 12:34 p.m. UTC | #5
On 09/08/2017 01:52 PM, Jesper Dangaard Brouer wrote:
> On Fri, 08 Sep 2017 12:34:28 +0200 Daniel Borkmann <daniel@iogearbox.net> wrote:
>> On 09/08/2017 07:06 AM, Jesper Dangaard Brouer wrote:
>>> On Fri,  8 Sep 2017 00:14:51 +0200
>>> Daniel Borkmann <daniel@iogearbox.net> wrote:
>>>
>>>> +	/* This is really only caused by a deliberately crappy
>>>> +	 * BPF program, normally we would never hit that case,
>>>> +	 * so no need to inform someone via tracepoints either,
>>>> +	 * just bail out.
>>>> +	 */
>>>> +	if (unlikely(map_owner != xdp_prog))
>>>> +		return -EINVAL;
>>>
>>> IMHO we do need to call the tracepoint here.  It is not just crappy
>>> BPF-progs that cause this situation, it is also drivers not implementing
>>> XDP_REDIRECT yet (which is all but ixgbe).  Due to the level XDP
>>> operates at, tracepoints are the only way users can runtime troubleshoot
>>> their XDP programs.
>>
>> Drivers not implementing XDP_REDIRECT don't even get there in
>> the first place. What they will do is to hit the 'default' case
>> when they check for the action code from the BPF program. Then
>> call into bpf_warn_invalid_xdp_action(act), and fall-through
>> to hit the tracepoint at trace_xdp_exception() which is also
>> triggered by XDP_ABORTED usually. So when that happens we do
>> complain loudly and call a tracepoint already. We should probably
>> tweak the bpf_warn_invalid_xdp_action() message a little to make
>> it clear that the action could also just be unsupported by the
>> driver instead of being illegal.
>
> Yes. drivers not implementing XDP_REDIRECT will cause a tracepoint
> trace_xdp_exception() to be called for its _own_ packets.

Yep, plus a big one time warning for the case a user doesn't
look at tracepoints initially.

> But it will still setup and leave map and map_owner pointer dangling.
> Another NIC can load an xdp_prog that return XDP_REDIRECT, which will hit
> above if-statement, and its packets will disappear, without getting
> recorded by a tracepoint (thus hard to debug!).

If a user wants to reproduce this exact error, he would need
to go and reload the program on the driver not supporting the
XDP_REDIRECT in the first place, and then reload his buggy program
on the other driver supporting XDP_REDIRECT but w/o having called
bpf_xdp_redirect_map(), so exactly once on the switch from one
driver to another with this misuse, any subsequent packets will
trigger _trace_xdp_redirect_err(), same way as if the buggy
program was loaded to the 2nd driver from the beginning since
the map and ifindex etc will be zero, hence my comment on this.
Jesper Dangaard Brouer Sept. 8, 2017, 1:07 p.m. UTC | #6
On Fri, 08 Sep 2017 14:34:13 +0200
Daniel Borkmann <daniel@iogearbox.net> wrote:

> On 09/08/2017 01:52 PM, Jesper Dangaard Brouer wrote:
> > On Fri, 08 Sep 2017 12:34:28 +0200 Daniel Borkmann <daniel@iogearbox.net> wrote:  
> >> On 09/08/2017 07:06 AM, Jesper Dangaard Brouer wrote:  
> >>> On Fri,  8 Sep 2017 00:14:51 +0200
> >>> Daniel Borkmann <daniel@iogearbox.net> wrote:
> >>>  
> >>>> +	/* This is really only caused by a deliberately crappy
> >>>> +	 * BPF program, normally we would never hit that case,
> >>>> +	 * so no need to inform someone via tracepoints either,
> >>>> +	 * just bail out.
> >>>> +	 */
> >>>> +	if (unlikely(map_owner != xdp_prog))
> >>>> +		return -EINVAL;  
> >>>
> >>> IMHO we do need to call the tracepoint here.  It is not just crappy
> >>> BPF-progs that cause this situation, it is also drivers not implementing
> >>> XDP_REDIRECT yet (which is all but ixgbe).  Due to the level XDP
> >>> operates at, tracepoints are the only way users can runtime troubleshoot
> >>> their XDP programs.  
> >>
> >> Drivers not implementing XDP_REDIRECT don't even get there in
> >> the first place. What they will do is to hit the 'default' case
> >> when they check for the action code from the BPF program. Then
> >> call into bpf_warn_invalid_xdp_action(act), and fall-through
> >> to hit the tracepoint at trace_xdp_exception() which is also
> >> triggered by XDP_ABORTED usually. So when that happens we do
> >> complain loudly and call a tracepoint already. We should probably
> >> tweak the bpf_warn_invalid_xdp_action() message a little to make
> >> it clear that the action could also just be unsupported by the
> >> driver instead of being illegal.  
> >
> > Yes. drivers not implementing XDP_REDIRECT will cause a tracepoint
> > trace_xdp_exception() to be called for its _own_ packets.  
> 
> Yep, plus a big one time warning for the case a user doesn't
> look at tracepoints initially.
> 
> > But it will still setup and leave map and map_owner pointer dangling.
> > Another NIC can load an xdp_prog that return XDP_REDIRECT, which will hit
> > above if-statement, and its packets will disappear, without getting
> > recorded by a tracepoint (thus hard to debug!).  
> 
> If a user wants to reproduce this exact error, he would need
> to go and reload the program on the driver not supporting the
> XDP_REDIRECT in the first place, and then reload his buggy program
> on the other driver supporting XDP_REDIRECT but w/o having called
> bpf_xdp_redirect_map(), so exactly once on the switch from one
> driver to another with this misuse, any subsequent packets will
> trigger _trace_xdp_redirect_err(), same way as if the buggy
> program was loaded to the 2nd driver from the beginning since
> the map and ifindex etc will be zero, hence my comment on this.

We can agree that the second program that experience the side-effect is
also buggy, as just returning XDP_REDIRECT without calling
bpf_xdp_redirect_map() or bpf_xdp_redirect(), is a bug in the bpf
program.  Thus, the comment about a "deliberately crappy BPF program"
is not wrong.

You don't have to load and unload xdp programs.  My test is simply
having two XDP programs running.  1. xdp_redirect_map on mlx5 which
doesn't implement XDP_REDIRECT, and 2. a "deliberately crappy BPF
program" on ixgbe that just returns XDP_REDIRECT.

In below output I've used -EFAULT == -14 to capture this situation
happening:

     ksoftirqd/3    28 [003]  3437.829882:     xdp:xdp_redirect_err: prog_id=3 action=REDIRECT ifindex=2 to_ifindex=0 err=-22
         swapper     0 [005]  3437.829882:     xdp:xdp_redirect_err: prog_id=3 action=REDIRECT ifindex=2 to_ifindex=0 err=-22
     ksoftirqd/0     7 [000]  3437.829882:        xdp:xdp_exception: prog_id=5 action=REDIRECT ifindex=7
     ksoftirqd/4    34 [004]  3437.829882:     xdp:xdp_redirect_err: prog_id=3 action=REDIRECT ifindex=2 to_ifindex=0 err=-22
     ksoftirqd/2    22 [002]  3437.829882:     xdp:xdp_redirect_err: prog_id=3 action=REDIRECT ifindex=2 to_ifindex=0 err=-22
     ksoftirqd/3    28 [003]  3437.829882:     xdp:xdp_redirect_err: prog_id=3 action=REDIRECT ifindex=2 to_ifindex=0 err=-22
         swapper     0 [005]  3437.829882:     xdp:xdp_redirect_err: prog_id=3 action=REDIRECT ifindex=2 to_ifindex=0 err=-22
         swapper     0 [005]  3437.829882:     xdp:xdp_redirect_err: prog_id=3 action=REDIRECT ifindex=2 to_ifindex=0 err=-22
     ksoftirqd/3    28 [003]  3437.829882:     xdp:xdp_redirect_err: prog_id=3 action=REDIRECT ifindex=2 to_ifindex=0 err=-22
     ksoftirqd/2    22 [002]  3437.829882:     xdp:xdp_redirect_err: prog_id=3 action=REDIRECT ifindex=2 to_ifindex=0 err=-22
     ksoftirqd/4    34 [004]  3437.829882:     xdp:xdp_redirect_err: prog_id=3 action=REDIRECT ifindex=2 to_ifindex=0 err=-22
     ksoftirqd/3    28 [003]  3437.829882:     xdp:xdp_redirect_err: prog_id=3 action=REDIRECT ifindex=2 to_ifindex=0 err=-22
     ksoftirqd/2    22 [002]  3437.829882:     xdp:xdp_redirect_err: prog_id=3 action=REDIRECT ifindex=2 to_ifindex=0 err=-22
     ksoftirqd/0     7 [000]  3437.829882: xdp:xdp_redirect_map_err: prog_id=3 action=REDIRECT ifindex=2 to_ifindex=0 err=-14 map_id=5 map_index=0
         swapper     0 [005]  3437.829883:     xdp:xdp_redirect_err: prog_id=3 action=REDIRECT ifindex=2 to_ifindex=0 err=-22
     ksoftirqd/3    28 [003]  3437.829883:     xdp:xdp_redirect_err: prog_id=3 action=REDIRECT ifindex=2 to_ifindex=0 err=-22

And I can see I made a mistake and dereference the map_id ;-)

BTW, just to make it clear, I love the rest of the patch.  And I love
how you solved this.  Cool trick. You also closed a hole where someone
could set the map in one bpf_prog and cause the next bpf program to
forward using this map (this could be a policy violation).
diff mbox series

Patch

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d690c7d..477b693 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4203,6 +4203,22 @@  static int fixup_bpf_calls(struct bpf_verifier_env *env)
 			continue;
 		}
 
+		if (insn->imm == BPF_FUNC_redirect_map) {
+			u64 addr = (unsigned long)prog;
+			struct bpf_insn r4_ld[] = {
+				BPF_LD_IMM64(BPF_REG_4, addr),
+				*insn,
+			};
+			cnt = ARRAY_SIZE(r4_ld);
+
+			new_prog = bpf_patch_insn_data(env, i + delta, r4_ld, cnt);
+			if (!new_prog)
+				return -ENOMEM;
+
+			delta    += cnt - 1;
+			env->prog = prog = new_prog;
+			insn      = new_prog->insnsi + i + delta;
+		}
 patch_call_imm:
 		fn = prog->aux->ops->get_func_proto(insn->imm);
 		/* all functions that have prototype and verifier allowed
diff --git a/net/core/filter.c b/net/core/filter.c
index 5912c73..0848df2 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -1794,6 +1794,7 @@  struct redirect_info {
 	u32 flags;
 	struct bpf_map *map;
 	struct bpf_map *map_to_flush;
+	const struct bpf_prog *map_owner;
 };
 
 static DEFINE_PER_CPU(struct redirect_info, redirect_info);
@@ -1807,7 +1808,6 @@  struct redirect_info {
 
 	ri->ifindex = ifindex;
 	ri->flags = flags;
-	ri->map = NULL;
 
 	return TC_ACT_REDIRECT;
 }
@@ -2504,6 +2504,7 @@  static int xdp_do_redirect_map(struct net_device *dev, struct xdp_buff *xdp,
 			       struct bpf_prog *xdp_prog)
 {
 	struct redirect_info *ri = this_cpu_ptr(&redirect_info);
+	const struct bpf_prog *map_owner = ri->map_owner;
 	struct bpf_map *map = ri->map;
 	u32 index = ri->ifindex;
 	struct net_device *fwd;
@@ -2511,6 +2512,15 @@  static int xdp_do_redirect_map(struct net_device *dev, struct xdp_buff *xdp,
 
 	ri->ifindex = 0;
 	ri->map = NULL;
+	ri->map_owner = NULL;
+
+	/* This is really only caused by a deliberately crappy
+	 * BPF program, normally we would never hit that case,
+	 * so no need to inform someone via tracepoints either,
+	 * just bail out.
+	 */
+	if (unlikely(map_owner != xdp_prog))
+		return -EINVAL;
 
 	fwd = __dev_map_lookup_elem(map, index);
 	if (!fwd) {
@@ -2607,6 +2617,8 @@  int xdp_do_generic_redirect(struct net_device *dev, struct sk_buff *skb,
 
 	ri->ifindex = ifindex;
 	ri->flags = flags;
+	ri->map = NULL;
+	ri->map_owner = NULL;
 
 	return XDP_REDIRECT;
 }
@@ -2619,7 +2631,8 @@  int xdp_do_generic_redirect(struct net_device *dev, struct sk_buff *skb,
 	.arg2_type      = ARG_ANYTHING,
 };
 
-BPF_CALL_3(bpf_xdp_redirect_map, struct bpf_map *, map, u32, ifindex, u64, flags)
+BPF_CALL_4(bpf_xdp_redirect_map, struct bpf_map *, map, u32, ifindex, u64, flags,
+	   const struct bpf_prog *, map_owner)
 {
 	struct redirect_info *ri = this_cpu_ptr(&redirect_info);
 
@@ -2629,10 +2642,14 @@  int xdp_do_generic_redirect(struct net_device *dev, struct sk_buff *skb,
 	ri->ifindex = ifindex;
 	ri->flags = flags;
 	ri->map = map;
+	ri->map_owner = map_owner;
 
 	return XDP_REDIRECT;
 }
 
+/* Note, arg4 is hidden from users and populated by the verifier
+ * with the right pointer.
+ */
 static const struct bpf_func_proto bpf_xdp_redirect_map_proto = {
 	.func           = bpf_xdp_redirect_map,
 	.gpl_only       = false,