Message ID | 20181210193512.20463-1-jakub.kicinski@netronome.com |
---|---|
State | Changes Requested, archived |
Delegated to: | BPF Maintainers |
Headers | show |
Series | [bpf] bpf: verifier: make sure callees don't prune with caller differences | expand |
On Mon, 10 Dec 2018 11:35:12 -0800, Jakub Kicinski wrote: > Currently for liveness and state pruning the register parentage > chains don't include states of the callee. This makes some sense > as the callee can't access those registers. However, this means > that READs done after the callee returns will not propagate into > the states of the callee. Callee will then perform pruning > disregarding differences in caller state. > > Example: > > 0: (85) call bpf_user_rnd_u32 > 1: (b7) r8 = 0 > 2: (55) if r0 != 0x0 goto pc+1 > 3: (b7) r8 = 1 > 4: (bf) r1 = r8 > 5: (85) call pc+4 > 6: (15) if r8 == 0x1 goto pc+1 > 7: (05) *(u64 *)(r9 - 8) = r3 > 8: (b7) r0 = 0 > 9: (95) exit > > 10: (15) if r1 == 0x0 goto pc+0 > 11: (95) exit > > Here we acquire unknown state with call to get_random() [1]. Then > we store this random state in r8 (either 0 or 1) [1 - 3], and make > a call on line 5. Callee does nothing but a trivial conditional > jump (to create a pruning point). Upon return caller checks the > state of r8 and either performs an unsafe read or not. > > Verifier will first explore the path with r8 == 1, creating a pruning > point at [11]. The parentage chain for r8 will include only callers > states so once verifier reaches [6] it will mark liveness only on states > in the caller, and not [11]. Now when verifier walks the paths with > r8 == 0 it will reach [11] and since REG_LIVE_READ on r8 was not > propagated there it will prune the walk entirely (stop walking > the entire program, not just the callee). Since [6] was never walked > with r8 == 0, [7] will be considered dead and replaced with "goto -1" > causing hang at runtime. > > This patch weaves the callee's explored states onto the callers > parentage chain. > > v1: don't unnecessarily link caller saved regs (Jiong) > > Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)") > Reported-by: David Beckett <david.beckett@netronome.com> > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> > Reviewed-by: Jiong Wang <jiong.wang@netronome.com> > --- > resend with netdev included And CC Ed, sorry.. > kernel/bpf/verifier.c | 10 +++++++- > tools/testing/selftests/bpf/test_verifier.c | 28 +++++++++++++++++++++ > 2 files changed, 37 insertions(+), 1 deletion(-) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index fc760d00a38c..60d57d9d3663 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -5114,11 +5114,19 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) > for (i = 0; i < BPF_REG_FP; i++) > cur->frame[cur->curframe]->regs[i].live = REG_LIVE_NONE; > > - /* all stack frames are accessible from callee, clear them all */ > + /* connect new state to parentage chain: > + * - all stack frames are accessible from callee > + * - even though other stack frames' registers are not accessible > + * liveness must propagate through the callees' states otherwise > + * not knowing the liveness callee may prune caller > + */ > for (j = 0; j <= cur->curframe; j++) { > struct bpf_func_state *frame = cur->frame[j]; > struct bpf_func_state *newframe = new->frame[j]; > > + for (i = BPF_REG_6; i < BPF_REG_FP; i++) > + frame->regs[i].parent = &newframe->regs[i]; > + > for (i = 0; i < frame->allocated_stack / BPF_REG_SIZE; i++) { > frame->stack[i].spilled_ptr.live = REG_LIVE_NONE; > frame->stack[i].spilled_ptr.parent = > diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c > index df6f751cc1e8..373611ae9f52 100644 > --- a/tools/testing/selftests/bpf/test_verifier.c > +++ b/tools/testing/selftests/bpf/test_verifier.c > @@ -13915,6 +13915,34 @@ static struct bpf_test tests[] = { > .result_unpriv = REJECT, > .result = ACCEPT, > }, > + { > + "calls: cross frame pruning", > + .insns = { > + /* r8 = !!random(); > + * call pruner() > + * if (r8) > + * do something bad; > + */ > + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, > + BPF_FUNC_get_prandom_u32), > + BPF_MOV64_IMM(BPF_REG_8, 0), > + BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1), > + BPF_MOV64_IMM(BPF_REG_8, 1), > + BPF_MOV64_REG(BPF_REG_1, BPF_REG_8), > + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 4), > + BPF_JMP_IMM(BPF_JEQ, BPF_REG_8, 1, 1), > + BPF_LDX_MEM(BPF_B, BPF_REG_9, BPF_REG_1, 0), > + BPF_MOV64_IMM(BPF_REG_0, 0), > + BPF_EXIT_INSN(), > + BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0, 0), > + BPF_EXIT_INSN(), > + }, > + .prog_type = BPF_PROG_TYPE_SOCKET_FILTER, > + .errstr_unpriv = "function calls to other bpf functions are allowed for root only", > + .result_unpriv = REJECT, > + .errstr = "!read_ok", > + .result = REJECT, > + }, > }; > > static int probe_filter_length(const struct bpf_insn *fp)
On Mon, Dec 10, 2018 at 11:35:12AM -0800, Jakub Kicinski wrote: > Currently for liveness and state pruning the register parentage > chains don't include states of the callee. This makes some sense > as the callee can't access those registers. However, this means > that READs done after the callee returns will not propagate into > the states of the callee. Callee will then perform pruning > disregarding differences in caller state. > > Example: > > 0: (85) call bpf_user_rnd_u32 > 1: (b7) r8 = 0 > 2: (55) if r0 != 0x0 goto pc+1 > 3: (b7) r8 = 1 > 4: (bf) r1 = r8 > 5: (85) call pc+4 > 6: (15) if r8 == 0x1 goto pc+1 > 7: (05) *(u64 *)(r9 - 8) = r3 > 8: (b7) r0 = 0 > 9: (95) exit > > 10: (15) if r1 == 0x0 goto pc+0 > 11: (95) exit > > Here we acquire unknown state with call to get_random() [1]. Then > we store this random state in r8 (either 0 or 1) [1 - 3], and make > a call on line 5. Callee does nothing but a trivial conditional > jump (to create a pruning point). Upon return caller checks the > state of r8 and either performs an unsafe read or not. > > Verifier will first explore the path with r8 == 1, creating a pruning > point at [11]. The parentage chain for r8 will include only callers > states so once verifier reaches [6] it will mark liveness only on states > in the caller, and not [11]. Now when verifier walks the paths with > r8 == 0 it will reach [11] and since REG_LIVE_READ on r8 was not > propagated there it will prune the walk entirely (stop walking > the entire program, not just the callee). Since [6] was never walked > with r8 == 0, [7] will be considered dead and replaced with "goto -1" > causing hang at runtime. > > This patch weaves the callee's explored states onto the callers > parentage chain. > > v1: don't unnecessarily link caller saved regs (Jiong) > > Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)") > Reported-by: David Beckett <david.beckett@netronome.com> > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> > Reviewed-by: Jiong Wang <jiong.wang@netronome.com> > --- > resend with netdev included > --- > kernel/bpf/verifier.c | 10 +++++++- > tools/testing/selftests/bpf/test_verifier.c | 28 +++++++++++++++++++++ > 2 files changed, 37 insertions(+), 1 deletion(-) > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index fc760d00a38c..60d57d9d3663 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -5114,11 +5114,19 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) > for (i = 0; i < BPF_REG_FP; i++) > cur->frame[cur->curframe]->regs[i].live = REG_LIVE_NONE; > > - /* all stack frames are accessible from callee, clear them all */ > + /* connect new state to parentage chain: > + * - all stack frames are accessible from callee > + * - even though other stack frames' registers are not accessible > + * liveness must propagate through the callees' states otherwise > + * not knowing the liveness callee may prune caller > + */ > for (j = 0; j <= cur->curframe; j++) { > struct bpf_func_state *frame = cur->frame[j]; > struct bpf_func_state *newframe = new->frame[j]; > > + for (i = BPF_REG_6; i < BPF_REG_FP; i++) > + frame->regs[i].parent = &newframe->regs[i]; > + excellent catch! thanks a lot for the analysis. Only odd bit that parent pointers are connected twice (by first loop just above this hunk and by this new one) How about something like this instead: - for (i = 0; i < BPF_REG_FP; i++) - cur_regs(env)[i].parent = &new->frame[new->curframe]->regs[i]; + for (j = 0; j <= cur->curframe; j++) + for (i = j < cur->curframe ? BPF_REG_6 : 0; i < BPF_REG_FP; i++) + cur->frame[j]->regs[i].parent = &new->frame[j]->regs[i]; with appropriate comments...
On 10/12/18 19:35, Jakub Kicinski wrote: > Currently for liveness and state pruning the register parentage > chains don't include states of the callee. This makes some sense > as the callee can't access those registers. However, this means > that READs done after the callee returns will not propagate into > the states of the callee. Callee will then perform pruning > disregarding differences in caller state. > > Example: > > 0: (85) call bpf_user_rnd_u32 > 1: (b7) r8 = 0 > 2: (55) if r0 != 0x0 goto pc+1 > 3: (b7) r8 = 1 > 4: (bf) r1 = r8 > 5: (85) call pc+4 > 6: (15) if r8 == 0x1 goto pc+1 > 7: (05) *(u64 *)(r9 - 8) = r3 > 8: (b7) r0 = 0 > 9: (95) exit > > 10: (15) if r1 == 0x0 goto pc+0 > 11: (95) exit > > Here we acquire unknown state with call to get_random() [1]. Then > we store this random state in r8 (either 0 or 1) [1 - 3], and make > a call on line 5. Callee does nothing but a trivial conditional > jump (to create a pruning point). Upon return caller checks the > state of r8 and either performs an unsafe read or not. > > Verifier will first explore the path with r8 == 1, creating a pruning > point at [11]. The parentage chain for r8 will include only callers > states so once verifier reaches [6] it will mark liveness only on states > in the caller, and not [11]. Now when verifier walks the paths with > r8 == 0 it will reach [11] and since REG_LIVE_READ on r8 was not > propagated there it will prune the walk entirely (stop walking > the entire program, not just the callee). Since [6] was never walked > with r8 == 0, [7] will be considered dead and replaced with "goto -1" > causing hang at runtime. > > This patch weaves the callee's explored states onto the callers > parentage chain. Could you describe this in a little more detail? In particular I'm not sure why anything that knows the difference between caller- and callee-saved regs belongs in is_state_visited() rather than check_func_call(). Is that just an optimisation that we can do because we know caller's r1-r5 were scratched by the call? It looks like, without that, the change is "every reg in all frames needs to be parented to the new-state", which is somewhat easier to understand (though I did have to think for quite a long time before it made sense to me why even that was necessary ;-) So please add more to the patch description, and maybe a comment above the loop explaining why it's r6-r9 only; then I'll be happy to Ack. > v1: don't unnecessarily link caller saved regs (Jiong) > > Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)") > Reported-by: David Beckett <david.beckett@netronome.com> > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> > Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
On Wed, 12 Dec 2018 20:28:07 +0000, Edward Cree wrote: > On 10/12/18 19:35, Jakub Kicinski wrote: > > Currently for liveness and state pruning the register parentage > > chains don't include states of the callee. This makes some sense > > as the callee can't access those registers. However, this means > > that READs done after the callee returns will not propagate into > > the states of the callee. Callee will then perform pruning > > disregarding differences in caller state. > > > > Example: > > > > 0: (85) call bpf_user_rnd_u32 > > 1: (b7) r8 = 0 > > 2: (55) if r0 != 0x0 goto pc+1 > > 3: (b7) r8 = 1 > > 4: (bf) r1 = r8 > > 5: (85) call pc+4 > > 6: (15) if r8 == 0x1 goto pc+1 > > 7: (05) *(u64 *)(r9 - 8) = r3 > > 8: (b7) r0 = 0 > > 9: (95) exit > > > > 10: (15) if r1 == 0x0 goto pc+0 > > 11: (95) exit > > > > Here we acquire unknown state with call to get_random() [1]. Then > > we store this random state in r8 (either 0 or 1) [1 - 3], and make > > a call on line 5. Callee does nothing but a trivial conditional > > jump (to create a pruning point). Upon return caller checks the > > state of r8 and either performs an unsafe read or not. > > > > Verifier will first explore the path with r8 == 1, creating a pruning > > point at [11]. The parentage chain for r8 will include only callers > > states so once verifier reaches [6] it will mark liveness only on states > > in the caller, and not [11]. Now when verifier walks the paths with > > r8 == 0 it will reach [11] and since REG_LIVE_READ on r8 was not > > propagated there it will prune the walk entirely (stop walking > > the entire program, not just the callee). Since [6] was never walked > > with r8 == 0, [7] will be considered dead and replaced with "goto -1" > > causing hang at runtime. > > > > This patch weaves the callee's explored states onto the callers > > parentage chain. > Could you describe this in a little more detail? > > In particular I'm not sure why anything that knows the difference between > caller- and callee-saved regs belongs in is_state_visited() rather than > check_func_call(). I will put those two "diagrams" into the next revision, I think they help (these are rough, placement of SLs on 1 and 4 is not precise but otherwise it's hard to draw): Parentage before: [0] [1] [2] [3] [4] [5] [10] [11] [6] [7] | | ,---|----. | | | sl0: sl0: / sl0: \ sl0: sl0: sl0: fr0: r8 <-- fr0: r8<+--fr0: r8 `fr0: r8 ,fr0: r8<-fr0: r8 \ fr1: r8 <- fr1: r8 / \__________________/ Parentage after: [0] [1] [2] [3] [4] [5] [10] [11] [6] [7] | | | | | | sl0: sl0: sl0: sl0: sl0: sl0: fr0: r8 <-- fr0: r8 <- fr0: r8 <- fr0: r8 <-fr0: r8<-fr0: r8 fr1: r8 <- fr1: r8 The simplest explanation I could come up with is that is_state_visited() connects into parentage chains all live things (regs, and stack). The caller's r6 - r9 are actually alive, they have just been pushed onto the stack by the JIT. And therefore, since they are just live things, it's is_state_visited()'s responsibility to connect them. > Is that just an optimisation that we can do because we know caller's > r1-r5 were scratched by the call? r0 - r5 of the caller will not be pushed to the stack by JITs and therefore those are really dead. It is an optimization. We could've connected them but they are already marked as WRITTEN in check_func_call(). > It looks like, without that, the change is "every reg in all frames > needs to be parented to the new-state", which is somewhat easier to > understand (though I did have to think for quite a long time before > it made sense to me why even that was necessary ;-) Ack, that was my initial patch actually, but Jiong pointed out it's unnecessary and I didn't argue :) r1 - r5 are marked as WRITTEN and UNINIT anyway, and r0 gets its parent from the callees frame (prepare_func_exit() links callers r0 parentage chain to previous states in the callee). > So please add more to the patch description, and maybe a comment above > the loop explaining why it's r6-r9 only; then I'll be happy to Ack. Good point, I will beef up the comment. > > v1: don't unnecessarily link caller saved regs (Jiong) > > > > Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)") > > Reported-by: David Beckett <david.beckett@netronome.com> > > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> > > Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Jakub Kicinski writes: <snip> >> Is that just an optimisation that we can do because we know caller's >> r1-r5 were scratched by the call? (resent, got a bounce back from netdev mailing list for previous email) I had been pondering whether it is really necessary to mark caller saved registers as scratched. I kind of feel the following code inside check_func_call might cause valid program rejected: /* after the call registers r0 - r5 were scratched */ for (i = 0; i < CALLER_SAVED_REGS; i++) { mark_reg_not_init(env, caller->regs, caller_saved[i]); check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK); } Because there is inter-procedure register allocation support in LLVM (-enable-ipra), which could effectively eliminate register save/restore for one caller-saved register across function call if the compiler can prove callee or any other childs on the callgraph doesn't use/clobber this particular caller-saved register. Then the later sequence in caller after the call site could just safely read the caller-saved without restoring it from stack etc. But we are marking all caller-saved as NOT_INIT, such read will be treated as reading from uninitialized value, so the program will be rejected. -enable-ipra is disabled at default at the moment. Regards, Jiong > r0 - r5 of the caller will not be pushed to the stack by JITs and > therefore those are really dead. It is an optimization. We could've > connected them but they are already marked as WRITTEN in > check_func_call(). > >> It looks like, without that, the change is "every reg in all frames >> needs to be parented to the new-state", which is somewhat easier to >> understand (though I did have to think for quite a long time before >> it made sense to me why even that was necessary ;-) > > Ack, that was my initial patch actually, but Jiong pointed out it's > unnecessary and I didn't argue :) r1 - r5 are marked as WRITTEN and > UNINIT anyway, and r0 gets its parent from the callees frame > (prepare_func_exit() links callers r0 parentage chain to previous > states in the callee).
On 13/12/18 10:52, Jiong Wang wrote: > Because there is inter-procedure register allocation support in LLVM > (-enable-ipra), which could effectively eliminate register save/restore for > one caller-saved register across function call if the compiler can prove > callee or any other childs on the callgraph doesn't use/clobber this > particular caller-saved register. Then the later sequence in caller after > the call site could just safely read the caller-saved without restoring it > from stack etc. But we are marking all caller-saved as NOT_INIT, such read > will be treated as reading from uninitialized value, so the program will be > rejected. I think "all r1-r5 are clobbered on call" is part of the eBPF ISA. In principle, JITs might use them if they have some fixup they need to do at CALL or RET time. And of course verifier can rewrite insn sequences for various reasons in ways the compiler doesn't know about. So I think you have to keep IPRA disabled, sorry. -Ed
On Thu, Dec 13, 2018 at 04:40:06PM +0000, Edward Cree wrote: > On 13/12/18 10:52, Jiong Wang wrote: > > Because there is inter-procedure register allocation support in LLVM > > (-enable-ipra), which could effectively eliminate register save/restore for > > one caller-saved register across function call if the compiler can prove > > callee or any other childs on the callgraph doesn't use/clobber this > > particular caller-saved register. Then the later sequence in caller after > > the call site could just safely read the caller-saved without restoring it > > from stack etc. But we are marking all caller-saved as NOT_INIT, such read > > will be treated as reading from uninitialized value, so the program will be > > rejected. > I think "all r1-r5 are clobbered on call" is part of the eBPF ISA. In right. it's part of the calling convention. afaik ipra is enabled by default, but it's a nop for bpf backend which doesn't allow leaf function. The backend always requests a frame to be setup from llvm core which effectively disables ipra. When we've been discussing on and off the idea to teach verifier to recognize which registers are used, so JITs can optimize prologue/epilogue. I think it will help, but probably not a lot at this point, since bpf progs are still tiny.
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index fc760d00a38c..60d57d9d3663 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -5114,11 +5114,19 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx) for (i = 0; i < BPF_REG_FP; i++) cur->frame[cur->curframe]->regs[i].live = REG_LIVE_NONE; - /* all stack frames are accessible from callee, clear them all */ + /* connect new state to parentage chain: + * - all stack frames are accessible from callee + * - even though other stack frames' registers are not accessible + * liveness must propagate through the callees' states otherwise + * not knowing the liveness callee may prune caller + */ for (j = 0; j <= cur->curframe; j++) { struct bpf_func_state *frame = cur->frame[j]; struct bpf_func_state *newframe = new->frame[j]; + for (i = BPF_REG_6; i < BPF_REG_FP; i++) + frame->regs[i].parent = &newframe->regs[i]; + for (i = 0; i < frame->allocated_stack / BPF_REG_SIZE; i++) { frame->stack[i].spilled_ptr.live = REG_LIVE_NONE; frame->stack[i].spilled_ptr.parent = diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c index df6f751cc1e8..373611ae9f52 100644 --- a/tools/testing/selftests/bpf/test_verifier.c +++ b/tools/testing/selftests/bpf/test_verifier.c @@ -13915,6 +13915,34 @@ static struct bpf_test tests[] = { .result_unpriv = REJECT, .result = ACCEPT, }, + { + "calls: cross frame pruning", + .insns = { + /* r8 = !!random(); + * call pruner() + * if (r8) + * do something bad; + */ + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, + BPF_FUNC_get_prandom_u32), + BPF_MOV64_IMM(BPF_REG_8, 0), + BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1), + BPF_MOV64_IMM(BPF_REG_8, 1), + BPF_MOV64_REG(BPF_REG_1, BPF_REG_8), + BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 4), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_8, 1, 1), + BPF_LDX_MEM(BPF_B, BPF_REG_9, BPF_REG_1, 0), + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0, 0), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_SOCKET_FILTER, + .errstr_unpriv = "function calls to other bpf functions are allowed for root only", + .result_unpriv = REJECT, + .errstr = "!read_ok", + .result = REJECT, + }, }; static int probe_filter_length(const struct bpf_insn *fp)