Message ID | 20190615191225.2409862-1-ast@kernel.org |
---|---|
Headers | show |
Series | bpf: bounded loops and other features | expand |
On Sat, Jun 15, 2019 at 12:12 PM Alexei Starovoitov <ast@kernel.org> wrote: > > v2->v3: fixed issues in backtracking pointed out by Andrii. > The next step is to add a lot more tests for backtracking. > Tests would be great, verifier complexity is at the level, where it's very easy to miss issues. Was fuzzying approach ever discussed for BPF verifier? I.e., have a fuzzer to generate both legal and illegal random small programs. Then re-implement verifier as user-level program with straightforward recursive exhaustive verification (so no state pruning logic, no precise/coarse, etc, just register/stack state tracking) of all possible branches. If kernel verifier's verdict differs from user-level verifier's verdict - flag that as a test case and figure out why they differ. Obviously that would work well only for small programs, but that should be a good first step already. In addition, if this is done, that user-land verifier can be a HUGE help to BPF application developers, as libbpf would (potentially) be able to generate better error messages using it as well. > v1->v2: addressed Andrii's feedback. > > this patch set introduces verifier support for bounded loops and > adds several other improvements. > Ideally they would be introduced one at a time, > but to support bounded loop the verifier needs to 'step back' > in the patch 1. That patch introduces tracking of spill/fill > of constants through the stack. Though it's a useful feature > it hurts cilium tests. > Patch 3 introduces another feature by extending is_branch_taken > logic to 'if rX op rY' conditions. This feature is also > necessary to support bounded loops. > Then patch 4 adds support for the loops while adding > key heuristics with jmp_processed. > Introduction of parentage chain of verifier states in patch 4 > allows patch 9 to add backtracking of precise scalar registers > which finally resolves degradation from patch 1. > > The end result is much faster verifier for existing programs > and new support for loops. > See patch 8 for many kinds of loops that are now validated. > Patch 9 is the most tricky one and could be rewritten with > a different algorithm in the future. > > Alexei Starovoitov (9): > bpf: track spill/fill of constants > selftests/bpf: fix tests due to const spill/fill > bpf: extend is_branch_taken to registers > bpf: introduce bounded loops > bpf: fix callees pruning callers > selftests/bpf: fix tests > selftests/bpf: add basic verifier tests for loops > selftests/bpf: add realistic loop tests > bpf: precise scalar_value tracking > > include/linux/bpf_verifier.h | 69 +- > kernel/bpf/verifier.c | 767 ++++++++++++++++-- > .../bpf/prog_tests/bpf_verif_scale.c | 67 +- > tools/testing/selftests/bpf/progs/loop1.c | 28 + > tools/testing/selftests/bpf/progs/loop2.c | 28 + > tools/testing/selftests/bpf/progs/loop3.c | 22 + > tools/testing/selftests/bpf/progs/pyperf.h | 6 +- > tools/testing/selftests/bpf/progs/pyperf600.c | 9 + > .../selftests/bpf/progs/pyperf600_nounroll.c | 8 + > .../testing/selftests/bpf/progs/strobemeta.c | 10 + > .../testing/selftests/bpf/progs/strobemeta.h | 528 ++++++++++++ > .../bpf/progs/strobemeta_nounroll1.c | 9 + > .../bpf/progs/strobemeta_nounroll2.c | 9 + > .../selftests/bpf/progs/test_seg6_loop.c | 261 ++++++ > .../selftests/bpf/progs/test_sysctl_loop1.c | 71 ++ > .../selftests/bpf/progs/test_sysctl_loop2.c | 72 ++ > .../selftests/bpf/progs/test_xdp_loop.c | 231 ++++++ > tools/testing/selftests/bpf/test_verifier.c | 11 +- > tools/testing/selftests/bpf/verifier/calls.c | 22 +- > tools/testing/selftests/bpf/verifier/cfg.c | 11 +- > .../bpf/verifier/direct_packet_access.c | 3 +- > .../bpf/verifier/helper_access_var_len.c | 28 +- > tools/testing/selftests/bpf/verifier/loops1.c | 161 ++++ > 23 files changed, 2317 insertions(+), 114 deletions(-) > create mode 100644 tools/testing/selftests/bpf/progs/loop1.c > create mode 100644 tools/testing/selftests/bpf/progs/loop2.c > create mode 100644 tools/testing/selftests/bpf/progs/loop3.c > create mode 100644 tools/testing/selftests/bpf/progs/pyperf600.c > create mode 100644 tools/testing/selftests/bpf/progs/pyperf600_nounroll.c > create mode 100644 tools/testing/selftests/bpf/progs/strobemeta.c > create mode 100644 tools/testing/selftests/bpf/progs/strobemeta.h > create mode 100644 tools/testing/selftests/bpf/progs/strobemeta_nounroll1.c > create mode 100644 tools/testing/selftests/bpf/progs/strobemeta_nounroll2.c > create mode 100644 tools/testing/selftests/bpf/progs/test_seg6_loop.c > create mode 100644 tools/testing/selftests/bpf/progs/test_sysctl_loop1.c > create mode 100644 tools/testing/selftests/bpf/progs/test_sysctl_loop2.c > create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_loop.c > create mode 100644 tools/testing/selftests/bpf/verifier/loops1.c > > -- > 2.20.0 >
On 6/17/19 9:39 AM, Andrii Nakryiko wrote: > On Sat, Jun 15, 2019 at 12:12 PM Alexei Starovoitov <ast@kernel.org> wrote: >> >> v2->v3: fixed issues in backtracking pointed out by Andrii. >> The next step is to add a lot more tests for backtracking. >> > > Tests would be great, verifier complexity is at the level, where it's > very easy to miss issues. > > Was fuzzying approach ever discussed for BPF verifier? I.e., have a > fuzzer to generate both legal and illegal random small programs. Then > re-implement verifier as user-level program with straightforward > recursive exhaustive verification (so no state pruning logic, no > precise/coarse, etc, just register/stack state tracking) of all > possible branches. If kernel verifier's verdict differs from > user-level verifier's verdict - flag that as a test case and figure > out why they differ. Obviously that would work well only for small > programs, but that should be a good first step already. > > In addition, if this is done, that user-land verifier can be a HUGE > help to BPF application developers, as libbpf would (potentially) be > able to generate better error messages using it as well. In theory that sounds good, but doesn't work in practice. The kernel verifier keeps changing faster than user space can catch up. It's also relying on loaded maps and all sorts of callbacks that check context, allowed helpers, maps, combinations of them from all over the kernel. The last effort to build kernel verifier as-is into .o and link with kmalloc/map wrappers in user space was here: https://github.com/iovisor/bpf-fuzzer It was fuzzing the verifier and was able to find few minor bugs. But it quickly bit rotted. Folks brought up in the past the idea to collect user space verifiers from different kernels, so that user space tooling can check whether particular program will load on a set of kernels without need to run them in VMs. Even if such feature existed today it won't really solve this production headache, since all kernels prior to today will not be covered. I think syzbot is still generating bpf programs. iirc it found one bug in the past in the verifier core. I think the only way to make verifier more robust is to keep adding new test cases manually. Most interesting bugs we found by humans. Another approach to 'better error message' that was considered in the past was to teach llvm to recognize things that verifier will reject and let llvm warn on them. But it's also not practical. We had llvm error on calls. Then we added them to the verifier and had to change llvm. If we had llvm error on loops, now we'd need to change it. imo it's better to let llvm handle everything.
On Mon, Jun 17, 2019 at 11:58 AM Alexei Starovoitov <ast@fb.com> wrote: > > On 6/17/19 9:39 AM, Andrii Nakryiko wrote: > > On Sat, Jun 15, 2019 at 12:12 PM Alexei Starovoitov <ast@kernel.org> wrote: > >> > >> v2->v3: fixed issues in backtracking pointed out by Andrii. > >> The next step is to add a lot more tests for backtracking. > >> > > > > Tests would be great, verifier complexity is at the level, where it's > > very easy to miss issues. > > > > Was fuzzying approach ever discussed for BPF verifier? I.e., have a > > fuzzer to generate both legal and illegal random small programs. Then > > re-implement verifier as user-level program with straightforward > > recursive exhaustive verification (so no state pruning logic, no > > precise/coarse, etc, just register/stack state tracking) of all > > possible branches. If kernel verifier's verdict differs from > > user-level verifier's verdict - flag that as a test case and figure > > out why they differ. Obviously that would work well only for small > > programs, but that should be a good first step already. > > > > In addition, if this is done, that user-land verifier can be a HUGE > > help to BPF application developers, as libbpf would (potentially) be > > able to generate better error messages using it as well. > > In theory that sounds good, but doesn't work in practice. > The kernel verifier keeps changing faster than user space can catch up. > It's also relying on loaded maps and all sorts of callbacks that > check context, allowed helpers, maps, combinations of them from all > over the kernel. > The last effort to build kernel verifier as-is into .o and link > with kmalloc/map wrappers in user space was here: > https://github.com/iovisor/bpf-fuzzer > It was fuzzing the verifier and was able to find few minor bugs. > But it quickly bit rotted. > > Folks brought up in the past the idea to collect user space > verifiers from different kernels, so that user space tooling can > check whether particular program will load on a set of kernels > without need to run them in VMs. > Even if such feature existed today it won't really solve this production > headache, since all kernels prior to today will not be covered. > > I think syzbot is still generating bpf programs. iirc it found > one bug in the past in the verifier core. > I think the only way to make verifier more robust is to keep > adding new test cases manually. > Most interesting bugs we found by humans. > > Another approach to 'better error message' that was considered > in the past was to teach llvm to recognize things that verifier > will reject and let llvm warn on them. > But it's also not practical. We had llvm error on calls. > Then we added them to the verifier and had to change llvm. > If we had llvm error on loops, now we'd need to change it. > imo it's better to let llvm handle everything. That all makes sense. Thanks for elaborate explanation!
On Mon, Jun 17, 2019 at 06:57PM, Alexei Starovoitov wrote: > On 6/17/19 9:39 AM, Andrii Nakryiko wrote: > > On Sat, Jun 15, 2019 at 12:12 PM Alexei Starovoitov <ast@kernel.org> wrote: > >> > >> v2->v3: fixed issues in backtracking pointed out by Andrii. > >> The next step is to add a lot more tests for backtracking. > >> > > > > Tests would be great, verifier complexity is at the level, where it's > > very easy to miss issues. > > > > Was fuzzying approach ever discussed for BPF verifier? I.e., have a > > fuzzer to generate both legal and illegal random small programs. Then > > re-implement verifier as user-level program with straightforward > > recursive exhaustive verification (so no state pruning logic, no > > precise/coarse, etc, just register/stack state tracking) of all > > possible branches. If kernel verifier's verdict differs from > > user-level verifier's verdict - flag that as a test case and figure > > out why they differ. Obviously that would work well only for small > > programs, but that should be a good first step already. > > > > In addition, if this is done, that user-land verifier can be a HUGE > > help to BPF application developers, as libbpf would (potentially) be > > able to generate better error messages using it as well. > > In theory that sounds good, but doesn't work in practice. > The kernel verifier keeps changing faster than user space can catch up. > It's also relying on loaded maps and all sorts of callbacks that > check context, allowed helpers, maps, combinations of them from all > over the kernel. > The last effort to build kernel verifier as-is into .o and link > with kmalloc/map wrappers in user space was here: > https://github.com/iovisor/bpf-fuzzer > It was fuzzing the verifier and was able to find few minor bugs. > But it quickly bit rotted. We've been working (with Xiao Han, cc'ed) on an update of bpf-fuzzer. It helped us find a few issues in the verifier (mostly thanks to warn_on("verifier bug")s). I'll probably send a patchset to the main repository in the next few weeks. Updating the glue code (userspace wrappers) is okay so far, as long as done regularly. [...] > > I think syzbot is still generating bpf programs. iirc it found > one bug in the past in the verifier core. syzkaller's template description of BPF programs is very limited. It produces valid BPF instructions, but loaded programs make little sense, are trivial or invalid, and don't go very far in the verifier. We probably can't rely on syzkaller to be effective for the verifier. > I think the only way to make verifier more robust is to keep > adding new test cases manually. > Most interesting bugs we found by humans. Tests and reviews are definitely the most effective ways to find bugs, despite the above two fuzzers. The verifier is complex enough that the random approach of fuzzers has a hard time covering the code. Paul