mbox series

[v3,bpf-next,0/9] bpf: bounded loops and other features

Message ID 20190615191225.2409862-1-ast@kernel.org
Headers show
Series bpf: bounded loops and other features | expand

Message

Alexei Starovoitov June 15, 2019, 7:12 p.m. UTC
v2->v3: fixed issues in backtracking pointed out by Andrii.
The next step is to add a lot more tests for backtracking.

v1->v2: addressed Andrii's feedback.

this patch set introduces verifier support for bounded loops and
adds several other improvements.
Ideally they would be introduced one at a time,
but to support bounded loop the verifier needs to 'step back'
in the patch 1. That patch introduces tracking of spill/fill
of constants through the stack. Though it's a useful feature
it hurts cilium tests.
Patch 3 introduces another feature by extending is_branch_taken
logic to 'if rX op rY' conditions. This feature is also
necessary to support bounded loops.
Then patch 4 adds support for the loops while adding
key heuristics with jmp_processed.
Introduction of parentage chain of verifier states in patch 4
allows patch 9 to add backtracking of precise scalar registers
which finally resolves degradation from patch 1.

The end result is much faster verifier for existing programs
and new support for loops.
See patch 8 for many kinds of loops that are now validated.
Patch 9 is the most tricky one and could be rewritten with
a different algorithm in the future.

Alexei Starovoitov (9):
  bpf: track spill/fill of constants
  selftests/bpf: fix tests due to const spill/fill
  bpf: extend is_branch_taken to registers
  bpf: introduce bounded loops
  bpf: fix callees pruning callers
  selftests/bpf: fix tests
  selftests/bpf: add basic verifier tests for loops
  selftests/bpf: add realistic loop tests
  bpf: precise scalar_value tracking

 include/linux/bpf_verifier.h                  |  69 +-
 kernel/bpf/verifier.c                         | 767 ++++++++++++++++--
 .../bpf/prog_tests/bpf_verif_scale.c          |  67 +-
 tools/testing/selftests/bpf/progs/loop1.c     |  28 +
 tools/testing/selftests/bpf/progs/loop2.c     |  28 +
 tools/testing/selftests/bpf/progs/loop3.c     |  22 +
 tools/testing/selftests/bpf/progs/pyperf.h    |   6 +-
 tools/testing/selftests/bpf/progs/pyperf600.c |   9 +
 .../selftests/bpf/progs/pyperf600_nounroll.c  |   8 +
 .../testing/selftests/bpf/progs/strobemeta.c  |  10 +
 .../testing/selftests/bpf/progs/strobemeta.h  | 528 ++++++++++++
 .../bpf/progs/strobemeta_nounroll1.c          |   9 +
 .../bpf/progs/strobemeta_nounroll2.c          |   9 +
 .../selftests/bpf/progs/test_seg6_loop.c      | 261 ++++++
 .../selftests/bpf/progs/test_sysctl_loop1.c   |  71 ++
 .../selftests/bpf/progs/test_sysctl_loop2.c   |  72 ++
 .../selftests/bpf/progs/test_xdp_loop.c       | 231 ++++++
 tools/testing/selftests/bpf/test_verifier.c   |  11 +-
 tools/testing/selftests/bpf/verifier/calls.c  |  22 +-
 tools/testing/selftests/bpf/verifier/cfg.c    |  11 +-
 .../bpf/verifier/direct_packet_access.c       |   3 +-
 .../bpf/verifier/helper_access_var_len.c      |  28 +-
 tools/testing/selftests/bpf/verifier/loops1.c | 161 ++++
 23 files changed, 2317 insertions(+), 114 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/loop1.c
 create mode 100644 tools/testing/selftests/bpf/progs/loop2.c
 create mode 100644 tools/testing/selftests/bpf/progs/loop3.c
 create mode 100644 tools/testing/selftests/bpf/progs/pyperf600.c
 create mode 100644 tools/testing/selftests/bpf/progs/pyperf600_nounroll.c
 create mode 100644 tools/testing/selftests/bpf/progs/strobemeta.c
 create mode 100644 tools/testing/selftests/bpf/progs/strobemeta.h
 create mode 100644 tools/testing/selftests/bpf/progs/strobemeta_nounroll1.c
 create mode 100644 tools/testing/selftests/bpf/progs/strobemeta_nounroll2.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_seg6_loop.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_sysctl_loop1.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_sysctl_loop2.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_loop.c
 create mode 100644 tools/testing/selftests/bpf/verifier/loops1.c

Comments

Andrii Nakryiko June 17, 2019, 4:39 p.m. UTC | #1
On Sat, Jun 15, 2019 at 12:12 PM Alexei Starovoitov <ast@kernel.org> wrote:
>
> v2->v3: fixed issues in backtracking pointed out by Andrii.
> The next step is to add a lot more tests for backtracking.
>

Tests would be great, verifier complexity is at the level, where it's
very easy to miss issues.

Was fuzzying approach ever discussed for BPF verifier? I.e., have a
fuzzer to generate both legal and illegal random small programs. Then
re-implement verifier as user-level program with straightforward
recursive exhaustive verification (so no state pruning logic, no
precise/coarse, etc, just register/stack state tracking) of all
possible branches. If kernel verifier's verdict differs from
user-level verifier's verdict - flag that as a test case and figure
out why they differ. Obviously that would work well only for small
programs, but that should be a good first step already.

In addition, if this is done, that user-land verifier can be a HUGE
help to BPF application developers, as libbpf would (potentially) be
able to generate better error messages using it as well.


> v1->v2: addressed Andrii's feedback.
>
> this patch set introduces verifier support for bounded loops and
> adds several other improvements.
> Ideally they would be introduced one at a time,
> but to support bounded loop the verifier needs to 'step back'
> in the patch 1. That patch introduces tracking of spill/fill
> of constants through the stack. Though it's a useful feature
> it hurts cilium tests.
> Patch 3 introduces another feature by extending is_branch_taken
> logic to 'if rX op rY' conditions. This feature is also
> necessary to support bounded loops.
> Then patch 4 adds support for the loops while adding
> key heuristics with jmp_processed.
> Introduction of parentage chain of verifier states in patch 4
> allows patch 9 to add backtracking of precise scalar registers
> which finally resolves degradation from patch 1.
>
> The end result is much faster verifier for existing programs
> and new support for loops.
> See patch 8 for many kinds of loops that are now validated.
> Patch 9 is the most tricky one and could be rewritten with
> a different algorithm in the future.
>
> Alexei Starovoitov (9):
>   bpf: track spill/fill of constants
>   selftests/bpf: fix tests due to const spill/fill
>   bpf: extend is_branch_taken to registers
>   bpf: introduce bounded loops
>   bpf: fix callees pruning callers
>   selftests/bpf: fix tests
>   selftests/bpf: add basic verifier tests for loops
>   selftests/bpf: add realistic loop tests
>   bpf: precise scalar_value tracking
>
>  include/linux/bpf_verifier.h                  |  69 +-
>  kernel/bpf/verifier.c                         | 767 ++++++++++++++++--
>  .../bpf/prog_tests/bpf_verif_scale.c          |  67 +-
>  tools/testing/selftests/bpf/progs/loop1.c     |  28 +
>  tools/testing/selftests/bpf/progs/loop2.c     |  28 +
>  tools/testing/selftests/bpf/progs/loop3.c     |  22 +
>  tools/testing/selftests/bpf/progs/pyperf.h    |   6 +-
>  tools/testing/selftests/bpf/progs/pyperf600.c |   9 +
>  .../selftests/bpf/progs/pyperf600_nounroll.c  |   8 +
>  .../testing/selftests/bpf/progs/strobemeta.c  |  10 +
>  .../testing/selftests/bpf/progs/strobemeta.h  | 528 ++++++++++++
>  .../bpf/progs/strobemeta_nounroll1.c          |   9 +
>  .../bpf/progs/strobemeta_nounroll2.c          |   9 +
>  .../selftests/bpf/progs/test_seg6_loop.c      | 261 ++++++
>  .../selftests/bpf/progs/test_sysctl_loop1.c   |  71 ++
>  .../selftests/bpf/progs/test_sysctl_loop2.c   |  72 ++
>  .../selftests/bpf/progs/test_xdp_loop.c       | 231 ++++++
>  tools/testing/selftests/bpf/test_verifier.c   |  11 +-
>  tools/testing/selftests/bpf/verifier/calls.c  |  22 +-
>  tools/testing/selftests/bpf/verifier/cfg.c    |  11 +-
>  .../bpf/verifier/direct_packet_access.c       |   3 +-
>  .../bpf/verifier/helper_access_var_len.c      |  28 +-
>  tools/testing/selftests/bpf/verifier/loops1.c | 161 ++++
>  23 files changed, 2317 insertions(+), 114 deletions(-)
>  create mode 100644 tools/testing/selftests/bpf/progs/loop1.c
>  create mode 100644 tools/testing/selftests/bpf/progs/loop2.c
>  create mode 100644 tools/testing/selftests/bpf/progs/loop3.c
>  create mode 100644 tools/testing/selftests/bpf/progs/pyperf600.c
>  create mode 100644 tools/testing/selftests/bpf/progs/pyperf600_nounroll.c
>  create mode 100644 tools/testing/selftests/bpf/progs/strobemeta.c
>  create mode 100644 tools/testing/selftests/bpf/progs/strobemeta.h
>  create mode 100644 tools/testing/selftests/bpf/progs/strobemeta_nounroll1.c
>  create mode 100644 tools/testing/selftests/bpf/progs/strobemeta_nounroll2.c
>  create mode 100644 tools/testing/selftests/bpf/progs/test_seg6_loop.c
>  create mode 100644 tools/testing/selftests/bpf/progs/test_sysctl_loop1.c
>  create mode 100644 tools/testing/selftests/bpf/progs/test_sysctl_loop2.c
>  create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_loop.c
>  create mode 100644 tools/testing/selftests/bpf/verifier/loops1.c
>
> --
> 2.20.0
>
Alexei Starovoitov June 17, 2019, 6:57 p.m. UTC | #2
On 6/17/19 9:39 AM, Andrii Nakryiko wrote:
> On Sat, Jun 15, 2019 at 12:12 PM Alexei Starovoitov <ast@kernel.org> wrote:
>>
>> v2->v3: fixed issues in backtracking pointed out by Andrii.
>> The next step is to add a lot more tests for backtracking.
>>
> 
> Tests would be great, verifier complexity is at the level, where it's
> very easy to miss issues.
> 
> Was fuzzying approach ever discussed for BPF verifier? I.e., have a
> fuzzer to generate both legal and illegal random small programs. Then
> re-implement verifier as user-level program with straightforward
> recursive exhaustive verification (so no state pruning logic, no
> precise/coarse, etc, just register/stack state tracking) of all
> possible branches. If kernel verifier's verdict differs from
> user-level verifier's verdict - flag that as a test case and figure
> out why they differ. Obviously that would work well only for small
> programs, but that should be a good first step already.
> 
> In addition, if this is done, that user-land verifier can be a HUGE
> help to BPF application developers, as libbpf would (potentially) be
> able to generate better error messages using it as well.

In theory that sounds good, but doesn't work in practice.
The kernel verifier keeps changing faster than user space can catch up.
It's also relying on loaded maps and all sorts of callbacks that
check context, allowed helpers, maps, combinations of them from all
over the kernel.
The last effort to build kernel verifier as-is into .o and link
with kmalloc/map wrappers in user space was here:
https://github.com/iovisor/bpf-fuzzer
It was fuzzing the verifier and was able to find few minor bugs.
But it quickly bit rotted.

Folks brought up in the past the idea to collect user space
verifiers from different kernels, so that user space tooling can
check whether particular program will load on a set of kernels
without need to run them in VMs.
Even if such feature existed today it won't really solve this production
headache, since all kernels prior to today will not be covered.

I think syzbot is still generating bpf programs. iirc it found
one bug in the past in the verifier core.
I think the only way to make verifier more robust is to keep
adding new test cases manually.
Most interesting bugs we found by humans.

Another approach to 'better error message' that was considered
in the past was to teach llvm to recognize things that verifier
will reject and let llvm warn on them.
But it's also not practical. We had llvm error on calls.
Then we added them to the verifier and had to change llvm.
If we had llvm error on loops, now we'd need to change it.
imo it's better to let llvm handle everything.
Andrii Nakryiko June 17, 2019, 7:06 p.m. UTC | #3
On Mon, Jun 17, 2019 at 11:58 AM Alexei Starovoitov <ast@fb.com> wrote:
>
> On 6/17/19 9:39 AM, Andrii Nakryiko wrote:
> > On Sat, Jun 15, 2019 at 12:12 PM Alexei Starovoitov <ast@kernel.org> wrote:
> >>
> >> v2->v3: fixed issues in backtracking pointed out by Andrii.
> >> The next step is to add a lot more tests for backtracking.
> >>
> >
> > Tests would be great, verifier complexity is at the level, where it's
> > very easy to miss issues.
> >
> > Was fuzzying approach ever discussed for BPF verifier? I.e., have a
> > fuzzer to generate both legal and illegal random small programs. Then
> > re-implement verifier as user-level program with straightforward
> > recursive exhaustive verification (so no state pruning logic, no
> > precise/coarse, etc, just register/stack state tracking) of all
> > possible branches. If kernel verifier's verdict differs from
> > user-level verifier's verdict - flag that as a test case and figure
> > out why they differ. Obviously that would work well only for small
> > programs, but that should be a good first step already.
> >
> > In addition, if this is done, that user-land verifier can be a HUGE
> > help to BPF application developers, as libbpf would (potentially) be
> > able to generate better error messages using it as well.
>
> In theory that sounds good, but doesn't work in practice.
> The kernel verifier keeps changing faster than user space can catch up.
> It's also relying on loaded maps and all sorts of callbacks that
> check context, allowed helpers, maps, combinations of them from all
> over the kernel.
> The last effort to build kernel verifier as-is into .o and link
> with kmalloc/map wrappers in user space was here:
> https://github.com/iovisor/bpf-fuzzer
> It was fuzzing the verifier and was able to find few minor bugs.
> But it quickly bit rotted.
>
> Folks brought up in the past the idea to collect user space
> verifiers from different kernels, so that user space tooling can
> check whether particular program will load on a set of kernels
> without need to run them in VMs.
> Even if such feature existed today it won't really solve this production
> headache, since all kernels prior to today will not be covered.
>
> I think syzbot is still generating bpf programs. iirc it found
> one bug in the past in the verifier core.
> I think the only way to make verifier more robust is to keep
> adding new test cases manually.
> Most interesting bugs we found by humans.
>
> Another approach to 'better error message' that was considered
> in the past was to teach llvm to recognize things that verifier
> will reject and let llvm warn on them.
> But it's also not practical. We had llvm error on calls.
> Then we added them to the verifier and had to change llvm.
> If we had llvm error on loops, now we'd need to change it.
> imo it's better to let llvm handle everything.

That all makes sense. Thanks for elaborate explanation!
Paul Chaignon June 18, 2019, 2:05 p.m. UTC | #4
On Mon, Jun 17, 2019 at 06:57PM, Alexei Starovoitov wrote:
> On 6/17/19 9:39 AM, Andrii Nakryiko wrote:
> > On Sat, Jun 15, 2019 at 12:12 PM Alexei Starovoitov <ast@kernel.org> wrote:
> >>
> >> v2->v3: fixed issues in backtracking pointed out by Andrii.
> >> The next step is to add a lot more tests for backtracking.
> >>
> >
> > Tests would be great, verifier complexity is at the level, where it's
> > very easy to miss issues.
> >
> > Was fuzzying approach ever discussed for BPF verifier? I.e., have a
> > fuzzer to generate both legal and illegal random small programs. Then
> > re-implement verifier as user-level program with straightforward
> > recursive exhaustive verification (so no state pruning logic, no
> > precise/coarse, etc, just register/stack state tracking) of all
> > possible branches. If kernel verifier's verdict differs from
> > user-level verifier's verdict - flag that as a test case and figure
> > out why they differ. Obviously that would work well only for small
> > programs, but that should be a good first step already.
> >
> > In addition, if this is done, that user-land verifier can be a HUGE
> > help to BPF application developers, as libbpf would (potentially) be
> > able to generate better error messages using it as well.
>
> In theory that sounds good, but doesn't work in practice.
> The kernel verifier keeps changing faster than user space can catch up.
> It's also relying on loaded maps and all sorts of callbacks that
> check context, allowed helpers, maps, combinations of them from all
> over the kernel.
> The last effort to build kernel verifier as-is into .o and link
> with kmalloc/map wrappers in user space was here:
> https://github.com/iovisor/bpf-fuzzer
> It was fuzzing the verifier and was able to find few minor bugs.
> But it quickly bit rotted.

We've been working (with Xiao Han, cc'ed) on an update of bpf-fuzzer.  It
helped us find a few issues in the verifier (mostly thanks to
warn_on("verifier bug")s).  I'll probably send a patchset to the main
repository in the next few weeks.

Updating the glue code (userspace wrappers) is okay so far, as long as
done regularly.

[...]

>
> I think syzbot is still generating bpf programs. iirc it found
> one bug in the past in the verifier core.

syzkaller's template description of BPF programs is very limited.  It
produces valid BPF instructions, but loaded programs make little sense,
are trivial or invalid, and don't go very far in the verifier.  We
probably can't rely on syzkaller to be effective for the verifier.

> I think the only way to make verifier more robust is to keep
> adding new test cases manually.
> Most interesting bugs we found by humans.

Tests and reviews are definitely the most effective ways to find bugs,
despite the above two fuzzers.  The verifier is complex enough that the
random approach of fuzzers has a hard time covering the code.

Paul