bpf debug info

Message ID	20161129064217.GA20124@ast-mbp.thefacebook.com
State	RFC, archived
Delegated to:	David Miller
Headers	show Return-Path: <netdev-owner@vger.kernel.org> Date: Mon, 28 Nov 2016 22:42:19 -0800 From: Alexei Starovoitov <alexei.starovoitov@gmail.com> To: netdev@vger.kernel.org Cc: Daniel Borkmann <daniel@iogearbox.net>, Brenden Blanco <bblanco@plumgrid.com>, Thomas Graf <tgraf@suug.ch>, Wangnan <wangnan0@huawei.com>, He Kuang <hekuang@huawei.com>, kernel-team@fb.com Subject: bpf debug info Message-ID: <20161129064217.GA20124@ast-mbp.thefacebook.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.24 (2015-08-30) Sender: netdev-owner@vger.kernel.org Precedence: bulk

Alexei Starovoitov Nov. 29, 2016, 6:42 a.m. UTC

Hi All,

The support for debug information in BPF was recently added to llvm.

In order to use it recompile bpf programs with the following patch
in samples/bpf/Makefile
@@ -155,4 +155,4 @@ $(obj)/%.o: $(src)/%.c
        $(CLANG) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) \
                -D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign \
                -Wno-compare-distinct-pointer-types \
-               -O2 -emit-llvm -c $< -o -| $(LLC) -march=bpf -filetype=obj -o $@
+               -O2 -emit-llvm -g -c $< -o -| $(LLC) -march=bpf -filetype=obj -o $@

and compiled .o files can be consumed by standard llvm-objdump utility.

$ llvm-objdump -S -no-show-raw-insn samples/bpf/xdp1_kern.o
xdp1_kern.o:    file format ELF64-BPF

Disassembly of section xdp1:
xdp_prog1:
; {
       0:       r2 = *(u32 *)(r1 + 4)
; void *data = (void *)(long)ctx->data;
       8:       r1 = *(u32 *)(r1 + 0)
; if (data + nh_off > data_end)
      10:       r3 = r1
      18:       r3 += 14
      20:       if r3 > r2 goto 55
; h_proto = eth->h_proto;
      28:       r3 = *(u8 *)(r1 + 12)
      30:       r4 = *(u8 *)(r1 + 13)
      38:       r4 <<= 8
      40:       r4 |= r3
; if (h_proto == htons(ETH_P_8021Q) || h_proto == htons(ETH_P_8021AD)) {
      48:       if r4 == 43144 goto 2
      50:       r3 = 14
      58:       if r4 != 129 goto 5

LBB0_3:
; if (data + nh_off > data_end)
      60:       r3 = r1
      68:       r3 += 18
      70:       if r3 > r2 goto 45
      78:       r3 = 18
; h_proto = vhdr->h_vlan_encapsulated_proto;
      80:       r4 = *(u16 *)(r1 + 16)

LBB0_5:
      88:       r5 = r4
      90:       r5 &= 65535
; if (h_proto == htons(ETH_P_8021Q) || h_proto == htons(ETH_P_8021AD)) {
      98:       if r5 == 43144 goto 1
      a0:       if r5 != 129 goto 9

Notice that 'clang -S -o a.s' output and llvm-objdump disassembler
were changed to use kernel verifier style, so now it should be easier
to see what's going on.

The main advantage of debug info is that verifier error messages
are now easier to correlate to original C code.

For example, say, in samples/bpf/parse_varlen.c I forgot
to compare pointer into packet with data_end:
If I try to run samples/bpf/test_cls_bpf.sh the verifier will complain:
R0=imm0,min_value=0,max_value=0 R1=pkt(id=0,off=0,r=42) R2=pkt_end
112: (0f) r4 += r3
113: (0f) r1 += r4
114: (b7) r0 = 2
115: (69) r2 = *(u16 *)(r1 +2)
invalid access to packet, off=2 size=2, R1(id=3,off=0,r=0)

Now multiply 115 * 8 and convert to hex. This is address 0x398 in llvm-objdump:
; struct udphdr *udp = data + tp_off;
     388:       r1 += r4
     390:       r0 = 2
; if (udp->dest == htons(DEFAULT_PKTGEN_UDP_PORT) ||
     398:       r2 = *(u16 *)(r1 + 2)
     3a0:       if r2 == 2304 goto 16

Now it's clear which line of C code is causing the verifier to reject.

It's still not obvious why register R1 is 'invalid pointer to packet'.
The 'r=0' part of 'R1(id=3,off=0,r=0)' stands for zero bytes were
verified to be valid in this register.
Since 'if (udp + 1 > data_end)' was not done, the kernel doesn't
know that there are valid bytes in the packet after 'udp' pointer.

So next step is to improve verifier messages to be more human friendly.
The step after is to introduce BPF_COMMENT pseudo instruction
that will be ignored by the interpreter yet it will contain the text
of original source code. Then llvm-objdump step won't be necessary.
The bpf loader will load both instructions and pieces of C sources.
Then verifier errors should be even easier to read and humans
can easily understand the purpose of the program.

PS
A year ago He Kuang reported that dwarf emitted by bpf llvm backend is broken.
Sorry it took so long to fix. It's probably still broken on big endian,
since I've only tested on x86.

Daniel Borkmann Nov. 29, 2016, 3:11 p.m. UTC | #1

On 11/29/2016 07:42 AM, Alexei Starovoitov wrote:
[...]
> The support for debug information in BPF was recently added to llvm.
>
> In order to use it recompile bpf programs with the following patch
> in samples/bpf/Makefile
> @@ -155,4 +155,4 @@ $(obj)/%.o: $(src)/%.c
>          $(CLANG) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) \
>                  -D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign \
>                  -Wno-compare-distinct-pointer-types \
> -               -O2 -emit-llvm -c $< -o -| $(LLC) -march=bpf -filetype=obj -o $@
> +               -O2 -emit-llvm -g -c $< -o -| $(LLC) -march=bpf -filetype=obj -o $@
>
> and compiled .o files can be consumed by standard llvm-objdump utility.
>
> $ llvm-objdump -S -no-show-raw-insn samples/bpf/xdp1_kern.o
> xdp1_kern.o:    file format ELF64-BPF
>
> Disassembly of section xdp1:
> xdp_prog1:
> ; {
>         0:       r2 = *(u32 *)(r1 + 4)
> ; void *data = (void *)(long)ctx->data;
>         8:       r1 = *(u32 *)(r1 + 0)
> ; if (data + nh_off > data_end)
>        10:       r3 = r1
>        18:       r3 += 14
>        20:       if r3 > r2 goto 55
> ; h_proto = eth->h_proto;
>        28:       r3 = *(u8 *)(r1 + 12)
>        30:       r4 = *(u8 *)(r1 + 13)
>        38:       r4 <<= 8
>        40:       r4 |= r3
> ; if (h_proto == htons(ETH_P_8021Q) || h_proto == htons(ETH_P_8021AD)) {
>        48:       if r4 == 43144 goto 2
>        50:       r3 = 14
>        58:       if r4 != 129 goto 5
>
> LBB0_3:
> ; if (data + nh_off > data_end)
>        60:       r3 = r1
>        68:       r3 += 18
>        70:       if r3 > r2 goto 45
>        78:       r3 = 18
> ; h_proto = vhdr->h_vlan_encapsulated_proto;
>        80:       r4 = *(u16 *)(r1 + 16)
>
> LBB0_5:
>        88:       r5 = r4
>        90:       r5 &= 65535
> ; if (h_proto == htons(ETH_P_8021Q) || h_proto == htons(ETH_P_8021AD)) {
>        98:       if r5 == 43144 goto 1
>        a0:       if r5 != 129 goto 9
>
> Notice that 'clang -S -o a.s' output and llvm-objdump disassembler
> were changed to use kernel verifier style, so now it should be easier
> to see what's going on.

Sounds really useful, is that scheduled for llvm 3.10 release?

That debugging info is stored in dwarf format into the obj, right?
Would be nice if also pahole could work on bpf object files.

> The main advantage of debug info is that verifier error messages
> are now easier to correlate to original C code.

Does that mean that the old -S output format is not available anymore?
Personally, I liked that one better tbh, was hoping someone would have
added asm parsing for it, so compilation from .S to .o also works.

> For example, say, in samples/bpf/parse_varlen.c I forgot
> to compare pointer into packet with data_end:
> --- a/samples/bpf/parse_varlen.c
> +++ b/samples/bpf/parse_varlen.c
> @@ -33,8 +33,8 @@ static int udp(void *data, uint64_t tp_off, void *data_end)
>   {
>          struct udphdr *udp = data + tp_off;
>
> -       if (udp + 1 > data_end)
> -               return 0;
> +//     if (udp + 1 > data_end)
> +//             return 0;
>          if (udp->dest == htons(DEFAULT_PKTGEN_UDP_PORT) ||
>              udp->source == htons(DEFAULT_PKTGEN_UDP_PORT)) {
>
> If I try to run samples/bpf/test_cls_bpf.sh the verifier will complain:
> R0=imm0,min_value=0,max_value=0 R1=pkt(id=0,off=0,r=42) R2=pkt_end
> 112: (0f) r4 += r3
> 113: (0f) r1 += r4
> 114: (b7) r0 = 2
> 115: (69) r2 = *(u16 *)(r1 +2)
> invalid access to packet, off=2 size=2, R1(id=3,off=0,r=0)
>
> Now multiply 115 * 8 and convert to hex. This is address 0x398 in llvm-objdump:
> ; struct udphdr *udp = data + tp_off;
>       388:       r1 += r4
>       390:       r0 = 2
> ; if (udp->dest == htons(DEFAULT_PKTGEN_UDP_PORT) ||
>       398:       r2 = *(u16 *)(r1 + 2)
>       3a0:       if r2 == 2304 goto 16
>
> Now it's clear which line of C code is causing the verifier to reject.
[...]

Could llvm-objdump switch line numbering for bpf same way as verifier
output, so mapping step is not really needed? Alternatively, on each
verifier error, there could be a hint with cmd to use regarding llvm-objdump.

> So next step is to improve verifier messages to be more human friendly.
> The step after is to introduce BPF_COMMENT pseudo instruction
> that will be ignored by the interpreter yet it will contain the text
> of original source code. Then llvm-objdump step won't be necessary.
> The bpf loader will load both instructions and pieces of C sources.
> Then verifier errors should be even easier to read and humans
> can easily understand the purpose of the program.

So the BPF_COMMENT pseudo insn will get stripped away from the insn array
after verification step, so we don't need to hold/account for this mem? I
assume in it's ->imm member it will just hold offset into text blob?

Given that the generated verifier log can already become huge nowadays up
to a point where less than 4k insns prog fails to load due to reaching max
kernel allowed verifier buffer size, is the plan to only dump C code for
last few lines or for everything?

> PS
> A year ago He Kuang reported that dwarf emitted by bpf llvm backend is broken.
> Sorry it took so long to fix. It's probably still broken on big endian,
> since I've only tested on x86.
>

Jakub Kicinski Nov. 29, 2016, 3:38 p.m. UTC | #2

On Tue, 29 Nov 2016 16:11:32 +0100, Daniel Borkmann wrote:
> On 11/29/2016 07:42 AM, Alexei Starovoitov wrote:
> [...]
> > The support for debug information in BPF was recently added to llvm.
> >
> > In order to use it recompile bpf programs with the following patch
> > in samples/bpf/Makefile
> > @@ -155,4 +155,4 @@ $(obj)/%.o: $(src)/%.c
> >          $(CLANG) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) \
> >                  -D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign \
> >                  -Wno-compare-distinct-pointer-types \
> > -               -O2 -emit-llvm -c $< -o -| $(LLC) -march=bpf -filetype=obj -o $@
> > +               -O2 -emit-llvm -g -c $< -o -| $(LLC) -march=bpf -filetype=obj -o $@
> >
> > and compiled .o files can be consumed by standard llvm-objdump utility.
> >
> > $ llvm-objdump -S -no-show-raw-insn samples/bpf/xdp1_kern.o
> > xdp1_kern.o:    file format ELF64-BPF
> >
> > Disassembly of section xdp1:
> > xdp_prog1:
> > ; {
> >         0:       r2 = *(u32 *)(r1 + 4)
> > ; void *data = (void *)(long)ctx->data;
> >         8:       r1 = *(u32 *)(r1 + 0)
> > ; if (data + nh_off > data_end)
> >        10:       r3 = r1
> >        18:       r3 += 14
> >        20:       if r3 > r2 goto 55
> > ; h_proto = eth->h_proto;
> >        28:       r3 = *(u8 *)(r1 + 12)
> >        30:       r4 = *(u8 *)(r1 + 13)
> >        38:       r4 <<= 8
> >        40:       r4 |= r3
> > ; if (h_proto == htons(ETH_P_8021Q) || h_proto == htons(ETH_P_8021AD)) {
> >        48:       if r4 == 43144 goto 2
> >        50:       r3 = 14
> >        58:       if r4 != 129 goto 5
> >
> > LBB0_3:
> > ; if (data + nh_off > data_end)
> >        60:       r3 = r1
> >        68:       r3 += 18
> >        70:       if r3 > r2 goto 45
> >        78:       r3 = 18
> > ; h_proto = vhdr->h_vlan_encapsulated_proto;
> >        80:       r4 = *(u16 *)(r1 + 16)
> >
> > LBB0_5:
> >        88:       r5 = r4
> >        90:       r5 &= 65535
> > ; if (h_proto == htons(ETH_P_8021Q) || h_proto == htons(ETH_P_8021AD)) {
> >        98:       if r5 == 43144 goto 1
> >        a0:       if r5 != 129 goto 9
> >
> > Notice that 'clang -S -o a.s' output and llvm-objdump disassembler
> > were changed to use kernel verifier style, so now it should be easier
> > to see what's going on.  

This sounds super useful!  Thanks a lot!

>>> [...]
> > So next step is to improve verifier messages to be more human friendly.
> > The step after is to introduce BPF_COMMENT pseudo instruction
> > that will be ignored by the interpreter yet it will contain the text
> > of original source code. Then llvm-objdump step won't be necessary.
> > The bpf loader will load both instructions and pieces of C sources.
> > Then verifier errors should be even easier to read and humans
> > can easily understand the purpose of the program.  
> 
> So the BPF_COMMENT pseudo insn will get stripped away from the insn array
> after verification step, so we don't need to hold/account for this mem? I
> assume in it's ->imm member it will just hold offset into text blob?

Associating any form of opaque data with programs always makes me
worried about opening a side channel of communication with a specialized
user space implementations/compilers.  But I guess if the BPF_COMMENTs
are stripped in the verifier as Daniel assumes drivers and JITs will
never see it.

Just to clarify, however - is there any reason why pushing the source
code into the kernel is necessary?  Or is it just for convenience?
Provided the user space loader has access to the debug info it should
have no problems matching the verifier output to code lines?

Alexei Starovoitov Nov. 29, 2016, 5:01 p.m. UTC | #3

On Tue, Nov 29, 2016 at 04:11:32PM +0100, Daniel Borkmann wrote:
> On 11/29/2016 07:42 AM, Alexei Starovoitov wrote:
> [...]
> >The support for debug information in BPF was recently added to llvm.
> >
> >In order to use it recompile bpf programs with the following patch
> >in samples/bpf/Makefile
> >@@ -155,4 +155,4 @@ $(obj)/%.o: $(src)/%.c
> >         $(CLANG) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) \
> >                 -D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign \
> >                 -Wno-compare-distinct-pointer-types \
> >-               -O2 -emit-llvm -c $< -o -| $(LLC) -march=bpf -filetype=obj -o $@
> >+               -O2 -emit-llvm -g -c $< -o -| $(LLC) -march=bpf -filetype=obj -o $@
> >
> >and compiled .o files can be consumed by standard llvm-objdump utility.
> >
> >$ llvm-objdump -S -no-show-raw-insn samples/bpf/xdp1_kern.o
> >xdp1_kern.o:    file format ELF64-BPF
> >
> >Disassembly of section xdp1:
> >xdp_prog1:
> >; {
> >        0:       r2 = *(u32 *)(r1 + 4)
> >; void *data = (void *)(long)ctx->data;
> >        8:       r1 = *(u32 *)(r1 + 0)
> >; if (data + nh_off > data_end)
> >       10:       r3 = r1
> >       18:       r3 += 14
> >       20:       if r3 > r2 goto 55
> >; h_proto = eth->h_proto;
> >       28:       r3 = *(u8 *)(r1 + 12)
> >       30:       r4 = *(u8 *)(r1 + 13)
> >       38:       r4 <<= 8
> >       40:       r4 |= r3
> >; if (h_proto == htons(ETH_P_8021Q) || h_proto == htons(ETH_P_8021AD)) {
> >       48:       if r4 == 43144 goto 2
> >       50:       r3 = 14
> >       58:       if r4 != 129 goto 5
> >
> >LBB0_3:
> >; if (data + nh_off > data_end)
> >       60:       r3 = r1
> >       68:       r3 += 18
> >       70:       if r3 > r2 goto 45
> >       78:       r3 = 18
> >; h_proto = vhdr->h_vlan_encapsulated_proto;
> >       80:       r4 = *(u16 *)(r1 + 16)
> >
> >LBB0_5:
> >       88:       r5 = r4
> >       90:       r5 &= 65535
> >; if (h_proto == htons(ETH_P_8021Q) || h_proto == htons(ETH_P_8021AD)) {
> >       98:       if r5 == 43144 goto 1
> >       a0:       if r5 != 129 goto 9
> >
> >Notice that 'clang -S -o a.s' output and llvm-objdump disassembler
> >were changed to use kernel verifier style, so now it should be easier
> >to see what's going on.
> 
> Sounds really useful, is that scheduled for llvm 3.10 release?

llvm 4.0 :)

> That debugging info is stored in dwarf format into the obj, right?

right.

> Would be nice if also pahole could work on bpf object files.

yeah. pahole need to be taught to recognize bpf e_machine type and relocations.

> >The main advantage of debug info is that verifier error messages
> >are now easier to correlate to original C code.
> 
> Does that mean that the old -S output format is not available anymore?
> Personally, I liked that one better tbh, was hoping someone would have
> added asm parsing for it, so compilation from .S to .o also works.

we can still do a proper assembler form .s into .o
llvm infra is flexible enough. I can point somebody on what places
inside llvm bpf backend to extend to add full parsing of .s

> >If I try to run samples/bpf/test_cls_bpf.sh the verifier will complain:
> >R0=imm0,min_value=0,max_value=0 R1=pkt(id=0,off=0,r=42) R2=pkt_end
> >112: (0f) r4 += r3
> >113: (0f) r1 += r4
> >114: (b7) r0 = 2
> >115: (69) r2 = *(u16 *)(r1 +2)
> >invalid access to packet, off=2 size=2, R1(id=3,off=0,r=0)
> >
> >Now multiply 115 * 8 and convert to hex. This is address 0x398 in llvm-objdump:
> >; struct udphdr *udp = data + tp_off;
> >      388:       r1 += r4
> >      390:       r0 = 2
> >; if (udp->dest == htons(DEFAULT_PKTGEN_UDP_PORT) ||
> >      398:       r2 = *(u16 *)(r1 + 2)
> >      3a0:       if r2 == 2304 goto 16
> >
> >Now it's clear which line of C code is causing the verifier to reject.
> [...]
> 
> Could llvm-objdump switch line numbering for bpf same way as verifier
> output, so mapping step is not really needed?

you mean that llvm-objdump to print 113,114,115 ?
I guess it's doable. Will give it a try.

> So the BPF_COMMENT pseudo insn will get stripped away from the insn array
> after verification step, so we don't need to hold/account for this mem?

that was the idea, but if we keep src in there, we might want to keep
it for some future 'dump program' debugging step?

> I assume in it's ->imm member it will just hold offset into text blob?

yes. 

> Given that the generated verifier log can already become huge nowadays up
> to a point where less than 4k insns prog fails to load due to reaching max
> kernel allowed verifier buffer size, is the plan to only dump C code for
> last few lines or for everything?

right. that's a good point.

Alexei Starovoitov Nov. 29, 2016, 6:51 p.m. UTC | #4

On Tue, Nov 29, 2016 at 03:38:18PM +0000, Jakub Kicinski wrote:
> 
> >>> [...]
> > > So next step is to improve verifier messages to be more human friendly.
> > > The step after is to introduce BPF_COMMENT pseudo instruction
> > > that will be ignored by the interpreter yet it will contain the text
> > > of original source code. Then llvm-objdump step won't be necessary.
> > > The bpf loader will load both instructions and pieces of C sources.
> > > Then verifier errors should be even easier to read and humans
> > > can easily understand the purpose of the program.  
> > 
> > So the BPF_COMMENT pseudo insn will get stripped away from the insn array
> > after verification step, so we don't need to hold/account for this mem? I
> > assume in it's ->imm member it will just hold offset into text blob?
> 
> Associating any form of opaque data with programs always makes me
> worried about opening a side channel of communication with a specialized
> user space implementations/compilers.  But I guess if the BPF_COMMENTs
> are stripped in the verifier as Daniel assumes drivers and JITs will
> never see it.

yes. the idea that it's a comment. It can contain any text,
not only C code, but any other language.
It's definitely going to be stripped before JITs and kernel will
not make any safety or translation decisions based on such comment.

> Just to clarify, however - is there any reason why pushing the source
> code into the kernel is necessary?  Or is it just for convenience?
> Provided the user space loader has access to the debug info it should
> have no problems matching the verifier output to code lines?

correct. just for convenience. The user space has to keep .o around,
since it can crash, would have to reload and so on.
Only for some script that ssh-es into servers and wants to see
what is being loaded, it might help to dump full asm and these comments
along with prog_digest that Daniel is working on in parallel.
Alternatively instead of doing BPF_COMMENT we can load the whole .o
as-is into bpffs as a blob. Later (based on digest) the kernel can
dump such .o back for user space to run objdump on. It all can be
done without kernel involvement. Like tc command can copy .o and so on.
But not everything is using tc.
Another alternative is to do a decompiler from bpf binary
into meaningful C code. It's not trivial and names will be lost.
bpf_comment approach is pretty cheap from kernel point of view
and greatly helps visibility when users don't cheat with debug info.

Daniel Borkmann Nov. 29, 2016, 8:23 p.m. UTC | #5

On 11/29/2016 07:51 PM, Alexei Starovoitov wrote:
> On Tue, Nov 29, 2016 at 03:38:18PM +0000, Jakub Kicinski wrote:
[...]
>>>> So next step is to improve verifier messages to be more human friendly.
>>>> The step after is to introduce BPF_COMMENT pseudo instruction
>>>> that will be ignored by the interpreter yet it will contain the text
>>>> of original source code. Then llvm-objdump step won't be necessary.
>>>> The bpf loader will load both instructions and pieces of C sources.
>>>> Then verifier errors should be even easier to read and humans
>>>> can easily understand the purpose of the program.
>>>
>>> So the BPF_COMMENT pseudo insn will get stripped away from the insn array
>>> after verification step, so we don't need to hold/account for this mem? I
>>> assume in it's ->imm member it will just hold offset into text blob?
>>
>> Associating any form of opaque data with programs always makes me
>> worried about opening a side channel of communication with a specialized
>> user space implementations/compilers.  But I guess if the BPF_COMMENTs
>> are stripped in the verifier as Daniel assumes drivers and JITs will
>> never see it.
>
> yes. the idea that it's a comment. It can contain any text,
> not only C code, but any other language.
> It's definitely going to be stripped before JITs and kernel will
> not make any safety or translation decisions based on such comment.
>
>> Just to clarify, however - is there any reason why pushing the source
>> code into the kernel is necessary?  Or is it just for convenience?
>> Provided the user space loader has access to the debug info it should
>> have no problems matching the verifier output to code lines?
>
> correct. just for convenience. The user space has to keep .o around,
> since it can crash, would have to reload and so on.
> Only for some script that ssh-es into servers and wants to see
> what is being loaded, it might help to dump full asm and these comments
> along with prog_digest that Daniel is working on in parallel.

Which would mean we'd need to keep it around somewhere (prog aux data?)
in post-verification time (so potentially drivers/JITs could see it, too,
just not inside insn stream). Some API glue code could probably blind
this information for the JITing time to stop incentive of playing side
channel games (e.g. core code could encrypt the pointer value and only
core kernel knows how to access that data, no modules, no out-of-tree
code). The other thing I'm wondering is, when we strip this info anyway
from the insn stream to keep it in aux data (so it can later be reconstructed
on a dump), then perhaps that is best done before prog loading time? It
would then allow to keep complexity with stripping that insns out of the
verifier. If semantics are that these comments are acting as a hole/gap
(in a similar sense of what we have with cBPF today), then it can never
become a jmp target and loaders could strip it out already (instead of
teaching DFS, etc about it), and prepare a meta data structure in bpf_attr
for bpf(2), and verifier works based on that one. What makes this problematic
however is when you have rewrites in the kernel (ctx access, constant
blinding, etc), but perhaps they could just adjust the offsets from that
meta data thing as well?

> Alternatively instead of doing BPF_COMMENT we can load the whole .o
> as-is into bpffs as a blob. Later (based on digest) the kernel can
> dump such .o back for user space to run objdump on. It all can be
> done without kernel involvement. Like tc command can copy .o and so on.
> But not everything is using tc.

That means kernel must ensure/verify that loaded insns also come from
that claimed object file; not sure if easily possible w/o parsing elf.
It could work if the kernel loads everything based on the content of
the object file itself, but that is not possible anymore since we have
bpf(2) already for doing that (but also would add a lot of complexity).

> Another alternative is to do a decompiler from bpf binary
> into meaningful C code. It's not trivial and names will be lost.
> bpf_comment approach is pretty cheap from kernel point of view
> and greatly helps visibility when users don't cheat with debug info.

Agree, it's non-trivial, would be really nice to have, though, so also
non-cooperative -g users could be better examined.

Thanks,
Daniel

Alexei Starovoitov Nov. 30, 2016, 12:28 a.m. UTC | #6

On Tue, Nov 29, 2016 at 09:23:10PM +0100, Daniel Borkmann wrote:
> On 11/29/2016 07:51 PM, Alexei Starovoitov wrote:
> >On Tue, Nov 29, 2016 at 03:38:18PM +0000, Jakub Kicinski wrote:
> [...]
> >>>>So next step is to improve verifier messages to be more human friendly.
> >>>>The step after is to introduce BPF_COMMENT pseudo instruction
> >>>>that will be ignored by the interpreter yet it will contain the text
> >>>>of original source code. Then llvm-objdump step won't be necessary.
> >>>>The bpf loader will load both instructions and pieces of C sources.
> >>>>Then verifier errors should be even easier to read and humans
> >>>>can easily understand the purpose of the program.
> >>>
> >>>So the BPF_COMMENT pseudo insn will get stripped away from the insn array
> >>>after verification step, so we don't need to hold/account for this mem? I
> >>>assume in it's ->imm member it will just hold offset into text blob?
> >>
> >>Associating any form of opaque data with programs always makes me
> >>worried about opening a side channel of communication with a specialized
> >>user space implementations/compilers.  But I guess if the BPF_COMMENTs
> >>are stripped in the verifier as Daniel assumes drivers and JITs will
> >>never see it.
> >
> >yes. the idea that it's a comment. It can contain any text,
> >not only C code, but any other language.
> >It's definitely going to be stripped before JITs and kernel will
> >not make any safety or translation decisions based on such comment.
> >
> >>Just to clarify, however - is there any reason why pushing the source
> >>code into the kernel is necessary?  Or is it just for convenience?
> >>Provided the user space loader has access to the debug info it should
> >>have no problems matching the verifier output to code lines?
> >
> >correct. just for convenience. The user space has to keep .o around,
> >since it can crash, would have to reload and so on.
> >Only for some script that ssh-es into servers and wants to see
> >what is being loaded, it might help to dump full asm and these comments
> >along with prog_digest that Daniel is working on in parallel.
> 
> Which would mean we'd need to keep it around somewhere (prog aux data?)
> in post-verification time (so potentially drivers/JITs could see it, too,
> just not inside insn stream). Some API glue code could probably blind
> this information for the JITing time to stop incentive of playing side
> channel games (e.g. core code could encrypt the pointer value and only
> core kernel knows how to access that data, no modules, no out-of-tree
> code). The other thing I'm wondering is, when we strip this info anyway
> from the insn stream to keep it in aux data (so it can later be reconstructed
> on a dump), then perhaps that is best done before prog loading time? It
> would then allow to keep complexity with stripping that insns out of the
> verifier. If semantics are that these comments are acting as a hole/gap
> (in a similar sense of what we have with cBPF today), then it can never
> become a jmp target and loaders could strip it out already (instead of
> teaching DFS, etc about it), and prepare a meta data structure in bpf_attr
> for bpf(2), and verifier works based on that one. What makes this problematic
> however is when you have rewrites in the kernel (ctx access, constant
> blinding, etc), but perhaps they could just adjust the offsets from that
> meta data thing as well?

yes. all correct.
if we keep comment==nop instructions as part of the program, it's not great,
since we'll be wasting performance for no strong reason.
If we remove them from instruction stream after the verifier then they
can only be useful in verifier messages and that's not much better
than existing 'llvm-objdump -S file.o' approach.
Hmm.

> >Alternatively instead of doing BPF_COMMENT we can load the whole .o
> >as-is into bpffs as a blob. Later (based on digest) the kernel can
> >dump such .o back for user space to run objdump on. It all can be
> >done without kernel involvement. Like tc command can copy .o and so on.
> >But not everything is using tc.
> 
> That means kernel must ensure/verify that loaded insns also come from
> that claimed object file; not sure if easily possible w/o parsing elf.
> It could work if the kernel loads everything based on the content of
> the object file itself,

it will only check that program section of file contain valid insns.
the user may still cheat with junk in dwarf section of such elf.
Sounds more and more that this should really be solved by user space
and correlation to be done via prog_digest.

Arnaldo Carvalho de Melo Dec. 6, 2016, 3:12 p.m. UTC | #7

Em Tue, Nov 29, 2016 at 09:01:17AM -0800, Alexei Starovoitov escreveu:
> On Tue, Nov 29, 2016 at 04:11:32PM +0100, Daniel Borkmann wrote:
> > On 11/29/2016 07:42 AM, Alexei Starovoitov wrote:
> > >Notice that 'clang -S -o a.s' output and llvm-objdump disassembler
> > >were changed to use kernel verifier style, so now it should be easier
> > >to see what's going on.

> > Sounds really useful, is that scheduled for llvm 3.10 release?

> llvm 4.0 :)

> > That debugging info is stored in dwarf format into the obj, right?

> right.

> > Would be nice if also pahole could work on bpf object files.
> 
> yeah. pahole need to be taught to recognize bpf e_machine type and relocations.

Coincidentally I'm testing some perf patches wrt builtin BPF and for
that I'm updating to llvm to 4.0, will take a look at what is involved
in that...

- Arnaldo

Commit Message

Comments

Patch