diff mbox

[net] x86: bpf_jit: fix compilation of large bpf programs

Message ID 1432334575-16959-1-git-send-email-ast@plumgrid.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Alexei Starovoitov May 22, 2015, 10:42 p.m. UTC
x86 has variable length encoding. x86 JIT compiler is trying
to pick the shortest encoding for given bpf instruction.
While doing so the jump targets are changing, so JIT is doing
multiple passes over the program. Typical program needs 3 passes.
Some very short programs converge with 2 passes. Large programs
may need 4 or 5. But specially crafted bpf programs may hit the
pass limit and if the program converges on the last iteration
the JIT compiler will be producing an image full of 'int 3' insns.
Fix this corner case by doing final iteration over bpf program.

Fixes: 0a14842f5a3c ("net: filter: Just In Time compiler for x86-64")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
---
Daniel wrote the 'Edge hopping nuthouse' test case with 4k jump
instructions that managed to trigger this bug.
The test case is nuts and the bug is real.
It's an old bug, but I think worth backporting all the way.
Though this fix will apply cleanly only till commit:
f3c2af7ba17a ("net: filter: x86: split bpf_jit_compile()")
The older kernels should be similar. They have
'for (pass = 0; pass < 10; pass++) {' at the line 153 or so.
and all have similar problem as far as I can see.

 arch/x86/net/bpf_jit_comp.c |    7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Daniel Borkmann May 22, 2015, 10:46 p.m. UTC | #1
On 05/23/2015 12:42 AM, Alexei Starovoitov wrote:
> x86 has variable length encoding. x86 JIT compiler is trying
> to pick the shortest encoding for given bpf instruction.
> While doing so the jump targets are changing, so JIT is doing
> multiple passes over the program. Typical program needs 3 passes.
> Some very short programs converge with 2 passes. Large programs
> may need 4 or 5. But specially crafted bpf programs may hit the
> pass limit and if the program converges on the last iteration
> the JIT compiler will be producing an image full of 'int 3' insns.
> Fix this corner case by doing final iteration over bpf program.
>
> Fixes: 0a14842f5a3c ("net: filter: Just In Time compiler for x86-64")
> Reported-by: Daniel Borkmann <daniel@iogearbox.net>
> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>

LGTM, thanks!

Tested-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller May 25, 2015, 4:19 a.m. UTC | #2
From: Alexei Starovoitov <ast@plumgrid.com>
Date: Fri, 22 May 2015 15:42:55 -0700

> x86 has variable length encoding. x86 JIT compiler is trying
> to pick the shortest encoding for given bpf instruction.
> While doing so the jump targets are changing, so JIT is doing
> multiple passes over the program. Typical program needs 3 passes.
> Some very short programs converge with 2 passes. Large programs
> may need 4 or 5. But specially crafted bpf programs may hit the
> pass limit and if the program converges on the last iteration
> the JIT compiler will be producing an image full of 'int 3' insns.
> Fix this corner case by doing final iteration over bpf program.
> 
> Fixes: 0a14842f5a3c ("net: filter: Just In Time compiler for x86-64")
> Reported-by: Daniel Borkmann <daniel@iogearbox.net>
> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>

Applied and queued up for -stable, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Laight May 26, 2015, 1:40 p.m. UTC | #3
From: Alexei Starovoitov
> Sent: 22 May 2015 23:43
> x86 has variable length encoding. x86 JIT compiler is trying
> to pick the shortest encoding for given bpf instruction.
> While doing so the jump targets are changing, so JIT is doing
> multiple passes over the program. Typical program needs 3 passes.
> Some very short programs converge with 2 passes. Large programs
> may need 4 or 5. But specially crafted bpf programs may hit the
> pass limit and if the program converges on the last iteration
> the JIT compiler will be producing an image full of 'int 3' insns.
> Fix this corner case by doing final iteration over bpf program.

If the JIT compiler is only changing the encoding of the constants
in the x86 instructions (rather than changing the instructions themselves)
then there is likely to me an unmeasurable change in the execution time.
For instance I don't remember there being a difference in execution time
between long and short branches - the only difference is the amount of
cache they use.

	David

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet May 26, 2015, 2:35 p.m. UTC | #4
On Tue, 2015-05-26 at 13:40 +0000, David Laight wrote:

> If the JIT compiler is only changing the encoding of the constants
> in the x86 instructions (rather than changing the instructions themselves)
> then there is likely to me an unmeasurable change in the execution time.
> For instance I don't remember there being a difference in execution time
> between long and short branches - the only difference is the amount of
> cache they use.

icache is precisely the matter here. In the end, it makes a difference.

You could check this interesting study Ingo did recently :

https://lkml.org/lkml/2015/5/19/1009


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet May 26, 2015, 3:29 p.m. UTC | #5
On Tue, 2015-05-26 at 15:13 +0000, David Laight wrote:

> Yes, interesting, a benchmark that manages to run a lot of code 'cold cache'.

We have binaries here at Google with 400 or 500 MBytes of text.

Not benchmark, super real workloads you know.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Laight May 26, 2015, 3:47 p.m. UTC | #6
From: Eric Dumazet 

> Sent: 26 May 2015 16:30

> 

> > Yes, interesting, a benchmark that manages to run a lot of code 'cold cache'.

> 

> We have binaries here at Google with 400 or 500 MBytes of text.

> 

> Not benchmark, super real workloads you know.


Indeed, and a lot of the code is likely to be running 'cold cache'.

I was alluding to the problem where people will benchmark a small function
by running in 1000s of times in a tight loop with exactly the same data.
Not only is it 'hot cache' but any dynamic branch prediction is 'trained'
to the specific data.

	David
diff mbox

Patch

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 99f76103c6b7..ddeff4844a10 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -966,7 +966,12 @@  void bpf_int_jit_compile(struct bpf_prog *prog)
 	}
 	ctx.cleanup_addr = proglen;
 
-	for (pass = 0; pass < 10; pass++) {
+	/* JITed image shrinks with every pass and the loop iterates
+	 * until the image stops shrinking. Very large bpf programs
+	 * may converge on the last pass. In such case do one more
+	 * pass to emit the final image
+	 */
+	for (pass = 0; pass < 10 || image; pass++) {
 		proglen = do_jit(prog, addrs, image, oldproglen, &ctx);
 		if (proglen <= 0) {
 			image = NULL;