Message ID | 20191002234512.25902-1-daniel@iogearbox.net |
---|---|
State | Accepted |
Delegated to: | BPF Maintainers |
Headers | show |
Series | [bpf-next,1/2] bpf, x86: Small optimization in comparing against imm0 | expand |
On Wed, Oct 2, 2019 at 5:30 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > Replace 'cmp reg, 0' with 'test reg, reg' for comparisons against > zero. Saves 1 byte of instruction encoding per occurrence. The flag > results of test 'reg, reg' are identical to 'cmp reg, 0' in all > cases except for AF which we don't use/care about. In terms of > macro-fusibility in combination with a subsequent conditional jump > instruction, both have the same properties for the jumps used in > the JIT translation. For example, same JITed Cilium program can > shrink a bit from e.g. 12,455 to 12,317 bytes as tests with 0 are > used quite frequently. > > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com>
Song Liu wrote: > On Wed, Oct 2, 2019 at 5:30 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > > > Replace 'cmp reg, 0' with 'test reg, reg' for comparisons against > > zero. Saves 1 byte of instruction encoding per occurrence. The flag > > results of test 'reg, reg' are identical to 'cmp reg, 0' in all > > cases except for AF which we don't use/care about. In terms of > > macro-fusibility in combination with a subsequent conditional jump > > instruction, both have the same properties for the jumps used in > > the JIT translation. For example, same JITed Cilium program can > > shrink a bit from e.g. 12,455 to 12,317 bytes as tests with 0 are > > used quite frequently. > > > > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> > > Acked-by: Song Liu <songliubraving@fb.com> Bonus points for causing me to spend the morning remembering the differences between cmd, and, or, and test. Also wonder if at some point we should clean up the jit a bit and add some defines/helpers for all the open coded opcodes and such. Acked-by: John Fastabend <john.fastabend@gmail.com>
On Thu, Oct 3, 2019 at 2:08 PM John Fastabend <john.fastabend@gmail.com> wrote: > > Song Liu wrote: > > On Wed, Oct 2, 2019 at 5:30 PM Daniel Borkmann <daniel@iogearbox.net> wrote: > > > > > > Replace 'cmp reg, 0' with 'test reg, reg' for comparisons against > > > zero. Saves 1 byte of instruction encoding per occurrence. The flag > > > results of test 'reg, reg' are identical to 'cmp reg, 0' in all > > > cases except for AF which we don't use/care about. In terms of > > > macro-fusibility in combination with a subsequent conditional jump > > > instruction, both have the same properties for the jumps used in > > > the JIT translation. For example, same JITed Cilium program can > > > shrink a bit from e.g. 12,455 to 12,317 bytes as tests with 0 are > > > used quite frequently. > > > > > > Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> > > > > Acked-by: Song Liu <songliubraving@fb.com> > > Bonus points for causing me to spend the morning remembering the > differences between cmd, and, or, and test. > > Also wonder if at some point we should clean up the jit a bit and > add some defines/helpers for all the open coded opcodes and such. > > Acked-by: John Fastabend <john.fastabend@gmail.com> Applied both. Thanks
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 991549a1c5f3..3ad2ba1ad855 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -909,6 +909,16 @@ xadd: if (is_imm8(insn->off)) case BPF_JMP32 | BPF_JSLT | BPF_K: case BPF_JMP32 | BPF_JSGE | BPF_K: case BPF_JMP32 | BPF_JSLE | BPF_K: + /* test dst_reg, dst_reg to save one extra byte */ + if (imm32 == 0) { + if (BPF_CLASS(insn->code) == BPF_JMP) + EMIT1(add_2mod(0x48, dst_reg, dst_reg)); + else if (is_ereg(dst_reg)) + EMIT1(add_2mod(0x40, dst_reg, dst_reg)); + EMIT2(0x85, add_2reg(0xC0, dst_reg, dst_reg)); + goto emit_cond_jmp; + } + /* cmp dst_reg, imm8/32 */ if (BPF_CLASS(insn->code) == BPF_JMP) EMIT1(add_1mod(0x48, dst_reg));
Replace 'cmp reg, 0' with 'test reg, reg' for comparisons against zero. Saves 1 byte of instruction encoding per occurrence. The flag results of test 'reg, reg' are identical to 'cmp reg, 0' in all cases except for AF which we don't use/care about. In terms of macro-fusibility in combination with a subsequent conditional jump instruction, both have the same properties for the jumps used in the JIT translation. For example, same JITed Cilium program can shrink a bit from e.g. 12,455 to 12,317 bytes as tests with 0 are used quite frequently. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> --- arch/x86/net/bpf_jit_comp.c | 10 ++++++++++ 1 file changed, 10 insertions(+)