diff mbox series

[bpf-next,01/13] bpf: xor of a/x in cbpf can be done in 32 bit alu

Message ID 20180126223348.11250-2-daniel@iogearbox.net
State Accepted, archived
Delegated to: BPF Maintainers
Headers show
Series BPF improvements and fixes | expand

Commit Message

Daniel Borkmann Jan. 26, 2018, 10:33 p.m. UTC
Very minor optimization; saves 1 byte per program in x86_64
JIT in cBPF prologue.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
---
 net/core/filter.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Naveen N. Rao Jan. 28, 2018, 6:58 p.m. UTC | #1
in 32 bit alu

Daniel Borkmann wrote:
> Very minor optimization; saves 1 byte per program in x86_64
> JIT in cBPF prologue.

... but increases program size by 4 bytes on ppc64 :(
In general, this is an area I've been wanting to spend some time on.  
Powerpc doesn't have 32-bit sub-registers, so we need to emit an 
additional instruction to clear the higher 32-bits for all 32-bit 
operations. I need to look at the performance impact.

- Naveen

> 
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Acked-by: Alexei Starovoitov <ast@kernel.org>
> ---
>  net/core/filter.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 18da42a..cba2f73 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -401,8 +401,8 @@ static int bpf_convert_filter(struct sock_filter *prog, int len,
>  		/* Classic BPF expects A and X to be reset first. These need
>  		 * to be guaranteed to be the first two instructions.
>  		 */
> -		*new_insn++ = BPF_ALU64_REG(BPF_XOR, BPF_REG_A, BPF_REG_A);
> -		*new_insn++ = BPF_ALU64_REG(BPF_XOR, BPF_REG_X, BPF_REG_X);
> +		*new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_A, BPF_REG_A);
> +		*new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_X, BPF_REG_X);
>  
>  		/* All programs must keep CTX in callee saved BPF_REG_CTX.
>  		 * In eBPF case it's done by the compiler, here we need to
> -- 
> 2.9.5
> 
> 
>
Daniel Borkmann Jan. 28, 2018, 8:51 p.m. UTC | #2
On 01/28/2018 07:58 PM, Naveen N. Rao wrote:
> in 32 bit alu
> 
> Daniel Borkmann wrote:
>> Very minor optimization; saves 1 byte per program in x86_64
>> JIT in cBPF prologue.
> 
> ... but increases program size by 4 bytes on ppc64 :(
> In general, this is an area I've been wanting to spend some time on.  Powerpc doesn't have 32-bit sub-registers, so we need to emit an additional instruction to clear the higher 32-bits for all 32-bit operations. I need to look at the performance impact.

Right, I think one way to optimize this could be on JIT level in such
case when CPU doesn't have subregs. There is the bpf_prog_was_classic()
helper that can be used there in order to skip some of the bpf_alu32_trunc
goto cases e.g. for some of the bit ops as an example, since we know that
upper part in cBPF must be zero here anyway, this should definitely be a
low hanging fruit given we use alu32 in the cBPF to eBPF conversion in a
lot of places.

Thanks,
Daniel
diff mbox series

Patch

diff --git a/net/core/filter.c b/net/core/filter.c
index 18da42a..cba2f73 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -401,8 +401,8 @@  static int bpf_convert_filter(struct sock_filter *prog, int len,
 		/* Classic BPF expects A and X to be reset first. These need
 		 * to be guaranteed to be the first two instructions.
 		 */
-		*new_insn++ = BPF_ALU64_REG(BPF_XOR, BPF_REG_A, BPF_REG_A);
-		*new_insn++ = BPF_ALU64_REG(BPF_XOR, BPF_REG_X, BPF_REG_X);
+		*new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_A, BPF_REG_A);
+		*new_insn++ = BPF_ALU32_REG(BPF_XOR, BPF_REG_X, BPF_REG_X);
 
 		/* All programs must keep CTX in callee saved BPF_REG_CTX.
 		 * In eBPF case it's done by the compiler, here we need to