From patchwork Sat Jan 26 17:25:58 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Jiong Wang <jiong.wang@netronome.com>
X-Patchwork-Id: 1031491
Return-Path: <netdev-owner@vger.kernel.org>
X-Original-To: patchwork-incoming-netdev@ozlabs.org
Delivered-To: patchwork-incoming-netdev@ozlabs.org
Authentication-Results: ozlabs.org;
	spf=none (mailfrom) smtp.mailfrom=vger.kernel.org
	(client-ip=209.132.180.67; helo=vger.kernel.org;
	envelope-from=netdev-owner@vger.kernel.org;
	receiver=<UNKNOWN>)
Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none)
	header.from=netronome.com
Authentication-Results: ozlabs.org; dkim=pass (2048-bit key;
	unprotected) header.d=netronome-com.20150623.gappssmtp.com
	header.i=@netronome-com.20150623.gappssmtp.com
	header.b="UfHak6WN"; dkim-atps=neutral
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by ozlabs.org (Postfix) with ESMTP id 43n2rL0Ttcz9s55
	for <patchwork-incoming-netdev@ozlabs.org>;
	Sun, 27 Jan 2019 04:26:58 +1100 (AEDT)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1726176AbfAZR0X (ORCPT
	<rfc822;patchwork-incoming-netdev@ozlabs.org>);
	Sat, 26 Jan 2019 12:26:23 -0500
Received: from mail-wr1-f42.google.com ([209.85.221.42]:43378 "EHLO
	mail-wr1-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1726070AbfAZR0W (ORCPT
	<rfc822;netdev@vger.kernel.org>); Sat, 26 Jan 2019 12:26:22 -0500
Received: by mail-wr1-f42.google.com with SMTP id r10so13412317wrs.10
	for <netdev@vger.kernel.org>; Sat, 26 Jan 2019 09:26:21 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=netronome-com.20150623.gappssmtp.com; s=20150623;
	h=from:to:cc:subject:date:message-id;
	bh=ldaLhKxTWfXFba068kfaEYRgufuAF7hxkH3NwUscGiQ=;
	b=UfHak6WNwCEDp+dG5zc8oKTDSdq/VXeWAt12hgS4zxGP1BhT7nOUblImYozIAd9Q6o
	a7aM/mNca31aOdMDEAGtyA6nbyzh6CPc9icSYlbLRfM4D7BN+ghWI6X1i497sGNa1JXn
	3VI20I+XKLmNOoEPYCdd1RdLqWKvHcQ2C8C4X1geoJGc2q9TwJZ0x4liewmSqOefZB7Y
	RUZbMj9OtEAsHdPFuVY8Cv+UAhcmwe2RH4p4AgzBDd/AW1Oeg/yKv1thuxrezcb7Wqk7
	QDB0h0nUtvppeYhhcEy6/288zbQ/R+kLWBHitVUG/DLg1DANi6uwl2sgZKJD9mpickOq
	c5vA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:from:to:cc:subject:date:message-id;
	bh=ldaLhKxTWfXFba068kfaEYRgufuAF7hxkH3NwUscGiQ=;
	b=t10tEMlRXyt5ZjTw7lBiQG86CFUKREFjsUt2rowv2Ki3pUOELFkvrd0W+n8V6uQBBq
	4KCqDNRV1MWxcMKhMyepLhWMUDKYTIdpBwVauY4/AciDnoCHf0R3buiqqGSrc/lyqEx2
	sl4R8XP7RChRYx0j30MLSRfdYf2k8TyWp6JBAaGVsmeqcHJHPkgrntxpOubu7QjyKDU2
	ykhbT6LiK1eS1fHvwpTnkJvIWouT+WIczlcKXsLMeWbWvK25dmWKApeq1CaYnMw5P2JP
	lpApC77sv572qfdqNgd0zo8qoO4nBBPTGS84S5vy+KJkYOV0eYLQOaI6El6m55zLNG9A
	PDxQ==
X-Gm-Message-State: AJcUuke7w9NtSAiKW0mztfLpJ/4u/BFRm9t/j1g/AueecUKw8suR3C0U
	uzugHrCMxv+fW3Ok2X1IWAEatg==
X-Google-Smtp-Source: 
 ALg8bN7kXrr5Rr3uqQr8jQaSFVMHXZVHXLaBLhE3NgwxjhfRt5wH4qC1jHPR4JKiLDrbNlZ8u0HYFg==
X-Received: by 2002:a05:6000:f:: with SMTP id
	h15mr15099448wrx.248.1548523580087;
	Sat, 26 Jan 2019 09:26:20 -0800 (PST)
Received: from cbtest28.netronome.com ([217.38.71.146])
	by smtp.gmail.com with ESMTPSA id
	g188sm84426444wmf.32.2019.01.26.09.26.19
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128);
	Sat, 26 Jan 2019 09:26:19 -0800 (PST)
From: Jiong Wang <jiong.wang@netronome.com>
To: ast@kernel.org, daniel@iogearbox.net
Cc: netdev@vger.kernel.org, oss-drivers@netronome.com,
	Jiong Wang <jiong.wang@netronome.com>,
	"David S . Miller" <davem@davemloft.net>,
	Paul Burton <paul.burton@mips.com>, Wang YanQing <udknight@gmail.com>,
	Zi Shen Lim <zlim.lnx@gmail.com>,
	Shubham Bansal <illusionist.neo@gmail.com>,
	"Naveen N . Rao" <naveen.n.rao@linux.ibm.com>,
	Sandipan Das <sandipan@linux.ibm.com>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Heiko Carstens <heiko.carstens@de.ibm.com>
Subject: [PATCH bpf-next v4 00/16] bpf: propose new jmp32 instructions
Date: Sat, 26 Jan 2019 12:25:58 -0500
Message-Id: <1548523574-18316-1-git-send-email-jiong.wang@netronome.com>
X-Mailer: git-send-email 2.7.4
Sender: netdev-owner@vger.kernel.org
Precedence: bulk
List-ID: <netdev.vger.kernel.org>
X-Mailing-List: netdev@vger.kernel.org

v3 -> v4:
 - Fixed rebase issue. JMP32 checks were missing in two new functions:
    + kernel/bpf/verifier.c:insn_is_cond_jump 
    + drivers/net/ethernet/netronome/nfp/bpf/main.h:is_mbpf_cond_jump
   (Daniel)
 - Further rebased on top of latest llvm-readelf change.

v2 -> v3:
 - Added missed check on JMP32 inside bpf_jit_build_body. (Sandipan)
 - Wrap ?: statements in s390 port with brace. They are used by macros
   which doesn't guard the operand with brace.
 - Fixed the ',' issues test_verifier change.
 - Reorder two selftests patches to be near each other.
 - Rebased on top of latest bpf-next.

v1 -> v2:
 - Updated encoding. Use reserved insn class 0x6 instead of packing with
   existing BPF_JMP. (Alexei)
 - Updated code comments in s390 port. (Martin)
 - Separate JIT function for jeq32_imm in NFP port. (Jakub)
 - Re-implemented auto-testing support. (Jakub)
 - Moved testcases to test_verifer.c, plus more unit tests. (Jakub)
 - Fixed JEQ/JNE range deduction. (Jakub)
 - Also supported JSET in this patch set.
 - Fixed/Improved range deduction for all the other operations. All C
   programs under bpf selftest passed verification now.
 - Improved min/max code implementation.
 - Fixed bpftool/disassembler.

Current eBPF ISA has 32-bit sub-register and has defined a set of ALU32
instructions.

However, there is no JMP32 instructions, the consequence is code-gen for
32-bit sub-registers is not efficient. For example, explicit sign-extension
from 32-bit to 64-bit is needed for signed comparison.

Adding JMP32 instruction therefore could complete eBPF ISA on 32-bit
sub-register support. This also match those JMP32 instructions in most JIT
backends, for example x64-64 and AArch64. These new eBPF JMP32 instructions
could have one-to-one map on them.

A few verifier ALU32 related bugs has been fixed recently, and JMP32
introduced by this set further improves BPF sub-register ecosystem. Once
this is landed, BPF programs using 32-bit sub-register ISA could get
reasonably good support from verifier and JIT compilers. Users then could
compare the runtime efficiency of one BPF program under both modes, and
could use the one shown better from benchmark result.

From benchmark results on some Cilium BPF programs, for 64-bit arches,
after JMP32 introduced, programs compiled with -mattr=+alu32 (meaning
enable sub-register usage) are smaller in code size and generally smaller
in verifier processed insn number.

Benchmark results
===
Text size in bytes (generated by "size")
---
LLVM code-gen option   default  alu32  alu32/jmp32  change Vs.  change Vs.
                                                    alu32       default
bpf_lb-DLB_L3.o:       6456     6280   6160         -1.91%      -4.58%
bpf_lb-DLB_L4.o:       7848     7664   7136         -6.89%      -9.07%
bpf_lb-DUNKNOWN.o:     2680     2664   2568         -3.60%      -4.18%
bpf_lxc.o:             104824   104744 97360        -7.05%      -7.12%
bpf_netdev.o:          23456    23576  21632        -8.25%      -7.78%
bpf_overlay.o:         16184    16304  14648        -10.16%     -9.49%

Processed instruction number
---
LLVM code-gen option   default  alu32  alu32/jmp32  change Vs.  change Vs.
                                                    alu32       default
bpf_lb-DLB_L3.o:       1579     1281   1295         +1.09%      -17.99%
bpf_lb-DLB_L4.o:       2045     1663   1556         -6.43%      -23.91%
bpf_lb-DUNKNOWN.o:     606      513    501          -2.34%      -17.33%
bpf_lxc.o:             85381    103218 94435        -8.51%      +10.60%
bpf_netdev.o:          5246     5809   5200         -10.48%     -0.08%
bpf_overlay.o:         2443     2705   2456         -9.02%      -0.53%

It is even better for 32-bit arches like x32, arm32 and nfp etc, as now
some conditional jump will become JMP32 which doesn't require code-gen for
high 32-bit comparison.

Encoding
===
The new JMP32 instructions are using new BPF_JMP32 class which is using
the reserved eBPF class number 0x6. And BPF_JA/CALL/EXIT only exist for
BPF_JMP, they are reserved opcode for BPF_JMP32.

LLVM support
===
A couple of unit tests has been added and included in this set. Also LLVM
code-gen for JMP32 has been added, so you could just compile any BPF C
program with both -mcpu=probe and -mattr=+alu32 specified. If you are
compiling on a machine with kernel patched by this set, LLVM will select
the ISA automatically based on host probe results. Otherwise specify
-mcpu=v3 and -mattr=+alu32 could also force use JMP32 ISA.

   LLVM support could be found at:

     https://github.com/Netronome/llvm/tree/jmp32-v2

   (clang driver also taught about the new "v3" processor, will send out
    merge request for both clang and llvm once kernel set landed.)

JIT backends support
===
A couple of JIT backends has been supported in this set except SPARC and
MIPS. It shouldn't be a big issue for these two ports as LLVM default won't
generate JMP32 insns, it will only generate them when host machine is
probed to be with the support.

Thanks.

Cc: David S. Miller <davem@davemloft.net>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Wang YanQing <udknight@gmail.com>
Cc: Zi Shen Lim <zlim.lnx@gmail.com>
Cc: Shubham Bansal <illusionist.neo@gmail.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>


Jiong Wang (16):
  bpf: allocate 0x06 to new eBPF instruction class JMP32
  bpf: refactor verifier min/max code for condition jump
  bpf: verifier support JMP32
  bpf: disassembler support JMP32
  tools: bpftool: teach cfg code about JMP32
  bpf: interpreter support for JMP32
  bpf: JIT blinds support JMP32
  x86_64: bpf: implement jitting of JMP32
  x32: bpf: implement jitting of JMP32
  arm64: bpf: implement jitting of JMP32
  arm: bpf: implement jitting of JMP32
  ppc: bpf: implement jitting of JMP32
  s390: bpf: implement jitting of JMP32
  nfp: bpf: implement jitting of JMP32
  selftests: bpf: functional and min/max reasoning unit tests for JMP32
  selftests: bpf: makefile support sub-register code-gen test mode

 Documentation/networking/filter.txt           |  15 +-
 arch/arm/net/bpf_jit_32.c                     |  53 +-
 arch/arm/net/bpf_jit_32.h                     |   2 +
 arch/arm64/net/bpf_jit_comp.c                 |  37 +-
 arch/powerpc/include/asm/ppc-opcode.h         |   1 +
 arch/powerpc/net/bpf_jit.h                    |   4 +
 arch/powerpc/net/bpf_jit_comp64.c             | 120 +++-
 arch/s390/net/bpf_jit_comp.c                  |  66 ++-
 arch/x86/net/bpf_jit_comp.c                   |  46 +-
 arch/x86/net/bpf_jit_comp32.c                 | 121 ++--
 drivers/net/ethernet/netronome/nfp/bpf/jit.c  |  97 +++-
 drivers/net/ethernet/netronome/nfp/bpf/main.h |  22 +-
 include/linux/filter.h                        |  20 +
 include/uapi/linux/bpf.h                      |   1 +
 kernel/bpf/core.c                             | 221 +++-----
 kernel/bpf/disasm.c                           |  34 +-
 kernel/bpf/verifier.c                         | 365 ++++++++----
 samples/bpf/bpf_insn.h                        |  20 +
 tools/bpf/bpftool/cfg.c                       |   9 +-
 tools/include/linux/filter.h                  |  20 +
 tools/include/uapi/linux/bpf.h                |   1 +
 tools/testing/selftests/bpf/Makefile          |  95 +++-
 tools/testing/selftests/bpf/test_verifier.c   | 786 +++++++++++++++++++++++++-
 23 files changed, 1736 insertions(+), 420 deletions(-)