Message ID | 20191022043119.2625263-1-yhs@fb.com |
---|---|
State | Accepted |
Delegated to: | BPF Maintainers |
Headers | show |
Series | [bpf-next,v2] tools/bpf: turn on llvm alu32 attribute by default | expand |
On Mon, Oct 21, 2019 at 09:31:19PM -0700, Yonghong Song wrote: > llvm alu32 was introduced in llvm7: > https://reviews.llvm.org/rL325987 > https://reviews.llvm.org/rL325989 > Experiments showed that in general performance > is better with alu32 enabled: > https://lwn.net/Articles/775316/ > > This patch turned on alu32 with no-flavor test_progs > which is tested most often. The flavor test at > no_alu32/test_progs can be used to test without > alu32 enabled. The Makefile check for whether > llvm supports '-mattr=+alu32 -mcpu=v3' is > removed as llvm7 should be available for recent > distributions and also latest llvm is preferred > to run bpf selftests. > > Note that jmp32 is checked by -mcpu=probe and > will be enabled if the host kernel supports it. > > Cc: Jiong Wang <jiong.wang@netronome.com> > Acked-by: Andrii Nakryiko <andriin@fb.com> > Signed-off-by: Yonghong Song <yhs@fb.com> Applied, thanks! Would it make sense to include -mattr=+alu32 also into -mcpu=probe on LLVM side or is the rationale to not do it that this causes a penalty for various other, non-x86 archs when done by default (although they could opt-out at the same time via -mattr=-alu32)?
On 10/22/19 12:29 PM, Daniel Borkmann wrote: > On Mon, Oct 21, 2019 at 09:31:19PM -0700, Yonghong Song wrote: >> llvm alu32 was introduced in llvm7: >> https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_rL325987&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=0VCVs-aItkaVLRJ9Jp7YeX0We2JPKzcY7p_83Hlkso4&s=M0ANvh80tDNZb5JzE5vj9IETkKD87L1jFkcRHShC6Rk&e= >> https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_rL325989&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=0VCVs-aItkaVLRJ9Jp7YeX0We2JPKzcY7p_83Hlkso4&s=LABlrq9E6tmCwrbU2bCQa_LwchCaL8Tk5GczMCO5Cvs&e= >> Experiments showed that in general performance >> is better with alu32 enabled: >> https://urldefense.proofpoint.com/v2/url?u=https-3A__lwn.net_Articles_775316_&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=0VCVs-aItkaVLRJ9Jp7YeX0We2JPKzcY7p_83Hlkso4&s=qSDIIkauxw9Y_8rYH0AlvB4nvu06reDuhsb0GxSpoBo&e= >> >> This patch turned on alu32 with no-flavor test_progs >> which is tested most often. The flavor test at >> no_alu32/test_progs can be used to test without >> alu32 enabled. The Makefile check for whether >> llvm supports '-mattr=+alu32 -mcpu=v3' is >> removed as llvm7 should be available for recent >> distributions and also latest llvm is preferred >> to run bpf selftests. >> >> Note that jmp32 is checked by -mcpu=probe and >> will be enabled if the host kernel supports it. >> >> Cc: Jiong Wang <jiong.wang@netronome.com> >> Acked-by: Andrii Nakryiko <andriin@fb.com> >> Signed-off-by: Yonghong Song <yhs@fb.com> > > Applied, thanks! > > Would it make sense to include -mattr=+alu32 also into -mcpu=probe > on LLVM side or is the rationale to not do it that this causes a > penalty for various other, non-x86 archs when done by default > (although they could opt-out at the same time via -mattr=-alu32)? The current -mcpu=probe is mostly to provide whether particular instruction(s) are supported by the kernel or not. This follows traditional cpu concept. For -mattr=+alu32 case, instruction set remains the same, but we need to probe verifier capability. But I agree that for bpf probing verifier for alu32 support is totally reasonable. Jiong, could you help do an implementation in llvm side since you are more familiar with what alu32 capability needs to be checked for verifier? Thanks!
On Wed, Oct 23, 2019 at 3:27 AM Yonghong Song <yhs@fb.com> wrote: > > > > On 10/22/19 12:29 PM, Daniel Borkmann wrote: > > On Mon, Oct 21, 2019 at 09:31:19PM -0700, Yonghong Song wrote: > >> llvm alu32 was introduced in llvm7: > >> https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_rL325987&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=0VCVs-aItkaVLRJ9Jp7YeX0We2JPKzcY7p_83Hlkso4&s=M0ANvh80tDNZb5JzE5vj9IETkKD87L1jFkcRHShC6Rk&e= > >> https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_rL325989&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=0VCVs-aItkaVLRJ9Jp7YeX0We2JPKzcY7p_83Hlkso4&s=LABlrq9E6tmCwrbU2bCQa_LwchCaL8Tk5GczMCO5Cvs&e= > >> Experiments showed that in general performance > >> is better with alu32 enabled: > >> https://urldefense.proofpoint.com/v2/url?u=https-3A__lwn.net_Articles_775316_&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=0VCVs-aItkaVLRJ9Jp7YeX0We2JPKzcY7p_83Hlkso4&s=qSDIIkauxw9Y_8rYH0AlvB4nvu06reDuhsb0GxSpoBo&e= > >> > >> This patch turned on alu32 with no-flavor test_progs > >> which is tested most often. The flavor test at > >> no_alu32/test_progs can be used to test without > >> alu32 enabled. The Makefile check for whether > >> llvm supports '-mattr=+alu32 -mcpu=v3' is > >> removed as llvm7 should be available for recent > >> distributions and also latest llvm is preferred > >> to run bpf selftests. > >> > >> Note that jmp32 is checked by -mcpu=probe and > >> will be enabled if the host kernel supports it. > >> > >> Cc: Jiong Wang <jiong.wang@netronome.com> > >> Acked-by: Andrii Nakryiko <andriin@fb.com> > >> Signed-off-by: Yonghong Song <yhs@fb.com> > > > > Applied, thanks! > > > > Would it make sense to include -mattr=+alu32 also into -mcpu=probe > > on LLVM side or is the rationale to not do it that this causes a > > penalty for various other, non-x86 archs when done by default > > (although they could opt-out at the same time via -mattr=-alu32)? > > The current -mcpu=probe is mostly to provide whether particular > instruction(s) are supported by the kernel or not. This follows > traditional cpu concept. For -mattr=+alu32 case, instruction set > remains the same, but we need to probe verifier capability. > > But I agree that for bpf probing verifier for alu32 support > is totally reasonable. > > Jiong, could you help do an implementation in llvm side since > you are more familiar with what alu32 capability needs to be > checked for verifier? Thanks! I think alu32 code-gen becomes good and stable after jmp32 instructions (cpu=v3) supported, so if we want to enable alu32 at default, perhaps could just link it with v3 probe, and also Daniel's opt-out suggestion makes sense. Will try to do one impl but not sure could catch the timeline tomorrow. For what it's worth, tomorrow will be my last day using Netronome email, I will use wong.kwongyuan.tools@gmail.com for bpf/kernel contributing temporarily.
On 10/24/19 3:23 PM, Jiong Wang wrote: > On Wed, Oct 23, 2019 at 3:27 AM Yonghong Song <yhs@fb.com> wrote: >> >> >> >> On 10/22/19 12:29 PM, Daniel Borkmann wrote: >>> On Mon, Oct 21, 2019 at 09:31:19PM -0700, Yonghong Song wrote: >>>> llvm alu32 was introduced in llvm7: >>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_rL325987&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=0VCVs-aItkaVLRJ9Jp7YeX0We2JPKzcY7p_83Hlkso4&s=M0ANvh80tDNZb5JzE5vj9IETkKD87L1jFkcRHShC6Rk&e= >>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_rL325989&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=0VCVs-aItkaVLRJ9Jp7YeX0We2JPKzcY7p_83Hlkso4&s=LABlrq9E6tmCwrbU2bCQa_LwchCaL8Tk5GczMCO5Cvs&e= >>>> Experiments showed that in general performance >>>> is better with alu32 enabled: >>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lwn.net_Articles_775316_&d=DwIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=0VCVs-aItkaVLRJ9Jp7YeX0We2JPKzcY7p_83Hlkso4&s=qSDIIkauxw9Y_8rYH0AlvB4nvu06reDuhsb0GxSpoBo&e= >>>> >>>> This patch turned on alu32 with no-flavor test_progs >>>> which is tested most often. The flavor test at >>>> no_alu32/test_progs can be used to test without >>>> alu32 enabled. The Makefile check for whether >>>> llvm supports '-mattr=+alu32 -mcpu=v3' is >>>> removed as llvm7 should be available for recent >>>> distributions and also latest llvm is preferred >>>> to run bpf selftests. >>>> >>>> Note that jmp32 is checked by -mcpu=probe and >>>> will be enabled if the host kernel supports it. >>>> >>>> Cc: Jiong Wang <jiong.wang@netronome.com> >>>> Acked-by: Andrii Nakryiko <andriin@fb.com> >>>> Signed-off-by: Yonghong Song <yhs@fb.com> >>> >>> Applied, thanks! >>> >>> Would it make sense to include -mattr=+alu32 also into -mcpu=probe >>> on LLVM side or is the rationale to not do it that this causes a >>> penalty for various other, non-x86 archs when done by default >>> (although they could opt-out at the same time via -mattr=-alu32)? >> >> The current -mcpu=probe is mostly to provide whether particular >> instruction(s) are supported by the kernel or not. This follows >> traditional cpu concept. For -mattr=+alu32 case, instruction set >> remains the same, but we need to probe verifier capability. >> >> But I agree that for bpf probing verifier for alu32 support >> is totally reasonable. >> >> Jiong, could you help do an implementation in llvm side since >> you are more familiar with what alu32 capability needs to be >> checked for verifier? Thanks! > > I think alu32 code-gen becomes good and stable after jmp32 > instructions (cpu=v3) supported, so if we want to enable alu32 at > default, perhaps could just link it with v3 probe, and also Daniel's > opt-out suggestion makes sense. > > Will try to do one impl but not sure could catch the timeline > tomorrow. For what it's worth, tomorrow will be my last day using > Netronome email, I will use wong.kwongyuan.tools@gmail.com for > bpf/kernel contributing temporarily. Jiong, thanks for letting us know. Maybe the following diff will be okay? diff --git a/llvm/lib/Target/BPF/BPFSubtarget.cpp b/llvm/lib/Target/BPF/BPFSubtarget.cpp index ab3452501b9..f3cb03b1f1f 100644 --- a/llvm/lib/Target/BPF/BPFSubtarget.cpp +++ b/llvm/lib/Target/BPF/BPFSubtarget.cpp @@ -52,6 +52,7 @@ void BPFSubtarget::initSubtargetFeatures(StringRef CPU, StringRef FS) { if (CPU == "v3") { HasJmpExt = true; HasJmp32 = true; + HasAlu32 = true; return; } } Considering in general -mattr=+alu32 improves code size and performance. I don't think there is need to disable HasAlu32. Any regression we should just debug and fix. What do you think? Yonghong
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 4ff5f4aada08..11ff34e7311b 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -30,17 +30,7 @@ TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map test test_sock test_btf test_sockmap get_cgroup_id_user test_socket_cookie \ test_cgroup_storage test_select_reuseport test_section_names \ test_netcnt test_tcpnotify_user test_sock_fields test_sysctl test_hashmap \ - test_cgroup_attach xdping - -# Also test sub-register code-gen if LLVM has eBPF v3 processor support which -# contains both ALU32 and JMP32 instructions. -SUBREG_CODEGEN := $(shell echo "int cal(int a) { return a > 0; }" | \ - $(CLANG) -target bpf -O2 -emit-llvm -S -x c - -o - | \ - $(LLC) -mattr=+alu32 -mcpu=v3 2>&1 | \ - grep 'if w') -ifneq ($(SUBREG_CODEGEN),) -TEST_GEN_PROGS += test_progs-alu32 -endif + test_cgroup_attach xdping test_progs-no_alu32 # Also test bpf-gcc, if present ifneq ($(BPF_GCC),) @@ -179,7 +169,7 @@ endef # $eval()) and pass control to DEFINE_TEST_RUNNER_RULES. # Parameters: # $1 - test runner base binary name (e.g., test_progs) -# $2 - test runner extra "flavor" (e.g., alu32, gcc-bpf, etc) +# $2 - test runner extra "flavor" (e.g., no_alu32, gcc-bpf, etc) define DEFINE_TEST_RUNNER TRUNNER_OUTPUT := $(OUTPUT)$(if $2,/)$2 @@ -201,7 +191,7 @@ endef # Using TRUNNER_XXX variables, provided by callers of DEFINE_TEST_RUNNER and # set up by DEFINE_TEST_RUNNER itself, create test runner build rules with: # $1 - test runner base binary name (e.g., test_progs) -# $2 - test runner extra "flavor" (e.g., alu32, gcc-bpf, etc) +# $2 - test runner extra "flavor" (e.g., no_alu32, gcc-bpf, etc) define DEFINE_TEST_RUNNER_RULES ifeq ($($(TRUNNER_OUTPUT)-dir),) @@ -272,14 +262,12 @@ TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read \ $(wildcard progs/btf_dump_test_case_*.c) TRUNNER_BPF_BUILD_RULE := CLANG_BPF_BUILD_RULE TRUNNER_BPF_CFLAGS := -I. -I$(OUTPUT) $(BPF_CFLAGS) $(CLANG_CFLAGS) -TRUNNER_BPF_LDFLAGS := +TRUNNER_BPF_LDFLAGS := -mattr=+alu32 $(eval $(call DEFINE_TEST_RUNNER,test_progs)) -# Define test_progs-alu32 test runner. -ifneq ($(SUBREG_CODEGEN),) -TRUNNER_BPF_LDFLAGS += -mattr=+alu32 -$(eval $(call DEFINE_TEST_RUNNER,test_progs,alu32)) -endif +# Define test_progs-no_alu32 test runner. +TRUNNER_BPF_LDFLAGS := +$(eval $(call DEFINE_TEST_RUNNER,test_progs,no_alu32)) # Define test_progs BPF-GCC-flavored test runner. ifneq ($(BPF_GCC),) @@ -319,4 +307,4 @@ $(OUTPUT)/test_verifier: test_verifier.c verifier/tests.h $(BPFOBJ) | $(OUTPUT) EXTRA_CLEAN := $(TEST_CUSTOM_PROGS) \ prog_tests/tests.h map_tests/tests.h verifier/tests.h \ - feature $(OUTPUT)/*.o $(OUTPUT)/alu32 $(OUTPUT)/bpf_gcc + feature $(OUTPUT)/*.o $(OUTPUT)/no_alu32 $(OUTPUT)/bpf_gcc