Message ID | 20220623124037.252709-1-amusil@redhat.com |
---|---|
State | Changes Requested |
Headers | show |
Series | [ovs-dev] cpu.c: Build cpu.c without any assumptions about CPU flags | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/intel-ovs-compilation | fail | test: fail |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
> -----Original Message----- > From: dev <ovs-dev-bounces@openvswitch.org> On Behalf Of Ales Musil > Sent: Thursday, June 23, 2022 1:41 PM > To: dev@openvswitch.org > Cc: david.marchand@redhat.com > Subject: [ovs-dev] [PATCH] cpu.c: Build cpu.c without any assumptions about > CPU flags > > The cpu.c was built with multiple feature flags that might > not be available on all hosts. It manifested itself as > Illegal instruction with crashing on "shlx" instruction > which is part of the bmi2 extension, that might not be available > on all CPU that openvswitch runs on. > > Move the cpu.c and cpu.h to lib_libopenvswitch_la_SOURCES which > should build it without any assumptions about CPU capabilities. Although this seems a good solution, the CI shows a linking failure; https://mail.openvswitch.org/pipermail/ovs-build/2022-June/022795.html I think the root cause might be that the avx512 library does not depend on the vswitch one, resulting in symbols not being found at link time for avx512 lib? One solution might be to create a standalone library for cpu.c, and then link that in to both avx512 and vswitch libraries? Or there might be alternative methods to expose the cpu.c "has_cpu_isa" function to the avx512 library? > Fixes: b366fa2f494 ("dpif-netdev: Call cpuid for x86 isa availability. ") > Reported-at: https://bugzilla.redhat.com/2100393 > Signed-off-by: Ales Musil <amusil@redhat.com> > --- > lib/automake.mk | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/lib/automake.mk b/lib/automake.mk > index cb50578eb..f082eb75a 100644 > --- a/lib/automake.mk > +++ b/lib/automake.mk > @@ -36,8 +36,6 @@ lib_libopenvswitchavx512_la_CFLAGS = \ > -fPIC \ > $(AM_CFLAGS) > lib_libopenvswitchavx512_la_SOURCES = \ > - lib/cpu.c \ > - lib/cpu.h \ > lib/dpif-netdev-avx512.c > if HAVE_AVX512BW > lib_libopenvswitchavx512_la_CFLAGS += \ > @@ -98,6 +96,8 @@ lib_libopenvswitch_la_SOURCES = \ > lib/csum.h \ > lib/ct-dpif.c \ > lib/ct-dpif.h \ > + lib/cpu.c \ > + lib/cpu.h \ > lib/daemon.c \ > lib/daemon.h \ > lib/daemon-private.h \ > -- > 2.35.3 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
On 6/23/22 15:33, Van Haaren, Harry wrote: >> -----Original Message----- >> From: dev <ovs-dev-bounces@openvswitch.org> On Behalf Of Ales Musil >> Sent: Thursday, June 23, 2022 1:41 PM >> To: dev@openvswitch.org >> Cc: david.marchand@redhat.com >> Subject: [ovs-dev] [PATCH] cpu.c: Build cpu.c without any assumptions about >> CPU flags >> >> The cpu.c was built with multiple feature flags that might >> not be available on all hosts. It manifested itself as >> Illegal instruction with crashing on "shlx" instruction >> which is part of the bmi2 extension, that might not be available >> on all CPU that openvswitch runs on. >> >> Move the cpu.c and cpu.h to lib_libopenvswitch_la_SOURCES which >> should build it without any assumptions about CPU capabilities. > > Although this seems a good solution, the CI shows a linking failure; > https://mail.openvswitch.org/pipermail/ovs-build/2022-June/022795.html > > I think the root cause might be that the avx512 library does not depend on the > vswitch one, resulting in symbols not being found at link time for avx512 lib? > > One solution might be to create a standalone library for cpu.c, and then link > that in to both avx512 and vswitch libraries? Or there might be alternative > methods to expose the cpu.c "has_cpu_isa" function to the avx512 library? libopenvswitch.a is already linked with libopenvswitchavx512.a together when building binaries, but since cpu.h is an internal header, the 'cpu_has_isa' is not exported, so can not be found while linking. The easiest solution would be to move the cpu.h into include/openvswitch/ adding appropriate 'extern "C"' declarations. This way it should be visible during the linking. Ales, you need to build with CFLAGS='-msse4.2' in order to reproduce the issue, otherwise avx512 code is mostly skipped during the build. Best regards, Ilya Maximets. > > >> Fixes: b366fa2f494 ("dpif-netdev: Call cpuid for x86 isa availability. ") >> Reported-at: https://bugzilla.redhat.com/2100393 >> Signed-off-by: Ales Musil <amusil@redhat.com> >> --- >> lib/automake.mk | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/lib/automake.mk b/lib/automake.mk >> index cb50578eb..f082eb75a 100644 >> --- a/lib/automake.mk >> +++ b/lib/automake.mk >> @@ -36,8 +36,6 @@ lib_libopenvswitchavx512_la_CFLAGS = \ >> -fPIC \ >> $(AM_CFLAGS) >> lib_libopenvswitchavx512_la_SOURCES = \ >> - lib/cpu.c \ >> - lib/cpu.h \ >> lib/dpif-netdev-avx512.c >> if HAVE_AVX512BW >> lib_libopenvswitchavx512_la_CFLAGS += \ >> @@ -98,6 +96,8 @@ lib_libopenvswitch_la_SOURCES = \ >> lib/csum.h \ >> lib/ct-dpif.c \ >> lib/ct-dpif.h \ >> + lib/cpu.c \ >> + lib/cpu.h \ 'p' goes after 'o'. >> lib/daemon.c \ >> lib/daemon.h \ >> lib/daemon-private.h \ >> -- >> 2.35.3
On Thu, Jun 23, 2022 at 3:33 PM Van Haaren, Harry <harry.van.haaren@intel.com> wrote: > > > -----Original Message----- > > From: dev <ovs-dev-bounces@openvswitch.org> On Behalf Of Ales Musil > > Sent: Thursday, June 23, 2022 1:41 PM > > To: dev@openvswitch.org > > Cc: david.marchand@redhat.com > > Subject: [ovs-dev] [PATCH] cpu.c: Build cpu.c without any assumptions about > > CPU flags > > > > The cpu.c was built with multiple feature flags that might > > not be available on all hosts. It manifested itself as > > Illegal instruction with crashing on "shlx" instruction > > which is part of the bmi2 extension, that might not be available > > on all CPU that openvswitch runs on. > > > > Move the cpu.c and cpu.h to lib_libopenvswitch_la_SOURCES which > > should build it without any assumptions about CPU capabilities. > > Although this seems a good solution, the CI shows a linking failure; > https://mail.openvswitch.org/pipermail/ovs-build/2022-June/022795.html From the link error, the "normal" ovs archive seems to be passed when linking binaries. libtool: link: gcc -Wstrict-prototypes -Wall -Wextra -Wno-sign-compare -Wpointer-arith -Wformat -Wformat-security -Wswitch-enum -Wunused-parameter -Wbad-function-cast -Wcast-align -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes -Wmissing-field-initializers -fno-strict-aliasing -Wswitch-bool -Wlogical-not-parentheses -Wsizeof-array-argument -Wbool-compare -Wshift-negative-value -Wduplicated-cond -Wshadow -Wmultistatement-macros -Wcast-align=strict -mssse3 -I/home/dmarchan/git/pub/dpdk.org/v21.11/install/include -include rte_config.h -I/usr/usr/include -Werror -Werror -D_FILE_OFFSET_BITS=64 -g -DOPENSSL_API_COMPAT=0x10100000L -march=x86-64-v2 -o utilities/ovs-appctl utilities/ovs-appctl.o -Wl,--as-needed lib/.libs/libopenvswitch.a -L/home/dmarchan/git/pub/dpdk.org/v21.11/install/lib64 -lssl -lcrypto -lbpf /home/dmarchan/git/pub/ovs/master/lib/.libs/libopenvswitchavx512.a -lrte_node -lrte_graph -lrte_flow_classify -lrte_pipeline -lrte_table -lrte_pdump -lrte_port -lrte_fib -lrte_ipsec -lrte_vhost -lrte_stack -lrte_security -lrte_sched -lrte_reorder -lrte_rib -lrte_dmadev -lrte_regexdev -lrte_rawdev -lrte_power -lrte_pcapng -lrte_member -lrte_lpm -lrte_latencystats -lrte_kni -lrte_jobstats -lrte_ip_frag -lrte_gso -lrte_gro -lrte_gpudev -lrte_eventdev -lrte_efd -lrte_distributor -lrte_cryptodev -lrte_compressdev -lrte_cfgfile -lrte_bpf -lrte_bitratestats -lrte_bbdev -lrte_acl -lrte_timer -lrte_hash -lrte_metrics -lrte_cmdline -lrte_pci -lrte_ethdev -lrte_meter -lrte_net -lrte_mbuf -lrte_mempool -lrte_rcu -lrte_ring -lrte_eal -lrte_telemetry -lrte_kvargs -lbsd -lmlx4 -libverbs -lmlx5 -lpcap -lnuma -latomic -lm For this error on cpu_has_isa to happen, the linker probably judged no symbol from cpu.c could be used and decided to drop the whole .o. So we need something to make sure those won't be dropped. I just saw Ilya proposal before clicking Send, but it does not seem to work (or maybe I did something wrong on my side..).
On 6/23/22 17:39, David Marchand wrote: > On Thu, Jun 23, 2022 at 3:33 PM Van Haaren, Harry > <harry.van.haaren@intel.com> wrote: >> >>> -----Original Message----- >>> From: dev <ovs-dev-bounces@openvswitch.org> On Behalf Of Ales Musil >>> Sent: Thursday, June 23, 2022 1:41 PM >>> To: dev@openvswitch.org >>> Cc: david.marchand@redhat.com >>> Subject: [ovs-dev] [PATCH] cpu.c: Build cpu.c without any assumptions about >>> CPU flags >>> >>> The cpu.c was built with multiple feature flags that might >>> not be available on all hosts. It manifested itself as >>> Illegal instruction with crashing on "shlx" instruction >>> which is part of the bmi2 extension, that might not be available >>> on all CPU that openvswitch runs on. >>> >>> Move the cpu.c and cpu.h to lib_libopenvswitch_la_SOURCES which >>> should build it without any assumptions about CPU capabilities. >> >> Although this seems a good solution, the CI shows a linking failure; >> https://mail.openvswitch.org/pipermail/ovs-build/2022-June/022795.html > > From the link error, the "normal" ovs archive seems to be passed when > linking binaries. > > libtool: link: gcc -Wstrict-prototypes -Wall -Wextra -Wno-sign-compare > -Wpointer-arith -Wformat -Wformat-security -Wswitch-enum > -Wunused-parameter -Wbad-function-cast -Wcast-align > -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes > -Wmissing-field-initializers -fno-strict-aliasing -Wswitch-bool > -Wlogical-not-parentheses -Wsizeof-array-argument -Wbool-compare > -Wshift-negative-value -Wduplicated-cond -Wshadow > -Wmultistatement-macros -Wcast-align=strict -mssse3 > -I/home/dmarchan/git/pub/dpdk.org/v21.11/install/include -include > rte_config.h -I/usr/usr/include -Werror -Werror -D_FILE_OFFSET_BITS=64 > -g -DOPENSSL_API_COMPAT=0x10100000L -march=x86-64-v2 -o > utilities/ovs-appctl utilities/ovs-appctl.o -Wl,--as-needed > lib/.libs/libopenvswitch.a > -L/home/dmarchan/git/pub/dpdk.org/v21.11/install/lib64 -lssl -lcrypto > -lbpf /home/dmarchan/git/pub/ovs/master/lib/.libs/libopenvswitchavx512.a > -lrte_node -lrte_graph -lrte_flow_classify -lrte_pipeline -lrte_table > -lrte_pdump -lrte_port -lrte_fib -lrte_ipsec -lrte_vhost -lrte_stack > -lrte_security -lrte_sched -lrte_reorder -lrte_rib -lrte_dmadev > -lrte_regexdev -lrte_rawdev -lrte_power -lrte_pcapng -lrte_member > -lrte_lpm -lrte_latencystats -lrte_kni -lrte_jobstats -lrte_ip_frag > -lrte_gso -lrte_gro -lrte_gpudev -lrte_eventdev -lrte_efd > -lrte_distributor -lrte_cryptodev -lrte_compressdev -lrte_cfgfile > -lrte_bpf -lrte_bitratestats -lrte_bbdev -lrte_acl -lrte_timer > -lrte_hash -lrte_metrics -lrte_cmdline -lrte_pci -lrte_ethdev > -lrte_meter -lrte_net -lrte_mbuf -lrte_mempool -lrte_rcu -lrte_ring > -lrte_eal -lrte_telemetry -lrte_kvargs -lbsd -lmlx4 -libverbs -lmlx5 > -lpcap -lnuma -latomic -lm > > For this error on cpu_has_isa to happen, the linker probably judged no > symbol from cpu.c could be used and decided to drop the whole .o. > So we need something to make sure those won't be dropped. > > I just saw Ilya proposal before clicking Send, but it does not seem to > work (or maybe I did something wrong on my side..). > Hmm, yes, you're right. The issue is a bit larger, and I think I already brought that up at some point. Basically all the functions that use cpu_has_isa should be moved to a generic libopenvswitch library out of libopenvswitchavx512, because they are supposed to check for CPU capabilities, but are built with all the avx512 flags enabled, hence may contain illegal instructions. So, the correct solution should be, along with moving cpu.{c,h}, to also move: dp_netdev_input_outer_avx512_probe to lib/dpif-netdev-private-dpif.c mfex_avx512_probe to lib/dpif-netdev-private-extract.c dpcls_subtable_avx512_gather_probe to lib/dpif-netdev-lookup.c The last one is a bit tricky, because CHECK_LOOKUP_FUNCTION() and DECLARE_OPTIMIZED_LOOKUP_FUNCTION() will have to be correctly exposed, i.e. DECLARE_OPTIMIZED_LOOKUP_FUNCTION macro will need to declare non-static functions and their prototypes should be declared in the lib/dpif-netdev-lookup.h. Best regards, Ilya Maximets.
On 6/23/22 18:46, Ilya Maximets wrote: > On 6/23/22 17:39, David Marchand wrote: >> On Thu, Jun 23, 2022 at 3:33 PM Van Haaren, Harry >> <harry.van.haaren@intel.com> wrote: >>> >>>> -----Original Message----- >>>> From: dev <ovs-dev-bounces@openvswitch.org> On Behalf Of Ales Musil >>>> Sent: Thursday, June 23, 2022 1:41 PM >>>> To: dev@openvswitch.org >>>> Cc: david.marchand@redhat.com >>>> Subject: [ovs-dev] [PATCH] cpu.c: Build cpu.c without any assumptions about >>>> CPU flags >>>> >>>> The cpu.c was built with multiple feature flags that might >>>> not be available on all hosts. It manifested itself as >>>> Illegal instruction with crashing on "shlx" instruction >>>> which is part of the bmi2 extension, that might not be available >>>> on all CPU that openvswitch runs on. >>>> >>>> Move the cpu.c and cpu.h to lib_libopenvswitch_la_SOURCES which >>>> should build it without any assumptions about CPU capabilities. >>> >>> Although this seems a good solution, the CI shows a linking failure; >>> https://mail.openvswitch.org/pipermail/ovs-build/2022-June/022795.html >> >> From the link error, the "normal" ovs archive seems to be passed when >> linking binaries. >> >> libtool: link: gcc -Wstrict-prototypes -Wall -Wextra -Wno-sign-compare >> -Wpointer-arith -Wformat -Wformat-security -Wswitch-enum >> -Wunused-parameter -Wbad-function-cast -Wcast-align >> -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes >> -Wmissing-field-initializers -fno-strict-aliasing -Wswitch-bool >> -Wlogical-not-parentheses -Wsizeof-array-argument -Wbool-compare >> -Wshift-negative-value -Wduplicated-cond -Wshadow >> -Wmultistatement-macros -Wcast-align=strict -mssse3 >> -I/home/dmarchan/git/pub/dpdk.org/v21.11/install/include -include >> rte_config.h -I/usr/usr/include -Werror -Werror -D_FILE_OFFSET_BITS=64 >> -g -DOPENSSL_API_COMPAT=0x10100000L -march=x86-64-v2 -o >> utilities/ovs-appctl utilities/ovs-appctl.o -Wl,--as-needed >> lib/.libs/libopenvswitch.a >> -L/home/dmarchan/git/pub/dpdk.org/v21.11/install/lib64 -lssl -lcrypto >> -lbpf /home/dmarchan/git/pub/ovs/master/lib/.libs/libopenvswitchavx512.a >> -lrte_node -lrte_graph -lrte_flow_classify -lrte_pipeline -lrte_table >> -lrte_pdump -lrte_port -lrte_fib -lrte_ipsec -lrte_vhost -lrte_stack >> -lrte_security -lrte_sched -lrte_reorder -lrte_rib -lrte_dmadev >> -lrte_regexdev -lrte_rawdev -lrte_power -lrte_pcapng -lrte_member >> -lrte_lpm -lrte_latencystats -lrte_kni -lrte_jobstats -lrte_ip_frag >> -lrte_gso -lrte_gro -lrte_gpudev -lrte_eventdev -lrte_efd >> -lrte_distributor -lrte_cryptodev -lrte_compressdev -lrte_cfgfile >> -lrte_bpf -lrte_bitratestats -lrte_bbdev -lrte_acl -lrte_timer >> -lrte_hash -lrte_metrics -lrte_cmdline -lrte_pci -lrte_ethdev >> -lrte_meter -lrte_net -lrte_mbuf -lrte_mempool -lrte_rcu -lrte_ring >> -lrte_eal -lrte_telemetry -lrte_kvargs -lbsd -lmlx4 -libverbs -lmlx5 >> -lpcap -lnuma -latomic -lm >> >> For this error on cpu_has_isa to happen, the linker probably judged no >> symbol from cpu.c could be used and decided to drop the whole .o. >> So we need something to make sure those won't be dropped. >> >> I just saw Ilya proposal before clicking Send, but it does not seem to >> work (or maybe I did something wrong on my side..). >> > > Hmm, yes, you're right. The issue is a bit larger, and I think > I already brought that up at some point. Basically all the > functions that use cpu_has_isa should be moved to a generic > libopenvswitch library out of libopenvswitchavx512, because they > are supposed to check for CPU capabilities, but are built with > all the avx512 flags enabled, hence may contain illegal instructions. > > So, the correct solution should be, along with moving cpu.{c,h}, to > also move: > dp_netdev_input_outer_avx512_probe to lib/dpif-netdev-private-dpif.c > mfex_avx512_probe to lib/dpif-netdev-private-extract.c > dpcls_subtable_avx512_gather_probe to lib/dpif-netdev-lookup.c > > The last one is a bit tricky, because CHECK_LOOKUP_FUNCTION() and > DECLARE_OPTIMIZED_LOOKUP_FUNCTION() will have to be correctly > exposed, i.e. DECLARE_OPTIMIZED_LOOKUP_FUNCTION macro will need > to declare non-static functions and their prototypes should be > declared in the lib/dpif-netdev-lookup.h. Or, probably, easier to split the dpcls_subtable_avx512_gather_probe function like this: diff --git a/lib/dpif-netdev-lookup-avx512-gather.c b/lib/dpif-netdev-lookup-avx512-gather.c index 1e86be207..2bc0af51d 100644 --- a/lib/dpif-netdev-lookup-avx512-gather.c +++ b/lib/dpif-netdev-lookup-avx512-gather.c @@ -415,15 +415,21 @@ dpcls_avx512_gather_mf_any(struct dpcls_subtable *subtable, uint32_t keys_map, dpcls_subtable_lookup_func dpcls_subtable_avx512_gather_probe(uint32_t u0_bits, uint32_t u1_bits) { - dpcls_subtable_lookup_func f = NULL; - int avx512f_available = cpu_has_isa(OVS_CPU_ISA_X86_AVX512F); int bmi2_available = cpu_has_isa(OVS_CPU_ISA_X86_BMI2); + if (!avx512f_available || !bmi2_available) { return NULL; } + return dpcls_subtable_avx512_gather_probe__(u0_bits, u1_bits); +} + +dpcls_subtable_lookup_func +dpcls_subtable_avx512_gather_probe__(uint32_t u0_bits, uint32_t u1_bits) +{ int use_vpop = cpu_has_isa(OVS_CPU_ISA_X86_VPOPCNTDQ); + dpcls_subtable_lookup_func f = NULL; CHECK_LOOKUP_FUNCTION(9, 4, use_vpop); CHECK_LOOKUP_FUNCTION(9, 1, use_vpop); --- After that, the first function can be moved to lib/dpif-netdev-lookup.c. (It's probably OK to leave the 'use_vpop' check inside, because we're not building the whole file with that instruction set, only specific functions.) Best regards, Ilya Maximets.
diff --git a/lib/automake.mk b/lib/automake.mk index cb50578eb..f082eb75a 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -36,8 +36,6 @@ lib_libopenvswitchavx512_la_CFLAGS = \ -fPIC \ $(AM_CFLAGS) lib_libopenvswitchavx512_la_SOURCES = \ - lib/cpu.c \ - lib/cpu.h \ lib/dpif-netdev-avx512.c if HAVE_AVX512BW lib_libopenvswitchavx512_la_CFLAGS += \ @@ -98,6 +96,8 @@ lib_libopenvswitch_la_SOURCES = \ lib/csum.h \ lib/ct-dpif.c \ lib/ct-dpif.h \ + lib/cpu.c \ + lib/cpu.h \ lib/daemon.c \ lib/daemon.h \ lib/daemon-private.h \
The cpu.c was built with multiple feature flags that might not be available on all hosts. It manifested itself as Illegal instruction with crashing on "shlx" instruction which is part of the bmi2 extension, that might not be available on all CPU that openvswitch runs on. Move the cpu.c and cpu.h to lib_libopenvswitch_la_SOURCES which should build it without any assumptions about CPU capabilities. Fixes: b366fa2f494 ("dpif-netdev: Call cpuid for x86 isa availability. ") Reported-at: https://bugzilla.redhat.com/2100393 Signed-off-by: Ales Musil <amusil@redhat.com> --- lib/automake.mk | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)