diff mbox series

[ovs-dev] cpu.c: Build cpu.c without any assumptions about CPU flags

Message ID 20220623124037.252709-1-amusil@redhat.com
State Changes Requested
Headers show
Series [ovs-dev] cpu.c: Build cpu.c without any assumptions about CPU flags | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/intel-ovs-compilation fail test: fail
ovsrobot/github-robot-_Build_and_Test success github build: passed

Commit Message

Ales Musil June 23, 2022, 12:40 p.m. UTC
The cpu.c was built with multiple feature flags that might
not be available on all hosts. It manifested itself as
Illegal instruction with crashing on "shlx" instruction
which is part of the bmi2 extension, that might not be available
on all CPU that openvswitch runs on.

Move the cpu.c and cpu.h to lib_libopenvswitch_la_SOURCES which
should build it without any assumptions about CPU capabilities.

Fixes: b366fa2f494 ("dpif-netdev: Call cpuid for x86 isa availability. ")
Reported-at: https://bugzilla.redhat.com/2100393
Signed-off-by: Ales Musil <amusil@redhat.com>
---
 lib/automake.mk | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Van Haaren, Harry June 23, 2022, 1:33 p.m. UTC | #1
> -----Original Message-----
> From: dev <ovs-dev-bounces@openvswitch.org> On Behalf Of Ales Musil
> Sent: Thursday, June 23, 2022 1:41 PM
> To: dev@openvswitch.org
> Cc: david.marchand@redhat.com
> Subject: [ovs-dev] [PATCH] cpu.c: Build cpu.c without any assumptions about
> CPU flags
> 
> The cpu.c was built with multiple feature flags that might
> not be available on all hosts. It manifested itself as
> Illegal instruction with crashing on "shlx" instruction
> which is part of the bmi2 extension, that might not be available
> on all CPU that openvswitch runs on.
> 
> Move the cpu.c and cpu.h to lib_libopenvswitch_la_SOURCES which
> should build it without any assumptions about CPU capabilities.

Although this seems a good solution, the CI shows a linking failure;
https://mail.openvswitch.org/pipermail/ovs-build/2022-June/022795.html

I think the root cause might be that the avx512 library does not depend on the
vswitch one, resulting in symbols not being found at link time for avx512 lib?

One solution might be to create a standalone library for cpu.c, and then link
that in to both avx512 and vswitch libraries? Or there might be alternative
methods to expose the cpu.c "has_cpu_isa" function to the avx512 library?


> Fixes: b366fa2f494 ("dpif-netdev: Call cpuid for x86 isa availability. ")
> Reported-at: https://bugzilla.redhat.com/2100393
> Signed-off-by: Ales Musil <amusil@redhat.com>
> ---
>  lib/automake.mk | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/automake.mk b/lib/automake.mk
> index cb50578eb..f082eb75a 100644
> --- a/lib/automake.mk
> +++ b/lib/automake.mk
> @@ -36,8 +36,6 @@ lib_libopenvswitchavx512_la_CFLAGS = \
>  	-fPIC \
>  	$(AM_CFLAGS)
>  lib_libopenvswitchavx512_la_SOURCES = \
> -	lib/cpu.c \
> -	lib/cpu.h \
>  	lib/dpif-netdev-avx512.c
>  if HAVE_AVX512BW
>  lib_libopenvswitchavx512_la_CFLAGS += \
> @@ -98,6 +96,8 @@ lib_libopenvswitch_la_SOURCES = \
>  	lib/csum.h \
>  	lib/ct-dpif.c \
>  	lib/ct-dpif.h \
> +	lib/cpu.c \
> +	lib/cpu.h \
>  	lib/daemon.c \
>  	lib/daemon.h \
>  	lib/daemon-private.h \
> --
> 2.35.3
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Ilya Maximets June 23, 2022, 3:29 p.m. UTC | #2
On 6/23/22 15:33, Van Haaren, Harry wrote:
>> -----Original Message-----
>> From: dev <ovs-dev-bounces@openvswitch.org> On Behalf Of Ales Musil
>> Sent: Thursday, June 23, 2022 1:41 PM
>> To: dev@openvswitch.org
>> Cc: david.marchand@redhat.com
>> Subject: [ovs-dev] [PATCH] cpu.c: Build cpu.c without any assumptions about
>> CPU flags
>>
>> The cpu.c was built with multiple feature flags that might
>> not be available on all hosts. It manifested itself as
>> Illegal instruction with crashing on "shlx" instruction
>> which is part of the bmi2 extension, that might not be available
>> on all CPU that openvswitch runs on.
>>
>> Move the cpu.c and cpu.h to lib_libopenvswitch_la_SOURCES which
>> should build it without any assumptions about CPU capabilities.
> 
> Although this seems a good solution, the CI shows a linking failure;
> https://mail.openvswitch.org/pipermail/ovs-build/2022-June/022795.html
> 
> I think the root cause might be that the avx512 library does not depend on the
> vswitch one, resulting in symbols not being found at link time for avx512 lib?
> 
> One solution might be to create a standalone library for cpu.c, and then link
> that in to both avx512 and vswitch libraries? Or there might be alternative
> methods to expose the cpu.c "has_cpu_isa" function to the avx512 library?

libopenvswitch.a is already linked with libopenvswitchavx512.a together
when building binaries, but since cpu.h is an internal header, the
'cpu_has_isa' is not exported, so can not be found while linking.
The easiest solution would be to move the cpu.h into include/openvswitch/
adding appropriate 'extern "C"' declarations.  This way it should
be visible during the linking.

Ales, you need to build with CFLAGS='-msse4.2' in order to reproduce the
issue, otherwise avx512 code is mostly skipped during the build.

Best regards, Ilya Maximets.

> 
> 
>> Fixes: b366fa2f494 ("dpif-netdev: Call cpuid for x86 isa availability. ")
>> Reported-at: https://bugzilla.redhat.com/2100393
>> Signed-off-by: Ales Musil <amusil@redhat.com>
>> ---
>>  lib/automake.mk | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/lib/automake.mk b/lib/automake.mk
>> index cb50578eb..f082eb75a 100644
>> --- a/lib/automake.mk
>> +++ b/lib/automake.mk
>> @@ -36,8 +36,6 @@ lib_libopenvswitchavx512_la_CFLAGS = \
>>  	-fPIC \
>>  	$(AM_CFLAGS)
>>  lib_libopenvswitchavx512_la_SOURCES = \
>> -	lib/cpu.c \
>> -	lib/cpu.h \
>>  	lib/dpif-netdev-avx512.c
>>  if HAVE_AVX512BW
>>  lib_libopenvswitchavx512_la_CFLAGS += \
>> @@ -98,6 +96,8 @@ lib_libopenvswitch_la_SOURCES = \
>>  	lib/csum.h \
>>  	lib/ct-dpif.c \
>>  	lib/ct-dpif.h \
>> +	lib/cpu.c \
>> +	lib/cpu.h \

'p' goes after 'o'.

>>  	lib/daemon.c \
>>  	lib/daemon.h \
>>  	lib/daemon-private.h \
>> --
>> 2.35.3
David Marchand June 23, 2022, 3:39 p.m. UTC | #3
On Thu, Jun 23, 2022 at 3:33 PM Van Haaren, Harry
<harry.van.haaren@intel.com> wrote:
>
> > -----Original Message-----
> > From: dev <ovs-dev-bounces@openvswitch.org> On Behalf Of Ales Musil
> > Sent: Thursday, June 23, 2022 1:41 PM
> > To: dev@openvswitch.org
> > Cc: david.marchand@redhat.com
> > Subject: [ovs-dev] [PATCH] cpu.c: Build cpu.c without any assumptions about
> > CPU flags
> >
> > The cpu.c was built with multiple feature flags that might
> > not be available on all hosts. It manifested itself as
> > Illegal instruction with crashing on "shlx" instruction
> > which is part of the bmi2 extension, that might not be available
> > on all CPU that openvswitch runs on.
> >
> > Move the cpu.c and cpu.h to lib_libopenvswitch_la_SOURCES which
> > should build it without any assumptions about CPU capabilities.
>
> Although this seems a good solution, the CI shows a linking failure;
> https://mail.openvswitch.org/pipermail/ovs-build/2022-June/022795.html

From the link error, the "normal" ovs archive seems to be passed when
linking binaries.

libtool: link: gcc -Wstrict-prototypes -Wall -Wextra -Wno-sign-compare
-Wpointer-arith -Wformat -Wformat-security -Wswitch-enum
-Wunused-parameter -Wbad-function-cast -Wcast-align
-Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes
-Wmissing-field-initializers -fno-strict-aliasing -Wswitch-bool
-Wlogical-not-parentheses -Wsizeof-array-argument -Wbool-compare
-Wshift-negative-value -Wduplicated-cond -Wshadow
-Wmultistatement-macros -Wcast-align=strict -mssse3
-I/home/dmarchan/git/pub/dpdk.org/v21.11/install/include -include
rte_config.h -I/usr/usr/include -Werror -Werror -D_FILE_OFFSET_BITS=64
-g -DOPENSSL_API_COMPAT=0x10100000L -march=x86-64-v2 -o
utilities/ovs-appctl utilities/ovs-appctl.o -Wl,--as-needed
lib/.libs/libopenvswitch.a
-L/home/dmarchan/git/pub/dpdk.org/v21.11/install/lib64 -lssl -lcrypto
-lbpf /home/dmarchan/git/pub/ovs/master/lib/.libs/libopenvswitchavx512.a
-lrte_node -lrte_graph -lrte_flow_classify -lrte_pipeline -lrte_table
-lrte_pdump -lrte_port -lrte_fib -lrte_ipsec -lrte_vhost -lrte_stack
-lrte_security -lrte_sched -lrte_reorder -lrte_rib -lrte_dmadev
-lrte_regexdev -lrte_rawdev -lrte_power -lrte_pcapng -lrte_member
-lrte_lpm -lrte_latencystats -lrte_kni -lrte_jobstats -lrte_ip_frag
-lrte_gso -lrte_gro -lrte_gpudev -lrte_eventdev -lrte_efd
-lrte_distributor -lrte_cryptodev -lrte_compressdev -lrte_cfgfile
-lrte_bpf -lrte_bitratestats -lrte_bbdev -lrte_acl -lrte_timer
-lrte_hash -lrte_metrics -lrte_cmdline -lrte_pci -lrte_ethdev
-lrte_meter -lrte_net -lrte_mbuf -lrte_mempool -lrte_rcu -lrte_ring
-lrte_eal -lrte_telemetry -lrte_kvargs -lbsd -lmlx4 -libverbs -lmlx5
-lpcap -lnuma -latomic -lm

For this error on cpu_has_isa to happen, the linker probably judged no
symbol from cpu.c could be used and decided to drop the whole .o.
So we need something to make sure those won't be dropped.

I just saw Ilya proposal before clicking Send, but it does not seem to
work (or maybe I did something wrong on my side..).
Ilya Maximets June 23, 2022, 4:46 p.m. UTC | #4
On 6/23/22 17:39, David Marchand wrote:
> On Thu, Jun 23, 2022 at 3:33 PM Van Haaren, Harry
> <harry.van.haaren@intel.com> wrote:
>>
>>> -----Original Message-----
>>> From: dev <ovs-dev-bounces@openvswitch.org> On Behalf Of Ales Musil
>>> Sent: Thursday, June 23, 2022 1:41 PM
>>> To: dev@openvswitch.org
>>> Cc: david.marchand@redhat.com
>>> Subject: [ovs-dev] [PATCH] cpu.c: Build cpu.c without any assumptions about
>>> CPU flags
>>>
>>> The cpu.c was built with multiple feature flags that might
>>> not be available on all hosts. It manifested itself as
>>> Illegal instruction with crashing on "shlx" instruction
>>> which is part of the bmi2 extension, that might not be available
>>> on all CPU that openvswitch runs on.
>>>
>>> Move the cpu.c and cpu.h to lib_libopenvswitch_la_SOURCES which
>>> should build it without any assumptions about CPU capabilities.
>>
>> Although this seems a good solution, the CI shows a linking failure;
>> https://mail.openvswitch.org/pipermail/ovs-build/2022-June/022795.html
> 
> From the link error, the "normal" ovs archive seems to be passed when
> linking binaries.
> 
> libtool: link: gcc -Wstrict-prototypes -Wall -Wextra -Wno-sign-compare
> -Wpointer-arith -Wformat -Wformat-security -Wswitch-enum
> -Wunused-parameter -Wbad-function-cast -Wcast-align
> -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes
> -Wmissing-field-initializers -fno-strict-aliasing -Wswitch-bool
> -Wlogical-not-parentheses -Wsizeof-array-argument -Wbool-compare
> -Wshift-negative-value -Wduplicated-cond -Wshadow
> -Wmultistatement-macros -Wcast-align=strict -mssse3
> -I/home/dmarchan/git/pub/dpdk.org/v21.11/install/include -include
> rte_config.h -I/usr/usr/include -Werror -Werror -D_FILE_OFFSET_BITS=64
> -g -DOPENSSL_API_COMPAT=0x10100000L -march=x86-64-v2 -o
> utilities/ovs-appctl utilities/ovs-appctl.o -Wl,--as-needed
> lib/.libs/libopenvswitch.a
> -L/home/dmarchan/git/pub/dpdk.org/v21.11/install/lib64 -lssl -lcrypto
> -lbpf /home/dmarchan/git/pub/ovs/master/lib/.libs/libopenvswitchavx512.a
> -lrte_node -lrte_graph -lrte_flow_classify -lrte_pipeline -lrte_table
> -lrte_pdump -lrte_port -lrte_fib -lrte_ipsec -lrte_vhost -lrte_stack
> -lrte_security -lrte_sched -lrte_reorder -lrte_rib -lrte_dmadev
> -lrte_regexdev -lrte_rawdev -lrte_power -lrte_pcapng -lrte_member
> -lrte_lpm -lrte_latencystats -lrte_kni -lrte_jobstats -lrte_ip_frag
> -lrte_gso -lrte_gro -lrte_gpudev -lrte_eventdev -lrte_efd
> -lrte_distributor -lrte_cryptodev -lrte_compressdev -lrte_cfgfile
> -lrte_bpf -lrte_bitratestats -lrte_bbdev -lrte_acl -lrte_timer
> -lrte_hash -lrte_metrics -lrte_cmdline -lrte_pci -lrte_ethdev
> -lrte_meter -lrte_net -lrte_mbuf -lrte_mempool -lrte_rcu -lrte_ring
> -lrte_eal -lrte_telemetry -lrte_kvargs -lbsd -lmlx4 -libverbs -lmlx5
> -lpcap -lnuma -latomic -lm
> 
> For this error on cpu_has_isa to happen, the linker probably judged no
> symbol from cpu.c could be used and decided to drop the whole .o.
> So we need something to make sure those won't be dropped.
> 
> I just saw Ilya proposal before clicking Send, but it does not seem to
> work (or maybe I did something wrong on my side..).
> 

Hmm, yes, you're right.  The issue is a bit larger, and I think
I already brought that up at some point.  Basically all the
functions that use cpu_has_isa should be moved to a generic
libopenvswitch library out of libopenvswitchavx512, because they
are supposed to check for CPU capabilities, but are built with
all the avx512 flags enabled, hence may contain illegal instructions.

So, the correct solution should be, along with moving cpu.{c,h}, to
also move:
  dp_netdev_input_outer_avx512_probe to lib/dpif-netdev-private-dpif.c
  mfex_avx512_probe to lib/dpif-netdev-private-extract.c
  dpcls_subtable_avx512_gather_probe to lib/dpif-netdev-lookup.c

The last one is a bit tricky, because CHECK_LOOKUP_FUNCTION() and
DECLARE_OPTIMIZED_LOOKUP_FUNCTION() will have to be correctly
exposed, i.e. DECLARE_OPTIMIZED_LOOKUP_FUNCTION macro will need
to declare non-static functions and their prototypes should be
declared in the lib/dpif-netdev-lookup.h.

Best regards, Ilya Maximets.
Ilya Maximets June 23, 2022, 5:10 p.m. UTC | #5
On 6/23/22 18:46, Ilya Maximets wrote:
> On 6/23/22 17:39, David Marchand wrote:
>> On Thu, Jun 23, 2022 at 3:33 PM Van Haaren, Harry
>> <harry.van.haaren@intel.com> wrote:
>>>
>>>> -----Original Message-----
>>>> From: dev <ovs-dev-bounces@openvswitch.org> On Behalf Of Ales Musil
>>>> Sent: Thursday, June 23, 2022 1:41 PM
>>>> To: dev@openvswitch.org
>>>> Cc: david.marchand@redhat.com
>>>> Subject: [ovs-dev] [PATCH] cpu.c: Build cpu.c without any assumptions about
>>>> CPU flags
>>>>
>>>> The cpu.c was built with multiple feature flags that might
>>>> not be available on all hosts. It manifested itself as
>>>> Illegal instruction with crashing on "shlx" instruction
>>>> which is part of the bmi2 extension, that might not be available
>>>> on all CPU that openvswitch runs on.
>>>>
>>>> Move the cpu.c and cpu.h to lib_libopenvswitch_la_SOURCES which
>>>> should build it without any assumptions about CPU capabilities.
>>>
>>> Although this seems a good solution, the CI shows a linking failure;
>>> https://mail.openvswitch.org/pipermail/ovs-build/2022-June/022795.html
>>
>> From the link error, the "normal" ovs archive seems to be passed when
>> linking binaries.
>>
>> libtool: link: gcc -Wstrict-prototypes -Wall -Wextra -Wno-sign-compare
>> -Wpointer-arith -Wformat -Wformat-security -Wswitch-enum
>> -Wunused-parameter -Wbad-function-cast -Wcast-align
>> -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes
>> -Wmissing-field-initializers -fno-strict-aliasing -Wswitch-bool
>> -Wlogical-not-parentheses -Wsizeof-array-argument -Wbool-compare
>> -Wshift-negative-value -Wduplicated-cond -Wshadow
>> -Wmultistatement-macros -Wcast-align=strict -mssse3
>> -I/home/dmarchan/git/pub/dpdk.org/v21.11/install/include -include
>> rte_config.h -I/usr/usr/include -Werror -Werror -D_FILE_OFFSET_BITS=64
>> -g -DOPENSSL_API_COMPAT=0x10100000L -march=x86-64-v2 -o
>> utilities/ovs-appctl utilities/ovs-appctl.o -Wl,--as-needed
>> lib/.libs/libopenvswitch.a
>> -L/home/dmarchan/git/pub/dpdk.org/v21.11/install/lib64 -lssl -lcrypto
>> -lbpf /home/dmarchan/git/pub/ovs/master/lib/.libs/libopenvswitchavx512.a
>> -lrte_node -lrte_graph -lrte_flow_classify -lrte_pipeline -lrte_table
>> -lrte_pdump -lrte_port -lrte_fib -lrte_ipsec -lrte_vhost -lrte_stack
>> -lrte_security -lrte_sched -lrte_reorder -lrte_rib -lrte_dmadev
>> -lrte_regexdev -lrte_rawdev -lrte_power -lrte_pcapng -lrte_member
>> -lrte_lpm -lrte_latencystats -lrte_kni -lrte_jobstats -lrte_ip_frag
>> -lrte_gso -lrte_gro -lrte_gpudev -lrte_eventdev -lrte_efd
>> -lrte_distributor -lrte_cryptodev -lrte_compressdev -lrte_cfgfile
>> -lrte_bpf -lrte_bitratestats -lrte_bbdev -lrte_acl -lrte_timer
>> -lrte_hash -lrte_metrics -lrte_cmdline -lrte_pci -lrte_ethdev
>> -lrte_meter -lrte_net -lrte_mbuf -lrte_mempool -lrte_rcu -lrte_ring
>> -lrte_eal -lrte_telemetry -lrte_kvargs -lbsd -lmlx4 -libverbs -lmlx5
>> -lpcap -lnuma -latomic -lm
>>
>> For this error on cpu_has_isa to happen, the linker probably judged no
>> symbol from cpu.c could be used and decided to drop the whole .o.
>> So we need something to make sure those won't be dropped.
>>
>> I just saw Ilya proposal before clicking Send, but it does not seem to
>> work (or maybe I did something wrong on my side..).
>>
> 
> Hmm, yes, you're right.  The issue is a bit larger, and I think
> I already brought that up at some point.  Basically all the
> functions that use cpu_has_isa should be moved to a generic
> libopenvswitch library out of libopenvswitchavx512, because they
> are supposed to check for CPU capabilities, but are built with
> all the avx512 flags enabled, hence may contain illegal instructions.
> 
> So, the correct solution should be, along with moving cpu.{c,h}, to
> also move:
>   dp_netdev_input_outer_avx512_probe to lib/dpif-netdev-private-dpif.c
>   mfex_avx512_probe to lib/dpif-netdev-private-extract.c
>   dpcls_subtable_avx512_gather_probe to lib/dpif-netdev-lookup.c
> 
> The last one is a bit tricky, because CHECK_LOOKUP_FUNCTION() and
> DECLARE_OPTIMIZED_LOOKUP_FUNCTION() will have to be correctly
> exposed, i.e. DECLARE_OPTIMIZED_LOOKUP_FUNCTION macro will need
> to declare non-static functions and their prototypes should be
> declared in the lib/dpif-netdev-lookup.h.

Or, probably, easier to split the dpcls_subtable_avx512_gather_probe
function like this:

diff --git a/lib/dpif-netdev-lookup-avx512-gather.c b/lib/dpif-netdev-lookup-avx512-gather.c
index 1e86be207..2bc0af51d 100644
--- a/lib/dpif-netdev-lookup-avx512-gather.c
+++ b/lib/dpif-netdev-lookup-avx512-gather.c
@@ -415,15 +415,21 @@ dpcls_avx512_gather_mf_any(struct dpcls_subtable *subtable, uint32_t keys_map,
 dpcls_subtable_lookup_func
 dpcls_subtable_avx512_gather_probe(uint32_t u0_bits, uint32_t u1_bits)
 {
-    dpcls_subtable_lookup_func f = NULL;
-
     int avx512f_available = cpu_has_isa(OVS_CPU_ISA_X86_AVX512F);
     int bmi2_available = cpu_has_isa(OVS_CPU_ISA_X86_BMI2);
+
     if (!avx512f_available || !bmi2_available) {
         return NULL;
     }
+    return dpcls_subtable_avx512_gather_probe__(u0_bits, u1_bits);
+}
 
+
+dpcls_subtable_lookup_func
+dpcls_subtable_avx512_gather_probe__(uint32_t u0_bits, uint32_t u1_bits)
+{
     int use_vpop = cpu_has_isa(OVS_CPU_ISA_X86_VPOPCNTDQ);
+    dpcls_subtable_lookup_func f = NULL;
 
     CHECK_LOOKUP_FUNCTION(9, 4, use_vpop);
     CHECK_LOOKUP_FUNCTION(9, 1, use_vpop);
---

After that, the first function can be moved to lib/dpif-netdev-lookup.c.
(It's probably OK to leave the 'use_vpop' check inside, because we're
not building the whole file with that instruction set, only specific
functions.)

Best regards, Ilya Maximets.
diff mbox series

Patch

diff --git a/lib/automake.mk b/lib/automake.mk
index cb50578eb..f082eb75a 100644
--- a/lib/automake.mk
+++ b/lib/automake.mk
@@ -36,8 +36,6 @@  lib_libopenvswitchavx512_la_CFLAGS = \
 	-fPIC \
 	$(AM_CFLAGS)
 lib_libopenvswitchavx512_la_SOURCES = \
-	lib/cpu.c \
-	lib/cpu.h \
 	lib/dpif-netdev-avx512.c
 if HAVE_AVX512BW
 lib_libopenvswitchavx512_la_CFLAGS += \
@@ -98,6 +96,8 @@  lib_libopenvswitch_la_SOURCES = \
 	lib/csum.h \
 	lib/ct-dpif.c \
 	lib/ct-dpif.h \
+	lib/cpu.c \
+	lib/cpu.h \
 	lib/daemon.c \
 	lib/daemon.h \
 	lib/daemon-private.h \