diff mbox series

[ovs-dev,v7,2/4] mfex_avx512: Calculate miniflow_bits at compile time.

Message ID 20220506052326.3191931-3-kumar.amber@intel.com
State Superseded
Headers show
Series MFEX Optimizations IPv6 + Hashing Optimizations | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/intel-ovs-compilation success test: success

Commit Message

Kumar Amber May 6, 2022, 5:23 a.m. UTC
The patch removes magic numbers from miniflow_bits
and calculates the bits at compile time. This also
makes it easier to handle any ABI changes.

Signed-off-by: Kumar Amber <kumar.amber@intel.com>
---
 lib/dpif-netdev-extract-avx512.c | 21 +++++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

Comments

Van Haaren, Harry May 25, 2022, 2:29 p.m. UTC | #1
> -----Original Message-----
> From: Amber, Kumar <kumar.amber@intel.com>
> Sent: Friday, May 6, 2022 6:23 AM
> To: ovs-dev@openvswitch.org
> Cc: echaudro@redhat.com; ktraynor@redhat.com; i.maximets@ovn.org; Ferriter,
> Cian <cian.ferriter@intel.com>; Stokes, Ian <ian.stokes@intel.com>;
> david.marchand@redhat.com; fbl@sysclose.org; Van Haaren, Harry
> <harry.van.haaren@intel.com>; Amber, Kumar <kumar.amber@intel.com>
> Subject: [PATCH v7 2/4] mfex_avx512: Calculate miniflow_bits at compile time.
> 
> The patch removes magic numbers from miniflow_bits
> and calculates the bits at compile time. This also
> makes it easier to handle any ABI changes.
> 
> Signed-off-by: Kumar Amber <kumar.amber@intel.com>
> ---
>  lib/dpif-netdev-extract-avx512.c | 21 +++++++++++++++++----
>  1 file changed, 17 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/dpif-netdev-extract-avx512.c b/lib/dpif-netdev-extract-avx512.c
> index 6ae15a4db..e77bb3214 100644
> --- a/lib/dpif-netdev-extract-avx512.c
> +++ b/lib/dpif-netdev-extract-avx512.c
> @@ -256,6 +256,19 @@ _mm512_maskz_permutexvar_epi8_wrap(__mmask64
> kmask, __m512i idx, __m512i a)
>  #define PKT_MIN_ETH_IPv4_TCP      (PKT_OFFSET_L4_IPv4 + TCP_HEADER_LEN)
>  #define PKT_MIN_ETH_VLAN_IPv4_TCP (PKT_OFFSET_L4_VLAN_IPv4 +
> TCP_HEADER_LEN)
> 
> +/* MF bits. */
> +#define MF_BIT(field) (MAP_1 << ((offsetof(struct flow, field) / 8) %         \
> +                       MAP_T_BITS))

Had a look to see if "flow.c" had any re-usable macro defines for this, but didn't spot any
as the macros in flow.c all operate on struct flowmap's or other more complex structs,
not a simple u64.

Given the result of this calculation is checked by autovalidation, we can have good
confidence in the correctness of the arithmetic here.

By calculating the miniflow bits at compile-time like this, the SIMD optimized implementation
becomes a lot more robust to "Miniflow ABI" breakages, as the implementation will re-adjust
bits as required at compile time (instead of requiring manual patching).

Nice work.

<snip>
Kumar Amber May 25, 2022, 6:11 p.m. UTC | #2
Hi Harry,

> -----Original Message-----
> From: Van Haaren, Harry <harry.van.haaren@intel.com>
> Sent: Wednesday, May 25, 2022 8:00 PM
> To: Amber, Kumar <kumar.amber@intel.com>; ovs-dev@openvswitch.org
> Cc: echaudro@redhat.com; ktraynor@redhat.com; i.maximets@ovn.org;
> Ferriter, Cian <cian.ferriter@intel.com>; Stokes, Ian <ian.stokes@intel.com>;
> david.marchand@redhat.com; fbl@sysclose.org
> Subject: RE: [PATCH v7 2/4] mfex_avx512: Calculate miniflow_bits at compile
> time.
> 
> > -----Original Message-----
> > From: Amber, Kumar <kumar.amber@intel.com>
> > Sent: Friday, May 6, 2022 6:23 AM
> > To: ovs-dev@openvswitch.org
> > Cc: echaudro@redhat.com; ktraynor@redhat.com; i.maximets@ovn.org;
> > Ferriter, Cian <cian.ferriter@intel.com>; Stokes, Ian
> > <ian.stokes@intel.com>; david.marchand@redhat.com; fbl@sysclose.org;
> > Van Haaren, Harry <harry.van.haaren@intel.com>; Amber, Kumar
> > <kumar.amber@intel.com>
> > Subject: [PATCH v7 2/4] mfex_avx512: Calculate miniflow_bits at compile
> time.
> >
> > The patch removes magic numbers from miniflow_bits and calculates the
> > bits at compile time. This also makes it easier to handle any ABI
> > changes.
> >
> > Signed-off-by: Kumar Amber <kumar.amber@intel.com>
> > ---
> >  lib/dpif-netdev-extract-avx512.c | 21 +++++++++++++++++----
> >  1 file changed, 17 insertions(+), 4 deletions(-)
> >
> > diff --git a/lib/dpif-netdev-extract-avx512.c
> > b/lib/dpif-netdev-extract-avx512.c
> > index 6ae15a4db..e77bb3214 100644
> > --- a/lib/dpif-netdev-extract-avx512.c
> > +++ b/lib/dpif-netdev-extract-avx512.c
> > @@ -256,6 +256,19 @@
> _mm512_maskz_permutexvar_epi8_wrap(__mmask64
> > kmask, __m512i idx, __m512i a)
> >  #define PKT_MIN_ETH_IPv4_TCP      (PKT_OFFSET_L4_IPv4 +
> TCP_HEADER_LEN)
> >  #define PKT_MIN_ETH_VLAN_IPv4_TCP (PKT_OFFSET_L4_VLAN_IPv4 +
> > TCP_HEADER_LEN)
> >
> > +/* MF bits. */
> > +#define MF_BIT(field) (MAP_1 << ((offsetof(struct flow, field) / 8) %         \
> > +                       MAP_T_BITS))
> 
> Had a look to see if "flow.c" had any re-usable macro defines for this, but
> didn't spot any as the macros in flow.c all operate on struct flowmap's or
> other more complex structs, not a simple u64.
> 
> Given the result of this calculation is checked by autovalidation, we can have
> good confidence in the correctness of the arithmetic here.
> 
> By calculating the miniflow bits at compile-time like this, the SIMD optimized
> implementation becomes a lot more robust to "Miniflow ABI" breakages, as
> the implementation will re-adjust bits as required at compile time (instead of
> requiring manual patching).
> 
> Nice work.
> 

Thanks, Harry, for the reviews .

Regards
Amber
> <snip>
diff mbox series

Patch

diff --git a/lib/dpif-netdev-extract-avx512.c b/lib/dpif-netdev-extract-avx512.c
index 6ae15a4db..e77bb3214 100644
--- a/lib/dpif-netdev-extract-avx512.c
+++ b/lib/dpif-netdev-extract-avx512.c
@@ -256,6 +256,19 @@  _mm512_maskz_permutexvar_epi8_wrap(__mmask64 kmask, __m512i idx, __m512i a)
 #define PKT_MIN_ETH_IPv4_TCP      (PKT_OFFSET_L4_IPv4 + TCP_HEADER_LEN)
 #define PKT_MIN_ETH_VLAN_IPv4_TCP (PKT_OFFSET_L4_VLAN_IPv4 + TCP_HEADER_LEN)
 
+/* MF bits. */
+#define MF_BIT(field) (MAP_1 << ((offsetof(struct flow, field) / 8) %         \
+                       MAP_T_BITS))
+
+#define MF_ETH        (MF_BIT(dp_hash) | MF_BIT(in_port) | MF_BIT(packet_type)\
+                       | MF_BIT(dl_dst) | MF_BIT(dl_src)| MF_BIT(dl_type))
+
+#define MF_ETH_VLAN   (MF_ETH | MF_BIT(vlans))
+#define MF_IPV4_UDP   (MF_BIT(nw_src) | MF_BIT(ipv6_label) | MF_BIT(tp_src) | \
+                       MF_BIT(tp_dst))
+
+#define MF_IPV4_TCP   (MF_IPV4_UDP | MF_BIT(tcp_flags) | MF_BIT(arp_tha.ea[2]))
+
 /* This union allows initializing static data as u8, but easily loading it
  * into AVX512 registers too. The union ensures proper alignment for the zmm.
  */
@@ -353,7 +366,7 @@  static const struct mfex_profile mfex_profiles[PROFILE_COUNT] =
         .strip_mask.u8_data = { PATTERN_STRIP_IPV4_MASK },
         .store_kmsk = PATTERN_IPV4_UDP_KMASK,
 
-        .mf_bits = { 0x18a0000000000000, 0x0000000000040401},
+        .mf_bits = { MF_ETH, MF_IPV4_UDP},
         .dp_pkt_offs = {
             0, UINT16_MAX, PKT_OFFSET_L2, PKT_OFFSET_L4_IPv4,
         },
@@ -376,7 +389,7 @@  static const struct mfex_profile mfex_profiles[PROFILE_COUNT] =
         .strip_mask.u8_data = { PATTERN_STRIP_IPV4_MASK },
         .store_kmsk = PATTERN_IPV4_TCP_KMASK,
 
-        .mf_bits = { 0x18a0000000000000, 0x0000000000044401},
+        .mf_bits = { MF_ETH, MF_IPV4_TCP},
         .dp_pkt_offs = {
             0, UINT16_MAX, PKT_OFFSET_L2, PKT_OFFSET_L4_IPv4,
         },
@@ -395,7 +408,7 @@  static const struct mfex_profile mfex_profiles[PROFILE_COUNT] =
         .strip_mask.u8_data = { PATTERN_STRIP_DOT1Q_IPV4_MASK },
         .store_kmsk = PATTERN_DT1Q_IPV4_UDP_KMASK,
 
-        .mf_bits = { 0x38a0000000000000, 0x0000000000040401},
+        .mf_bits = { MF_ETH_VLAN, MF_IPV4_UDP},
         .dp_pkt_offs = {
             PKT_OFFSET_L2, UINT16_MAX, PKT_OFFSET_L3_VLAN,
             PKT_OFFSET_L4_VLAN_IPv4,
@@ -421,7 +434,7 @@  static const struct mfex_profile mfex_profiles[PROFILE_COUNT] =
         .strip_mask.u8_data = { PATTERN_STRIP_DOT1Q_IPV4_MASK },
         .store_kmsk = PATTERN_DT1Q_IPV4_TCP_KMASK,
 
-        .mf_bits = { 0x38a0000000000000, 0x0000000000044401},
+        .mf_bits = { MF_ETH_VLAN, MF_IPV4_TCP},
         .dp_pkt_offs = {
             PKT_OFFSET_L2, UINT16_MAX, PKT_OFFSET_L3_VLAN,
             PKT_OFFSET_L4_VLAN_IPv4,