Message ID | 20220506052326.3191931-3-kumar.amber@intel.com |
---|---|
State | Superseded |
Headers | show |
Series | MFEX Optimizations IPv6 + Hashing Optimizations | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/intel-ovs-compilation | success | test: success |
> -----Original Message----- > From: Amber, Kumar <kumar.amber@intel.com> > Sent: Friday, May 6, 2022 6:23 AM > To: ovs-dev@openvswitch.org > Cc: echaudro@redhat.com; ktraynor@redhat.com; i.maximets@ovn.org; Ferriter, > Cian <cian.ferriter@intel.com>; Stokes, Ian <ian.stokes@intel.com>; > david.marchand@redhat.com; fbl@sysclose.org; Van Haaren, Harry > <harry.van.haaren@intel.com>; Amber, Kumar <kumar.amber@intel.com> > Subject: [PATCH v7 2/4] mfex_avx512: Calculate miniflow_bits at compile time. > > The patch removes magic numbers from miniflow_bits > and calculates the bits at compile time. This also > makes it easier to handle any ABI changes. > > Signed-off-by: Kumar Amber <kumar.amber@intel.com> > --- > lib/dpif-netdev-extract-avx512.c | 21 +++++++++++++++++---- > 1 file changed, 17 insertions(+), 4 deletions(-) > > diff --git a/lib/dpif-netdev-extract-avx512.c b/lib/dpif-netdev-extract-avx512.c > index 6ae15a4db..e77bb3214 100644 > --- a/lib/dpif-netdev-extract-avx512.c > +++ b/lib/dpif-netdev-extract-avx512.c > @@ -256,6 +256,19 @@ _mm512_maskz_permutexvar_epi8_wrap(__mmask64 > kmask, __m512i idx, __m512i a) > #define PKT_MIN_ETH_IPv4_TCP (PKT_OFFSET_L4_IPv4 + TCP_HEADER_LEN) > #define PKT_MIN_ETH_VLAN_IPv4_TCP (PKT_OFFSET_L4_VLAN_IPv4 + > TCP_HEADER_LEN) > > +/* MF bits. */ > +#define MF_BIT(field) (MAP_1 << ((offsetof(struct flow, field) / 8) % \ > + MAP_T_BITS)) Had a look to see if "flow.c" had any re-usable macro defines for this, but didn't spot any as the macros in flow.c all operate on struct flowmap's or other more complex structs, not a simple u64. Given the result of this calculation is checked by autovalidation, we can have good confidence in the correctness of the arithmetic here. By calculating the miniflow bits at compile-time like this, the SIMD optimized implementation becomes a lot more robust to "Miniflow ABI" breakages, as the implementation will re-adjust bits as required at compile time (instead of requiring manual patching). Nice work. <snip>
Hi Harry, > -----Original Message----- > From: Van Haaren, Harry <harry.van.haaren@intel.com> > Sent: Wednesday, May 25, 2022 8:00 PM > To: Amber, Kumar <kumar.amber@intel.com>; ovs-dev@openvswitch.org > Cc: echaudro@redhat.com; ktraynor@redhat.com; i.maximets@ovn.org; > Ferriter, Cian <cian.ferriter@intel.com>; Stokes, Ian <ian.stokes@intel.com>; > david.marchand@redhat.com; fbl@sysclose.org > Subject: RE: [PATCH v7 2/4] mfex_avx512: Calculate miniflow_bits at compile > time. > > > -----Original Message----- > > From: Amber, Kumar <kumar.amber@intel.com> > > Sent: Friday, May 6, 2022 6:23 AM > > To: ovs-dev@openvswitch.org > > Cc: echaudro@redhat.com; ktraynor@redhat.com; i.maximets@ovn.org; > > Ferriter, Cian <cian.ferriter@intel.com>; Stokes, Ian > > <ian.stokes@intel.com>; david.marchand@redhat.com; fbl@sysclose.org; > > Van Haaren, Harry <harry.van.haaren@intel.com>; Amber, Kumar > > <kumar.amber@intel.com> > > Subject: [PATCH v7 2/4] mfex_avx512: Calculate miniflow_bits at compile > time. > > > > The patch removes magic numbers from miniflow_bits and calculates the > > bits at compile time. This also makes it easier to handle any ABI > > changes. > > > > Signed-off-by: Kumar Amber <kumar.amber@intel.com> > > --- > > lib/dpif-netdev-extract-avx512.c | 21 +++++++++++++++++---- > > 1 file changed, 17 insertions(+), 4 deletions(-) > > > > diff --git a/lib/dpif-netdev-extract-avx512.c > > b/lib/dpif-netdev-extract-avx512.c > > index 6ae15a4db..e77bb3214 100644 > > --- a/lib/dpif-netdev-extract-avx512.c > > +++ b/lib/dpif-netdev-extract-avx512.c > > @@ -256,6 +256,19 @@ > _mm512_maskz_permutexvar_epi8_wrap(__mmask64 > > kmask, __m512i idx, __m512i a) > > #define PKT_MIN_ETH_IPv4_TCP (PKT_OFFSET_L4_IPv4 + > TCP_HEADER_LEN) > > #define PKT_MIN_ETH_VLAN_IPv4_TCP (PKT_OFFSET_L4_VLAN_IPv4 + > > TCP_HEADER_LEN) > > > > +/* MF bits. */ > > +#define MF_BIT(field) (MAP_1 << ((offsetof(struct flow, field) / 8) % \ > > + MAP_T_BITS)) > > Had a look to see if "flow.c" had any re-usable macro defines for this, but > didn't spot any as the macros in flow.c all operate on struct flowmap's or > other more complex structs, not a simple u64. > > Given the result of this calculation is checked by autovalidation, we can have > good confidence in the correctness of the arithmetic here. > > By calculating the miniflow bits at compile-time like this, the SIMD optimized > implementation becomes a lot more robust to "Miniflow ABI" breakages, as > the implementation will re-adjust bits as required at compile time (instead of > requiring manual patching). > > Nice work. > Thanks, Harry, for the reviews . Regards Amber > <snip>
diff --git a/lib/dpif-netdev-extract-avx512.c b/lib/dpif-netdev-extract-avx512.c index 6ae15a4db..e77bb3214 100644 --- a/lib/dpif-netdev-extract-avx512.c +++ b/lib/dpif-netdev-extract-avx512.c @@ -256,6 +256,19 @@ _mm512_maskz_permutexvar_epi8_wrap(__mmask64 kmask, __m512i idx, __m512i a) #define PKT_MIN_ETH_IPv4_TCP (PKT_OFFSET_L4_IPv4 + TCP_HEADER_LEN) #define PKT_MIN_ETH_VLAN_IPv4_TCP (PKT_OFFSET_L4_VLAN_IPv4 + TCP_HEADER_LEN) +/* MF bits. */ +#define MF_BIT(field) (MAP_1 << ((offsetof(struct flow, field) / 8) % \ + MAP_T_BITS)) + +#define MF_ETH (MF_BIT(dp_hash) | MF_BIT(in_port) | MF_BIT(packet_type)\ + | MF_BIT(dl_dst) | MF_BIT(dl_src)| MF_BIT(dl_type)) + +#define MF_ETH_VLAN (MF_ETH | MF_BIT(vlans)) +#define MF_IPV4_UDP (MF_BIT(nw_src) | MF_BIT(ipv6_label) | MF_BIT(tp_src) | \ + MF_BIT(tp_dst)) + +#define MF_IPV4_TCP (MF_IPV4_UDP | MF_BIT(tcp_flags) | MF_BIT(arp_tha.ea[2])) + /* This union allows initializing static data as u8, but easily loading it * into AVX512 registers too. The union ensures proper alignment for the zmm. */ @@ -353,7 +366,7 @@ static const struct mfex_profile mfex_profiles[PROFILE_COUNT] = .strip_mask.u8_data = { PATTERN_STRIP_IPV4_MASK }, .store_kmsk = PATTERN_IPV4_UDP_KMASK, - .mf_bits = { 0x18a0000000000000, 0x0000000000040401}, + .mf_bits = { MF_ETH, MF_IPV4_UDP}, .dp_pkt_offs = { 0, UINT16_MAX, PKT_OFFSET_L2, PKT_OFFSET_L4_IPv4, }, @@ -376,7 +389,7 @@ static const struct mfex_profile mfex_profiles[PROFILE_COUNT] = .strip_mask.u8_data = { PATTERN_STRIP_IPV4_MASK }, .store_kmsk = PATTERN_IPV4_TCP_KMASK, - .mf_bits = { 0x18a0000000000000, 0x0000000000044401}, + .mf_bits = { MF_ETH, MF_IPV4_TCP}, .dp_pkt_offs = { 0, UINT16_MAX, PKT_OFFSET_L2, PKT_OFFSET_L4_IPv4, }, @@ -395,7 +408,7 @@ static const struct mfex_profile mfex_profiles[PROFILE_COUNT] = .strip_mask.u8_data = { PATTERN_STRIP_DOT1Q_IPV4_MASK }, .store_kmsk = PATTERN_DT1Q_IPV4_UDP_KMASK, - .mf_bits = { 0x38a0000000000000, 0x0000000000040401}, + .mf_bits = { MF_ETH_VLAN, MF_IPV4_UDP}, .dp_pkt_offs = { PKT_OFFSET_L2, UINT16_MAX, PKT_OFFSET_L3_VLAN, PKT_OFFSET_L4_VLAN_IPv4, @@ -421,7 +434,7 @@ static const struct mfex_profile mfex_profiles[PROFILE_COUNT] = .strip_mask.u8_data = { PATTERN_STRIP_DOT1Q_IPV4_MASK }, .store_kmsk = PATTERN_DT1Q_IPV4_TCP_KMASK, - .mf_bits = { 0x38a0000000000000, 0x0000000000044401}, + .mf_bits = { MF_ETH_VLAN, MF_IPV4_TCP}, .dp_pkt_offs = { PKT_OFFSET_L2, UINT16_MAX, PKT_OFFSET_L3_VLAN, PKT_OFFSET_L4_VLAN_IPv4,
The patch removes magic numbers from miniflow_bits and calculates the bits at compile time. This also makes it easier to handle any ABI changes. Signed-off-by: Kumar Amber <kumar.amber@intel.com> --- lib/dpif-netdev-extract-avx512.c | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-)