diff mbox series

[ovs-dev] netdev-offload-dpdk: Fix for broken ethernet matching HWOL for XL710 NIC

Message ID 1596727030-120460-1-git-send-email-emma.finn@intel.com
State Changes Requested
Headers show
Series [ovs-dev] netdev-offload-dpdk: Fix for broken ethernet matching HWOL for XL710 NIC | expand

Commit Message

Emma Finn Aug. 6, 2020, 3:17 p.m. UTC
The following 2 commits introduced changes which caused a regression
for XL710 devices and functionality ceases for partial offload as a result.
864852a0624a ("netdev-offload-dpdk: Fix Ethernet matching for type only.")
a79eae87abe4 ("netdev-offload-dpdk: Remove pre-validate of patterns function.")

Fixed by partial reversion of these changes.

Signed-off-by: Emma Finn <emma.finn@intel.com>
---
 lib/netdev-offload-dpdk.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

Comments

Eli Britstein Aug. 6, 2020, 3:22 p.m. UTC | #1
On 8/6/2020 6:17 PM, Emma Finn wrote:
> The following 2 commits introduced changes which caused a regression
> for XL710 devices and functionality ceases for partial offload as a result.
> 864852a0624a ("netdev-offload-dpdk: Fix Ethernet matching for type only.")
> a79eae87abe4 ("netdev-offload-dpdk: Remove pre-validate of patterns function.")
OVS is vendor agnostic. That kind of workaround belongs in intel PMD in 
dpdk, not in OVS.
>
> Fixed by partial reversion of these changes.
>
> Signed-off-by: Emma Finn<emma.finn@intel.com>
Stokes, Ian Aug. 6, 2020, 5:28 p.m. UTC | #2
> On 8/6/2020 6:17 PM, Emma Finn wrote:
> > The following 2 commits introduced changes which caused a regression
> > for XL710 devices and functionality ceases for partial offload as a result.
> > 864852a0624a ("netdev-offload-dpdk: Fix Ethernet matching for type
> > only.")
> > a79eae87abe4 ("netdev-offload-dpdk: Remove pre-validate of patterns
> > function.")
> OVS is vendor agnostic. That kind of workaround belongs in intel PMD in dpdk,
> not in OVS.

Hi Eli, 

Yes OVS looks to be vendor agnostic, but this code I believe was already in place and working for this usecase. As such it's removal introduced a regression from an OVS point of view between the releases.

We have had examples in the past where workarounds are permissible if there is a clear path to fixing this in the future on the DPDK side (which is what I suggest here) (for example scatter gather support for MTUs in the past raised similar issue where we hand to handle specific NIC until the next DPDK LTS release).

So my suggestion is to re-instate the original workaround and remove its when fixed in the next DPDK LTS which supports the change for i40e at the PMD layer or if it's backported to the next 19.11 stable release which would be validated for use with OVS.

I've included the i40e DPDK maintainers here for their thoughts also.

@Beilei/Jia Is this something we can look at to introduce in the i40e PMD?

Regards
Ian

> >
> > Fixed by partial reversion of these changes.
> >
> > Signed-off-by: Emma Finn<emma.finn@intel.com>
Xing, Beilei Aug. 7, 2020, 2:05 a.m. UTC | #3
> -----Original Message-----
> From: Stokes, Ian <ian.stokes@intel.com>
> Sent: Friday, August 7, 2020 1:29 AM
> To: Eli Britstein <elibr@mellanox.com>; Finn, Emma <emma.finn@intel.com>;
> dev@openvswitch.org; Xing, Beilei <beilei.xing@intel.com>; Guo, Jia
> <jia.guo@intel.com>
> Cc: i.maximets@ovn.org
> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet matching
> HWOL for XL710 NIC
> 
> > On 8/6/2020 6:17 PM, Emma Finn wrote:
> > > The following 2 commits introduced changes which caused a regression
> > > for XL710 devices and functionality ceases for partial offload as a result.
> > > 864852a0624a ("netdev-offload-dpdk: Fix Ethernet matching for type
> > > only.")
> > > a79eae87abe4 ("netdev-offload-dpdk: Remove pre-validate of patterns
> > > function.")
> > OVS is vendor agnostic. That kind of workaround belongs in intel PMD
> > in dpdk, not in OVS.
> 
> Hi Eli,
> 
> Yes OVS looks to be vendor agnostic, but this code I believe was already in
> place and working for this usecase. As such it's removal introduced a
> regression from an OVS point of view between the releases.
> 
> We have had examples in the past where workarounds are permissible if there
> is a clear path to fixing this in the future on the DPDK side (which is what I
> suggest here) (for example scatter gather support for MTUs in the past raised
> similar issue where we hand to handle specific NIC until the next DPDK LTS
> release).
> 
> So my suggestion is to re-instate the original workaround and remove its when
> fixed in the next DPDK LTS which supports the change for i40e at the PMD
> layer or if it's backported to the next 19.11 stable release which would be
> validated for use with OVS.
> 
> I've included the i40e DPDK maintainers here for their thoughts also.
> 
> @Beilei/Jia Is this something we can look at to introduce in the i40e PMD?

Hi Ian,

Sorry I'm not very clear about what's the relationship between the three OVS patches
(864852a0624a, a79eae87abe4, this fix patch) and DPDK issue. What's the i40e issue?
Did you create a JIRA to track it? 

BR,
Beilei

> 
> Regards
> Ian
> 
> > >
> > > Fixed by partial reversion of these changes.
> > >
> > > Signed-off-by: Emma Finn<emma.finn@intel.com>
Eli Britstein Aug. 7, 2020, 4:55 a.m. UTC | #4
On 8/6/2020 8:28 PM, Stokes, Ian wrote:
>> On 8/6/2020 6:17 PM, Emma Finn wrote:
>>> The following 2 commits introduced changes which caused a regression
>>> for XL710 devices and functionality ceases for partial offload as a result.
>>> 864852a0624a ("netdev-offload-dpdk: Fix Ethernet matching for type
>>> only.")
>>> a79eae87abe4 ("netdev-offload-dpdk: Remove pre-validate of patterns
>>> function.")
>> OVS is vendor agnostic. That kind of workaround belongs in intel PMD in dpdk,
>> not in OVS.
> Hi Eli,
>
> Yes OVS looks to be vendor agnostic, but this code I believe was already in place and working for this usecase. As such it's removal introduced a regression from an OVS point of view between the releases.
>
> We have had examples in the past where workarounds are permissible if there is a clear path to fixing this in the future on the DPDK side (which is what I suggest here) (for example scatter gather support for MTUs in the past raised similar issue where we hand to handle specific NIC until the next DPDK LTS release).
>
> So my suggestion is to re-instate the original workaround and remove its when fixed in the next DPDK LTS which supports the change for i40e at the PMD layer or if it's backported to the next 19.11 stable release which would be validated for use with OVS.

Hi,

There was a bug with this WA.

Please see 
https://patchwork.ozlabs.org/project/openvswitch/patch/1587180266-28632-1-git-send-email-JackerDune@gmail.com/.

Is it possible to address it in DPDK instead of reverting in OVS and 
later re-reverting?

Thanks,

Eli

>
> I've included the i40e DPDK maintainers here for their thoughts also.
>
> @Beilei/Jia Is this something we can look at to introduce in the i40e PMD?
>
> Regards
> Ian
>
>>> Fixed by partial reversion of these changes.
>>>
>>> Signed-off-by: Emma Finn<emma.finn@intel.com>
Eli Britstein Aug. 9, 2020, 6:04 a.m. UTC | #5
On 8/7/2020 7:55 AM, Eli Britstein wrote:
>
> On 8/6/2020 8:28 PM, Stokes, Ian wrote:
>>> On 8/6/2020 6:17 PM, Emma Finn wrote:
>>>> The following 2 commits introduced changes which caused a regression
>>>> for XL710 devices and functionality ceases for partial offload as a 
>>>> result.
>>>> 864852a0624a ("netdev-offload-dpdk: Fix Ethernet matching for type
>>>> only.")
>>>> a79eae87abe4 ("netdev-offload-dpdk: Remove pre-validate of patterns
>>>> function.")
>>> OVS is vendor agnostic. That kind of workaround belongs in intel PMD 
>>> in dpdk,
>>> not in OVS.
>> Hi Eli,
>>
>> Yes OVS looks to be vendor agnostic, but this code I believe was 
>> already in place and working for this usecase. As such it's removal 
>> introduced a regression from an OVS point of view between the releases.
>>
>> We have had examples in the past where workarounds are permissible if 
>> there is a clear path to fixing this in the future on the DPDK side 
>> (which is what I suggest here) (for example scatter gather support 
>> for MTUs in the past raised similar issue where we hand to handle 
>> specific NIC until the next DPDK LTS release).
>>
>> So my suggestion is to re-instate the original workaround and remove 
>> its when fixed in the next DPDK LTS which supports the change for 
>> i40e at the PMD layer or if it's backported to the next 19.11 stable 
>> release which would be validated for use with OVS.
>
> Hi,
>
> There was a bug with this WA.
>
> Please see 
> https://patchwork.ozlabs.org/project/openvswitch/patch/1587180266-28632-1-git-send-email-JackerDune@gmail.com/.
>
> Is it possible to address it in DPDK instead of reverting in OVS and 
> later re-reverting?
>
> Thanks,
>
> Eli
>
>>
>> I've included the i40e DPDK maintainers here for their thoughts also.
>>
>> @Beilei/Jia Is this something we can look at to introduce in the i40e 
>> PMD?

Hello,

Please take a look at:

http://mails.dpdk.org/archives/dev/2020-May/166272.html

For MLX it was only an optimization. For i40e something similar may be a 
workaround for this issue.

Thanks,

Eli

>>
>> Regards
>> Ian
>>
>>>> Fixed by partial reversion of these changes.
>>>>
>>>> Signed-off-by: Emma Finn<emma.finn@intel.com>
Stokes, Ian Aug. 10, 2020, 8:52 p.m. UTC | #6
> On 8/7/2020 7:55 AM, Eli Britstein wrote:
> >
> > On 8/6/2020 8:28 PM, Stokes, Ian wrote:
> >>> On 8/6/2020 6:17 PM, Emma Finn wrote:
> >>>> The following 2 commits introduced changes which caused a
> >>>> regression for XL710 devices and functionality ceases for partial
> >>>> offload as a result.
> >>>> 864852a0624a ("netdev-offload-dpdk: Fix Ethernet matching for type
> >>>> only.")
> >>>> a79eae87abe4 ("netdev-offload-dpdk: Remove pre-validate of
> patterns
> >>>> function.")
> >>> OVS is vendor agnostic. That kind of workaround belongs in intel PMD
> >>> in dpdk, not in OVS.
> >> Hi Eli,
> >>
> >> Yes OVS looks to be vendor agnostic, but this code I believe was
> >> already in place and working for this usecase. As such it's removal
> >> introduced a regression from an OVS point of view between the releases.
> >>
> >> We have had examples in the past where workarounds are permissible if
> >> there is a clear path to fixing this in the future on the DPDK side
> >> (which is what I suggest here) (for example scatter gather support
> >> for MTUs in the past raised similar issue where we hand to handle
> >> specific NIC until the next DPDK LTS release).
> >>
> >> So my suggestion is to re-instate the original workaround and remove
> >> its when fixed in the next DPDK LTS which supports the change for
> >> i40e at the PMD layer or if it's backported to the next 19.11 stable
> >> release which would be validated for use with OVS.
> >
> > Hi,
> >
> > There was a bug with this WA.
> >
> > Please see
> > https://patchwork.ozlabs.org/project/openvswitch/patch/1587180266-
> 28632-1-git-send-email-JackerDune@gmail.com/.
> >
> > Is it possible to address it in DPDK instead of reverting in OVS and
> > later re-reverting?
> >
> > Thanks,
> >
> > Eli
> >
> >>
> >> I've included the i40e DPDK maintainers here for their thoughts also.
> >>
> >> @Beilei/Jia Is this something we can look at to introduce in the i40e
> >> PMD?
> 
> Hello,
> 
> Please take a look at:
> 
> http://mails.dpdk.org/archives/dev/2020-May/166272.html
> 
> For MLX it was only an optimization. For i40e something similar may be a
> workaround for this issue.

Thanks for this Eli, let me sync with Beilei on this.

If it's something we can resolve in the PMD then I think we can add an errata or known issue for the 2.14 release (and possibly the 2.13 release as I think the same issue is present there).

If it was fixed in the future we could then remove the issue notice.

Regards
Ian
> 
> Thanks,
> 
> Eli
> 
> >>
> >> Regards
> >> Ian
> >>
> >>>> Fixed by partial reversion of these changes.
> >>>>
> >>>> Signed-off-by: Emma Finn<emma.finn@intel.com>
Xing, Beilei Aug. 11, 2020, 3:35 a.m. UTC | #7
> -----Original Message-----
> From: Stokes, Ian <ian.stokes@intel.com>
> Sent: Tuesday, August 11, 2020 4:52 AM
> To: Eli Britstein <elibr@mellanox.com>; Finn, Emma <emma.finn@intel.com>;
> dev@openvswitch.org; Xing, Beilei <beilei.xing@intel.com>; Guo, Jia
> <jia.guo@intel.com>
> Cc: i.maximets@ovn.org
> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet matching
> HWOL for XL710 NIC
> 
> > On 8/7/2020 7:55 AM, Eli Britstein wrote:
> > >
> > > On 8/6/2020 8:28 PM, Stokes, Ian wrote:
> > >>> On 8/6/2020 6:17 PM, Emma Finn wrote:
> > >>>> The following 2 commits introduced changes which caused a
> > >>>> regression for XL710 devices and functionality ceases for partial
> > >>>> offload as a result.
> > >>>> 864852a0624a ("netdev-offload-dpdk: Fix Ethernet matching for
> > >>>> type
> > >>>> only.")
> > >>>> a79eae87abe4 ("netdev-offload-dpdk: Remove pre-validate of
> > patterns
> > >>>> function.")
> > >>> OVS is vendor agnostic. That kind of workaround belongs in intel
> > >>> PMD in dpdk, not in OVS.
> > >> Hi Eli,
> > >>
> > >> Yes OVS looks to be vendor agnostic, but this code I believe was
> > >> already in place and working for this usecase. As such it's removal
> > >> introduced a regression from an OVS point of view between the releases.
> > >>
> > >> We have had examples in the past where workarounds are permissible
> > >> if there is a clear path to fixing this in the future on the DPDK
> > >> side (which is what I suggest here) (for example scatter gather
> > >> support for MTUs in the past raised similar issue where we hand to
> > >> handle specific NIC until the next DPDK LTS release).
> > >>
> > >> So my suggestion is to re-instate the original workaround and
> > >> remove its when fixed in the next DPDK LTS which supports the
> > >> change for i40e at the PMD layer or if it's backported to the next
> > >> 19.11 stable release which would be validated for use with OVS.
> > >
> > > Hi,
> > >
> > > There was a bug with this WA.
> > >
> > > Please see
> > > https://patchwork.ozlabs.org/project/openvswitch/patch/1587180266-
> > 28632-1-git-send-email-JackerDune@gmail.com/.
> > >
> > > Is it possible to address it in DPDK instead of reverting in OVS and
> > > later re-reverting?
> > >
> > > Thanks,
> > >
> > > Eli
> > >
> > >>
> > >> I've included the i40e DPDK maintainers here for their thoughts also.
> > >>
> > >> @Beilei/Jia Is this something we can look at to introduce in the
> > >> i40e PMD?
> >
> > Hello,
> >
> > Please take a look at:
> >
> > http://mails.dpdk.org/archives/dev/2020-May/166272.html
> >
> > For MLX it was only an optimization. For i40e something similar may be
> > a workaround for this issue.
> 
> Thanks for this Eli, let me sync with Beilei on this.
> 
> If it's something we can resolve in the PMD then I think we can add an errata
> or known issue for the 2.14 release (and possibly the 2.13 release as I think the
> same issue is present there).
> 
> If it was fixed in the future we could then remove the issue notice.


Hi,

According to the OVS patch and mlx DPDK patch, is the requirement to
support rte flow pattern like 'pattern IPv4 / UDP src is 32 / end', no need
to use ''pattern ETH / IPv4 / UDP src is 32 / end '?
please correct me if I'm wrong.

If yes, could you please tell me how OVS adds a flow which doesn’t
include ETH item? I'm not very familiar with OVS, I run some simple
commands before, and find eth will always exist. E.g:
flow_add: ufid:fced09a8-9b8a-420d-9cb6-454ab9bed224 skb_priority(0),skb_mark(0),\
ct_state(-new-est-rel-rpl-inv-trk-snat-dnat),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),\
dp_hash(0),in_port(2),packet_type(ns=0,id=0),eth(src=00:e8:ca:11:ba:80,dst=ff:ff:ff:ff:ff:ff),
eth_type(0x0800),ipv4(src=0.0.0.0,dst=255.255.255.255,proto=17,tos=0x10,ttl=128,frag=no),
udp(src=68,dst=67), actions:drop

Thanks,
Beilei

> 
> Regards
> Ian
> >
> > Thanks,
> >
> > Eli
> >
> > >>
> > >> Regards
> > >> Ian
> > >>
> > >>>> Fixed by partial reversion of these changes.
> > >>>>
> > >>>> Signed-off-by: Emma Finn<emma.finn@intel.com>
Ilya Maximets Aug. 11, 2020, 10:01 a.m. UTC | #8
On 8/11/20 5:35 AM, Xing, Beilei wrote:
> 
> 
>> -----Original Message-----
>> From: Stokes, Ian <ian.stokes@intel.com>
>> Sent: Tuesday, August 11, 2020 4:52 AM
>> To: Eli Britstein <elibr@mellanox.com>; Finn, Emma <emma.finn@intel.com>;
>> dev@openvswitch.org; Xing, Beilei <beilei.xing@intel.com>; Guo, Jia
>> <jia.guo@intel.com>
>> Cc: i.maximets@ovn.org
>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet matching
>> HWOL for XL710 NIC
>>
>>> On 8/7/2020 7:55 AM, Eli Britstein wrote:
>>>>
>>>> On 8/6/2020 8:28 PM, Stokes, Ian wrote:
>>>>>> On 8/6/2020 6:17 PM, Emma Finn wrote:
>>>>>>> The following 2 commits introduced changes which caused a
>>>>>>> regression for XL710 devices and functionality ceases for partial
>>>>>>> offload as a result.
>>>>>>> 864852a0624a ("netdev-offload-dpdk: Fix Ethernet matching for
>>>>>>> type
>>>>>>> only.")
>>>>>>> a79eae87abe4 ("netdev-offload-dpdk: Remove pre-validate of
>>> patterns
>>>>>>> function.")
>>>>>> OVS is vendor agnostic. That kind of workaround belongs in intel
>>>>>> PMD in dpdk, not in OVS.
>>>>> Hi Eli,
>>>>>
>>>>> Yes OVS looks to be vendor agnostic, but this code I believe was
>>>>> already in place and working for this usecase. As such it's removal
>>>>> introduced a regression from an OVS point of view between the releases.
>>>>>
>>>>> We have had examples in the past where workarounds are permissible
>>>>> if there is a clear path to fixing this in the future on the DPDK
>>>>> side (which is what I suggest here) (for example scatter gather
>>>>> support for MTUs in the past raised similar issue where we hand to
>>>>> handle specific NIC until the next DPDK LTS release).
>>>>>
>>>>> So my suggestion is to re-instate the original workaround and
>>>>> remove its when fixed in the next DPDK LTS which supports the
>>>>> change for i40e at the PMD layer or if it's backported to the next
>>>>> 19.11 stable release which would be validated for use with OVS.
>>>>
>>>> Hi,
>>>>
>>>> There was a bug with this WA.
>>>>
>>>> Please see
>>>> https://patchwork.ozlabs.org/project/openvswitch/patch/1587180266-
>>> 28632-1-git-send-email-JackerDune@gmail.com/.

I'm worried about this bug.  Current version of a workaround is not correct
from the OVS point of view since it wildcards dl_type during offloading that
is not expected by higher layers, causing HW flow being much more wide than
requested.  In case we going to have this workaround for xl710, we should
have another workaround for this issue too.


>>>>
>>>> Is it possible to address it in DPDK instead of reverting in OVS and
>>>> later re-reverting?
>>>>
>>>> Thanks,
>>>>
>>>> Eli
>>>>
>>>>>
>>>>> I've included the i40e DPDK maintainers here for their thoughts also.
>>>>>
>>>>> @Beilei/Jia Is this something we can look at to introduce in the
>>>>> i40e PMD?
>>>
>>> Hello,
>>>
>>> Please take a look at:
>>>
>>> http://mails.dpdk.org/archives/dev/2020-May/166272.html
>>>
>>> For MLX it was only an optimization. For i40e something similar may be
>>> a workaround for this issue.
>>
>> Thanks for this Eli, let me sync with Beilei on this.
>>
>> If it's something we can resolve in the PMD then I think we can add an errata
>> or known issue for the 2.14 release (and possibly the 2.13 release as I think the
>> same issue is present there).
>>
>> If it was fixed in the future we could then remove the issue notice.> 
> 
> Hi,
> 
> According to the OVS patch and mlx DPDK patch, is the requirement to
> support rte flow pattern like 'pattern IPv4 / UDP src is 32 / end', no need
> to use ''pattern ETH / IPv4 / UDP src is 32 / end '?
> please correct me if I'm wrong.
> 
> If yes, could you please tell me how OVS adds a flow which doesn’t
> include ETH item? I'm not very familiar with OVS, I run some simple
> commands before, and find eth will always exist. E.g:
> flow_add: ufid:fced09a8-9b8a-420d-9cb6-454ab9bed224 skb_priority(0),skb_mark(0),\
> ct_state(-new-est-rel-rpl-inv-trk-snat-dnat),ct_zone(0),ct_mark(0),ct_label(0),recirc_id(0),\
> dp_hash(0),in_port(2),packet_type(ns=0,id=0),eth(src=00:e8:ca:11:ba:80,dst=ff:ff:ff:ff:ff:ff),
> eth_type(0x0800),ipv4(src=0.0.0.0,dst=255.255.255.255,proto=17,tos=0x10,ttl=128,frag=no),
> udp(src=68,dst=67), actions:drop

Emma, Ian, could you please provide the pattern that fails to offload on
current OVS master so it will be easier for everyone to understand the issue.
It's not obvious how to construct the bad pattern by only looking at the i40e
pmd code.  Also, please, enable debug logs and provide the testpmd-style
arguments constructed by OVS.

It might be also good to have all this information in the commit message.

Thanks.

Best regards, Ilya Maximets.

> 
> Thanks,
> Beilei
> 
>>
>> Regards
>> Ian
>>>
>>> Thanks,
>>>
>>> Eli
>>>
>>>>>
>>>>> Regards
>>>>> Ian
>>>>>
>>>>>>> Fixed by partial reversion of these changes.
>>>>>>>
>>>>>>> Signed-off-by: Emma Finn<emma.finn@intel.com>
>
Ilya Maximets Aug. 11, 2020, 11:51 a.m. UTC | #9
On 8/6/20 5:17 PM, Emma Finn wrote:
> The following 2 commits introduced changes which caused a regression
> for XL710 devices and functionality ceases for partial offload as a result.
> 864852a0624a ("netdev-offload-dpdk: Fix Ethernet matching for type only.")
> a79eae87abe4 ("netdev-offload-dpdk: Remove pre-validate of patterns function.")
> 
> Fixed by partial reversion of these changes.
> 
> Signed-off-by: Emma Finn <emma.finn@intel.com>
> ---
>  lib/netdev-offload-dpdk.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
> index de6101e..b300884 100644
> --- a/lib/netdev-offload-dpdk.c
> +++ b/lib/netdev-offload-dpdk.c
> @@ -691,8 +691,7 @@ parse_flow_match(struct flow_patterns *patterns,
>      consumed_masks->packet_type = 0;
>  
>      /* Eth */
> -    if (match->wc.masks.dl_type ||
> -        !eth_addr_is_zero(match->wc.masks.dl_src) ||
> +    if (!eth_addr_is_zero(match->wc.masks.dl_src) ||
>          !eth_addr_is_zero(match->wc.masks.dl_dst)) {
>          struct rte_flow_item_eth *spec, *mask;
>  
> @@ -712,6 +711,16 @@ parse_flow_match(struct flow_patterns *patterns,
>          consumed_masks->dl_type = 0;
>  
>          add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec, mask);
> +    } else {
> +        /*
> +        * If user specifies a flow (like UDP flow) without L2 patterns,
> +        * OVS will at least set the dl_type. Normally, it's enough to
> +        * create an eth pattern just with it. Unluckily, some Intel's
> +        * NIC (such as XL710) doesn't support that. Below is a workaround,
> +        * which simply matches any L2 pkts.
> +        */
> +        consumed_masks->dl_type = 0;
> +        add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, NULL, NULL);
>      }
>  
>      /* VLAN */
> 


What about alternative workaround (not even compiled):

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 44ebf96da..dfb803c98 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -5198,6 +5198,22 @@ netdev_dpdk_rte_flow_create(struct netdev *netdev,
     return flow;
 }
 
+struct int
+netdev_dpdk_rte_flow_validate(struct netdev *netdev,
+                              const struct rte_flow_attr *attr,
+                              const struct rte_flow_item *items,
+                              const struct rte_flow_action *actions,
+                              struct rte_flow_error *error)
+{
+    struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
+    int res;
+
+    ovs_mutex_lock(&dev->mutex);
+    res = rte_flow_validate(dev->port_id, attr, items, actions, error);
+    ovs_mutex_unlock(&dev->mutex);
+    return res;
+}
+
 int
 netdev_dpdk_rte_flow_query_count(struct netdev *netdev,
                                  struct rte_flow *rte_flow,
diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h
index 848346cb4..49e3d2bfa 100644
--- a/lib/netdev-dpdk.h
+++ b/lib/netdev-dpdk.h
@@ -48,6 +48,13 @@ netdev_dpdk_rte_flow_create(struct netdev *netdev,
                             const struct rte_flow_item *items,
                             const struct rte_flow_action *actions,
                             struct rte_flow_error *error);
+
+struct int
+netdev_dpdk_rte_flow_validate(struct netdev *netdev,
+                              const struct rte_flow_attr *attr,
+                              const struct rte_flow_item *items,
+                              const struct rte_flow_action *actions,
+                              struct rte_flow_error *error);
 int
 netdev_dpdk_rte_flow_query_count(struct netdev *netdev,
                                  struct rte_flow *rte_flow,
diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index de6101e4d..5df1d06de 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -553,6 +553,59 @@ dump_flow(struct ds *s, struct ds *s_extra,
     return s;
 }
 
+/* XXX:  Comment why this function needed and what it does. */
+static struct flow *
+try_workaround_eth_pattern__(struct netdev *netdev,
+                             const struct rte_flow_attr *attr,
+                             const struct rte_flow_item *items,
+                             const struct rte_flow_action *actions,
+                             struct rte_flow_error *error)
+{
+    struct rte_flow_item_eth *mask  = items[0].mask;
+    if (items[0].type != RTE_FLOW_ITEM_TYPE_ETH
+        || mask->type != 0xffff
+        || !eth_addr_is_zero(mask->src)
+        || !eth_addr_is_zero(mask->dst)) {
+        return NULL;
+    }
+
+    struct rte_flow_item_eth *eth_spec = xzalloc(sizeof *eth_spec);
+    struct rte_flow_item_eth *eth_mask = xzalloc(sizeof *eth_mask);
+    const struct rte_flow_attr dummy_attr = { .ingress = 1, };
+    const struct rte_flow_action dummy_actions[] = {
+        { .type = RTE_FLOW_ACTION_TYPE_DROP, },
+        { .type = RTE_FLOW_ACTION_TYPE_END,  },
+    };
+    const struct rte_flow_item bad_items[] = {
+        {
+            .type = RTE_FLOW_ITEM_TYPE_ETH,
+            .spec = eth_spec,
+            .mask = eth_mask,
+        },
+        /* XXX: Add extra items to trigger the issue here. */
+    };
+    bool may_need_workaround = false;
+    static struct flow *flow = NULL;
+
+    eth_spec->type = htons(ETH_TYPE_IP);
+    eth_mask->type = 0xffff;
+
+    if (netdev_dpdk_rte_flow_validate(netdev, &dummy_attr, bad_items,
+                                      dummy_actions, NULL) == /* XXX: Error type */) {
+        /* Seems PMD doesn't support ETH pattern with only 'type' set. */
+        free(items[0].spec);
+        free(items[0].mask);
+        items[0].spec = NULL;
+        items[0].mask = NULL;
+        flow = netdev_dpdk_rte_flow_create(netdev, attr, items,
+                                           actions, error);
+    }
+    free(eth_spec);
+    free(eth_mask);
+
+    return flow;
+}
+
 static struct rte_flow *
 netdev_offload_dpdk_flow_create(struct netdev *netdev,
                                 const struct rte_flow_attr *attr,
@@ -566,6 +619,10 @@ netdev_offload_dpdk_flow_create(struct netdev *netdev,
     char *extra_str;
 
     flow = netdev_dpdk_rte_flow_create(netdev, attr, items, actions, error);
+    if (!flow && error->type == /* XXX: Put correct error type here */ ) {
+        flow = try_workaround_eth_pattern__(netdev, attr, items,
+                                            actions, error);
+    }
     if (flow) {
         if (!VLOG_DROP_DBG(&rl)) {
             dump_flow(&s, &s_extra, attr, items, actions);
---

This should re-introduce the workaround for XL710 while keeping other NICs that
correctly supports ETH paterns untouched, i.e. avoid bugs with too wide matching
pattern for them.  With a small additional overhead for the failed flows.  A bit
higher overhead for XL710, but better than not having offloading at all.  Overhead
will be removed along with the workaround once DPDK i40e PMD fixed.

Thoughts?


For the implementation, we could try to remember which netdev requires a workaround,
but this will require more changes.


Best regards, Ilya Maximets.
Emma Finn Aug. 11, 2020, 1:12 p.m. UTC | #10
> -----Original Message-----
> From: Ilya Maximets <i.maximets@ovn.org>
> Sent: Tuesday 11 August 2020 11:02
> To: Xing, Beilei <beilei.xing@intel.com>; Stokes, Ian <ian.stokes@intel.com>;
> Eli Britstein <elibr@mellanox.com>; Finn, Emma <emma.finn@intel.com>;
> dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
> Cc: i.maximets@ovn.org
> Subject: Re: [PATCH] netdev-offload-dpdk: Fix for broken ethernet matching
> HWOL for XL710 NIC
> 
> On 8/11/20 5:35 AM, Xing, Beilei wrote:
> >
> >
> >> -----Original Message-----
> >> From: Stokes, Ian <ian.stokes@intel.com>
> >> Sent: Tuesday, August 11, 2020 4:52 AM
> >> To: Eli Britstein <elibr@mellanox.com>; Finn, Emma
> >> <emma.finn@intel.com>; dev@openvswitch.org; Xing, Beilei
> >> <beilei.xing@intel.com>; Guo, Jia <jia.guo@intel.com>
> >> Cc: i.maximets@ovn.org
> >> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
> >> matching HWOL for XL710 NIC
> >>
> >>> On 8/7/2020 7:55 AM, Eli Britstein wrote:
> >>>>
> >>>> On 8/6/2020 8:28 PM, Stokes, Ian wrote:
> >>>>>> On 8/6/2020 6:17 PM, Emma Finn wrote:
> >>>>>>> The following 2 commits introduced changes which caused a
> >>>>>>> regression for XL710 devices and functionality ceases for
> >>>>>>> partial offload as a result.
> >>>>>>> 864852a0624a ("netdev-offload-dpdk: Fix Ethernet matching for
> >>>>>>> type
> >>>>>>> only.")
> >>>>>>> a79eae87abe4 ("netdev-offload-dpdk: Remove pre-validate of
> >>> patterns
> >>>>>>> function.")
> >>>>>> OVS is vendor agnostic. That kind of workaround belongs in intel
> >>>>>> PMD in dpdk, not in OVS.
> >>>>> Hi Eli,
> >>>>>
> >>>>> Yes OVS looks to be vendor agnostic, but this code I believe was
> >>>>> already in place and working for this usecase. As such it's
> >>>>> removal introduced a regression from an OVS point of view between
> the releases.
> >>>>>
> >>>>> We have had examples in the past where workarounds are
> permissible
> >>>>> if there is a clear path to fixing this in the future on the DPDK
> >>>>> side (which is what I suggest here) (for example scatter gather
> >>>>> support for MTUs in the past raised similar issue where we hand to
> >>>>> handle specific NIC until the next DPDK LTS release).
> >>>>>
> >>>>> So my suggestion is to re-instate the original workaround and
> >>>>> remove its when fixed in the next DPDK LTS which supports the
> >>>>> change for i40e at the PMD layer or if it's backported to the next
> >>>>> 19.11 stable release which would be validated for use with OVS.
> >>>>
> >>>> Hi,
> >>>>
> >>>> There was a bug with this WA.
> >>>>
> >>>> Please see
> >>>> https://patchwork.ozlabs.org/project/openvswitch/patch/1587180266-
> >>> 28632-1-git-send-email-JackerDune@gmail.com/.
> 
> I'm worried about this bug.  Current version of a workaround is not correct
> from the OVS point of view since it wildcards dl_type during offloading that is
> not expected by higher layers, causing HW flow being much more wide than
> requested.  In case we going to have this workaround for xl710, we should
> have another workaround for this issue too.
> 
> 
> >>>>
> >>>> Is it possible to address it in DPDK instead of reverting in OVS
> >>>> and later re-reverting?
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Eli
> >>>>
> >>>>>
> >>>>> I've included the i40e DPDK maintainers here for their thoughts also.
> >>>>>
> >>>>> @Beilei/Jia Is this something we can look at to introduce in the
> >>>>> i40e PMD?
> >>>
> >>> Hello,
> >>>
> >>> Please take a look at:
> >>>
> >>> http://mails.dpdk.org/archives/dev/2020-May/166272.html
> >>>
> >>> For MLX it was only an optimization. For i40e something similar may
> >>> be a workaround for this issue.
> >>
> >> Thanks for this Eli, let me sync with Beilei on this.
> >>
> >> If it's something we can resolve in the PMD then I think we can add
> >> an errata or known issue for the 2.14 release (and possibly the 2.13
> >> release as I think the same issue is present there).
> >>
> >> If it was fixed in the future we could then remove the issue notice.>
> >
> > Hi,
> >
> > According to the OVS patch and mlx DPDK patch, is the requirement to
> > support rte flow pattern like 'pattern IPv4 / UDP src is 32 / end', no
> > need to use ''pattern ETH / IPv4 / UDP src is 32 / end '?
> > please correct me if I'm wrong.
> >
> > If yes, could you please tell me how OVS adds a flow which doesn’t
> > include ETH item? I'm not very familiar with OVS, I run some simple
> > commands before, and find eth will always exist. E.g:
> > flow_add: ufid:fced09a8-9b8a-420d-9cb6-454ab9bed224
> > skb_priority(0),skb_mark(0),\
> > ct_state(-new-est-rel-rpl-inv-trk-snat-dnat),ct_zone(0),ct_mark(0),ct_
> > label(0),recirc_id(0),\
> > dp_hash(0),in_port(2),packet_type(ns=0,id=0),eth(src=00:e8:ca:11:ba:80
> > ,dst=ff:ff:ff:ff:ff:ff),
> > eth_type(0x0800),ipv4(src=0.0.0.0,dst=255.255.255.255,proto=17,tos=0x1
> > 0,ttl=128,frag=no),
> > udp(src=68,dst=67), actions:drop
> 
> Emma, Ian, could you please provide the pattern that fails to offload on
> current OVS master so it will be easier for everyone to understand the issue.
> It's not obvious how to construct the bad pattern by only looking at the i40e
> pmd code.  Also, please, enable debug logs and provide the testpmd-style
> arguments constructed by OVS.
> 
> It might be also good to have all this information in the commit message.
> 

Following flow pattern is failing from logs:

  Attributes: ingress=1, egress=0, prio=0, group=0, transfer=1
rte flow eth pattern:
  Spec: src=00:00:04:00:0b:00, dst=00:00:04:00:0a:00, type=0x0800
  Mask: src=00:00:00:00:00:00, dst=00:00:00:00:00:00, type=0xffff
rte flow ipv4 pattern:
  Spec: tos=0x0, ttl=40, proto=0x11, src=1.1.0.0, dst=0.0.0.0
  Mask: tos=0x0, ttl=0, proto=0x0, src=255.255.255.255, dst=0.0.0.0
rte flow count action:
  Count: shared=0, id=0
rte flow port-id action:
  Port-id: original=0, id=1
2020-08-11T10:49:28.002Z|00003|netdev_offload_dpdk(dp_netdev_flow_18)|WARN|dpdk0: rte_flow creation failed: 13 (Unsupported ether_type.).

Or from dump-flows :

flow-dump from pmd on cpu core: 23
ufid:b4ba667d-8f4d-4f45-8896-c541d9fcecc0, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(dpdk0),packet_type(ns=0,id=0),eth(src=00:00:04:00:0b:00/00:00:00:00:00:00,dst=00:00:04:00:0a:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=1.1.0.0,dst=0.0.0.0/0.0.0.0,proto=17/0,tos=0/0,ttl=64/0,frag=no),udp(src=63/0,dst=63/0), packets:42575487, bytes:2554529220, used:0.000s, offloaded:partial, dp:ovs, actions:dpdk1, dp-extra-info:miniflow_bits(4,2)


with regards to " provide the testpmd-style arguments constructed by OVS."
Can you confirm what you mean?

Thanks, 
Emma

> Thanks.
> 
> Best regards, Ilya Maximets.
> 
> >
> > Thanks,
> > Beilei
> >
> >>
> >> Regards
> >> Ian
> >>>
> >>> Thanks,
> >>>
> >>> Eli
> >>>
> >>>>>
> >>>>> Regards
> >>>>> Ian
> >>>>>
> >>>>>>> Fixed by partial reversion of these changes.
> >>>>>>>
> >>>>>>> Signed-off-by: Emma Finn<emma.finn@intel.com>
> >
Ilya Maximets Aug. 11, 2020, 1:18 p.m. UTC | #11
On 8/11/20 3:12 PM, Finn, Emma wrote:
> 
> 
>> -----Original Message-----
>> From: Ilya Maximets <i.maximets@ovn.org>
>> Sent: Tuesday 11 August 2020 11:02
>> To: Xing, Beilei <beilei.xing@intel.com>; Stokes, Ian <ian.stokes@intel.com>;
>> Eli Britstein <elibr@mellanox.com>; Finn, Emma <emma.finn@intel.com>;
>> dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
>> Cc: i.maximets@ovn.org
>> Subject: Re: [PATCH] netdev-offload-dpdk: Fix for broken ethernet matching
>> HWOL for XL710 NIC
>>
>> On 8/11/20 5:35 AM, Xing, Beilei wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Stokes, Ian <ian.stokes@intel.com>
>>>> Sent: Tuesday, August 11, 2020 4:52 AM
>>>> To: Eli Britstein <elibr@mellanox.com>; Finn, Emma
>>>> <emma.finn@intel.com>; dev@openvswitch.org; Xing, Beilei
>>>> <beilei.xing@intel.com>; Guo, Jia <jia.guo@intel.com>
>>>> Cc: i.maximets@ovn.org
>>>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
>>>> matching HWOL for XL710 NIC
>>>>
>>>>> On 8/7/2020 7:55 AM, Eli Britstein wrote:
>>>>>>
>>>>>> On 8/6/2020 8:28 PM, Stokes, Ian wrote:
>>>>>>>> On 8/6/2020 6:17 PM, Emma Finn wrote:
>>>>>>>>> The following 2 commits introduced changes which caused a
>>>>>>>>> regression for XL710 devices and functionality ceases for
>>>>>>>>> partial offload as a result.
>>>>>>>>> 864852a0624a ("netdev-offload-dpdk: Fix Ethernet matching for
>>>>>>>>> type
>>>>>>>>> only.")
>>>>>>>>> a79eae87abe4 ("netdev-offload-dpdk: Remove pre-validate of
>>>>> patterns
>>>>>>>>> function.")
>>>>>>>> OVS is vendor agnostic. That kind of workaround belongs in intel
>>>>>>>> PMD in dpdk, not in OVS.
>>>>>>> Hi Eli,
>>>>>>>
>>>>>>> Yes OVS looks to be vendor agnostic, but this code I believe was
>>>>>>> already in place and working for this usecase. As such it's
>>>>>>> removal introduced a regression from an OVS point of view between
>> the releases.
>>>>>>>
>>>>>>> We have had examples in the past where workarounds are
>> permissible
>>>>>>> if there is a clear path to fixing this in the future on the DPDK
>>>>>>> side (which is what I suggest here) (for example scatter gather
>>>>>>> support for MTUs in the past raised similar issue where we hand to
>>>>>>> handle specific NIC until the next DPDK LTS release).
>>>>>>>
>>>>>>> So my suggestion is to re-instate the original workaround and
>>>>>>> remove its when fixed in the next DPDK LTS which supports the
>>>>>>> change for i40e at the PMD layer or if it's backported to the next
>>>>>>> 19.11 stable release which would be validated for use with OVS.
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> There was a bug with this WA.
>>>>>>
>>>>>> Please see
>>>>>> https://patchwork.ozlabs.org/project/openvswitch/patch/1587180266-
>>>>> 28632-1-git-send-email-JackerDune@gmail.com/.
>>
>> I'm worried about this bug.  Current version of a workaround is not correct
>> from the OVS point of view since it wildcards dl_type during offloading that is
>> not expected by higher layers, causing HW flow being much more wide than
>> requested.  In case we going to have this workaround for xl710, we should
>> have another workaround for this issue too.
>>
>>
>>>>>>
>>>>>> Is it possible to address it in DPDK instead of reverting in OVS
>>>>>> and later re-reverting?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Eli
>>>>>>
>>>>>>>
>>>>>>> I've included the i40e DPDK maintainers here for their thoughts also.
>>>>>>>
>>>>>>> @Beilei/Jia Is this something we can look at to introduce in the
>>>>>>> i40e PMD?
>>>>>
>>>>> Hello,
>>>>>
>>>>> Please take a look at:
>>>>>
>>>>> http://mails.dpdk.org/archives/dev/2020-May/166272.html
>>>>>
>>>>> For MLX it was only an optimization. For i40e something similar may
>>>>> be a workaround for this issue.
>>>>
>>>> Thanks for this Eli, let me sync with Beilei on this.
>>>>
>>>> If it's something we can resolve in the PMD then I think we can add
>>>> an errata or known issue for the 2.14 release (and possibly the 2.13
>>>> release as I think the same issue is present there).
>>>>
>>>> If it was fixed in the future we could then remove the issue notice.>
>>>
>>> Hi,
>>>
>>> According to the OVS patch and mlx DPDK patch, is the requirement to
>>> support rte flow pattern like 'pattern IPv4 / UDP src is 32 / end', no
>>> need to use ''pattern ETH / IPv4 / UDP src is 32 / end '?
>>> please correct me if I'm wrong.
>>>
>>> If yes, could you please tell me how OVS adds a flow which doesn’t
>>> include ETH item? I'm not very familiar with OVS, I run some simple
>>> commands before, and find eth will always exist. E.g:
>>> flow_add: ufid:fced09a8-9b8a-420d-9cb6-454ab9bed224
>>> skb_priority(0),skb_mark(0),\
>>> ct_state(-new-est-rel-rpl-inv-trk-snat-dnat),ct_zone(0),ct_mark(0),ct_
>>> label(0),recirc_id(0),\
>>> dp_hash(0),in_port(2),packet_type(ns=0,id=0),eth(src=00:e8:ca:11:ba:80
>>> ,dst=ff:ff:ff:ff:ff:ff),
>>> eth_type(0x0800),ipv4(src=0.0.0.0,dst=255.255.255.255,proto=17,tos=0x1
>>> 0,ttl=128,frag=no),
>>> udp(src=68,dst=67), actions:drop
>>
>> Emma, Ian, could you please provide the pattern that fails to offload on
>> current OVS master so it will be easier for everyone to understand the issue.
>> It's not obvious how to construct the bad pattern by only looking at the i40e
>> pmd code.  Also, please, enable debug logs and provide the testpmd-style
>> arguments constructed by OVS.
>>
>> It might be also good to have all this information in the commit message.
>>
> 
> Following flow pattern is failing from logs:
> 
>   Attributes: ingress=1, egress=0, prio=0, group=0, transfer=1
> rte flow eth pattern:
>   Spec: src=00:00:04:00:0b:00, dst=00:00:04:00:0a:00, type=0x0800
>   Mask: src=00:00:00:00:00:00, dst=00:00:00:00:00:00, type=0xffff
> rte flow ipv4 pattern:
>   Spec: tos=0x0, ttl=40, proto=0x11, src=1.1.0.0, dst=0.0.0.0
>   Mask: tos=0x0, ttl=0, proto=0x0, src=255.255.255.255, dst=0.0.0.0
> rte flow count action:
>   Count: shared=0, id=0
> rte flow port-id action:
>   Port-id: original=0, id=1
> 2020-08-11T10:49:28.002Z|00003|netdev_offload_dpdk(dp_netdev_flow_18)|WARN|dpdk0: rte_flow creation failed: 13 (Unsupported ether_type.).
> 
> Or from dump-flows :
> 
> flow-dump from pmd on cpu core: 23
> ufid:b4ba667d-8f4d-4f45-8896-c541d9fcecc0, skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(dpdk0),packet_type(ns=0,id=0),eth(src=00:00:04:00:0b:00/00:00:00:00:00:00,dst=00:00:04:00:0a:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=1.1.0.0,dst=0.0.0.0/0.0.0.0,proto=17/0,tos=0/0,ttl=64/0,frag=no),udp(src=63/0,dst=63/0), packets:42575487, bytes:2554529220, used:0.000s, offloaded:partial, dp:ovs, actions:dpdk1, dp-extra-info:miniflow_bits(4,2)
> 
> 
> with regards to " provide the testpmd-style arguments constructed by OVS."
> Can you confirm what you mean?

We changed the way of logging in a following commit:
d8ad173fb9c1 ("netdev-offload-dpdk: Log testpmd format for flow create/destroy.")

It's on master and branch-2.14.

> 
> Thanks, 
> Emma
> 
>> Thanks.
>>
>> Best regards, Ilya Maximets.
>>
>>>
>>> Thanks,
>>> Beilei
>>>
>>>>
>>>> Regards
>>>> Ian
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Eli
>>>>>
>>>>>>>
>>>>>>> Regards
>>>>>>> Ian
>>>>>>>
>>>>>>>>> Fixed by partial reversion of these changes.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Emma Finn<emma.finn@intel.com>
>>>
>
Eli Britstein Aug. 11, 2020, 3:48 p.m. UTC | #12
>-----Original Message-----
>From: Ilya Maximets <i.maximets@ovn.org>
>Sent: Tuesday, August 11, 2020 4:19 PM
>To: Finn, Emma <emma.finn@intel.com>; Ilya Maximets <i.maximets@ovn.org>;
>Xing, Beilei <beilei.xing@intel.com>; Stokes, Ian <ian.stokes@intel.com>; Eli
>Britstein <elibr@mellanox.com>; dev@openvswitch.org; Guo, Jia
><jia.guo@intel.com>
>Subject: Re: [PATCH] netdev-offload-dpdk: Fix for broken ethernet matching
>HWOL for XL710 NIC
>
>On 8/11/20 3:12 PM, Finn, Emma wrote:
>>
>>
>>> -----Original Message-----
>>> From: Ilya Maximets <i.maximets@ovn.org>
>>> Sent: Tuesday 11 August 2020 11:02
>>> To: Xing, Beilei <beilei.xing@intel.com>; Stokes, Ian
>>> <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>; Finn,
>>> Emma <emma.finn@intel.com>; dev@openvswitch.org; Guo, Jia
>>> <jia.guo@intel.com>
>>> Cc: i.maximets@ovn.org
>>> Subject: Re: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
>>> matching HWOL for XL710 NIC
>>>
>>> On 8/11/20 5:35 AM, Xing, Beilei wrote:
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Stokes, Ian <ian.stokes@intel.com>
>>>>> Sent: Tuesday, August 11, 2020 4:52 AM
>>>>> To: Eli Britstein <elibr@mellanox.com>; Finn, Emma
>>>>> <emma.finn@intel.com>; dev@openvswitch.org; Xing, Beilei
>>>>> <beilei.xing@intel.com>; Guo, Jia <jia.guo@intel.com>
>>>>> Cc: i.maximets@ovn.org
>>>>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
>>>>> matching HWOL for XL710 NIC
>>>>>
>>>>>> On 8/7/2020 7:55 AM, Eli Britstein wrote:
>>>>>>>
>>>>>>> On 8/6/2020 8:28 PM, Stokes, Ian wrote:
>>>>>>>>> On 8/6/2020 6:17 PM, Emma Finn wrote:
>>>>>>>>>> The following 2 commits introduced changes which caused a
>>>>>>>>>> regression for XL710 devices and functionality ceases for
>>>>>>>>>> partial offload as a result.
>>>>>>>>>> 864852a0624a ("netdev-offload-dpdk: Fix Ethernet matching for
>>>>>>>>>> type
>>>>>>>>>> only.")
>>>>>>>>>> a79eae87abe4 ("netdev-offload-dpdk: Remove pre-validate of
>>>>>> patterns
>>>>>>>>>> function.")
>>>>>>>>> OVS is vendor agnostic. That kind of workaround belongs in
>>>>>>>>> intel PMD in dpdk, not in OVS.
>>>>>>>> Hi Eli,
>>>>>>>>
>>>>>>>> Yes OVS looks to be vendor agnostic, but this code I believe was
>>>>>>>> already in place and working for this usecase. As such it's
>>>>>>>> removal introduced a regression from an OVS point of view
>>>>>>>> between
>>> the releases.
>>>>>>>>
>>>>>>>> We have had examples in the past where workarounds are
>>> permissible
>>>>>>>> if there is a clear path to fixing this in the future on the
>>>>>>>> DPDK side (which is what I suggest here) (for example scatter
>>>>>>>> gather support for MTUs in the past raised similar issue where
>>>>>>>> we hand to handle specific NIC until the next DPDK LTS release).
>>>>>>>>
>>>>>>>> So my suggestion is to re-instate the original workaround and
>>>>>>>> remove its when fixed in the next DPDK LTS which supports the
>>>>>>>> change for i40e at the PMD layer or if it's backported to the
>>>>>>>> next
>>>>>>>> 19.11 stable release which would be validated for use with OVS.
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> There was a bug with this WA.
>>>>>>>
>>>>>>> Please see
>>>>>>> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2
>>>>>>>
>Fpatchwork.ozlabs.org%2Fproject%2Fopenvswitch%2Fpatch%2F158718026
>>>>>>> 6-
>&amp;data=02%7C01%7Celibr%40nvidia.com%7C2eb7f5d9553c44e4980a08
>>>>>>>
>d83df912c9%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637327
>487
>>>>>>>
>263661244&amp;sdata=fgnwXLy3xt%2B8H8h9BJwDzlPmp3dtWv6AMTQ69%2
>B9kb
>>>>>>> NM%3D&amp;reserved=0
>>>>>> 28632-1-git-send-email-JackerDune@gmail.com/.
>>>
>>> I'm worried about this bug.  Current version of a workaround is not
>>> correct from the OVS point of view since it wildcards dl_type during
>>> offloading that is not expected by higher layers, causing HW flow
>>> being much more wide than requested.  In case we going to have this
>>> workaround for xl710, we should have another workaround for this issue too.
>>>
>>>
>>>>>>>
>>>>>>> Is it possible to address it in DPDK instead of reverting in OVS
>>>>>>> and later re-reverting?
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Eli
>>>>>>>
>>>>>>>>
>>>>>>>> I've included the i40e DPDK maintainers here for their thoughts also.
>>>>>>>>
>>>>>>>> @Beilei/Jia Is this something we can look at to introduce in the
>>>>>>>> i40e PMD?
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> Please take a look at:
>>>>>>
>>>>>> https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fm
>>>>>> ails.dpdk.org%2Farchives%2Fdev%2F2020-
>May%2F166272.html&amp;data=0
>>>>>>
>2%7C01%7Celibr%40nvidia.com%7C2eb7f5d9553c44e4980a08d83df912c9%7
>C4
>>>>>>
>3083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637327487263661244&a
>mp;s
>>>>>>
>data=g2Yiy07cL4yGIQYsJzNkZqimUibNMXsDQLNdtJ8Evmw%3D&amp;reserved=
>0
>>>>>>
>>>>>> For MLX it was only an optimization. For i40e something similar
>>>>>> may be a workaround for this issue.
>>>>>
>>>>> Thanks for this Eli, let me sync with Beilei on this.
>>>>>
>>>>> If it's something we can resolve in the PMD then I think we can add
>>>>> an errata or known issue for the 2.14 release (and possibly the
>>>>> 2.13 release as I think the same issue is present there).
>>>>>
>>>>> If it was fixed in the future we could then remove the issue
>>>>> notice.>
>>>>
>>>> Hi,
>>>>
>>>> According to the OVS patch and mlx DPDK patch, is the requirement to
>>>> support rte flow pattern like 'pattern IPv4 / UDP src is 32 / end',
>>>> no need to use ''pattern ETH / IPv4 / UDP src is 32 / end '?
>>>> please correct me if I'm wrong.
>>>>
>>>> If yes, could you please tell me how OVS adds a flow which doesn’t
>>>> include ETH item? I'm not very familiar with OVS, I run some simple
>>>> commands before, and find eth will always exist. E.g:
>>>> flow_add: ufid:fced09a8-9b8a-420d-9cb6-454ab9bed224
>>>> skb_priority(0),skb_mark(0),\
>>>> ct_state(-new-est-rel-rpl-inv-trk-snat-dnat),ct_zone(0),ct_mark(0),c
>>>> t_
>>>> label(0),recirc_id(0),\
>>>> dp_hash(0),in_port(2),packet_type(ns=0,id=0),eth(src=00:e8:ca:11:ba:
>>>> 80
>>>> ,dst=ff:ff:ff:ff:ff:ff),
>>>> eth_type(0x0800),ipv4(src=0.0.0.0,dst=255.255.255.255,proto=17,tos=0
>>>> x1
>>>> 0,ttl=128,frag=no),
>>>> udp(src=68,dst=67), actions:drop
>>>
>>> Emma, Ian, could you please provide the pattern that fails to offload
>>> on current OVS master so it will be easier for everyone to understand the
>issue.
>>> It's not obvious how to construct the bad pattern by only looking at
>>> the i40e pmd code.  Also, please, enable debug logs and provide the
>>> testpmd-style arguments constructed by OVS.
>>>
>>> It might be also good to have all this information in the commit message.
>>>
>>
>> Following flow pattern is failing from logs:
>>
>>   Attributes: ingress=1, egress=0, prio=0, group=0, transfer=1 rte
>> flow eth pattern:
>>   Spec: src=00:00:04:00:0b:00, dst=00:00:04:00:0a:00, type=0x0800
>>   Mask: src=00:00:00:00:00:00, dst=00:00:00:00:00:00, type=0xffff rte
>> flow ipv4 pattern:
>>   Spec: tos=0x0, ttl=40, proto=0x11, src=1.1.0.0, dst=0.0.0.0
>>   Mask: tos=0x0, ttl=0, proto=0x0, src=255.255.255.255, dst=0.0.0.0
>> rte flow count action:
>>   Count: shared=0, id=0
>> rte flow port-id action:
>>   Port-id: original=0, id=1
>> 2020-08-
>11T10:49:28.002Z|00003|netdev_offload_dpdk(dp_netdev_flow_18)|WARN|
>dpdk0: rte_flow creation failed: 13 (Unsupported ether_type.).
Hi

Maybe another workaround until fixed in i40e PMD is to mask out the ether_type match if there is IPv4 or IPv6.
It means that patterns of:
eth type is 0x0800 / ipv4 / ...
will be
eth / ipv4 / ...

For IPv6 the same, but for other ether types no removal of the match.
eth type is 0x1234 /
kept the same.

I referred to MLX5 PMD patch. As I mentioned, for MLX5 it's only an optimization, for XL710 it can be used as a workaround.
http://mails.dpdk.org/archives/dev/2020-May/166272.html

A similar masking out of protocol type is done today for TCP/UDP/SCTP/ICMP. For example:
        /* proto == TCP and ITEM_TYPE_TCP, thus no need for proto match. */    
        if (next_proto_mask) {                                                 
            *next_proto_mask = 0;                                              
        }                                                                      
Please take a look at https://patchwork.ozlabs.org/project/openvswitch/patch/20200710120718.38633-3-sriharsha.basavapatna@broadcom.com/
That posted commit suggest not to mask out the protocol type and let the PMDs do it if they wish.

The proposed workaround (not tested):
diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index de6101e4d..1af7bebcd 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -676,6 +676,7 @@ static int
 parse_flow_match(struct flow_patterns *patterns,
                  struct match *match)
 {
+    uint8_t *next_eth_type_mask = NULL;
     uint8_t *next_proto_mask = NULL;
     struct flow *consumed_masks;
     uint8_t proto = 0;
@@ -712,6 +713,7 @@ parse_flow_match(struct flow_patterns *patterns,
         consumed_masks->dl_type = 0;
 
         add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec, mask);
+        next_eth_type_mask = &mask->type;
     }
 
     /* VLAN */
@@ -739,6 +741,11 @@ parse_flow_match(struct flow_patterns *patterns,
     if (match->flow.dl_type == htons(ETH_TYPE_IP)) {
         struct rte_flow_item_ipv4 *spec, *mask;
 
+        /* IPv4. No need to match ether type. */
+        if (next_eth_type_mask) {
+            *next_eth_type_mask = 0;
+        }
+
         spec = xzalloc(sizeof *spec);
         mask = xzalloc(sizeof *mask);
 
@@ -777,6 +784,11 @@ parse_flow_match(struct flow_patterns *patterns,
     if (match->flow.dl_type == htons(ETH_TYPE_IPV6)) {
         struct rte_flow_item_ipv6 *spec, *mask;
 
+        /* IPv6. No need to match ether type. */
+        if (next_eth_type_mask) {
+            *next_eth_type_mask = 0;
+        }
+
         spec = xzalloc(sizeof *spec);
         mask = xzalloc(sizeof *mask);


>>
>> Or from dump-flows :
>>
>> flow-dump from pmd on cpu core: 23
>> ufid:b4ba667d-8f4d-4f45-8896-c541d9fcecc0,
>> skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0
>> ),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(dpdk0),packet_type(n
>> s=0,id=0),eth(src=00:00:04:00:0b:00/00:00:00:00:00:00,dst=00:00:04:00:
>> 0a:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=1.1.0.0,dst=0.0.0.0
>> /0.0.0.0,proto=17/0,tos=0/0,ttl=64/0,frag=no),udp(src=63/0,dst=63/0),
>> packets:42575487, bytes:2554529220, used:0.000s, offloaded:partial,
>> dp:ovs, actions:dpdk1, dp-extra-info:miniflow_bits(4,2)
>>
>>
>> with regards to " provide the testpmd-style arguments constructed by OVS."
>> Can you confirm what you mean?
>
>We changed the way of logging in a following commit:
>d8ad173fb9c1 ("netdev-offload-dpdk: Log testpmd format for flow
>create/destroy.")
>
>It's on master and branch-2.14.
>
>>
>> Thanks,
>> Emma
>>
>>> Thanks.
>>>
>>> Best regards, Ilya Maximets.
>>>
>>>>
>>>> Thanks,
>>>> Beilei
>>>>
>>>>>
>>>>> Regards
>>>>> Ian
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Eli
>>>>>>
>>>>>>>>
>>>>>>>> Regards
>>>>>>>> Ian
>>>>>>>>
>>>>>>>>>> Fixed by partial reversion of these changes.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Emma Finn<emma.finn@intel.com>
>>>>
>>
Emma Finn Aug. 13, 2020, 8:47 a.m. UTC | #13
> -----Original Message-----
> From: Eli Britstein <elibr@nvidia.com>
> Sent: Tuesday 11 August 2020 16:49
> To: Ilya Maximets <i.maximets@ovn.org>; Finn, Emma
> <emma.finn@intel.com>; Xing, Beilei <beilei.xing@intel.com>; Stokes, Ian
> <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>;
> dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet matching
> HWOL for XL710 NIC
> 
> 
> 
> >-----Original Message-----
> >From: Ilya Maximets <i.maximets@ovn.org>
> >Sent: Tuesday, August 11, 2020 4:19 PM
> >To: Finn, Emma <emma.finn@intel.com>; Ilya Maximets
> ><i.maximets@ovn.org>; Xing, Beilei <beilei.xing@intel.com>; Stokes, Ian
> ><ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>;
> >dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
> >Subject: Re: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
> >matching HWOL for XL710 NIC
> >
> >On 8/11/20 3:12 PM, Finn, Emma wrote:
> >>
> >>
> >>> -----Original Message-----
> >>> From: Ilya Maximets <i.maximets@ovn.org>
> >>> Sent: Tuesday 11 August 2020 11:02
> >>> To: Xing, Beilei <beilei.xing@intel.com>; Stokes, Ian
> >>> <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>; Finn,
> >>> Emma <emma.finn@intel.com>; dev@openvswitch.org; Guo, Jia
> >>> <jia.guo@intel.com>
> >>> Cc: i.maximets@ovn.org
> >>> Subject: Re: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
> >>> matching HWOL for XL710 NIC
> >>>
> >>> On 8/11/20 5:35 AM, Xing, Beilei wrote:
> >>>>
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: Stokes, Ian <ian.stokes@intel.com>
> >>>>> Sent: Tuesday, August 11, 2020 4:52 AM
> >>>>> To: Eli Britstein <elibr@mellanox.com>; Finn, Emma
> >>>>> <emma.finn@intel.com>; dev@openvswitch.org; Xing, Beilei
> >>>>> <beilei.xing@intel.com>; Guo, Jia <jia.guo@intel.com>
> >>>>> Cc: i.maximets@ovn.org
> >>>>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
> >>>>> matching HWOL for XL710 NIC
> >>>>>
> >>>>>> On 8/7/2020 7:55 AM, Eli Britstein wrote:
> >>>>>>>
> >>>>>>> On 8/6/2020 8:28 PM, Stokes, Ian wrote:
> >>>>>>>>> On 8/6/2020 6:17 PM, Emma Finn wrote:
> >>>>>>>>>> The following 2 commits introduced changes which caused a
> >>>>>>>>>> regression for XL710 devices and functionality ceases for
> >>>>>>>>>> partial offload as a result.
> >>>>>>>>>> 864852a0624a ("netdev-offload-dpdk: Fix Ethernet matching for
> >>>>>>>>>> type
> >>>>>>>>>> only.")
> >>>>>>>>>> a79eae87abe4 ("netdev-offload-dpdk: Remove pre-validate of
> >>>>>> patterns
> >>>>>>>>>> function.")
> >>>>>>>>> OVS is vendor agnostic. That kind of workaround belongs in
> >>>>>>>>> intel PMD in dpdk, not in OVS.
> >>>>>>>> Hi Eli,
> >>>>>>>>
> >>>>>>>> Yes OVS looks to be vendor agnostic, but this code I believe
> >>>>>>>> was already in place and working for this usecase. As such it's
> >>>>>>>> removal introduced a regression from an OVS point of view
> >>>>>>>> between
> >>> the releases.
> >>>>>>>>
> >>>>>>>> We have had examples in the past where workarounds are
> >>> permissible
> >>>>>>>> if there is a clear path to fixing this in the future on the
> >>>>>>>> DPDK side (which is what I suggest here) (for example scatter
> >>>>>>>> gather support for MTUs in the past raised similar issue where
> >>>>>>>> we hand to handle specific NIC until the next DPDK LTS release).
> >>>>>>>>
> >>>>>>>> So my suggestion is to re-instate the original workaround and
> >>>>>>>> remove its when fixed in the next DPDK LTS which supports the
> >>>>>>>> change for i40e at the PMD layer or if it's backported to the
> >>>>>>>> next
> >>>>>>>> 19.11 stable release which would be validated for use with OVS.
> >>>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> There was a bug with this WA.
> >>>>>>>
> >>>>>>> Please see
> >>>>>>>
> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%
> >>>>>>> 2
> >>>>>>>
> >Fpatchwork.ozlabs.org%2Fproject%2Fopenvswitch%2Fpatch%2F158718026
> >>>>>>> 6-
> >&amp;data=02%7C01%7Celibr%40nvidia.com%7C2eb7f5d9553c44e4980a08
> >>>>>>>
> >d83df912c9%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C63732
> 7
> >487
> >>>>>>>
> >263661244&amp;sdata=fgnwXLy3xt%2B8H8h9BJwDzlPmp3dtWv6AMTQ69%
> 2
> >B9kb
> >>>>>>> NM%3D&amp;reserved=0
> >>>>>> 28632-1-git-send-email-JackerDune@gmail.com/.
> >>>
> >>> I'm worried about this bug.  Current version of a workaround is not
> >>> correct from the OVS point of view since it wildcards dl_type during
> >>> offloading that is not expected by higher layers, causing HW flow
> >>> being much more wide than requested.  In case we going to have this
> >>> workaround for xl710, we should have another workaround for this
> issue too.
> >>>
> >>>
> >>>>>>>
> >>>>>>> Is it possible to address it in DPDK instead of reverting in OVS
> >>>>>>> and later re-reverting?
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>>
> >>>>>>> Eli
> >>>>>>>
> >>>>>>>>
> >>>>>>>> I've included the i40e DPDK maintainers here for their thoughts
> also.
> >>>>>>>>
> >>>>>>>> @Beilei/Jia Is this something we can look at to introduce in
> >>>>>>>> the i40e PMD?
> >>>>>>
> >>>>>> Hello,
> >>>>>>
> >>>>>> Please take a look at:
> >>>>>>
> >>>>>>
> https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2F
> >>>>>> m
> >>>>>> ails.dpdk.org%2Farchives%2Fdev%2F2020-
> >May%2F166272.html&amp;data=0
> >>>>>>
> >2%7C01%7Celibr%40nvidia.com%7C2eb7f5d9553c44e4980a08d83df912c9%
> 7
> >C4
> >>>>>>
> >3083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637327487263661244&
> a
> >mp;s
> >>>>>>
> >data=g2Yiy07cL4yGIQYsJzNkZqimUibNMXsDQLNdtJ8Evmw%3D&amp;reserv
> ed=
> >0
> >>>>>>
> >>>>>> For MLX it was only an optimization. For i40e something similar
> >>>>>> may be a workaround for this issue.
> >>>>>
> >>>>> Thanks for this Eli, let me sync with Beilei on this.
> >>>>>
> >>>>> If it's something we can resolve in the PMD then I think we can
> >>>>> add an errata or known issue for the 2.14 release (and possibly
> >>>>> the
> >>>>> 2.13 release as I think the same issue is present there).
> >>>>>
> >>>>> If it was fixed in the future we could then remove the issue
> >>>>> notice.>
> >>>>
> >>>> Hi,
> >>>>
> >>>> According to the OVS patch and mlx DPDK patch, is the requirement
> >>>> to support rte flow pattern like 'pattern IPv4 / UDP src is 32 /
> >>>> end', no need to use ''pattern ETH / IPv4 / UDP src is 32 / end '?
> >>>> please correct me if I'm wrong.
> >>>>
> >>>> If yes, could you please tell me how OVS adds a flow which doesn’t
> >>>> include ETH item? I'm not very familiar with OVS, I run some simple
> >>>> commands before, and find eth will always exist. E.g:
> >>>> flow_add: ufid:fced09a8-9b8a-420d-9cb6-454ab9bed224
> >>>> skb_priority(0),skb_mark(0),\
> >>>> ct_state(-new-est-rel-rpl-inv-trk-snat-dnat),ct_zone(0),ct_mark(0),
> >>>> c
> >>>> t_
> >>>> label(0),recirc_id(0),\
> >>>> dp_hash(0),in_port(2),packet_type(ns=0,id=0),eth(src=00:e8:ca:11:ba:
> >>>> 80
> >>>> ,dst=ff:ff:ff:ff:ff:ff),
> >>>> eth_type(0x0800),ipv4(src=0.0.0.0,dst=255.255.255.255,proto=17,tos=
> >>>> 0
> >>>> x1
> >>>> 0,ttl=128,frag=no),
> >>>> udp(src=68,dst=67), actions:drop
> >>>
> >>> Emma, Ian, could you please provide the pattern that fails to
> >>> offload on current OVS master so it will be easier for everyone to
> >>> understand the
> >issue.
> >>> It's not obvious how to construct the bad pattern by only looking at
> >>> the i40e pmd code.  Also, please, enable debug logs and provide the
> >>> testpmd-style arguments constructed by OVS.
> >>>
> >>> It might be also good to have all this information in the commit message.
> >>>
> >>
> >> Following flow pattern is failing from logs:
> >>
> >>   Attributes: ingress=1, egress=0, prio=0, group=0, transfer=1 rte
> >> flow eth pattern:
> >>   Spec: src=00:00:04:00:0b:00, dst=00:00:04:00:0a:00, type=0x0800
> >>   Mask: src=00:00:00:00:00:00, dst=00:00:00:00:00:00, type=0xffff rte
> >> flow ipv4 pattern:
> >>   Spec: tos=0x0, ttl=40, proto=0x11, src=1.1.0.0, dst=0.0.0.0
> >>   Mask: tos=0x0, ttl=0, proto=0x0, src=255.255.255.255, dst=0.0.0.0
> >> rte flow count action:
> >>   Count: shared=0, id=0
> >> rte flow port-id action:
> >>   Port-id: original=0, id=1
> >> 2020-08-
> >11T10:49:28.002Z|00003|netdev_offload_dpdk(dp_netdev_flow_18)|WAR
> N|
> >dpdk0: rte_flow creation failed: 13 (Unsupported ether_type.).
> Hi
> 
> Maybe another workaround until fixed in i40e PMD is to mask out the
> ether_type match if there is IPv4 or IPv6.
> It means that patterns of:
> eth type is 0x0800 / ipv4 / ...
> will be
> eth / ipv4 / ...
> 
> For IPv6 the same, but for other ether types no removal of the match.
> eth type is 0x1234 /
> kept the same.
> 
> I referred to MLX5 PMD patch. As I mentioned, for MLX5 it's only an
> optimization, for XL710 it can be used as a workaround.
> http://mails.dpdk.org/archives/dev/2020-May/166272.html
> 
> A similar masking out of protocol type is done today for
> TCP/UDP/SCTP/ICMP. For example:
>         /* proto == TCP and ITEM_TYPE_TCP, thus no need for proto match. */
>         if (next_proto_mask) {
>             *next_proto_mask = 0;
>         }
> Please take a look at
> https://patchwork.ozlabs.org/project/openvswitch/patch/20200710120718.3
> 8633-3-sriharsha.basavapatna@broadcom.com/
> That posted commit suggest not to mask out the protocol type and let the
> PMDs do it if they wish.
> 
> The proposed workaround (not tested):
> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c index
> de6101e4d..1af7bebcd 100644
> --- a/lib/netdev-offload-dpdk.c
> +++ b/lib/netdev-offload-dpdk.c
> @@ -676,6 +676,7 @@ static int
>  parse_flow_match(struct flow_patterns *patterns,
>                   struct match *match)
>  {
> +    uint8_t *next_eth_type_mask = NULL;
>      uint8_t *next_proto_mask = NULL;
>      struct flow *consumed_masks;
>      uint8_t proto = 0;
> @@ -712,6 +713,7 @@ parse_flow_match(struct flow_patterns *patterns,
>          consumed_masks->dl_type = 0;
> 
>          add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec, mask);
> +        next_eth_type_mask = &mask->type;
>      }
> 
>      /* VLAN */
> @@ -739,6 +741,11 @@ parse_flow_match(struct flow_patterns *patterns,
>      if (match->flow.dl_type == htons(ETH_TYPE_IP)) {
>          struct rte_flow_item_ipv4 *spec, *mask;
> 
> +        /* IPv4. No need to match ether type. */
> +        if (next_eth_type_mask) {
> +            *next_eth_type_mask = 0;
> +        }
> +
>          spec = xzalloc(sizeof *spec);
>          mask = xzalloc(sizeof *mask);
> 
> @@ -777,6 +784,11 @@ parse_flow_match(struct flow_patterns *patterns,
>      if (match->flow.dl_type == htons(ETH_TYPE_IPV6)) {
>          struct rte_flow_item_ipv6 *spec, *mask;
> 
> +        /* IPv6. No need to match ether type. */
> +        if (next_eth_type_mask) {
> +            *next_eth_type_mask = 0;
> +        }
> +
>          spec = xzalloc(sizeof *spec);
>          mask = xzalloc(sizeof *mask);
> 
> 
> >>
> >> Or from dump-flows :
> >>
> >> flow-dump from pmd on cpu core: 23
> >> ufid:b4ba667d-8f4d-4f45-8896-c541d9fcecc0,
> >> skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/
> >> 0
> >> ),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(dpdk0),packet_type(
> >> n
> >> s=0,id=0),eth(src=00:00:04:00:0b:00/00:00:00:00:00:00,dst=00:00:04:00:
> >> 0a:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=1.1.0.0,dst=0.0.0.
> >> 0
> >> /0.0.0.0,proto=17/0,tos=0/0,ttl=64/0,frag=no),udp(src=63/0,dst=63/0),
> >> packets:42575487, bytes:2554529220, used:0.000s, offloaded:partial,
> >> dp:ovs, actions:dpdk1, dp-extra-info:miniflow_bits(4,2)
> >>
> >>
> >> with regards to " provide the testpmd-style arguments constructed by
> OVS."
> >> Can you confirm what you mean?
> >
> >We changed the way of logging in a following commit:
> >d8ad173fb9c1 ("netdev-offload-dpdk: Log testpmd format for flow
> >create/destroy.")
> >
> >It's on master and branch-2.14.
> >
> >>

Tested both of these patches and neither fixed the issue.

The problem is that parse_flow_match() is setting an ethernet pattern 
that the i40e PMD doesn't support.
Working Flow:
rte flow eth pattern:
Spec = null
Mask = null
rte flow ipv4 pattern:
Spec: tos=0x0, ttl=40, proto=0x11, src=21.2.5.1, dst=0.0.0.0
Mask: tos=0x0, ttl=0, proto=0x0, src=240.0.0.0, dst=0.0.0.0
 
Broken Flow:
flow eth pattern:
Spec: src=00:00:04:00:0b:00, dst=00:00:04:00:0a:00, type=0x0800
Mask: src=00:00:00:00:00:00, dst=00:00:00:00:00:00, type=0xffff rte 
flow ipv4 pattern:
Spec: tos=0x0, ttl=40, proto=0x11, src=1.1.0.0, dst=0.0.0.0
Mask: tos=0x0, ttl=0, proto=0x0, src=255.255.255.255, dst=0.0.0.0

I will continue to try rework the patches to fix this.

Thanks, 
Emma

> >> Thanks,
> >> Emma
> >>
> >>> Thanks.
> >>>
> >>> Best regards, Ilya Maximets.
> >>>
> >>>>
> >>>> Thanks,
> >>>> Beilei
> >>>>
> >>>>>
> >>>>> Regards
> >>>>> Ian
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>> Eli
> >>>>>>
> >>>>>>>>
> >>>>>>>> Regards
> >>>>>>>> Ian
> >>>>>>>>
> >>>>>>>>>> Fixed by partial reversion of these changes.
> >>>>>>>>>>
> >>>>>>>>>> Signed-off-by: Emma Finn<emma.finn@intel.com>
> >>>>
> >>
Eli Britstein Aug. 13, 2020, 9:26 a.m. UTC | #14
>-----Original Message-----
>From: Finn, Emma <emma.finn@intel.com>
>Sent: Thursday, August 13, 2020 11:47 AM
>To: Eli Britstein <elibr@nvidia.com>; Ilya Maximets <i.maximets@ovn.org>; Xing,
>Beilei <beilei.xing@intel.com>; Stokes, Ian <ian.stokes@intel.com>; Eli Britstein
><elibr@mellanox.com>; dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
>Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet matching
>HWOL for XL710 NIC
>
>> -----Original Message-----
>> From: Eli Britstein <elibr@nvidia.com>
>> Sent: Tuesday 11 August 2020 16:49
>> To: Ilya Maximets <i.maximets@ovn.org>; Finn, Emma
>> <emma.finn@intel.com>; Xing, Beilei <beilei.xing@intel.com>; Stokes,
>> Ian <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>;
>> dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
>> matching HWOL for XL710 NIC
>>
>>
>>
>> >-----Original Message-----
>> >From: Ilya Maximets <i.maximets@ovn.org>
>> >Sent: Tuesday, August 11, 2020 4:19 PM
>> >To: Finn, Emma <emma.finn@intel.com>; Ilya Maximets
>> ><i.maximets@ovn.org>; Xing, Beilei <beilei.xing@intel.com>; Stokes,
>> >Ian <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>;
>> >dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
>> >Subject: Re: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
>> >matching HWOL for XL710 NIC
>> >
>> >On 8/11/20 3:12 PM, Finn, Emma wrote:
>> >>
>> >>
>> >>> -----Original Message-----
>> >>> From: Ilya Maximets <i.maximets@ovn.org>
>> >>> Sent: Tuesday 11 August 2020 11:02
>> >>> To: Xing, Beilei <beilei.xing@intel.com>; Stokes, Ian
>> >>> <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>; Finn,
>> >>> Emma <emma.finn@intel.com>; dev@openvswitch.org; Guo, Jia
>> >>> <jia.guo@intel.com>
>> >>> Cc: i.maximets@ovn.org
>> >>> Subject: Re: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
>> >>> matching HWOL for XL710 NIC
>> >>>
>> >>> On 8/11/20 5:35 AM, Xing, Beilei wrote:
>> >>>>
>> >>>>
>> >>>>> -----Original Message-----
>> >>>>> From: Stokes, Ian <ian.stokes@intel.com>
>> >>>>> Sent: Tuesday, August 11, 2020 4:52 AM
>> >>>>> To: Eli Britstein <elibr@mellanox.com>; Finn, Emma
>> >>>>> <emma.finn@intel.com>; dev@openvswitch.org; Xing, Beilei
>> >>>>> <beilei.xing@intel.com>; Guo, Jia <jia.guo@intel.com>
>> >>>>> Cc: i.maximets@ovn.org
>> >>>>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken
>> >>>>> ethernet matching HWOL for XL710 NIC
>> >>>>>
>> >>>>>> On 8/7/2020 7:55 AM, Eli Britstein wrote:
>> >>>>>>>
>> >>>>>>> On 8/6/2020 8:28 PM, Stokes, Ian wrote:
>> >>>>>>>>> On 8/6/2020 6:17 PM, Emma Finn wrote:
>> >>>>>>>>>> The following 2 commits introduced changes which caused a
>> >>>>>>>>>> regression for XL710 devices and functionality ceases for
>> >>>>>>>>>> partial offload as a result.
>> >>>>>>>>>> 864852a0624a ("netdev-offload-dpdk: Fix Ethernet matching
>> >>>>>>>>>> for type
>> >>>>>>>>>> only.")
>> >>>>>>>>>> a79eae87abe4 ("netdev-offload-dpdk: Remove pre-validate of
>> >>>>>> patterns
>> >>>>>>>>>> function.")
>> >>>>>>>>> OVS is vendor agnostic. That kind of workaround belongs in
>> >>>>>>>>> intel PMD in dpdk, not in OVS.
>> >>>>>>>> Hi Eli,
>> >>>>>>>>
>> >>>>>>>> Yes OVS looks to be vendor agnostic, but this code I believe
>> >>>>>>>> was already in place and working for this usecase. As such
>> >>>>>>>> it's removal introduced a regression from an OVS point of
>> >>>>>>>> view between
>> >>> the releases.
>> >>>>>>>>
>> >>>>>>>> We have had examples in the past where workarounds are
>> >>> permissible
>> >>>>>>>> if there is a clear path to fixing this in the future on the
>> >>>>>>>> DPDK side (which is what I suggest here) (for example scatter
>> >>>>>>>> gather support for MTUs in the past raised similar issue
>> >>>>>>>> where we hand to handle specific NIC until the next DPDK LTS
>release).
>> >>>>>>>>
>> >>>>>>>> So my suggestion is to re-instate the original workaround and
>> >>>>>>>> remove its when fixed in the next DPDK LTS which supports the
>> >>>>>>>> change for i40e at the PMD layer or if it's backported to the
>> >>>>>>>> next
>> >>>>>>>> 19.11 stable release which would be validated for use with OVS.
>> >>>>>>>
>> >>>>>>> Hi,
>> >>>>>>>
>> >>>>>>> There was a bug with this WA.
>> >>>>>>>
>> >>>>>>> Please see
>> >>>>>>>
>> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%25
>> >>>>>>> 2
>> >>>>>>>
>> >Fpatchwork.ozlabs.org%2Fproject%2Fopenvswitch%2Fpatch%2F158718026
>> >>>>>>> 6-
>>
>>&amp;data=02%7C01%7Celibr%40nvidia.com%7C2eb7f5d9553c44e4980a08
>> >>>>>>>
>>
>>d83df912c9%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C63732
>> 7
>> >487
>> >>>>>>>
>>
>>263661244&amp;sdata=fgnwXLy3xt%2B8H8h9BJwDzlPmp3dtWv6AMTQ69%
>> 2
>> >B9kb
>> >>>>>>> NM%3D&amp;reserved=0
>> >>>>>> 28632-1-git-send-email-JackerDune@gmail.com/.
>> >>>
>> >>> I'm worried about this bug.  Current version of a workaround is
>> >>> not correct from the OVS point of view since it wildcards dl_type
>> >>> during offloading that is not expected by higher layers, causing
>> >>> HW flow being much more wide than requested.  In case we going to
>> >>> have this workaround for xl710, we should have another workaround
>> >>> for this
>> issue too.
>> >>>
>> >>>
>> >>>>>>>
>> >>>>>>> Is it possible to address it in DPDK instead of reverting in
>> >>>>>>> OVS and later re-reverting?
>> >>>>>>>
>> >>>>>>> Thanks,
>> >>>>>>>
>> >>>>>>> Eli
>> >>>>>>>
>> >>>>>>>>
>> >>>>>>>> I've included the i40e DPDK maintainers here for their
>> >>>>>>>> thoughts
>> also.
>> >>>>>>>>
>> >>>>>>>> @Beilei/Jia Is this something we can look at to introduce in
>> >>>>>>>> the i40e PMD?
>> >>>>>>
>> >>>>>> Hello,
>> >>>>>>
>> >>>>>> Please take a look at:
>> >>>>>>
>> >>>>>>
>> https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2F
>> >>>>>> m
>> >>>>>> ails.dpdk.org%2Farchives%2Fdev%2F2020-
>> >May%2F166272.html&amp;data=0
>> >>>>>>
>>
>>2%7C01%7Celibr%40nvidia.com%7C2eb7f5d9553c44e4980a08d83df912c9%
>> 7
>> >C4
>> >>>>>>
>>
>>3083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637327487263661244
>&
>> a
>> >mp;s
>> >>>>>>
>> >data=g2Yiy07cL4yGIQYsJzNkZqimUibNMXsDQLNdtJ8Evmw%3D&amp;reserv
>> ed=
>> >0
>> >>>>>>
>> >>>>>> For MLX it was only an optimization. For i40e something similar
>> >>>>>> may be a workaround for this issue.
>> >>>>>
>> >>>>> Thanks for this Eli, let me sync with Beilei on this.
>> >>>>>
>> >>>>> If it's something we can resolve in the PMD then I think we can
>> >>>>> add an errata or known issue for the 2.14 release (and possibly
>> >>>>> the
>> >>>>> 2.13 release as I think the same issue is present there).
>> >>>>>
>> >>>>> If it was fixed in the future we could then remove the issue
>> >>>>> notice.>
>> >>>>
>> >>>> Hi,
>> >>>>
>> >>>> According to the OVS patch and mlx DPDK patch, is the requirement
>> >>>> to support rte flow pattern like 'pattern IPv4 / UDP src is 32 /
>> >>>> end', no need to use ''pattern ETH / IPv4 / UDP src is 32 / end '?
>> >>>> please correct me if I'm wrong.
>> >>>>
>> >>>> If yes, could you please tell me how OVS adds a flow which
>> >>>> doesn’t include ETH item? I'm not very familiar with OVS, I run
>> >>>> some simple commands before, and find eth will always exist. E.g:
>> >>>> flow_add: ufid:fced09a8-9b8a-420d-9cb6-454ab9bed224
>> >>>> skb_priority(0),skb_mark(0),\
>> >>>> ct_state(-new-est-rel-rpl-inv-trk-snat-dnat),ct_zone(0),ct_mark(0
>> >>>> ),
>> >>>> c
>> >>>> t_
>> >>>> label(0),recirc_id(0),\
>> >>>> dp_hash(0),in_port(2),packet_type(ns=0,id=0),eth(src=00:e8:ca:11:ba:
>> >>>> 80
>> >>>> ,dst=ff:ff:ff:ff:ff:ff),
>> >>>> eth_type(0x0800),ipv4(src=0.0.0.0,dst=255.255.255.255,proto=17,to
>> >>>> s=
>> >>>> 0
>> >>>> x1
>> >>>> 0,ttl=128,frag=no),
>> >>>> udp(src=68,dst=67), actions:drop
>> >>>
>> >>> Emma, Ian, could you please provide the pattern that fails to
>> >>> offload on current OVS master so it will be easier for everyone to
>> >>> understand the
>> >issue.
>> >>> It's not obvious how to construct the bad pattern by only looking
>> >>> at the i40e pmd code.  Also, please, enable debug logs and provide
>> >>> the testpmd-style arguments constructed by OVS.
>> >>>
>> >>> It might be also good to have all this information in the commit message.
>> >>>
>> >>
>> >> Following flow pattern is failing from logs:
>> >>
>> >>   Attributes: ingress=1, egress=0, prio=0, group=0, transfer=1 rte
>> >> flow eth pattern:
>> >>   Spec: src=00:00:04:00:0b:00, dst=00:00:04:00:0a:00, type=0x0800
>> >>   Mask: src=00:00:00:00:00:00, dst=00:00:00:00:00:00, type=0xffff
>> >> rte flow ipv4 pattern:
>> >>   Spec: tos=0x0, ttl=40, proto=0x11, src=1.1.0.0, dst=0.0.0.0
>> >>   Mask: tos=0x0, ttl=0, proto=0x0, src=255.255.255.255, dst=0.0.0.0
>> >> rte flow count action:
>> >>   Count: shared=0, id=0
>> >> rte flow port-id action:
>> >>   Port-id: original=0, id=1
>> >> 2020-08-
>> >11T10:49:28.002Z|00003|netdev_offload_dpdk(dp_netdev_flow_18)|WAR
>> N|
>> >dpdk0: rte_flow creation failed: 13 (Unsupported ether_type.).
>> Hi
>>
>> Maybe another workaround until fixed in i40e PMD is to mask out the
>> ether_type match if there is IPv4 or IPv6.
>> It means that patterns of:
>> eth type is 0x0800 / ipv4 / ...
>> will be
>> eth / ipv4 / ...
>>
>> For IPv6 the same, but for other ether types no removal of the match.
>> eth type is 0x1234 /
>> kept the same.
>>
>> I referred to MLX5 PMD patch. As I mentioned, for MLX5 it's only an
>> optimization, for XL710 it can be used as a workaround.
>> https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails
>> .dpdk.org%2Farchives%2Fdev%2F2020-
>May%2F166272.html&amp;data=02%7C01%7
>>
>Celibr%40nvidia.com%7C3ce7437a70c14475299e08d83f657fdd%7C43083d15
>72734
>>
>0c1b7db39efd9ccc17a%7C0%7C0%7C637329052456016934&amp;sdata=D2H
>XS8Wo9Si
>> aD3BCtY8%2BvX5gKH8a%2B7VmAVG8YPxkjAE%3D&amp;reserved=0
>>
>> A similar masking out of protocol type is done today for
>> TCP/UDP/SCTP/ICMP. For example:
>>         /* proto == TCP and ITEM_TYPE_TCP, thus no need for proto match. */
>>         if (next_proto_mask) {
>>             *next_proto_mask = 0;
>>         }
>> Please take a look at
>> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatc
>>
>hwork.ozlabs.org%2Fproject%2Fopenvswitch%2Fpatch%2F20200710120718.3
>&am
>>
>p;data=02%7C01%7Celibr%40nvidia.com%7C3ce7437a70c14475299e08d83f6
>57fdd
>>
>%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637329052456016
>934&amp;s
>>
>data=I8EPuHTyuiP7e3NeEQ86fXIKyAzf5%2BOQ2B8CHo7ilI4%3D&amp;reserved
>=0
>> 8633-3-sriharsha.basavapatna@broadcom.com/
>> That posted commit suggest not to mask out the protocol type and let
>> the PMDs do it if they wish.
>>
>> The proposed workaround (not tested):
>> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
>> index de6101e4d..1af7bebcd 100644
>> --- a/lib/netdev-offload-dpdk.c
>> +++ b/lib/netdev-offload-dpdk.c
>> @@ -676,6 +676,7 @@ static int
>>  parse_flow_match(struct flow_patterns *patterns,
>>                   struct match *match)  {
>> +    uint8_t *next_eth_type_mask = NULL;
>>      uint8_t *next_proto_mask = NULL;
>>      struct flow *consumed_masks;
>>      uint8_t proto = 0;
>> @@ -712,6 +713,7 @@ parse_flow_match(struct flow_patterns *patterns,
>>          consumed_masks->dl_type = 0;
>>
>>          add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec,
>> mask);
>> +        next_eth_type_mask = &mask->type;
>>      }
>>
>>      /* VLAN */
>> @@ -739,6 +741,11 @@ parse_flow_match(struct flow_patterns *patterns,
>>      if (match->flow.dl_type == htons(ETH_TYPE_IP)) {
>>          struct rte_flow_item_ipv4 *spec, *mask;
>>
>> +        /* IPv4. No need to match ether type. */
>> +        if (next_eth_type_mask) {
>> +            *next_eth_type_mask = 0;
>> +        }
>> +
>>          spec = xzalloc(sizeof *spec);
>>          mask = xzalloc(sizeof *mask);
>>
>> @@ -777,6 +784,11 @@ parse_flow_match(struct flow_patterns *patterns,
>>      if (match->flow.dl_type == htons(ETH_TYPE_IPV6)) {
>>          struct rte_flow_item_ipv6 *spec, *mask;
>>
>> +        /* IPv6. No need to match ether type. */
>> +        if (next_eth_type_mask) {
>> +            *next_eth_type_mask = 0;
>> +        }
>> +
>>          spec = xzalloc(sizeof *spec);
>>          mask = xzalloc(sizeof *mask);
>>
>>
>> >>
>> >> Or from dump-flows :
>> >>
>> >> flow-dump from pmd on cpu core: 23
>> >> ufid:b4ba667d-8f4d-4f45-8896-c541d9fcecc0,
>> >> skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(
>> >> 0/
>> >> 0
>> >> ),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(dpdk0),packet_typ
>> >> e(
>> >> n
>> >> s=0,id=0),eth(src=00:00:04:00:0b:00/00:00:00:00:00:00,dst=00:00:04:00:
>> >> 0a:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=1.1.0.0,dst=0.0.0.
>> >> 0
>> >> /0.0.0.0,proto=17/0,tos=0/0,ttl=64/0,frag=no),udp(src=63/0,dst=63/0
>> >> ), packets:42575487, bytes:2554529220, used:0.000s,
>> >> offloaded:partial, dp:ovs, actions:dpdk1,
>> >> dp-extra-info:miniflow_bits(4,2)
>> >>
>> >>
>> >> with regards to " provide the testpmd-style arguments constructed
>> >> by
>> OVS."
>> >> Can you confirm what you mean?
>> >
>> >We changed the way of logging in a following commit:
>> >d8ad173fb9c1 ("netdev-offload-dpdk: Log testpmd format for flow
>> >create/destroy.")
>> >
>> >It's on master and branch-2.14.
>> >
>> >>
>
>Tested both of these patches and neither fixed the issue.
>
>The problem is that parse_flow_match() is setting an ethernet pattern that the
>i40e PMD doesn't support.
>Working Flow:
>rte flow eth pattern:
>Spec = null
>Mask = null
>rte flow ipv4 pattern:
>Spec: tos=0x0, ttl=40, proto=0x11, src=21.2.5.1, dst=0.0.0.0
>Mask: tos=0x0, ttl=0, proto=0x0, src=240.0.0.0, dst=0.0.0.0
>
>Broken Flow:
>flow eth pattern:
>Spec: src=00:00:04:00:0b:00, dst=00:00:04:00:0a:00, type=0x0800
>Mask: src=00:00:00:00:00:00, dst=00:00:00:00:00:00, type=0xffff rte flow ipv4
>pattern:
>Spec: tos=0x0, ttl=40, proto=0x11, src=1.1.0.0, dst=0.0.0.0
>Mask: tos=0x0, ttl=0, proto=0x0, src=255.255.255.255, dst=0.0.0.0
>
>I will continue to try rework the patches to fix this.
>
>Thanks,
>Emma
>
Hi Emma,

My previous patch had a bug (uint8_t instead of uint16_t). Anyway, I have prepared another one according to your inputs (also not tested). Please give it a try, and if works, finalize it with comments etc.
diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index de6101e4d..f4b04a880 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -672,10 +672,19 @@ free_flow_actions(struct flow_actions *actions)
     actions->cnt = 0;
 }
 
+/* write a comment this is a workaround... */
+#define NULL_ETH_MASK_IF_ZEROS \
+    if (eth_mask && is_all_zeros(&eth_mask->src, sizeof eth_mask->src) && \
+        is_all_zeros(&eth_mask->dst, sizeof eth_mask->dst)) { \
+        patterns->items[0].mask = NULL; \
+        free(eth_mask); \
+    }
+
 static int
 parse_flow_match(struct flow_patterns *patterns,
                  struct match *match)
 {
+    struct rte_flow_item_eth *eth_mask = NULL;
     uint8_t *next_proto_mask = NULL;
     struct flow *consumed_masks;
     uint8_t proto = 0;
@@ -694,24 +703,24 @@ parse_flow_match(struct flow_patterns *patterns,
     if (match->wc.masks.dl_type ||
         !eth_addr_is_zero(match->wc.masks.dl_src) ||
         !eth_addr_is_zero(match->wc.masks.dl_dst)) {
-        struct rte_flow_item_eth *spec, *mask;
+        struct rte_flow_item_eth *spec;
 
         spec = xzalloc(sizeof *spec);
-        mask = xzalloc(sizeof *mask);
+        eth_mask = xzalloc(sizeof *eth_mask);
 
         memcpy(&spec->dst, &match->flow.dl_dst, sizeof spec->dst);
         memcpy(&spec->src, &match->flow.dl_src, sizeof spec->src);
         spec->type = match->flow.dl_type;
 
-        memcpy(&mask->dst, &match->wc.masks.dl_dst, sizeof mask->dst);
-        memcpy(&mask->src, &match->wc.masks.dl_src, sizeof mask->src);
-        mask->type = match->wc.masks.dl_type;
+        memcpy(&eth_mask->dst, &match->wc.masks.dl_dst, sizeof eth_mask->dst);
+        memcpy(&eth_mask->src, &match->wc.masks.dl_src, sizeof eth_mask->src);
+        eth_mask->type = match->wc.masks.dl_type;
 
         memset(&consumed_masks->dl_dst, 0, sizeof consumed_masks->dl_dst);
         memset(&consumed_masks->dl_src, 0, sizeof consumed_masks->dl_src);
         consumed_masks->dl_type = 0;
 
-        add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec, mask);
+        add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec, eth_mask);
     }
 
     /* VLAN */
@@ -739,6 +748,8 @@ parse_flow_match(struct flow_patterns *patterns,
     if (match->flow.dl_type == htons(ETH_TYPE_IP)) {
         struct rte_flow_item_ipv4 *spec, *mask;
 
+        NULL_ETH_MASK_IF_ZEROS;
+
         spec = xzalloc(sizeof *spec);
         mask = xzalloc(sizeof *mask);
 
@@ -777,6 +788,8 @@ parse_flow_match(struct flow_patterns *patterns,
     if (match->flow.dl_type == htons(ETH_TYPE_IPV6)) {
         struct rte_flow_item_ipv6 *spec, *mask;
 
+        NULL_ETH_MASK_IF_ZEROS;
+
         spec = xzalloc(sizeof *spec);
         mask = xzalloc(sizeof *mask);


Thanks,
Eli

>> >> Thanks,
>> >> Emma
>> >>
>> >>> Thanks.
>> >>>
>> >>> Best regards, Ilya Maximets.
>> >>>
>> >>>>
>> >>>> Thanks,
>> >>>> Beilei
>> >>>>
>> >>>>>
>> >>>>> Regards
>> >>>>> Ian
>> >>>>>>
>> >>>>>> Thanks,
>> >>>>>>
>> >>>>>> Eli
>> >>>>>>
>> >>>>>>>>
>> >>>>>>>> Regards
>> >>>>>>>> Ian
>> >>>>>>>>
>> >>>>>>>>>> Fixed by partial reversion of these changes.
>> >>>>>>>>>>
>> >>>>>>>>>> Signed-off-by: Emma Finn<emma.finn@intel.com>
>> >>>>
>> >>
Emma Finn Aug. 13, 2020, 11:12 a.m. UTC | #15
> -----Original Message-----
> From: Eli Britstein <elibr@nvidia.com>
> Sent: Thursday 13 August 2020 10:26
> To: Finn, Emma <emma.finn@intel.com>; Ilya Maximets
> <i.maximets@ovn.org>; Xing, Beilei <beilei.xing@intel.com>; Stokes, Ian
> <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>;
> dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet matching
> HWOL for XL710 NIC
> 
> 
> 
> >-----Original Message-----
> >From: Finn, Emma <emma.finn@intel.com>
> >Sent: Thursday, August 13, 2020 11:47 AM
> >To: Eli Britstein <elibr@nvidia.com>; Ilya Maximets
> ><i.maximets@ovn.org>; Xing, Beilei <beilei.xing@intel.com>; Stokes, Ian
> ><ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>;
> >dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
> >Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
> >matching HWOL for XL710 NIC
> >
> >> -----Original Message-----
> >> From: Eli Britstein <elibr@nvidia.com>
> >> Sent: Tuesday 11 August 2020 16:49
> >> To: Ilya Maximets <i.maximets@ovn.org>; Finn, Emma
> >> <emma.finn@intel.com>; Xing, Beilei <beilei.xing@intel.com>; Stokes,
> >> Ian <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>;
> >> dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
> >> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
> >> matching HWOL for XL710 NIC
> >>
> >>
> >>
> >> >-----Original Message-----
> >> >From: Ilya Maximets <i.maximets@ovn.org>
> >> >Sent: Tuesday, August 11, 2020 4:19 PM
> >> >To: Finn, Emma <emma.finn@intel.com>; Ilya Maximets
> >> ><i.maximets@ovn.org>; Xing, Beilei <beilei.xing@intel.com>; Stokes,
> >> >Ian <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>;
> >> >dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
> >> >Subject: Re: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
> >> >matching HWOL for XL710 NIC
> >> >
> >> >On 8/11/20 3:12 PM, Finn, Emma wrote:
> >> >>
> >> >>
> >> >>> -----Original Message-----
> >> >>> From: Ilya Maximets <i.maximets@ovn.org>
> >> >>> Sent: Tuesday 11 August 2020 11:02
> >> >>> To: Xing, Beilei <beilei.xing@intel.com>; Stokes, Ian
> >> >>> <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>; Finn,
> >> >>> Emma <emma.finn@intel.com>; dev@openvswitch.org; Guo, Jia
> >> >>> <jia.guo@intel.com>
> >> >>> Cc: i.maximets@ovn.org
> >> >>> Subject: Re: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
> >> >>> matching HWOL for XL710 NIC
> >> >>>
> >> >>> On 8/11/20 5:35 AM, Xing, Beilei wrote:
> >> >>>>
> >> >>>>
> >> >>>>> -----Original Message-----
> >> >>>>> From: Stokes, Ian <ian.stokes@intel.com>
> >> >>>>> Sent: Tuesday, August 11, 2020 4:52 AM
> >> >>>>> To: Eli Britstein <elibr@mellanox.com>; Finn, Emma
> >> >>>>> <emma.finn@intel.com>; dev@openvswitch.org; Xing, Beilei
> >> >>>>> <beilei.xing@intel.com>; Guo, Jia <jia.guo@intel.com>
> >> >>>>> Cc: i.maximets@ovn.org
> >> >>>>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken
> >> >>>>> ethernet matching HWOL for XL710 NIC
> >> >>>>>
> >> >>>>>> On 8/7/2020 7:55 AM, Eli Britstein wrote:
> >> >>>>>>>
> >> >>>>>>> On 8/6/2020 8:28 PM, Stokes, Ian wrote:
> >> >>>>>>>>> On 8/6/2020 6:17 PM, Emma Finn wrote:
> >> >>>>>>>>>> The following 2 commits introduced changes which caused a
> >> >>>>>>>>>> regression for XL710 devices and functionality ceases for
> >> >>>>>>>>>> partial offload as a result.
> >> >>>>>>>>>> 864852a0624a ("netdev-offload-dpdk: Fix Ethernet matching
> >> >>>>>>>>>> for type
> >> >>>>>>>>>> only.")
> >> >>>>>>>>>> a79eae87abe4 ("netdev-offload-dpdk: Remove pre-validate
> of
> >> >>>>>> patterns
> >> >>>>>>>>>> function.")
> >> >>>>>>>>> OVS is vendor agnostic. That kind of workaround belongs in
> >> >>>>>>>>> intel PMD in dpdk, not in OVS.
> >> >>>>>>>> Hi Eli,
> >> >>>>>>>>
> >> >>>>>>>> Yes OVS looks to be vendor agnostic, but this code I believe
> >> >>>>>>>> was already in place and working for this usecase. As such
> >> >>>>>>>> it's removal introduced a regression from an OVS point of
> >> >>>>>>>> view between
> >> >>> the releases.
> >> >>>>>>>>
> >> >>>>>>>> We have had examples in the past where workarounds are
> >> >>> permissible
> >> >>>>>>>> if there is a clear path to fixing this in the future on the
> >> >>>>>>>> DPDK side (which is what I suggest here) (for example
> >> >>>>>>>> scatter gather support for MTUs in the past raised similar
> >> >>>>>>>> issue where we hand to handle specific NIC until the next
> >> >>>>>>>> DPDK LTS
> >release).
> >> >>>>>>>>
> >> >>>>>>>> So my suggestion is to re-instate the original workaround
> >> >>>>>>>> and remove its when fixed in the next DPDK LTS which
> >> >>>>>>>> supports the change for i40e at the PMD layer or if it's
> >> >>>>>>>> backported to the next
> >> >>>>>>>> 19.11 stable release which would be validated for use with OVS.
> >> >>>>>>>
> >> >>>>>>> Hi,
> >> >>>>>>>
> >> >>>>>>> There was a bug with this WA.
> >> >>>>>>>
> >> >>>>>>> Please see
> >> >>>>>>>
> >> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%25
> >> >>>>>>> 2
> >> >>>>>>>
> >>
> >Fpatchwork.ozlabs.org%2Fproject%2Fopenvswitch%2Fpatch%2F158718026
> >> >>>>>>> 6-
> >>
> >>&amp;data=02%7C01%7Celibr%40nvidia.com%7C2eb7f5d9553c44e4980a0
> 8
> >> >>>>>>>
> >>
> >>d83df912c9%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C6373
> 2
> >> 7
> >> >487
> >> >>>>>>>
> >>
> >>263661244&amp;sdata=fgnwXLy3xt%2B8H8h9BJwDzlPmp3dtWv6AMTQ69
> %
> >> 2
> >> >B9kb
> >> >>>>>>> NM%3D&amp;reserved=0
> >> >>>>>> 28632-1-git-send-email-JackerDune@gmail.com/.
> >> >>>
> >> >>> I'm worried about this bug.  Current version of a workaround is
> >> >>> not correct from the OVS point of view since it wildcards dl_type
> >> >>> during offloading that is not expected by higher layers, causing
> >> >>> HW flow being much more wide than requested.  In case we going to
> >> >>> have this workaround for xl710, we should have another workaround
> >> >>> for this
> >> issue too.
> >> >>>
> >> >>>
> >> >>>>>>>
> >> >>>>>>> Is it possible to address it in DPDK instead of reverting in
> >> >>>>>>> OVS and later re-reverting?
> >> >>>>>>>
> >> >>>>>>> Thanks,
> >> >>>>>>>
> >> >>>>>>> Eli
> >> >>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> I've included the i40e DPDK maintainers here for their
> >> >>>>>>>> thoughts
> >> also.
> >> >>>>>>>>
> >> >>>>>>>> @Beilei/Jia Is this something we can look at to introduce in
> >> >>>>>>>> the i40e PMD?
> >> >>>>>>
> >> >>>>>> Hello,
> >> >>>>>>
> >> >>>>>> Please take a look at:
> >> >>>>>>
> >> >>>>>>
> >> https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2F
> >> >>>>>> m
> >> >>>>>> ails.dpdk.org%2Farchives%2Fdev%2F2020-
> >> >May%2F166272.html&amp;data=0
> >> >>>>>>
> >>
> >>2%7C01%7Celibr%40nvidia.com%7C2eb7f5d9553c44e4980a08d83df912c9
> %
> >> 7
> >> >C4
> >> >>>>>>
> >>
> >>3083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637327487263661244
> >&
> >> a
> >> >mp;s
> >> >>>>>>
> >>
> >data=g2Yiy07cL4yGIQYsJzNkZqimUibNMXsDQLNdtJ8Evmw%3D&amp;reserv
> >> ed=
> >> >0
> >> >>>>>>
> >> >>>>>> For MLX it was only an optimization. For i40e something
> >> >>>>>> similar may be a workaround for this issue.
> >> >>>>>
> >> >>>>> Thanks for this Eli, let me sync with Beilei on this.
> >> >>>>>
> >> >>>>> If it's something we can resolve in the PMD then I think we can
> >> >>>>> add an errata or known issue for the 2.14 release (and possibly
> >> >>>>> the
> >> >>>>> 2.13 release as I think the same issue is present there).
> >> >>>>>
> >> >>>>> If it was fixed in the future we could then remove the issue
> >> >>>>> notice.>
> >> >>>>
> >> >>>> Hi,
> >> >>>>
> >> >>>> According to the OVS patch and mlx DPDK patch, is the
> >> >>>> requirement to support rte flow pattern like 'pattern IPv4 / UDP
> >> >>>> src is 32 / end', no need to use ''pattern ETH / IPv4 / UDP src is 32 /
> end '?
> >> >>>> please correct me if I'm wrong.
> >> >>>>
> >> >>>> If yes, could you please tell me how OVS adds a flow which
> >> >>>> doesn’t include ETH item? I'm not very familiar with OVS, I run
> >> >>>> some simple commands before, and find eth will always exist. E.g:
> >> >>>> flow_add: ufid:fced09a8-9b8a-420d-9cb6-454ab9bed224
> >> >>>> skb_priority(0),skb_mark(0),\
> >> >>>> ct_state(-new-est-rel-rpl-inv-trk-snat-dnat),ct_zone(0),ct_mark(
> >> >>>> 0
> >> >>>> ),
> >> >>>> c
> >> >>>> t_
> >> >>>> label(0),recirc_id(0),\
> >> >>>>
> dp_hash(0),in_port(2),packet_type(ns=0,id=0),eth(src=00:e8:ca:11:ba:
> >> >>>> 80
> >> >>>> ,dst=ff:ff:ff:ff:ff:ff),
> >> >>>> eth_type(0x0800),ipv4(src=0.0.0.0,dst=255.255.255.255,proto=17,t
> >> >>>> o
> >> >>>> s=
> >> >>>> 0
> >> >>>> x1
> >> >>>> 0,ttl=128,frag=no),
> >> >>>> udp(src=68,dst=67), actions:drop
> >> >>>
> >> >>> Emma, Ian, could you please provide the pattern that fails to
> >> >>> offload on current OVS master so it will be easier for everyone
> >> >>> to understand the
> >> >issue.
> >> >>> It's not obvious how to construct the bad pattern by only looking
> >> >>> at the i40e pmd code.  Also, please, enable debug logs and
> >> >>> provide the testpmd-style arguments constructed by OVS.
> >> >>>
> >> >>> It might be also good to have all this information in the commit
> message.
> >> >>>
> >> >>
> >> >> Following flow pattern is failing from logs:
> >> >>
> >> >>   Attributes: ingress=1, egress=0, prio=0, group=0, transfer=1 rte
> >> >> flow eth pattern:
> >> >>   Spec: src=00:00:04:00:0b:00, dst=00:00:04:00:0a:00, type=0x0800
> >> >>   Mask: src=00:00:00:00:00:00, dst=00:00:00:00:00:00, type=0xffff
> >> >> rte flow ipv4 pattern:
> >> >>   Spec: tos=0x0, ttl=40, proto=0x11, src=1.1.0.0, dst=0.0.0.0
> >> >>   Mask: tos=0x0, ttl=0, proto=0x0, src=255.255.255.255,
> >> >> dst=0.0.0.0 rte flow count action:
> >> >>   Count: shared=0, id=0
> >> >> rte flow port-id action:
> >> >>   Port-id: original=0, id=1
> >> >> 2020-08-
> >>
> >11T10:49:28.002Z|00003|netdev_offload_dpdk(dp_netdev_flow_18)|WAR
> >> N|
> >> >dpdk0: rte_flow creation failed: 13 (Unsupported ether_type.).
> >> Hi
> >>
> >> Maybe another workaround until fixed in i40e PMD is to mask out the
> >> ether_type match if there is IPv4 or IPv6.
> >> It means that patterns of:
> >> eth type is 0x0800 / ipv4 / ...
> >> will be
> >> eth / ipv4 / ...
> >>
> >> For IPv6 the same, but for other ether types no removal of the match.
> >> eth type is 0x1234 /
> >> kept the same.
> >>
> >> I referred to MLX5 PMD patch. As I mentioned, for MLX5 it's only an
> >> optimization, for XL710 it can be used as a workaround.
> >>
> https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail
> >> s
> >> .dpdk.org%2Farchives%2Fdev%2F2020-
> >May%2F166272.html&amp;data=02%7C01%7
> >>
> >Celibr%40nvidia.com%7C3ce7437a70c14475299e08d83f657fdd%7C43083d1
> 5
> >72734
> >>
> >0c1b7db39efd9ccc17a%7C0%7C0%7C637329052456016934&amp;sdata=D2
> H
> >XS8Wo9Si
> >> aD3BCtY8%2BvX5gKH8a%2B7VmAVG8YPxkjAE%3D&amp;reserved=0
> >>
> >> A similar masking out of protocol type is done today for
> >> TCP/UDP/SCTP/ICMP. For example:
> >>         /* proto == TCP and ITEM_TYPE_TCP, thus no need for proto match.
> */
> >>         if (next_proto_mask) {
> >>             *next_proto_mask = 0;
> >>         }
> >> Please take a look at
> >>
> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpat
> >> c
> >>
> >hwork.ozlabs.org%2Fproject%2Fopenvswitch%2Fpatch%2F20200710120718
> .3
> >&am
> >>
> >p;data=02%7C01%7Celibr%40nvidia.com%7C3ce7437a70c14475299e08d83f
> 6
> >57fdd
> >>
> >%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637329052456016
> >934&amp;s
> >>
> >data=I8EPuHTyuiP7e3NeEQ86fXIKyAzf5%2BOQ2B8CHo7ilI4%3D&amp;reser
> ved
> >=0
> >> 8633-3-sriharsha.basavapatna@broadcom.com/
> >> That posted commit suggest not to mask out the protocol type and let
> >> the PMDs do it if they wish.
> >>
> >> The proposed workaround (not tested):
> >> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
> >> index de6101e4d..1af7bebcd 100644
> >> --- a/lib/netdev-offload-dpdk.c
> >> +++ b/lib/netdev-offload-dpdk.c
> >> @@ -676,6 +676,7 @@ static int
> >>  parse_flow_match(struct flow_patterns *patterns,
> >>                   struct match *match)  {
> >> +    uint8_t *next_eth_type_mask = NULL;
> >>      uint8_t *next_proto_mask = NULL;
> >>      struct flow *consumed_masks;
> >>      uint8_t proto = 0;
> >> @@ -712,6 +713,7 @@ parse_flow_match(struct flow_patterns
> *patterns,
> >>          consumed_masks->dl_type = 0;
> >>
> >>          add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec,
> >> mask);
> >> +        next_eth_type_mask = &mask->type;
> >>      }
> >>
> >>      /* VLAN */
> >> @@ -739,6 +741,11 @@ parse_flow_match(struct flow_patterns
> *patterns,
> >>      if (match->flow.dl_type == htons(ETH_TYPE_IP)) {
> >>          struct rte_flow_item_ipv4 *spec, *mask;
> >>
> >> +        /* IPv4. No need to match ether type. */
> >> +        if (next_eth_type_mask) {
> >> +            *next_eth_type_mask = 0;
> >> +        }
> >> +
> >>          spec = xzalloc(sizeof *spec);
> >>          mask = xzalloc(sizeof *mask);
> >>
> >> @@ -777,6 +784,11 @@ parse_flow_match(struct flow_patterns
> *patterns,
> >>      if (match->flow.dl_type == htons(ETH_TYPE_IPV6)) {
> >>          struct rte_flow_item_ipv6 *spec, *mask;
> >>
> >> +        /* IPv6. No need to match ether type. */
> >> +        if (next_eth_type_mask) {
> >> +            *next_eth_type_mask = 0;
> >> +        }
> >> +
> >>          spec = xzalloc(sizeof *spec);
> >>          mask = xzalloc(sizeof *mask);
> >>
> >>
> >> >>
> >> >> Or from dump-flows :
> >> >>
> >> >> flow-dump from pmd on cpu core: 23
> >> >> ufid:b4ba667d-8f4d-4f45-8896-c541d9fcecc0,
> >> >> skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark
> >> >> (
> >> >> 0/
> >> >> 0
> >> >> ),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(dpdk0),packet_ty
> >> >> p
> >> >> e(
> >> >> n
> >> >>
> s=0,id=0),eth(src=00:00:04:00:0b:00/00:00:00:00:00:00,dst=00:00:04:00:
> >> >> 0a:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=1.1.0.0,dst=0.0.0.
> >> >> 0
> >> >> /0.0.0.0,proto=17/0,tos=0/0,ttl=64/0,frag=no),udp(src=63/0,dst=63/
> >> >> 0 ), packets:42575487, bytes:2554529220, used:0.000s,
> >> >> offloaded:partial, dp:ovs, actions:dpdk1,
> >> >> dp-extra-info:miniflow_bits(4,2)
> >> >>
> >> >>
> >> >> with regards to " provide the testpmd-style arguments constructed
> >> >> by
> >> OVS."
> >> >> Can you confirm what you mean?
> >> >
> >> >We changed the way of logging in a following commit:
> >> >d8ad173fb9c1 ("netdev-offload-dpdk: Log testpmd format for flow
> >> >create/destroy.")
> >> >
> >> >It's on master and branch-2.14.
> >> >
> >> >>
> >
> >Tested both of these patches and neither fixed the issue.
> >
> >The problem is that parse_flow_match() is setting an ethernet pattern
> >that the i40e PMD doesn't support.
> >Working Flow:
> >rte flow eth pattern:
> >Spec = null
> >Mask = null
> >rte flow ipv4 pattern:
> >Spec: tos=0x0, ttl=40, proto=0x11, src=21.2.5.1, dst=0.0.0.0
> >Mask: tos=0x0, ttl=0, proto=0x0, src=240.0.0.0, dst=0.0.0.0
> >
> >Broken Flow:
> >flow eth pattern:
> >Spec: src=00:00:04:00:0b:00, dst=00:00:04:00:0a:00, type=0x0800
> >Mask: src=00:00:00:00:00:00, dst=00:00:00:00:00:00, type=0xffff rte
> >flow ipv4
> >pattern:
> >Spec: tos=0x0, ttl=40, proto=0x11, src=1.1.0.0, dst=0.0.0.0
> >Mask: tos=0x0, ttl=0, proto=0x0, src=255.255.255.255, dst=0.0.0.0
> >
> >I will continue to try rework the patches to fix this.
> >
> >Thanks,
> >Emma
> >
> Hi Emma,
> 
> My previous patch had a bug (uint8_t instead of uint16_t). Anyway, I have
> prepared another one according to your inputs (also not tested). Please give
> it a try, and if works, finalize it with comments etc.
> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c index
> de6101e4d..f4b04a880 100644
> --- a/lib/netdev-offload-dpdk.c
> +++ b/lib/netdev-offload-dpdk.c
> @@ -672,10 +672,19 @@ free_flow_actions(struct flow_actions *actions)
>      actions->cnt = 0;
>  }
> 
> +/* write a comment this is a workaround... */ #define
> +NULL_ETH_MASK_IF_ZEROS \
> +    if (eth_mask && is_all_zeros(&eth_mask->src, sizeof eth_mask->src) && \
> +        is_all_zeros(&eth_mask->dst, sizeof eth_mask->dst)) { \
> +        patterns->items[0].mask = NULL; \
> +        free(eth_mask); \
> +    }
> +
>  static int
>  parse_flow_match(struct flow_patterns *patterns,
>                   struct match *match)
>  {
> +    struct rte_flow_item_eth *eth_mask = NULL;
>      uint8_t *next_proto_mask = NULL;
>      struct flow *consumed_masks;
>      uint8_t proto = 0;
> @@ -694,24 +703,24 @@ parse_flow_match(struct flow_patterns *patterns,
>      if (match->wc.masks.dl_type ||
>          !eth_addr_is_zero(match->wc.masks.dl_src) ||
>          !eth_addr_is_zero(match->wc.masks.dl_dst)) {
> -        struct rte_flow_item_eth *spec, *mask;
> +        struct rte_flow_item_eth *spec;
> 
>          spec = xzalloc(sizeof *spec);
> -        mask = xzalloc(sizeof *mask);
> +        eth_mask = xzalloc(sizeof *eth_mask);
> 
>          memcpy(&spec->dst, &match->flow.dl_dst, sizeof spec->dst);
>          memcpy(&spec->src, &match->flow.dl_src, sizeof spec->src);
>          spec->type = match->flow.dl_type;
> 
> -        memcpy(&mask->dst, &match->wc.masks.dl_dst, sizeof mask->dst);
> -        memcpy(&mask->src, &match->wc.masks.dl_src, sizeof mask->src);
> -        mask->type = match->wc.masks.dl_type;
> +        memcpy(&eth_mask->dst, &match->wc.masks.dl_dst, sizeof eth_mask-
> >dst);
> +        memcpy(&eth_mask->src, &match->wc.masks.dl_src, sizeof eth_mask-
> >src);
> +        eth_mask->type = match->wc.masks.dl_type;
> 
>          memset(&consumed_masks->dl_dst, 0, sizeof consumed_masks-
> >dl_dst);
>          memset(&consumed_masks->dl_src, 0, sizeof consumed_masks-
> >dl_src);
>          consumed_masks->dl_type = 0;
> 
> -        add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec, mask);
> +        add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec,
> + eth_mask);
>      }
> 
>      /* VLAN */
> @@ -739,6 +748,8 @@ parse_flow_match(struct flow_patterns *patterns,
>      if (match->flow.dl_type == htons(ETH_TYPE_IP)) {
>          struct rte_flow_item_ipv4 *spec, *mask;
> 
> +        NULL_ETH_MASK_IF_ZEROS;
> +
>          spec = xzalloc(sizeof *spec);
>          mask = xzalloc(sizeof *mask);
> 
> @@ -777,6 +788,8 @@ parse_flow_match(struct flow_patterns *patterns,
>      if (match->flow.dl_type == htons(ETH_TYPE_IPV6)) {
>          struct rte_flow_item_ipv6 *spec, *mask;
> 
> +        NULL_ETH_MASK_IF_ZEROS;
> +
>          spec = xzalloc(sizeof *spec);
>          mask = xzalloc(sizeof *mask);
> 
> 

Great. Thanks Eli, I tested this and it works. I will also have to test this for backporting to v.2.13.1.
I can provide test/review tags for this patch.

Thanks, 
Emma

> Thanks,
> Eli
> 
> >> >> Thanks,
> >> >> Emma
> >> >>
> >> >>> Thanks.
> >> >>>
> >> >>> Best regards, Ilya Maximets.
> >> >>>
> >> >>>>
> >> >>>> Thanks,
> >> >>>> Beilei
> >> >>>>
> >> >>>>>
> >> >>>>> Regards
> >> >>>>> Ian
> >> >>>>>>
> >> >>>>>> Thanks,
> >> >>>>>>
> >> >>>>>> Eli
> >> >>>>>>
> >> >>>>>>>>
> >> >>>>>>>> Regards
> >> >>>>>>>> Ian
> >> >>>>>>>>
> >> >>>>>>>>>> Fixed by partial reversion of these changes.
> >> >>>>>>>>>>
> >> >>>>>>>>>> Signed-off-by: Emma Finn<emma.finn@intel.com>
> >> >>>>
> >> >>
Ilya Maximets Aug. 13, 2020, 11:46 a.m. UTC | #16
On 8/13/20 1:12 PM, Finn, Emma wrote:
> 
> 
>> -----Original Message-----
>> From: Eli Britstein <elibr@nvidia.com>
>> Sent: Thursday 13 August 2020 10:26
>> To: Finn, Emma <emma.finn@intel.com>; Ilya Maximets
>> <i.maximets@ovn.org>; Xing, Beilei <beilei.xing@intel.com>; Stokes, Ian
>> <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>;
>> dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet matching
>> HWOL for XL710 NIC
>>
>>
>>
>>> -----Original Message-----
>>> From: Finn, Emma <emma.finn@intel.com>
>>> Sent: Thursday, August 13, 2020 11:47 AM
>>> To: Eli Britstein <elibr@nvidia.com>; Ilya Maximets
>>> <i.maximets@ovn.org>; Xing, Beilei <beilei.xing@intel.com>; Stokes, Ian
>>> <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>;
>>> dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
>>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
>>> matching HWOL for XL710 NIC
>>>
>>>> -----Original Message-----
>>>> From: Eli Britstein <elibr@nvidia.com>
>>>> Sent: Tuesday 11 August 2020 16:49
>>>> To: Ilya Maximets <i.maximets@ovn.org>; Finn, Emma
>>>> <emma.finn@intel.com>; Xing, Beilei <beilei.xing@intel.com>; Stokes,
>>>> Ian <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>;
>>>> dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
>>>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
>>>> matching HWOL for XL710 NIC
>>>>
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Ilya Maximets <i.maximets@ovn.org>
>>>>> Sent: Tuesday, August 11, 2020 4:19 PM
>>>>> To: Finn, Emma <emma.finn@intel.com>; Ilya Maximets
>>>>> <i.maximets@ovn.org>; Xing, Beilei <beilei.xing@intel.com>; Stokes,
>>>>> Ian <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>;
>>>>> dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
>>>>> Subject: Re: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
>>>>> matching HWOL for XL710 NIC
>>>>>
>>>>> On 8/11/20 3:12 PM, Finn, Emma wrote:
>>>>>>
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Ilya Maximets <i.maximets@ovn.org>
>>>>>>> Sent: Tuesday 11 August 2020 11:02
>>>>>>> To: Xing, Beilei <beilei.xing@intel.com>; Stokes, Ian
>>>>>>> <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>; Finn,
>>>>>>> Emma <emma.finn@intel.com>; dev@openvswitch.org; Guo, Jia
>>>>>>> <jia.guo@intel.com>
>>>>>>> Cc: i.maximets@ovn.org
>>>>>>> Subject: Re: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
>>>>>>> matching HWOL for XL710 NIC
>>>>>>>
>>>>>>> On 8/11/20 5:35 AM, Xing, Beilei wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Stokes, Ian <ian.stokes@intel.com>
>>>>>>>>> Sent: Tuesday, August 11, 2020 4:52 AM
>>>>>>>>> To: Eli Britstein <elibr@mellanox.com>; Finn, Emma
>>>>>>>>> <emma.finn@intel.com>; dev@openvswitch.org; Xing, Beilei
>>>>>>>>> <beilei.xing@intel.com>; Guo, Jia <jia.guo@intel.com>
>>>>>>>>> Cc: i.maximets@ovn.org
>>>>>>>>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken
>>>>>>>>> ethernet matching HWOL for XL710 NIC
>>>>>>>>>
>>>>>>>>>> On 8/7/2020 7:55 AM, Eli Britstein wrote:
>>>>>>>>>>>
>>>>>>>>>>> On 8/6/2020 8:28 PM, Stokes, Ian wrote:
>>>>>>>>>>>>> On 8/6/2020 6:17 PM, Emma Finn wrote:
>>>>>>>>>>>>>> The following 2 commits introduced changes which caused a
>>>>>>>>>>>>>> regression for XL710 devices and functionality ceases for
>>>>>>>>>>>>>> partial offload as a result.
>>>>>>>>>>>>>> 864852a0624a ("netdev-offload-dpdk: Fix Ethernet matching
>>>>>>>>>>>>>> for type
>>>>>>>>>>>>>> only.")
>>>>>>>>>>>>>> a79eae87abe4 ("netdev-offload-dpdk: Remove pre-validate
>> of
>>>>>>>>>> patterns
>>>>>>>>>>>>>> function.")
>>>>>>>>>>>>> OVS is vendor agnostic. That kind of workaround belongs in
>>>>>>>>>>>>> intel PMD in dpdk, not in OVS.
>>>>>>>>>>>> Hi Eli,
>>>>>>>>>>>>
>>>>>>>>>>>> Yes OVS looks to be vendor agnostic, but this code I believe
>>>>>>>>>>>> was already in place and working for this usecase. As such
>>>>>>>>>>>> it's removal introduced a regression from an OVS point of
>>>>>>>>>>>> view between
>>>>>>> the releases.
>>>>>>>>>>>>
>>>>>>>>>>>> We have had examples in the past where workarounds are
>>>>>>> permissible
>>>>>>>>>>>> if there is a clear path to fixing this in the future on the
>>>>>>>>>>>> DPDK side (which is what I suggest here) (for example
>>>>>>>>>>>> scatter gather support for MTUs in the past raised similar
>>>>>>>>>>>> issue where we hand to handle specific NIC until the next
>>>>>>>>>>>> DPDK LTS
>>> release).
>>>>>>>>>>>>
>>>>>>>>>>>> So my suggestion is to re-instate the original workaround
>>>>>>>>>>>> and remove its when fixed in the next DPDK LTS which
>>>>>>>>>>>> supports the change for i40e at the PMD layer or if it's
>>>>>>>>>>>> backported to the next
>>>>>>>>>>>> 19.11 stable release which would be validated for use with OVS.
>>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> There was a bug with this WA.
>>>>>>>>>>>
>>>>>>>>>>> Please see
>>>>>>>>>>>
>>>> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%25
>>>>>>>>>>> 2
>>>>>>>>>>>
>>>>
>>> Fpatchwork.ozlabs.org%2Fproject%2Fopenvswitch%2Fpatch%2F158718026
>>>>>>>>>>> 6-
>>>>
>>>> &amp;data=02%7C01%7Celibr%40nvidia.com%7C2eb7f5d9553c44e4980a0
>> 8
>>>>>>>>>>>
>>>>
>>>> d83df912c9%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C6373
>> 2
>>>> 7
>>>>> 487
>>>>>>>>>>>
>>>>
>>>> 263661244&amp;sdata=fgnwXLy3xt%2B8H8h9BJwDzlPmp3dtWv6AMTQ69
>> %
>>>> 2
>>>>> B9kb
>>>>>>>>>>> NM%3D&amp;reserved=0
>>>>>>>>>> 28632-1-git-send-email-JackerDune@gmail.com/.
>>>>>>>
>>>>>>> I'm worried about this bug.  Current version of a workaround is
>>>>>>> not correct from the OVS point of view since it wildcards dl_type
>>>>>>> during offloading that is not expected by higher layers, causing
>>>>>>> HW flow being much more wide than requested.  In case we going to
>>>>>>> have this workaround for xl710, we should have another workaround
>>>>>>> for this
>>>> issue too.
>>>>>>>
>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Is it possible to address it in DPDK instead of reverting in
>>>>>>>>>>> OVS and later re-reverting?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Eli
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I've included the i40e DPDK maintainers here for their
>>>>>>>>>>>> thoughts
>>>> also.
>>>>>>>>>>>>
>>>>>>>>>>>> @Beilei/Jia Is this something we can look at to introduce in
>>>>>>>>>>>> the i40e PMD?
>>>>>>>>>>
>>>>>>>>>> Hello,
>>>>>>>>>>
>>>>>>>>>> Please take a look at:
>>>>>>>>>>
>>>>>>>>>>
>>>> https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2F
>>>>>>>>>> m
>>>>>>>>>> ails.dpdk.org%2Farchives%2Fdev%2F2020-
>>>>> May%2F166272.html&amp;data=0
>>>>>>>>>>
>>>>
>>>> 2%7C01%7Celibr%40nvidia.com%7C2eb7f5d9553c44e4980a08d83df912c9
>> %
>>>> 7
>>>>> C4
>>>>>>>>>>
>>>>
>>>> 3083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637327487263661244
>>> &
>>>> a
>>>>> mp;s
>>>>>>>>>>
>>>>
>>> data=g2Yiy07cL4yGIQYsJzNkZqimUibNMXsDQLNdtJ8Evmw%3D&amp;reserv
>>>> ed=
>>>>> 0
>>>>>>>>>>
>>>>>>>>>> For MLX it was only an optimization. For i40e something
>>>>>>>>>> similar may be a workaround for this issue.
>>>>>>>>>
>>>>>>>>> Thanks for this Eli, let me sync with Beilei on this.
>>>>>>>>>
>>>>>>>>> If it's something we can resolve in the PMD then I think we can
>>>>>>>>> add an errata or known issue for the 2.14 release (and possibly
>>>>>>>>> the
>>>>>>>>> 2.13 release as I think the same issue is present there).
>>>>>>>>>
>>>>>>>>> If it was fixed in the future we could then remove the issue
>>>>>>>>> notice.>
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> According to the OVS patch and mlx DPDK patch, is the
>>>>>>>> requirement to support rte flow pattern like 'pattern IPv4 / UDP
>>>>>>>> src is 32 / end', no need to use ''pattern ETH / IPv4 / UDP src is 32 /
>> end '?
>>>>>>>> please correct me if I'm wrong.
>>>>>>>>
>>>>>>>> If yes, could you please tell me how OVS adds a flow which
>>>>>>>> doesn’t include ETH item? I'm not very familiar with OVS, I run
>>>>>>>> some simple commands before, and find eth will always exist. E.g:
>>>>>>>> flow_add: ufid:fced09a8-9b8a-420d-9cb6-454ab9bed224
>>>>>>>> skb_priority(0),skb_mark(0),\
>>>>>>>> ct_state(-new-est-rel-rpl-inv-trk-snat-dnat),ct_zone(0),ct_mark(
>>>>>>>> 0
>>>>>>>> ),
>>>>>>>> c
>>>>>>>> t_
>>>>>>>> label(0),recirc_id(0),\
>>>>>>>>
>> dp_hash(0),in_port(2),packet_type(ns=0,id=0),eth(src=00:e8:ca:11:ba:
>>>>>>>> 80
>>>>>>>> ,dst=ff:ff:ff:ff:ff:ff),
>>>>>>>> eth_type(0x0800),ipv4(src=0.0.0.0,dst=255.255.255.255,proto=17,t
>>>>>>>> o
>>>>>>>> s=
>>>>>>>> 0
>>>>>>>> x1
>>>>>>>> 0,ttl=128,frag=no),
>>>>>>>> udp(src=68,dst=67), actions:drop
>>>>>>>
>>>>>>> Emma, Ian, could you please provide the pattern that fails to
>>>>>>> offload on current OVS master so it will be easier for everyone
>>>>>>> to understand the
>>>>> issue.
>>>>>>> It's not obvious how to construct the bad pattern by only looking
>>>>>>> at the i40e pmd code.  Also, please, enable debug logs and
>>>>>>> provide the testpmd-style arguments constructed by OVS.
>>>>>>>
>>>>>>> It might be also good to have all this information in the commit
>> message.
>>>>>>>
>>>>>>
>>>>>> Following flow pattern is failing from logs:
>>>>>>
>>>>>>   Attributes: ingress=1, egress=0, prio=0, group=0, transfer=1 rte
>>>>>> flow eth pattern:
>>>>>>   Spec: src=00:00:04:00:0b:00, dst=00:00:04:00:0a:00, type=0x0800
>>>>>>   Mask: src=00:00:00:00:00:00, dst=00:00:00:00:00:00, type=0xffff
>>>>>> rte flow ipv4 pattern:
>>>>>>   Spec: tos=0x0, ttl=40, proto=0x11, src=1.1.0.0, dst=0.0.0.0
>>>>>>   Mask: tos=0x0, ttl=0, proto=0x0, src=255.255.255.255,
>>>>>> dst=0.0.0.0 rte flow count action:
>>>>>>   Count: shared=0, id=0
>>>>>> rte flow port-id action:
>>>>>>   Port-id: original=0, id=1
>>>>>> 2020-08-
>>>>
>>> 11T10:49:28.002Z|00003|netdev_offload_dpdk(dp_netdev_flow_18)|WAR
>>>> N|
>>>>> dpdk0: rte_flow creation failed: 13 (Unsupported ether_type.).
>>>> Hi
>>>>
>>>> Maybe another workaround until fixed in i40e PMD is to mask out the
>>>> ether_type match if there is IPv4 or IPv6.
>>>> It means that patterns of:
>>>> eth type is 0x0800 / ipv4 / ...
>>>> will be
>>>> eth / ipv4 / ...
>>>>
>>>> For IPv6 the same, but for other ether types no removal of the match.
>>>> eth type is 0x1234 /
>>>> kept the same.
>>>>
>>>> I referred to MLX5 PMD patch. As I mentioned, for MLX5 it's only an
>>>> optimization, for XL710 it can be used as a workaround.
>>>>
>> https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail
>>>> s
>>>> .dpdk.org%2Farchives%2Fdev%2F2020-
>>> May%2F166272.html&amp;data=02%7C01%7
>>>>
>>> Celibr%40nvidia.com%7C3ce7437a70c14475299e08d83f657fdd%7C43083d1
>> 5
>>> 72734
>>>>
>>> 0c1b7db39efd9ccc17a%7C0%7C0%7C637329052456016934&amp;sdata=D2
>> H
>>> XS8Wo9Si
>>>> aD3BCtY8%2BvX5gKH8a%2B7VmAVG8YPxkjAE%3D&amp;reserved=0
>>>>
>>>> A similar masking out of protocol type is done today for
>>>> TCP/UDP/SCTP/ICMP. For example:
>>>>         /* proto == TCP and ITEM_TYPE_TCP, thus no need for proto match.
>> */
>>>>         if (next_proto_mask) {
>>>>             *next_proto_mask = 0;
>>>>         }
>>>> Please take a look at
>>>>
>> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpat
>>>> c
>>>>
>>> hwork.ozlabs.org%2Fproject%2Fopenvswitch%2Fpatch%2F20200710120718
>> .3
>>> &am
>>>>
>>> p;data=02%7C01%7Celibr%40nvidia.com%7C3ce7437a70c14475299e08d83f
>> 6
>>> 57fdd
>>>>
>>> %7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637329052456016
>>> 934&amp;s
>>>>
>>> data=I8EPuHTyuiP7e3NeEQ86fXIKyAzf5%2BOQ2B8CHo7ilI4%3D&amp;reser
>> ved
>>> =0
>>>> 8633-3-sriharsha.basavapatna@broadcom.com/
>>>> That posted commit suggest not to mask out the protocol type and let
>>>> the PMDs do it if they wish.
>>>>
>>>> The proposed workaround (not tested):
>>>> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
>>>> index de6101e4d..1af7bebcd 100644
>>>> --- a/lib/netdev-offload-dpdk.c
>>>> +++ b/lib/netdev-offload-dpdk.c
>>>> @@ -676,6 +676,7 @@ static int
>>>>  parse_flow_match(struct flow_patterns *patterns,
>>>>                   struct match *match)  {
>>>> +    uint8_t *next_eth_type_mask = NULL;
>>>>      uint8_t *next_proto_mask = NULL;
>>>>      struct flow *consumed_masks;
>>>>      uint8_t proto = 0;
>>>> @@ -712,6 +713,7 @@ parse_flow_match(struct flow_patterns
>> *patterns,
>>>>          consumed_masks->dl_type = 0;
>>>>
>>>>          add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec,
>>>> mask);
>>>> +        next_eth_type_mask = &mask->type;
>>>>      }
>>>>
>>>>      /* VLAN */
>>>> @@ -739,6 +741,11 @@ parse_flow_match(struct flow_patterns
>> *patterns,
>>>>      if (match->flow.dl_type == htons(ETH_TYPE_IP)) {
>>>>          struct rte_flow_item_ipv4 *spec, *mask;
>>>>
>>>> +        /* IPv4. No need to match ether type. */
>>>> +        if (next_eth_type_mask) {
>>>> +            *next_eth_type_mask = 0;
>>>> +        }
>>>> +
>>>>          spec = xzalloc(sizeof *spec);
>>>>          mask = xzalloc(sizeof *mask);
>>>>
>>>> @@ -777,6 +784,11 @@ parse_flow_match(struct flow_patterns
>> *patterns,
>>>>      if (match->flow.dl_type == htons(ETH_TYPE_IPV6)) {
>>>>          struct rte_flow_item_ipv6 *spec, *mask;
>>>>
>>>> +        /* IPv6. No need to match ether type. */
>>>> +        if (next_eth_type_mask) {
>>>> +            *next_eth_type_mask = 0;
>>>> +        }
>>>> +
>>>>          spec = xzalloc(sizeof *spec);
>>>>          mask = xzalloc(sizeof *mask);
>>>>
>>>>
>>>>>>
>>>>>> Or from dump-flows :
>>>>>>
>>>>>> flow-dump from pmd on cpu core: 23
>>>>>> ufid:b4ba667d-8f4d-4f45-8896-c541d9fcecc0,
>>>>>> skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark
>>>>>> (
>>>>>> 0/
>>>>>> 0
>>>>>> ),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(dpdk0),packet_ty
>>>>>> p
>>>>>> e(
>>>>>> n
>>>>>>
>> s=0,id=0),eth(src=00:00:04:00:0b:00/00:00:00:00:00:00,dst=00:00:04:00:
>>>>>> 0a:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=1.1.0.0,dst=0.0.0.
>>>>>> 0
>>>>>> /0.0.0.0,proto=17/0,tos=0/0,ttl=64/0,frag=no),udp(src=63/0,dst=63/
>>>>>> 0 ), packets:42575487, bytes:2554529220, used:0.000s,
>>>>>> offloaded:partial, dp:ovs, actions:dpdk1,
>>>>>> dp-extra-info:miniflow_bits(4,2)
>>>>>>
>>>>>>
>>>>>> with regards to " provide the testpmd-style arguments constructed
>>>>>> by
>>>> OVS."
>>>>>> Can you confirm what you mean?
>>>>>
>>>>> We changed the way of logging in a following commit:
>>>>> d8ad173fb9c1 ("netdev-offload-dpdk: Log testpmd format for flow
>>>>> create/destroy.")
>>>>>
>>>>> It's on master and branch-2.14.
>>>>>
>>>>>>
>>>
>>> Tested both of these patches and neither fixed the issue.
>>>
>>> The problem is that parse_flow_match() is setting an ethernet pattern
>>> that the i40e PMD doesn't support.
>>> Working Flow:
>>> rte flow eth pattern:
>>> Spec = null
>>> Mask = null
>>> rte flow ipv4 pattern:
>>> Spec: tos=0x0, ttl=40, proto=0x11, src=21.2.5.1, dst=0.0.0.0
>>> Mask: tos=0x0, ttl=0, proto=0x0, src=240.0.0.0, dst=0.0.0.0
>>>
>>> Broken Flow:
>>> flow eth pattern:
>>> Spec: src=00:00:04:00:0b:00, dst=00:00:04:00:0a:00, type=0x0800
>>> Mask: src=00:00:00:00:00:00, dst=00:00:00:00:00:00, type=0xffff rte
>>> flow ipv4
>>> pattern:
>>> Spec: tos=0x0, ttl=40, proto=0x11, src=1.1.0.0, dst=0.0.0.0
>>> Mask: tos=0x0, ttl=0, proto=0x0, src=255.255.255.255, dst=0.0.0.0
>>>
>>> I will continue to try rework the patches to fix this.
>>>
>>> Thanks,
>>> Emma
>>>
>> Hi Emma,
>>
>> My previous patch had a bug (uint8_t instead of uint16_t). Anyway, I have
>> prepared another one according to your inputs (also not tested). Please give
>> it a try, and if works, finalize it with comments etc.
>> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c index
>> de6101e4d..f4b04a880 100644
>> --- a/lib/netdev-offload-dpdk.c
>> +++ b/lib/netdev-offload-dpdk.c
>> @@ -672,10 +672,19 @@ free_flow_actions(struct flow_actions *actions)
>>      actions->cnt = 0;
>>  }
>>
>> +/* write a comment this is a workaround... */ #define
>> +NULL_ETH_MASK_IF_ZEROS \
>> +    if (eth_mask && is_all_zeros(&eth_mask->src, sizeof eth_mask->src) && \
>> +        is_all_zeros(&eth_mask->dst, sizeof eth_mask->dst)) { \
>> +        patterns->items[0].mask = NULL; \
>> +        free(eth_mask); \
>> +    }

If we're going with this approach, please, make it a function that accepts
'struct rte_flow_item *', i.e. something like:

    clear_eth_pattern_if_type_only(&patterns->items[0]);

One extra check that it's actually an eth pattern should be added inside the
function.

Is it OK to pass spec, but not mask?   I do not see anything about this kind
of configuration in the rte_flow API definition.  I think, we should clean
up spec too in this case, just to be sure that we're not leaking any resources.

>> +
>>  static int
>>  parse_flow_match(struct flow_patterns *patterns,
>>                   struct match *match)
>>  {
>> +    struct rte_flow_item_eth *eth_mask = NULL;

In case of a function, there is no need to move this definition outside of
its block.

>>      uint8_t *next_proto_mask = NULL;
>>      struct flow *consumed_masks;
>>      uint8_t proto = 0;
>> @@ -694,24 +703,24 @@ parse_flow_match(struct flow_patterns *patterns,
>>      if (match->wc.masks.dl_type ||
>>          !eth_addr_is_zero(match->wc.masks.dl_src) ||
>>          !eth_addr_is_zero(match->wc.masks.dl_dst)) {
>> -        struct rte_flow_item_eth *spec, *mask;
>> +        struct rte_flow_item_eth *spec;
>>
>>          spec = xzalloc(sizeof *spec);
>> -        mask = xzalloc(sizeof *mask);
>> +        eth_mask = xzalloc(sizeof *eth_mask);
>>
>>          memcpy(&spec->dst, &match->flow.dl_dst, sizeof spec->dst);
>>          memcpy(&spec->src, &match->flow.dl_src, sizeof spec->src);
>>          spec->type = match->flow.dl_type;
>>
>> -        memcpy(&mask->dst, &match->wc.masks.dl_dst, sizeof mask->dst);
>> -        memcpy(&mask->src, &match->wc.masks.dl_src, sizeof mask->src);
>> -        mask->type = match->wc.masks.dl_type;
>> +        memcpy(&eth_mask->dst, &match->wc.masks.dl_dst, sizeof eth_mask-
>>> dst);
>> +        memcpy(&eth_mask->src, &match->wc.masks.dl_src, sizeof eth_mask-
>>> src);
>> +        eth_mask->type = match->wc.masks.dl_type;
>>
>>          memset(&consumed_masks->dl_dst, 0, sizeof consumed_masks-
>>> dl_dst);
>>          memset(&consumed_masks->dl_src, 0, sizeof consumed_masks-
>>> dl_src);
>>          consumed_masks->dl_type = 0;
>>
>> -        add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec, mask);
>> +        add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec,
>> + eth_mask);
>>      }
>>
>>      /* VLAN */
>> @@ -739,6 +748,8 @@ parse_flow_match(struct flow_patterns *patterns,
>>      if (match->flow.dl_type == htons(ETH_TYPE_IP)) {
>>          struct rte_flow_item_ipv4 *spec, *mask;
>>
>> +        NULL_ETH_MASK_IF_ZEROS;
>> +
>>          spec = xzalloc(sizeof *spec);
>>          mask = xzalloc(sizeof *mask);
>>
>> @@ -777,6 +788,8 @@ parse_flow_match(struct flow_patterns *patterns,
>>      if (match->flow.dl_type == htons(ETH_TYPE_IPV6)) {
>>          struct rte_flow_item_ipv6 *spec, *mask;
>>
>> +        NULL_ETH_MASK_IF_ZEROS;
>> +
>>          spec = xzalloc(sizeof *spec);
>>          mask = xzalloc(sizeof *mask);
>>
>>
> 
> Great. Thanks Eli, I tested this and it works. I will also have to test this for backporting to v.2.13.1.
> I can provide test/review tags for this patch.
> 
> Thanks, 
> Emma
> 
>> Thanks,
>> Eli
>>
>>>>>> Thanks,
>>>>>> Emma
>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> Best regards, Ilya Maximets.
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Beilei
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>> Ian
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Eli
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Regards
>>>>>>>>>>>> Ian
>>>>>>>>>>>>
>>>>>>>>>>>>>> Fixed by partial reversion of these changes.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Signed-off-by: Emma Finn<emma.finn@intel.com>
>>>>>>>>
>>>>>>
>
Ilya Maximets Aug. 13, 2020, 2:40 p.m. UTC | #17
On 8/13/20 1:46 PM, Ilya Maximets wrote:
> On 8/13/20 1:12 PM, Finn, Emma wrote:
>>
>>
>>> -----Original Message-----
>>> From: Eli Britstein <elibr@nvidia.com>
>>> Sent: Thursday 13 August 2020 10:26
>>> To: Finn, Emma <emma.finn@intel.com>; Ilya Maximets
>>> <i.maximets@ovn.org>; Xing, Beilei <beilei.xing@intel.com>; Stokes, Ian
>>> <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>;
>>> dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
>>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet matching
>>> HWOL for XL710 NIC
>>>
>>>
>>>
>>>> -----Original Message-----
>>>> From: Finn, Emma <emma.finn@intel.com>
>>>> Sent: Thursday, August 13, 2020 11:47 AM
>>>> To: Eli Britstein <elibr@nvidia.com>; Ilya Maximets
>>>> <i.maximets@ovn.org>; Xing, Beilei <beilei.xing@intel.com>; Stokes, Ian
>>>> <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>;
>>>> dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
>>>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
>>>> matching HWOL for XL710 NIC
>>>>
>>>>> -----Original Message-----
>>>>> From: Eli Britstein <elibr@nvidia.com>
>>>>> Sent: Tuesday 11 August 2020 16:49
>>>>> To: Ilya Maximets <i.maximets@ovn.org>; Finn, Emma
>>>>> <emma.finn@intel.com>; Xing, Beilei <beilei.xing@intel.com>; Stokes,
>>>>> Ian <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>;
>>>>> dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
>>>>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
>>>>> matching HWOL for XL710 NIC
>>>>>
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Ilya Maximets <i.maximets@ovn.org>
>>>>>> Sent: Tuesday, August 11, 2020 4:19 PM
>>>>>> To: Finn, Emma <emma.finn@intel.com>; Ilya Maximets
>>>>>> <i.maximets@ovn.org>; Xing, Beilei <beilei.xing@intel.com>; Stokes,
>>>>>> Ian <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>;
>>>>>> dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
>>>>>> Subject: Re: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
>>>>>> matching HWOL for XL710 NIC
>>>>>>
>>>>>> On 8/11/20 3:12 PM, Finn, Emma wrote:
>>>>>>>
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Ilya Maximets <i.maximets@ovn.org>
>>>>>>>> Sent: Tuesday 11 August 2020 11:02
>>>>>>>> To: Xing, Beilei <beilei.xing@intel.com>; Stokes, Ian
>>>>>>>> <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>; Finn,
>>>>>>>> Emma <emma.finn@intel.com>; dev@openvswitch.org; Guo, Jia
>>>>>>>> <jia.guo@intel.com>
>>>>>>>> Cc: i.maximets@ovn.org
>>>>>>>> Subject: Re: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
>>>>>>>> matching HWOL for XL710 NIC
>>>>>>>>
>>>>>>>> On 8/11/20 5:35 AM, Xing, Beilei wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Stokes, Ian <ian.stokes@intel.com>
>>>>>>>>>> Sent: Tuesday, August 11, 2020 4:52 AM
>>>>>>>>>> To: Eli Britstein <elibr@mellanox.com>; Finn, Emma
>>>>>>>>>> <emma.finn@intel.com>; dev@openvswitch.org; Xing, Beilei
>>>>>>>>>> <beilei.xing@intel.com>; Guo, Jia <jia.guo@intel.com>
>>>>>>>>>> Cc: i.maximets@ovn.org
>>>>>>>>>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken
>>>>>>>>>> ethernet matching HWOL for XL710 NIC
>>>>>>>>>>
>>>>>>>>>>> On 8/7/2020 7:55 AM, Eli Britstein wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> On 8/6/2020 8:28 PM, Stokes, Ian wrote:
>>>>>>>>>>>>>> On 8/6/2020 6:17 PM, Emma Finn wrote:
>>>>>>>>>>>>>>> The following 2 commits introduced changes which caused a
>>>>>>>>>>>>>>> regression for XL710 devices and functionality ceases for
>>>>>>>>>>>>>>> partial offload as a result.
>>>>>>>>>>>>>>> 864852a0624a ("netdev-offload-dpdk: Fix Ethernet matching
>>>>>>>>>>>>>>> for type
>>>>>>>>>>>>>>> only.")
>>>>>>>>>>>>>>> a79eae87abe4 ("netdev-offload-dpdk: Remove pre-validate
>>> of
>>>>>>>>>>> patterns
>>>>>>>>>>>>>>> function.")
>>>>>>>>>>>>>> OVS is vendor agnostic. That kind of workaround belongs in
>>>>>>>>>>>>>> intel PMD in dpdk, not in OVS.
>>>>>>>>>>>>> Hi Eli,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yes OVS looks to be vendor agnostic, but this code I believe
>>>>>>>>>>>>> was already in place and working for this usecase. As such
>>>>>>>>>>>>> it's removal introduced a regression from an OVS point of
>>>>>>>>>>>>> view between
>>>>>>>> the releases.
>>>>>>>>>>>>>
>>>>>>>>>>>>> We have had examples in the past where workarounds are
>>>>>>>> permissible
>>>>>>>>>>>>> if there is a clear path to fixing this in the future on the
>>>>>>>>>>>>> DPDK side (which is what I suggest here) (for example
>>>>>>>>>>>>> scatter gather support for MTUs in the past raised similar
>>>>>>>>>>>>> issue where we hand to handle specific NIC until the next
>>>>>>>>>>>>> DPDK LTS
>>>> release).
>>>>>>>>>>>>>
>>>>>>>>>>>>> So my suggestion is to re-instate the original workaround
>>>>>>>>>>>>> and remove its when fixed in the next DPDK LTS which
>>>>>>>>>>>>> supports the change for i40e at the PMD layer or if it's
>>>>>>>>>>>>> backported to the next
>>>>>>>>>>>>> 19.11 stable release which would be validated for use with OVS.
>>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> There was a bug with this WA.
>>>>>>>>>>>>
>>>>>>>>>>>> Please see
>>>>>>>>>>>>
>>>>> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%25
>>>>>>>>>>>> 2
>>>>>>>>>>>>
>>>>>
>>>> Fpatchwork.ozlabs.org%2Fproject%2Fopenvswitch%2Fpatch%2F158718026
>>>>>>>>>>>> 6-
>>>>>
>>>>> &amp;data=02%7C01%7Celibr%40nvidia.com%7C2eb7f5d9553c44e4980a0
>>> 8
>>>>>>>>>>>>
>>>>>
>>>>> d83df912c9%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C6373
>>> 2
>>>>> 7
>>>>>> 487
>>>>>>>>>>>>
>>>>>
>>>>> 263661244&amp;sdata=fgnwXLy3xt%2B8H8h9BJwDzlPmp3dtWv6AMTQ69
>>> %
>>>>> 2
>>>>>> B9kb
>>>>>>>>>>>> NM%3D&amp;reserved=0
>>>>>>>>>>> 28632-1-git-send-email-JackerDune@gmail.com/.
>>>>>>>>
>>>>>>>> I'm worried about this bug.  Current version of a workaround is
>>>>>>>> not correct from the OVS point of view since it wildcards dl_type
>>>>>>>> during offloading that is not expected by higher layers, causing
>>>>>>>> HW flow being much more wide than requested.  In case we going to
>>>>>>>> have this workaround for xl710, we should have another workaround
>>>>>>>> for this
>>>>> issue too.
>>>>>>>>
>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Is it possible to address it in DPDK instead of reverting in
>>>>>>>>>>>> OVS and later re-reverting?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Eli
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've included the i40e DPDK maintainers here for their
>>>>>>>>>>>>> thoughts
>>>>> also.
>>>>>>>>>>>>>
>>>>>>>>>>>>> @Beilei/Jia Is this something we can look at to introduce in
>>>>>>>>>>>>> the i40e PMD?
>>>>>>>>>>>
>>>>>>>>>>> Hello,
>>>>>>>>>>>
>>>>>>>>>>> Please take a look at:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>> https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2F
>>>>>>>>>>> m
>>>>>>>>>>> ails.dpdk.org%2Farchives%2Fdev%2F2020-
>>>>>> May%2F166272.html&amp;data=0
>>>>>>>>>>>
>>>>>
>>>>> 2%7C01%7Celibr%40nvidia.com%7C2eb7f5d9553c44e4980a08d83df912c9
>>> %
>>>>> 7
>>>>>> C4
>>>>>>>>>>>
>>>>>
>>>>> 3083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637327487263661244
>>>> &
>>>>> a
>>>>>> mp;s
>>>>>>>>>>>
>>>>>
>>>> data=g2Yiy07cL4yGIQYsJzNkZqimUibNMXsDQLNdtJ8Evmw%3D&amp;reserv
>>>>> ed=
>>>>>> 0
>>>>>>>>>>>
>>>>>>>>>>> For MLX it was only an optimization. For i40e something
>>>>>>>>>>> similar may be a workaround for this issue.
>>>>>>>>>>
>>>>>>>>>> Thanks for this Eli, let me sync with Beilei on this.
>>>>>>>>>>
>>>>>>>>>> If it's something we can resolve in the PMD then I think we can
>>>>>>>>>> add an errata or known issue for the 2.14 release (and possibly
>>>>>>>>>> the
>>>>>>>>>> 2.13 release as I think the same issue is present there).
>>>>>>>>>>
>>>>>>>>>> If it was fixed in the future we could then remove the issue
>>>>>>>>>> notice.>
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> According to the OVS patch and mlx DPDK patch, is the
>>>>>>>>> requirement to support rte flow pattern like 'pattern IPv4 / UDP
>>>>>>>>> src is 32 / end', no need to use ''pattern ETH / IPv4 / UDP src is 32 /
>>> end '?
>>>>>>>>> please correct me if I'm wrong.
>>>>>>>>>
>>>>>>>>> If yes, could you please tell me how OVS adds a flow which
>>>>>>>>> doesn’t include ETH item? I'm not very familiar with OVS, I run
>>>>>>>>> some simple commands before, and find eth will always exist. E.g:
>>>>>>>>> flow_add: ufid:fced09a8-9b8a-420d-9cb6-454ab9bed224
>>>>>>>>> skb_priority(0),skb_mark(0),\
>>>>>>>>> ct_state(-new-est-rel-rpl-inv-trk-snat-dnat),ct_zone(0),ct_mark(
>>>>>>>>> 0
>>>>>>>>> ),
>>>>>>>>> c
>>>>>>>>> t_
>>>>>>>>> label(0),recirc_id(0),\
>>>>>>>>>
>>> dp_hash(0),in_port(2),packet_type(ns=0,id=0),eth(src=00:e8:ca:11:ba:
>>>>>>>>> 80
>>>>>>>>> ,dst=ff:ff:ff:ff:ff:ff),
>>>>>>>>> eth_type(0x0800),ipv4(src=0.0.0.0,dst=255.255.255.255,proto=17,t
>>>>>>>>> o
>>>>>>>>> s=
>>>>>>>>> 0
>>>>>>>>> x1
>>>>>>>>> 0,ttl=128,frag=no),
>>>>>>>>> udp(src=68,dst=67), actions:drop
>>>>>>>>
>>>>>>>> Emma, Ian, could you please provide the pattern that fails to
>>>>>>>> offload on current OVS master so it will be easier for everyone
>>>>>>>> to understand the
>>>>>> issue.
>>>>>>>> It's not obvious how to construct the bad pattern by only looking
>>>>>>>> at the i40e pmd code.  Also, please, enable debug logs and
>>>>>>>> provide the testpmd-style arguments constructed by OVS.
>>>>>>>>
>>>>>>>> It might be also good to have all this information in the commit
>>> message.
>>>>>>>>
>>>>>>>
>>>>>>> Following flow pattern is failing from logs:
>>>>>>>
>>>>>>>   Attributes: ingress=1, egress=0, prio=0, group=0, transfer=1 rte
>>>>>>> flow eth pattern:
>>>>>>>   Spec: src=00:00:04:00:0b:00, dst=00:00:04:00:0a:00, type=0x0800
>>>>>>>   Mask: src=00:00:00:00:00:00, dst=00:00:00:00:00:00, type=0xffff
>>>>>>> rte flow ipv4 pattern:
>>>>>>>   Spec: tos=0x0, ttl=40, proto=0x11, src=1.1.0.0, dst=0.0.0.0
>>>>>>>   Mask: tos=0x0, ttl=0, proto=0x0, src=255.255.255.255,
>>>>>>> dst=0.0.0.0 rte flow count action:
>>>>>>>   Count: shared=0, id=0
>>>>>>> rte flow port-id action:
>>>>>>>   Port-id: original=0, id=1
>>>>>>> 2020-08-
>>>>>
>>>> 11T10:49:28.002Z|00003|netdev_offload_dpdk(dp_netdev_flow_18)|WAR
>>>>> N|
>>>>>> dpdk0: rte_flow creation failed: 13 (Unsupported ether_type.).
>>>>> Hi
>>>>>
>>>>> Maybe another workaround until fixed in i40e PMD is to mask out the
>>>>> ether_type match if there is IPv4 or IPv6.
>>>>> It means that patterns of:
>>>>> eth type is 0x0800 / ipv4 / ...
>>>>> will be
>>>>> eth / ipv4 / ...
>>>>>
>>>>> For IPv6 the same, but for other ether types no removal of the match.
>>>>> eth type is 0x1234 /
>>>>> kept the same.
>>>>>
>>>>> I referred to MLX5 PMD patch. As I mentioned, for MLX5 it's only an
>>>>> optimization, for XL710 it can be used as a workaround.
>>>>>
>>> https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmail
>>>>> s
>>>>> .dpdk.org%2Farchives%2Fdev%2F2020-
>>>> May%2F166272.html&amp;data=02%7C01%7
>>>>>
>>>> Celibr%40nvidia.com%7C3ce7437a70c14475299e08d83f657fdd%7C43083d1
>>> 5
>>>> 72734
>>>>>
>>>> 0c1b7db39efd9ccc17a%7C0%7C0%7C637329052456016934&amp;sdata=D2
>>> H
>>>> XS8Wo9Si
>>>>> aD3BCtY8%2BvX5gKH8a%2B7VmAVG8YPxkjAE%3D&amp;reserved=0
>>>>>
>>>>> A similar masking out of protocol type is done today for
>>>>> TCP/UDP/SCTP/ICMP. For example:
>>>>>         /* proto == TCP and ITEM_TYPE_TCP, thus no need for proto match.
>>> */
>>>>>         if (next_proto_mask) {
>>>>>             *next_proto_mask = 0;
>>>>>         }
>>>>> Please take a look at
>>>>>
>>> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpat
>>>>> c
>>>>>
>>>> hwork.ozlabs.org%2Fproject%2Fopenvswitch%2Fpatch%2F20200710120718
>>> .3
>>>> &am
>>>>>
>>>> p;data=02%7C01%7Celibr%40nvidia.com%7C3ce7437a70c14475299e08d83f
>>> 6
>>>> 57fdd
>>>>>
>>>> %7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637329052456016
>>>> 934&amp;s
>>>>>
>>>> data=I8EPuHTyuiP7e3NeEQ86fXIKyAzf5%2BOQ2B8CHo7ilI4%3D&amp;reser
>>> ved
>>>> =0
>>>>> 8633-3-sriharsha.basavapatna@broadcom.com/
>>>>> That posted commit suggest not to mask out the protocol type and let
>>>>> the PMDs do it if they wish.
>>>>>
>>>>> The proposed workaround (not tested):
>>>>> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
>>>>> index de6101e4d..1af7bebcd 100644
>>>>> --- a/lib/netdev-offload-dpdk.c
>>>>> +++ b/lib/netdev-offload-dpdk.c
>>>>> @@ -676,6 +676,7 @@ static int
>>>>>  parse_flow_match(struct flow_patterns *patterns,
>>>>>                   struct match *match)  {
>>>>> +    uint8_t *next_eth_type_mask = NULL;
>>>>>      uint8_t *next_proto_mask = NULL;
>>>>>      struct flow *consumed_masks;
>>>>>      uint8_t proto = 0;
>>>>> @@ -712,6 +713,7 @@ parse_flow_match(struct flow_patterns
>>> *patterns,
>>>>>          consumed_masks->dl_type = 0;
>>>>>
>>>>>          add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec,
>>>>> mask);
>>>>> +        next_eth_type_mask = &mask->type;
>>>>>      }
>>>>>
>>>>>      /* VLAN */
>>>>> @@ -739,6 +741,11 @@ parse_flow_match(struct flow_patterns
>>> *patterns,
>>>>>      if (match->flow.dl_type == htons(ETH_TYPE_IP)) {
>>>>>          struct rte_flow_item_ipv4 *spec, *mask;
>>>>>
>>>>> +        /* IPv4. No need to match ether type. */
>>>>> +        if (next_eth_type_mask) {
>>>>> +            *next_eth_type_mask = 0;
>>>>> +        }
>>>>> +
>>>>>          spec = xzalloc(sizeof *spec);
>>>>>          mask = xzalloc(sizeof *mask);
>>>>>
>>>>> @@ -777,6 +784,11 @@ parse_flow_match(struct flow_patterns
>>> *patterns,
>>>>>      if (match->flow.dl_type == htons(ETH_TYPE_IPV6)) {
>>>>>          struct rte_flow_item_ipv6 *spec, *mask;
>>>>>
>>>>> +        /* IPv6. No need to match ether type. */
>>>>> +        if (next_eth_type_mask) {
>>>>> +            *next_eth_type_mask = 0;
>>>>> +        }
>>>>> +
>>>>>          spec = xzalloc(sizeof *spec);
>>>>>          mask = xzalloc(sizeof *mask);
>>>>>
>>>>>
>>>>>>>
>>>>>>> Or from dump-flows :
>>>>>>>
>>>>>>> flow-dump from pmd on cpu core: 23
>>>>>>> ufid:b4ba667d-8f4d-4f45-8896-c541d9fcecc0,
>>>>>>> skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark
>>>>>>> (
>>>>>>> 0/
>>>>>>> 0
>>>>>>> ),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(dpdk0),packet_ty
>>>>>>> p
>>>>>>> e(
>>>>>>> n
>>>>>>>
>>> s=0,id=0),eth(src=00:00:04:00:0b:00/00:00:00:00:00:00,dst=00:00:04:00:
>>>>>>> 0a:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=1.1.0.0,dst=0.0.0.
>>>>>>> 0
>>>>>>> /0.0.0.0,proto=17/0,tos=0/0,ttl=64/0,frag=no),udp(src=63/0,dst=63/
>>>>>>> 0 ), packets:42575487, bytes:2554529220, used:0.000s,
>>>>>>> offloaded:partial, dp:ovs, actions:dpdk1,
>>>>>>> dp-extra-info:miniflow_bits(4,2)
>>>>>>>
>>>>>>>
>>>>>>> with regards to " provide the testpmd-style arguments constructed
>>>>>>> by
>>>>> OVS."
>>>>>>> Can you confirm what you mean?
>>>>>>
>>>>>> We changed the way of logging in a following commit:
>>>>>> d8ad173fb9c1 ("netdev-offload-dpdk: Log testpmd format for flow
>>>>>> create/destroy.")
>>>>>>
>>>>>> It's on master and branch-2.14.
>>>>>>
>>>>>>>
>>>>
>>>> Tested both of these patches and neither fixed the issue.
>>>>
>>>> The problem is that parse_flow_match() is setting an ethernet pattern
>>>> that the i40e PMD doesn't support.
>>>> Working Flow:
>>>> rte flow eth pattern:
>>>> Spec = null
>>>> Mask = null
>>>> rte flow ipv4 pattern:
>>>> Spec: tos=0x0, ttl=40, proto=0x11, src=21.2.5.1, dst=0.0.0.0
>>>> Mask: tos=0x0, ttl=0, proto=0x0, src=240.0.0.0, dst=0.0.0.0
>>>>
>>>> Broken Flow:
>>>> flow eth pattern:
>>>> Spec: src=00:00:04:00:0b:00, dst=00:00:04:00:0a:00, type=0x0800
>>>> Mask: src=00:00:00:00:00:00, dst=00:00:00:00:00:00, type=0xffff rte
>>>> flow ipv4
>>>> pattern:
>>>> Spec: tos=0x0, ttl=40, proto=0x11, src=1.1.0.0, dst=0.0.0.0
>>>> Mask: tos=0x0, ttl=0, proto=0x0, src=255.255.255.255, dst=0.0.0.0
>>>>
>>>> I will continue to try rework the patches to fix this.
>>>>
>>>> Thanks,
>>>> Emma
>>>>
>>> Hi Emma,
>>>
>>> My previous patch had a bug (uint8_t instead of uint16_t). Anyway, I have
>>> prepared another one according to your inputs (also not tested). Please give
>>> it a try, and if works, finalize it with comments etc.
>>> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c index
>>> de6101e4d..f4b04a880 100644
>>> --- a/lib/netdev-offload-dpdk.c
>>> +++ b/lib/netdev-offload-dpdk.c
>>> @@ -672,10 +672,19 @@ free_flow_actions(struct flow_actions *actions)
>>>      actions->cnt = 0;
>>>  }
>>>
>>> +/* write a comment this is a workaround... */ #define
>>> +NULL_ETH_MASK_IF_ZEROS \
>>> +    if (eth_mask && is_all_zeros(&eth_mask->src, sizeof eth_mask->src) && \
>>> +        is_all_zeros(&eth_mask->dst, sizeof eth_mask->dst)) { \
>>> +        patterns->items[0].mask = NULL; \
>>> +        free(eth_mask); \
>>> +    }
> 
> If we're going with this approach, please, make it a function that accepts
> 'struct rte_flow_item *', i.e. something like:
> 
>     clear_eth_pattern_if_type_only(&patterns->items[0]);
> 
> One extra check that it's actually an eth pattern should be added inside the
> function.
> 
> Is it OK to pass spec, but not mask?   I do not see anything about this kind
> of configuration in the rte_flow API definition.  I think, we should clean
> up spec too in this case, just to be sure that we're not leaking any resources.
> 
>>> +
>>>  static int
>>>  parse_flow_match(struct flow_patterns *patterns,
>>>                   struct match *match)
>>>  {
>>> +    struct rte_flow_item_eth *eth_mask = NULL;
> 
> In case of a function, there is no need to move this definition outside of
> its block.
> 
>>>      uint8_t *next_proto_mask = NULL;
>>>      struct flow *consumed_masks;
>>>      uint8_t proto = 0;
>>> @@ -694,24 +703,24 @@ parse_flow_match(struct flow_patterns *patterns,
>>>      if (match->wc.masks.dl_type ||
>>>          !eth_addr_is_zero(match->wc.masks.dl_src) ||
>>>          !eth_addr_is_zero(match->wc.masks.dl_dst)) {
>>> -        struct rte_flow_item_eth *spec, *mask;
>>> +        struct rte_flow_item_eth *spec;
>>>
>>>          spec = xzalloc(sizeof *spec);
>>> -        mask = xzalloc(sizeof *mask);
>>> +        eth_mask = xzalloc(sizeof *eth_mask);
>>>
>>>          memcpy(&spec->dst, &match->flow.dl_dst, sizeof spec->dst);
>>>          memcpy(&spec->src, &match->flow.dl_src, sizeof spec->src);
>>>          spec->type = match->flow.dl_type;
>>>
>>> -        memcpy(&mask->dst, &match->wc.masks.dl_dst, sizeof mask->dst);
>>> -        memcpy(&mask->src, &match->wc.masks.dl_src, sizeof mask->src);
>>> -        mask->type = match->wc.masks.dl_type;
>>> +        memcpy(&eth_mask->dst, &match->wc.masks.dl_dst, sizeof eth_mask-
>>>> dst);
>>> +        memcpy(&eth_mask->src, &match->wc.masks.dl_src, sizeof eth_mask-
>>>> src);
>>> +        eth_mask->type = match->wc.masks.dl_type;
>>>
>>>          memset(&consumed_masks->dl_dst, 0, sizeof consumed_masks-
>>>> dl_dst);
>>>          memset(&consumed_masks->dl_src, 0, sizeof consumed_masks-
>>>> dl_src);
>>>          consumed_masks->dl_type = 0;
>>>
>>> -        add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec, mask);
>>> +        add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec,
>>> + eth_mask);
>>>      }
>>>
>>>      /* VLAN */
>>> @@ -739,6 +748,8 @@ parse_flow_match(struct flow_patterns *patterns,
>>>      if (match->flow.dl_type == htons(ETH_TYPE_IP)) {
>>>          struct rte_flow_item_ipv4 *spec, *mask;
>>>
>>> +        NULL_ETH_MASK_IF_ZEROS;
>>> +
>>>          spec = xzalloc(sizeof *spec);
>>>          mask = xzalloc(sizeof *mask);
>>>
>>> @@ -777,6 +788,8 @@ parse_flow_match(struct flow_patterns *patterns,
>>>      if (match->flow.dl_type == htons(ETH_TYPE_IPV6)) {
>>>          struct rte_flow_item_ipv6 *spec, *mask;
>>>
>>> +        NULL_ETH_MASK_IF_ZEROS;
>>> +
>>>          spec = xzalloc(sizeof *spec);
>>>          mask = xzalloc(sizeof *mask);
>>>
>>>
>>
>> Great. Thanks Eli, I tested this and it works. I will also have to test this for backporting to v.2.13.1.
>> I can provide test/review tags for this patch.

Just for a clarification.

Emma, will you finalize and send a formal patch for this?
IIUC, Eli expected you to finish this work.

We have a tight time frame since we have 2.14 release scheduled on Monday.
It'll be good to have a patch today, so it could be tested.

>>
>> Thanks, 
>> Emma
>>
>>> Thanks,
>>> Eli
>>>
>>>>>>> Thanks,
>>>>>>> Emma
>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> Best regards, Ilya Maximets.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Beilei
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Regards
>>>>>>>>>> Ian
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>>
>>>>>>>>>>> Eli
>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards
>>>>>>>>>>>>> Ian
>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Fixed by partial reversion of these changes.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Signed-off-by: Emma Finn<emma.finn@intel.com>
>>>>>>>>>
>>>>>>>
>>
>
Stokes, Ian Aug. 13, 2020, 2:54 p.m. UTC | #18
8/13/20 1:46 PM, Ilya Maximets wrote:
> > On 8/13/20 1:12 PM, Finn, Emma wrote:
> >>
> >>
> >>> -----Original Message-----
> >>> From: Eli Britstein <elibr@nvidia.com>
> >>> Sent: Thursday 13 August 2020 10:26
> >>> To: Finn, Emma <emma.finn@intel.com>; Ilya Maximets
> >>> <i.maximets@ovn.org>; Xing, Beilei <beilei.xing@intel.com>; Stokes,
> >>> Ian <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>;
> >>> dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
> >>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
> >>> matching HWOL for XL710 NIC
> >>>
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Finn, Emma <emma.finn@intel.com>
> >>>> Sent: Thursday, August 13, 2020 11:47 AM
> >>>> To: Eli Britstein <elibr@nvidia.com>; Ilya Maximets
> >>>> <i.maximets@ovn.org>; Xing, Beilei <beilei.xing@intel.com>; Stokes,
> >>>> Ian <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>;
> >>>> dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
> >>>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
> >>>> matching HWOL for XL710 NIC
> >>>>
> >>>>> -----Original Message-----
> >>>>> From: Eli Britstein <elibr@nvidia.com>
> >>>>> Sent: Tuesday 11 August 2020 16:49
> >>>>> To: Ilya Maximets <i.maximets@ovn.org>; Finn, Emma
> >>>>> <emma.finn@intel.com>; Xing, Beilei <beilei.xing@intel.com>;
> >>>>> Stokes, Ian <ian.stokes@intel.com>; Eli Britstein
> >>>>> <elibr@mellanox.com>; dev@openvswitch.org; Guo, Jia
> >>>>> <jia.guo@intel.com>
> >>>>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
> >>>>> matching HWOL for XL710 NIC
> >>>>>
> >>>>>
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Ilya Maximets <i.maximets@ovn.org>
> >>>>>> Sent: Tuesday, August 11, 2020 4:19 PM
> >>>>>> To: Finn, Emma <emma.finn@intel.com>; Ilya Maximets
> >>>>>> <i.maximets@ovn.org>; Xing, Beilei <beilei.xing@intel.com>;
> >>>>>> Stokes, Ian <ian.stokes@intel.com>; Eli Britstein
> >>>>>> <elibr@mellanox.com>; dev@openvswitch.org; Guo, Jia
> >>>>>> <jia.guo@intel.com>
> >>>>>> Subject: Re: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
> >>>>>> matching HWOL for XL710 NIC
> >>>>>>
> >>>>>> On 8/11/20 3:12 PM, Finn, Emma wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>> -----Original Message-----
> >>>>>>>> From: Ilya Maximets <i.maximets@ovn.org>
> >>>>>>>> Sent: Tuesday 11 August 2020 11:02
> >>>>>>>> To: Xing, Beilei <beilei.xing@intel.com>; Stokes, Ian
> >>>>>>>> <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>;
> >>>>>>>> Finn, Emma <emma.finn@intel.com>; dev@openvswitch.org; Guo, Jia
> >>>>>>>> <jia.guo@intel.com>
> >>>>>>>> Cc: i.maximets@ovn.org
> >>>>>>>> Subject: Re: [PATCH] netdev-offload-dpdk: Fix for broken
> >>>>>>>> ethernet matching HWOL for XL710 NIC
> >>>>>>>>
> >>>>>>>> On 8/11/20 5:35 AM, Xing, Beilei wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> -----Original Message-----
> >>>>>>>>>> From: Stokes, Ian <ian.stokes@intel.com>
> >>>>>>>>>> Sent: Tuesday, August 11, 2020 4:52 AM
> >>>>>>>>>> To: Eli Britstein <elibr@mellanox.com>; Finn, Emma
> >>>>>>>>>> <emma.finn@intel.com>; dev@openvswitch.org; Xing, Beilei
> >>>>>>>>>> <beilei.xing@intel.com>; Guo, Jia <jia.guo@intel.com>
> >>>>>>>>>> Cc: i.maximets@ovn.org
> >>>>>>>>>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken
> >>>>>>>>>> ethernet matching HWOL for XL710 NIC
> >>>>>>>>>>
> >>>>>>>>>>> On 8/7/2020 7:55 AM, Eli Britstein wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 8/6/2020 8:28 PM, Stokes, Ian wrote:
> >>>>>>>>>>>>>> On 8/6/2020 6:17 PM, Emma Finn wrote:
> >>>>>>>>>>>>>>> The following 2 commits introduced changes which caused
> >>>>>>>>>>>>>>> a regression for XL710 devices and functionality ceases
> >>>>>>>>>>>>>>> for partial offload as a result.
> >>>>>>>>>>>>>>> 864852a0624a ("netdev-offload-dpdk: Fix Ethernet
> >>>>>>>>>>>>>>> matching for type
> >>>>>>>>>>>>>>> only.")
> >>>>>>>>>>>>>>> a79eae87abe4 ("netdev-offload-dpdk: Remove pre-validate
> >>> of
> >>>>>>>>>>> patterns
> >>>>>>>>>>>>>>> function.")
> >>>>>>>>>>>>>> OVS is vendor agnostic. That kind of workaround belongs
> >>>>>>>>>>>>>> in intel PMD in dpdk, not in OVS.
> >>>>>>>>>>>>> Hi Eli,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Yes OVS looks to be vendor agnostic, but this code I
> >>>>>>>>>>>>> believe was already in place and working for this usecase.
> >>>>>>>>>>>>> As such it's removal introduced a regression from an OVS
> >>>>>>>>>>>>> point of view between
> >>>>>>>> the releases.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> We have had examples in the past where workarounds are
> >>>>>>>> permissible
> >>>>>>>>>>>>> if there is a clear path to fixing this in the future on
> >>>>>>>>>>>>> the DPDK side (which is what I suggest here) (for example
> >>>>>>>>>>>>> scatter gather support for MTUs in the past raised similar
> >>>>>>>>>>>>> issue where we hand to handle specific NIC until the next
> >>>>>>>>>>>>> DPDK LTS
> >>>> release).
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> So my suggestion is to re-instate the original workaround
> >>>>>>>>>>>>> and remove its when fixed in the next DPDK LTS which
> >>>>>>>>>>>>> supports the change for i40e at the PMD layer or if it's
> >>>>>>>>>>>>> backported to the next
> >>>>>>>>>>>>> 19.11 stable release which would be validated for use with OVS.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hi,
> >>>>>>>>>>>>
> >>>>>>>>>>>> There was a bug with this WA.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Please see
> >>>>>>>>>>>>
> >>>>> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%25
> >>>>>>>>>>>> 2
> >>>>>>>>>>>>
> >>>>>
> >>>>
> Fpatchwork.ozlabs.org%2Fproject%2Fopenvswitch%2Fpatch%2F158718026
> >>>>>>>>>>>> 6-
> >>>>>
> >>>>>
> &amp;data=02%7C01%7Celibr%40nvidia.com%7C2eb7f5d9553c44e4980a0
> >>> 8
> >>>>>>>>>>>>
> >>>>>
> >>>>>
> d83df912c9%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C6373
> >>> 2
> >>>>> 7
> >>>>>> 487
> >>>>>>>>>>>>
> >>>>>
> >>>>>
> 263661244&amp;sdata=fgnwXLy3xt%2B8H8h9BJwDzlPmp3dtWv6AMTQ69
> >>> %
> >>>>> 2
> >>>>>> B9kb
> >>>>>>>>>>>> NM%3D&amp;reserved=0
> >>>>>>>>>>> 28632-1-git-send-email-JackerDune@gmail.com/.
> >>>>>>>>
> >>>>>>>> I'm worried about this bug.  Current version of a workaround is
> >>>>>>>> not correct from the OVS point of view since it wildcards
> >>>>>>>> dl_type during offloading that is not expected by higher
> >>>>>>>> layers, causing HW flow being much more wide than requested.
> >>>>>>>> In case we going to have this workaround for xl710, we should
> >>>>>>>> have another workaround for this
> >>>>> issue too.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Is it possible to address it in DPDK instead of reverting
> >>>>>>>>>>>> in OVS and later re-reverting?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Eli
> >>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I've included the i40e DPDK maintainers here for their
> >>>>>>>>>>>>> thoughts
> >>>>> also.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> @Beilei/Jia Is this something we can look at to introduce
> >>>>>>>>>>>>> in the i40e PMD?
> >>>>>>>>>>>
> >>>>>>>>>>> Hello,
> >>>>>>>>>>>
> >>>>>>>>>>> Please take a look at:
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>> https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2F
> >>>>>>>>>>> m
> >>>>>>>>>>> ails.dpdk.org%2Farchives%2Fdev%2F2020-
> >>>>>> May%2F166272.html&amp;data=0
> >>>>>>>>>>>
> >>>>>
> >>>>>
> 2%7C01%7Celibr%40nvidia.com%7C2eb7f5d9553c44e4980a08d83df912c9
> >>> %
> >>>>> 7
> >>>>>> C4
> >>>>>>>>>>>
> >>>>>
> >>>>>
> 3083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637327487263661244
> >>>> &
> >>>>> a
> >>>>>> mp;s
> >>>>>>>>>>>
> >>>>>
> >>>>
> data=g2Yiy07cL4yGIQYsJzNkZqimUibNMXsDQLNdtJ8Evmw%3D&amp;reserv
> >>>>> ed=
> >>>>>> 0
> >>>>>>>>>>>
> >>>>>>>>>>> For MLX it was only an optimization. For i40e something
> >>>>>>>>>>> similar may be a workaround for this issue.
> >>>>>>>>>>
> >>>>>>>>>> Thanks for this Eli, let me sync with Beilei on this.
> >>>>>>>>>>
> >>>>>>>>>> If it's something we can resolve in the PMD then I think we
> >>>>>>>>>> can add an errata or known issue for the 2.14 release (and
> >>>>>>>>>> possibly the
> >>>>>>>>>> 2.13 release as I think the same issue is present there).
> >>>>>>>>>>
> >>>>>>>>>> If it was fixed in the future we could then remove the issue
> >>>>>>>>>> notice.>
> >>>>>>>>>
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> According to the OVS patch and mlx DPDK patch, is the
> >>>>>>>>> requirement to support rte flow pattern like 'pattern IPv4 /
> >>>>>>>>> UDP src is 32 / end', no need to use ''pattern ETH / IPv4 /
> >>>>>>>>> UDP src is 32 /
> >>> end '?
> >>>>>>>>> please correct me if I'm wrong.
> >>>>>>>>>
> >>>>>>>>> If yes, could you please tell me how OVS adds a flow which
> >>>>>>>>> doesn’t include ETH item? I'm not very familiar with OVS, I
> >>>>>>>>> run some simple commands before, and find eth will always exist.
> E.g:
> >>>>>>>>> flow_add: ufid:fced09a8-9b8a-420d-9cb6-454ab9bed224
> >>>>>>>>> skb_priority(0),skb_mark(0),\
> >>>>>>>>> ct_state(-new-est-rel-rpl-inv-trk-snat-dnat),ct_zone(0),ct_mar
> >>>>>>>>> k(
> >>>>>>>>> 0
> >>>>>>>>> ),
> >>>>>>>>> c
> >>>>>>>>> t_
> >>>>>>>>> label(0),recirc_id(0),\
> >>>>>>>>>
> >>> dp_hash(0),in_port(2),packet_type(ns=0,id=0),eth(src=00:e8:ca:11:ba:
> >>>>>>>>> 80
> >>>>>>>>> ,dst=ff:ff:ff:ff:ff:ff),
> >>>>>>>>> eth_type(0x0800),ipv4(src=0.0.0.0,dst=255.255.255.255,proto=17
> >>>>>>>>> ,t
> >>>>>>>>> o
> >>>>>>>>> s=
> >>>>>>>>> 0
> >>>>>>>>> x1
> >>>>>>>>> 0,ttl=128,frag=no),
> >>>>>>>>> udp(src=68,dst=67), actions:drop
> >>>>>>>>
> >>>>>>>> Emma, Ian, could you please provide the pattern that fails to
> >>>>>>>> offload on current OVS master so it will be easier for everyone
> >>>>>>>> to understand the
> >>>>>> issue.
> >>>>>>>> It's not obvious how to construct the bad pattern by only
> >>>>>>>> looking at the i40e pmd code.  Also, please, enable debug logs
> >>>>>>>> and provide the testpmd-style arguments constructed by OVS.
> >>>>>>>>
> >>>>>>>> It might be also good to have all this information in the
> >>>>>>>> commit
> >>> message.
> >>>>>>>>
> >>>>>>>
> >>>>>>> Following flow pattern is failing from logs:
> >>>>>>>
> >>>>>>>   Attributes: ingress=1, egress=0, prio=0, group=0, transfer=1
> >>>>>>> rte flow eth pattern:
> >>>>>>>   Spec: src=00:00:04:00:0b:00, dst=00:00:04:00:0a:00, type=0x0800
> >>>>>>>   Mask: src=00:00:00:00:00:00, dst=00:00:00:00:00:00,
> >>>>>>> type=0xffff rte flow ipv4 pattern:
> >>>>>>>   Spec: tos=0x0, ttl=40, proto=0x11, src=1.1.0.0, dst=0.0.0.0
> >>>>>>>   Mask: tos=0x0, ttl=0, proto=0x0, src=255.255.255.255,
> >>>>>>> dst=0.0.0.0 rte flow count action:
> >>>>>>>   Count: shared=0, id=0
> >>>>>>> rte flow port-id action:
> >>>>>>>   Port-id: original=0, id=1
> >>>>>>> 2020-08-
> >>>>>
> >>>>
> 11T10:49:28.002Z|00003|netdev_offload_dpdk(dp_netdev_flow_18)|WAR
> >>>>> N|
> >>>>>> dpdk0: rte_flow creation failed: 13 (Unsupported ether_type.).
> >>>>> Hi
> >>>>>
> >>>>> Maybe another workaround until fixed in i40e PMD is to mask out
> >>>>> the ether_type match if there is IPv4 or IPv6.
> >>>>> It means that patterns of:
> >>>>> eth type is 0x0800 / ipv4 / ...
> >>>>> will be
> >>>>> eth / ipv4 / ...
> >>>>>
> >>>>> For IPv6 the same, but for other ether types no removal of the match.
> >>>>> eth type is 0x1234 /
> >>>>> kept the same.
> >>>>>
> >>>>> I referred to MLX5 PMD patch. As I mentioned, for MLX5 it's only
> >>>>> an optimization, for XL710 it can be used as a workaround.
> >>>>>
> >>> https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmai
> >>> l
> >>>>> s
> >>>>> .dpdk.org%2Farchives%2Fdev%2F2020-
> >>>> May%2F166272.html&amp;data=02%7C01%7
> >>>>>
> >>>>
> Celibr%40nvidia.com%7C3ce7437a70c14475299e08d83f657fdd%7C43083d1
> >>> 5
> >>>> 72734
> >>>>>
> >>>>
> 0c1b7db39efd9ccc17a%7C0%7C0%7C637329052456016934&amp;sdata=D2
> >>> H
> >>>> XS8Wo9Si
> >>>>> aD3BCtY8%2BvX5gKH8a%2B7VmAVG8YPxkjAE%3D&amp;reserved=0
> >>>>>
> >>>>> A similar masking out of protocol type is done today for
> >>>>> TCP/UDP/SCTP/ICMP. For example:
> >>>>>         /* proto == TCP and ITEM_TYPE_TCP, thus no need for proto match.
> >>> */
> >>>>>         if (next_proto_mask) {
> >>>>>             *next_proto_mask = 0;
> >>>>>         }
> >>>>> Please take a look at
> >>>>>
> >>> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpa
> >>> t
> >>>>> c
> >>>>>
> >>>>
> hwork.ozlabs.org%2Fproject%2Fopenvswitch%2Fpatch%2F20200710120718
> >>> .3
> >>>> &am
> >>>>>
> >>>>
> p;data=02%7C01%7Celibr%40nvidia.com%7C3ce7437a70c14475299e08d83f
> >>> 6
> >>>> 57fdd
> >>>>>
> >>>>
> %7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637329052456016
> >>>> 934&amp;s
> >>>>>
> >>>>
> data=I8EPuHTyuiP7e3NeEQ86fXIKyAzf5%2BOQ2B8CHo7ilI4%3D&amp;reser
> >>> ved
> >>>> =0
> >>>>> 8633-3-sriharsha.basavapatna@broadcom.com/
> >>>>> That posted commit suggest not to mask out the protocol type and
> >>>>> let the PMDs do it if they wish.
> >>>>>
> >>>>> The proposed workaround (not tested):
> >>>>> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
> >>>>> index de6101e4d..1af7bebcd 100644
> >>>>> --- a/lib/netdev-offload-dpdk.c
> >>>>> +++ b/lib/netdev-offload-dpdk.c
> >>>>> @@ -676,6 +676,7 @@ static int
> >>>>>  parse_flow_match(struct flow_patterns *patterns,
> >>>>>                   struct match *match)  {
> >>>>> +    uint8_t *next_eth_type_mask = NULL;
> >>>>>      uint8_t *next_proto_mask = NULL;
> >>>>>      struct flow *consumed_masks;
> >>>>>      uint8_t proto = 0;
> >>>>> @@ -712,6 +713,7 @@ parse_flow_match(struct flow_patterns
> >>> *patterns,
> >>>>>          consumed_masks->dl_type = 0;
> >>>>>
> >>>>>          add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec,
> >>>>> mask);
> >>>>> +        next_eth_type_mask = &mask->type;
> >>>>>      }
> >>>>>
> >>>>>      /* VLAN */
> >>>>> @@ -739,6 +741,11 @@ parse_flow_match(struct flow_patterns
> >>> *patterns,
> >>>>>      if (match->flow.dl_type == htons(ETH_TYPE_IP)) {
> >>>>>          struct rte_flow_item_ipv4 *spec, *mask;
> >>>>>
> >>>>> +        /* IPv4. No need to match ether type. */
> >>>>> +        if (next_eth_type_mask) {
> >>>>> +            *next_eth_type_mask = 0;
> >>>>> +        }
> >>>>> +
> >>>>>          spec = xzalloc(sizeof *spec);
> >>>>>          mask = xzalloc(sizeof *mask);
> >>>>>
> >>>>> @@ -777,6 +784,11 @@ parse_flow_match(struct flow_patterns
> >>> *patterns,
> >>>>>      if (match->flow.dl_type == htons(ETH_TYPE_IPV6)) {
> >>>>>          struct rte_flow_item_ipv6 *spec, *mask;
> >>>>>
> >>>>> +        /* IPv6. No need to match ether type. */
> >>>>> +        if (next_eth_type_mask) {
> >>>>> +            *next_eth_type_mask = 0;
> >>>>> +        }
> >>>>> +
> >>>>>          spec = xzalloc(sizeof *spec);
> >>>>>          mask = xzalloc(sizeof *mask);
> >>>>>
> >>>>>
> >>>>>>>
> >>>>>>> Or from dump-flows :
> >>>>>>>
> >>>>>>> flow-dump from pmd on cpu core: 23
> >>>>>>> ufid:b4ba667d-8f4d-4f45-8896-c541d9fcecc0,
> >>>>>>> skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_ma
> >>>>>>> rk
> >>>>>>> (
> >>>>>>> 0/
> >>>>>>> 0
> >>>>>>> ),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(dpdk0),packet_
> >>>>>>> ty
> >>>>>>> p
> >>>>>>> e(
> >>>>>>> n
> >>>>>>>
> >>> s=0,id=0),eth(src=00:00:04:00:0b:00/00:00:00:00:00:00,dst=00:00:04:00:
> >>>>>>> 0a:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=1.1.0.0,dst=0.0.0.
> >>>>>>> 0
> >>>>>>> /0.0.0.0,proto=17/0,tos=0/0,ttl=64/0,frag=no),udp(src=63/0,dst=6
> >>>>>>> 3/
> >>>>>>> 0 ), packets:42575487, bytes:2554529220, used:0.000s,
> >>>>>>> offloaded:partial, dp:ovs, actions:dpdk1,
> >>>>>>> dp-extra-info:miniflow_bits(4,2)
> >>>>>>>
> >>>>>>>
> >>>>>>> with regards to " provide the testpmd-style arguments
> >>>>>>> constructed by
> >>>>> OVS."
> >>>>>>> Can you confirm what you mean?
> >>>>>>
> >>>>>> We changed the way of logging in a following commit:
> >>>>>> d8ad173fb9c1 ("netdev-offload-dpdk: Log testpmd format for flow
> >>>>>> create/destroy.")
> >>>>>>
> >>>>>> It's on master and branch-2.14.
> >>>>>>
> >>>>>>>
> >>>>
> >>>> Tested both of these patches and neither fixed the issue.
> >>>>
> >>>> The problem is that parse_flow_match() is setting an ethernet
> >>>> pattern that the i40e PMD doesn't support.
> >>>> Working Flow:
> >>>> rte flow eth pattern:
> >>>> Spec = null
> >>>> Mask = null
> >>>> rte flow ipv4 pattern:
> >>>> Spec: tos=0x0, ttl=40, proto=0x11, src=21.2.5.1, dst=0.0.0.0
> >>>> Mask: tos=0x0, ttl=0, proto=0x0, src=240.0.0.0, dst=0.0.0.0
> >>>>
> >>>> Broken Flow:
> >>>> flow eth pattern:
> >>>> Spec: src=00:00:04:00:0b:00, dst=00:00:04:00:0a:00, type=0x0800
> >>>> Mask: src=00:00:00:00:00:00, dst=00:00:00:00:00:00, type=0xffff rte
> >>>> flow ipv4
> >>>> pattern:
> >>>> Spec: tos=0x0, ttl=40, proto=0x11, src=1.1.0.0, dst=0.0.0.0
> >>>> Mask: tos=0x0, ttl=0, proto=0x0, src=255.255.255.255, dst=0.0.0.0
> >>>>
> >>>> I will continue to try rework the patches to fix this.
> >>>>
> >>>> Thanks,
> >>>> Emma
> >>>>
> >>> Hi Emma,
> >>>
> >>> My previous patch had a bug (uint8_t instead of uint16_t). Anyway, I
> >>> have prepared another one according to your inputs (also not
> >>> tested). Please give it a try, and if works, finalize it with comments etc.
> >>> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
> >>> index
> >>> de6101e4d..f4b04a880 100644
> >>> --- a/lib/netdev-offload-dpdk.c
> >>> +++ b/lib/netdev-offload-dpdk.c
> >>> @@ -672,10 +672,19 @@ free_flow_actions(struct flow_actions *actions)
> >>>      actions->cnt = 0;
> >>>  }
> >>>
> >>> +/* write a comment this is a workaround... */ #define
> >>> +NULL_ETH_MASK_IF_ZEROS \
> >>> +    if (eth_mask && is_all_zeros(&eth_mask->src, sizeof eth_mask->src) &&
> \
> >>> +        is_all_zeros(&eth_mask->dst, sizeof eth_mask->dst)) { \
> >>> +        patterns->items[0].mask = NULL; \
> >>> +        free(eth_mask); \
> >>> +    }
> >
> > If we're going with this approach, please, make it a function that
> > accepts 'struct rte_flow_item *', i.e. something like:
> >
> >     clear_eth_pattern_if_type_only(&patterns->items[0]);
> >
> > One extra check that it's actually an eth pattern should be added
> > inside the function.
> >
> > Is it OK to pass spec, but not mask?   I do not see anything about this kind
> > of configuration in the rte_flow API definition.  I think, we should
> > clean up spec too in this case, just to be sure that we're not leaking any
> resources.
> >
> >>> +
> >>>  static int
> >>>  parse_flow_match(struct flow_patterns *patterns,
> >>>                   struct match *match)  {
> >>> +    struct rte_flow_item_eth *eth_mask = NULL;
> >
> > In case of a function, there is no need to move this definition
> > outside of its block.
> >
> >>>      uint8_t *next_proto_mask = NULL;
> >>>      struct flow *consumed_masks;
> >>>      uint8_t proto = 0;
> >>> @@ -694,24 +703,24 @@ parse_flow_match(struct flow_patterns
> *patterns,
> >>>      if (match->wc.masks.dl_type ||
> >>>          !eth_addr_is_zero(match->wc.masks.dl_src) ||
> >>>          !eth_addr_is_zero(match->wc.masks.dl_dst)) {
> >>> -        struct rte_flow_item_eth *spec, *mask;
> >>> +        struct rte_flow_item_eth *spec;
> >>>
> >>>          spec = xzalloc(sizeof *spec);
> >>> -        mask = xzalloc(sizeof *mask);
> >>> +        eth_mask = xzalloc(sizeof *eth_mask);
> >>>
> >>>          memcpy(&spec->dst, &match->flow.dl_dst, sizeof spec->dst);
> >>>          memcpy(&spec->src, &match->flow.dl_src, sizeof spec->src);
> >>>          spec->type = match->flow.dl_type;
> >>>
> >>> -        memcpy(&mask->dst, &match->wc.masks.dl_dst, sizeof mask->dst);
> >>> -        memcpy(&mask->src, &match->wc.masks.dl_src, sizeof mask->src);
> >>> -        mask->type = match->wc.masks.dl_type;
> >>> +        memcpy(&eth_mask->dst, &match->wc.masks.dl_dst, sizeof
> >>> + eth_mask-
> >>>> dst);
> >>> +        memcpy(&eth_mask->src, &match->wc.masks.dl_src, sizeof
> >>> + eth_mask-
> >>>> src);
> >>> +        eth_mask->type = match->wc.masks.dl_type;
> >>>
> >>>          memset(&consumed_masks->dl_dst, 0, sizeof consumed_masks-
> >>>> dl_dst);
> >>>          memset(&consumed_masks->dl_src, 0, sizeof consumed_masks-
> >>>> dl_src);
> >>>          consumed_masks->dl_type = 0;
> >>>
> >>> -        add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec, mask);
> >>> +        add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec,
> >>> + eth_mask);
> >>>      }
> >>>
> >>>      /* VLAN */
> >>> @@ -739,6 +748,8 @@ parse_flow_match(struct flow_patterns *patterns,
> >>>      if (match->flow.dl_type == htons(ETH_TYPE_IP)) {
> >>>          struct rte_flow_item_ipv4 *spec, *mask;
> >>>
> >>> +        NULL_ETH_MASK_IF_ZEROS;
> >>> +
> >>>          spec = xzalloc(sizeof *spec);
> >>>          mask = xzalloc(sizeof *mask);
> >>>
> >>> @@ -777,6 +788,8 @@ parse_flow_match(struct flow_patterns *patterns,
> >>>      if (match->flow.dl_type == htons(ETH_TYPE_IPV6)) {
> >>>          struct rte_flow_item_ipv6 *spec, *mask;
> >>>
> >>> +        NULL_ETH_MASK_IF_ZEROS;
> >>> +
> >>>          spec = xzalloc(sizeof *spec);
> >>>          mask = xzalloc(sizeof *mask);
> >>>
> >>>
> >>
> >> Great. Thanks Eli, I tested this and it works. I will also have to test this for
> backporting to v.2.13.1.
> >> I can provide test/review tags for this patch.
> 
> Just for a clarification.
> 
> Emma, will you finalize and send a formal patch for this?
> IIUC, Eli expected you to finish this work.

Hi Ilya,

Emma is reworking the patch now and will send on this evening, however Emma will be out of office tomorrow, I can help with the testing and re-work if needed.

> 
> We have a tight time frame since we have 2.14 release scheduled on Monday.
> It'll be good to have a patch today, so it could be tested.

Agreed, once Emma sends the patch the plan is to test and re-work this evening/tomorrow.

Thanks
Ian
> 
> >>
> >> Thanks,
> >> Emma
> >>
> >>> Thanks,
> >>> Eli
> >>>
> >>>>>>> Thanks,
> >>>>>>> Emma
> >>>>>>>
> >>>>>>>> Thanks.
> >>>>>>>>
> >>>>>>>> Best regards, Ilya Maximets.
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Beilei
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Regards
> >>>>>>>>>> Ian
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks,
> >>>>>>>>>>>
> >>>>>>>>>>> Eli
> >>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Regards
> >>>>>>>>>>>>> Ian
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Fixed by partial reversion of these changes.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Signed-off-by: Emma Finn<emma.finn@intel.com>
> >>>>>>>>>
> >>>>>>>
> >>
> >
Ilya Maximets Aug. 13, 2020, 4:11 p.m. UTC | #19
On 8/13/20 4:54 PM, Stokes, Ian wrote:
> 8/13/20 1:46 PM, Ilya Maximets wrote:
>>> On 8/13/20 1:12 PM, Finn, Emma wrote:
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Eli Britstein <elibr@nvidia.com>
>>>>> Sent: Thursday 13 August 2020 10:26
>>>>> To: Finn, Emma <emma.finn@intel.com>; Ilya Maximets
>>>>> <i.maximets@ovn.org>; Xing, Beilei <beilei.xing@intel.com>; Stokes,
>>>>> Ian <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>;
>>>>> dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
>>>>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
>>>>> matching HWOL for XL710 NIC
>>>>>
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Finn, Emma <emma.finn@intel.com>
>>>>>> Sent: Thursday, August 13, 2020 11:47 AM
>>>>>> To: Eli Britstein <elibr@nvidia.com>; Ilya Maximets
>>>>>> <i.maximets@ovn.org>; Xing, Beilei <beilei.xing@intel.com>; Stokes,
>>>>>> Ian <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>;
>>>>>> dev@openvswitch.org; Guo, Jia <jia.guo@intel.com>
>>>>>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
>>>>>> matching HWOL for XL710 NIC
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Eli Britstein <elibr@nvidia.com>
>>>>>>> Sent: Tuesday 11 August 2020 16:49
>>>>>>> To: Ilya Maximets <i.maximets@ovn.org>; Finn, Emma
>>>>>>> <emma.finn@intel.com>; Xing, Beilei <beilei.xing@intel.com>;
>>>>>>> Stokes, Ian <ian.stokes@intel.com>; Eli Britstein
>>>>>>> <elibr@mellanox.com>; dev@openvswitch.org; Guo, Jia
>>>>>>> <jia.guo@intel.com>
>>>>>>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
>>>>>>> matching HWOL for XL710 NIC
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Ilya Maximets <i.maximets@ovn.org>
>>>>>>>> Sent: Tuesday, August 11, 2020 4:19 PM
>>>>>>>> To: Finn, Emma <emma.finn@intel.com>; Ilya Maximets
>>>>>>>> <i.maximets@ovn.org>; Xing, Beilei <beilei.xing@intel.com>;
>>>>>>>> Stokes, Ian <ian.stokes@intel.com>; Eli Britstein
>>>>>>>> <elibr@mellanox.com>; dev@openvswitch.org; Guo, Jia
>>>>>>>> <jia.guo@intel.com>
>>>>>>>> Subject: Re: [PATCH] netdev-offload-dpdk: Fix for broken ethernet
>>>>>>>> matching HWOL for XL710 NIC
>>>>>>>>
>>>>>>>> On 8/11/20 3:12 PM, Finn, Emma wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: Ilya Maximets <i.maximets@ovn.org>
>>>>>>>>>> Sent: Tuesday 11 August 2020 11:02
>>>>>>>>>> To: Xing, Beilei <beilei.xing@intel.com>; Stokes, Ian
>>>>>>>>>> <ian.stokes@intel.com>; Eli Britstein <elibr@mellanox.com>;
>>>>>>>>>> Finn, Emma <emma.finn@intel.com>; dev@openvswitch.org; Guo, Jia
>>>>>>>>>> <jia.guo@intel.com>
>>>>>>>>>> Cc: i.maximets@ovn.org
>>>>>>>>>> Subject: Re: [PATCH] netdev-offload-dpdk: Fix for broken
>>>>>>>>>> ethernet matching HWOL for XL710 NIC
>>>>>>>>>>
>>>>>>>>>> On 8/11/20 5:35 AM, Xing, Beilei wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: Stokes, Ian <ian.stokes@intel.com>
>>>>>>>>>>>> Sent: Tuesday, August 11, 2020 4:52 AM
>>>>>>>>>>>> To: Eli Britstein <elibr@mellanox.com>; Finn, Emma
>>>>>>>>>>>> <emma.finn@intel.com>; dev@openvswitch.org; Xing, Beilei
>>>>>>>>>>>> <beilei.xing@intel.com>; Guo, Jia <jia.guo@intel.com>
>>>>>>>>>>>> Cc: i.maximets@ovn.org
>>>>>>>>>>>> Subject: RE: [PATCH] netdev-offload-dpdk: Fix for broken
>>>>>>>>>>>> ethernet matching HWOL for XL710 NIC
>>>>>>>>>>>>
>>>>>>>>>>>>> On 8/7/2020 7:55 AM, Eli Britstein wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 8/6/2020 8:28 PM, Stokes, Ian wrote:
>>>>>>>>>>>>>>>> On 8/6/2020 6:17 PM, Emma Finn wrote:
>>>>>>>>>>>>>>>>> The following 2 commits introduced changes which caused
>>>>>>>>>>>>>>>>> a regression for XL710 devices and functionality ceases
>>>>>>>>>>>>>>>>> for partial offload as a result.
>>>>>>>>>>>>>>>>> 864852a0624a ("netdev-offload-dpdk: Fix Ethernet
>>>>>>>>>>>>>>>>> matching for type
>>>>>>>>>>>>>>>>> only.")
>>>>>>>>>>>>>>>>> a79eae87abe4 ("netdev-offload-dpdk: Remove pre-validate
>>>>> of
>>>>>>>>>>>>> patterns
>>>>>>>>>>>>>>>>> function.")
>>>>>>>>>>>>>>>> OVS is vendor agnostic. That kind of workaround belongs
>>>>>>>>>>>>>>>> in intel PMD in dpdk, not in OVS.
>>>>>>>>>>>>>>> Hi Eli,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yes OVS looks to be vendor agnostic, but this code I
>>>>>>>>>>>>>>> believe was already in place and working for this usecase.
>>>>>>>>>>>>>>> As such it's removal introduced a regression from an OVS
>>>>>>>>>>>>>>> point of view between
>>>>>>>>>> the releases.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> We have had examples in the past where workarounds are
>>>>>>>>>> permissible
>>>>>>>>>>>>>>> if there is a clear path to fixing this in the future on
>>>>>>>>>>>>>>> the DPDK side (which is what I suggest here) (for example
>>>>>>>>>>>>>>> scatter gather support for MTUs in the past raised similar
>>>>>>>>>>>>>>> issue where we hand to handle specific NIC until the next
>>>>>>>>>>>>>>> DPDK LTS
>>>>>> release).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So my suggestion is to re-instate the original workaround
>>>>>>>>>>>>>>> and remove its when fixed in the next DPDK LTS which
>>>>>>>>>>>>>>> supports the change for i40e at the PMD layer or if it's
>>>>>>>>>>>>>>> backported to the next
>>>>>>>>>>>>>>> 19.11 stable release which would be validated for use with OVS.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> There was a bug with this WA.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please see
>>>>>>>>>>>>>>
>>>>>>> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%25
>>>>>>>>>>>>>> 2
>>>>>>>>>>>>>>
>>>>>>>
>>>>>>
>> Fpatchwork.ozlabs.org%2Fproject%2Fopenvswitch%2Fpatch%2F158718026
>>>>>>>>>>>>>> 6-
>>>>>>>
>>>>>>>
>> &amp;data=02%7C01%7Celibr%40nvidia.com%7C2eb7f5d9553c44e4980a0
>>>>> 8
>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>
>> d83df912c9%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C6373
>>>>> 2
>>>>>>> 7
>>>>>>>> 487
>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>
>> 263661244&amp;sdata=fgnwXLy3xt%2B8H8h9BJwDzlPmp3dtWv6AMTQ69
>>>>> %
>>>>>>> 2
>>>>>>>> B9kb
>>>>>>>>>>>>>> NM%3D&amp;reserved=0
>>>>>>>>>>>>> 28632-1-git-send-email-JackerDune@gmail.com/.
>>>>>>>>>>
>>>>>>>>>> I'm worried about this bug.  Current version of a workaround is
>>>>>>>>>> not correct from the OVS point of view since it wildcards
>>>>>>>>>> dl_type during offloading that is not expected by higher
>>>>>>>>>> layers, causing HW flow being much more wide than requested.
>>>>>>>>>> In case we going to have this workaround for xl710, we should
>>>>>>>>>> have another workaround for this
>>>>>>> issue too.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Is it possible to address it in DPDK instead of reverting
>>>>>>>>>>>>>> in OVS and later re-reverting?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Eli
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've included the i40e DPDK maintainers here for their
>>>>>>>>>>>>>>> thoughts
>>>>>>> also.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> @Beilei/Jia Is this something we can look at to introduce
>>>>>>>>>>>>>>> in the i40e PMD?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please take a look at:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>> https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2F
>>>>>>>>>>>>> m
>>>>>>>>>>>>> ails.dpdk.org%2Farchives%2Fdev%2F2020-
>>>>>>>> May%2F166272.html&amp;data=0
>>>>>>>>>>>>>
>>>>>>>
>>>>>>>
>> 2%7C01%7Celibr%40nvidia.com%7C2eb7f5d9553c44e4980a08d83df912c9
>>>>> %
>>>>>>> 7
>>>>>>>> C4
>>>>>>>>>>>>>
>>>>>>>
>>>>>>>
>> 3083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637327487263661244
>>>>>> &
>>>>>>> a
>>>>>>>> mp;s
>>>>>>>>>>>>>
>>>>>>>
>>>>>>
>> data=g2Yiy07cL4yGIQYsJzNkZqimUibNMXsDQLNdtJ8Evmw%3D&amp;reserv
>>>>>>> ed=
>>>>>>>> 0
>>>>>>>>>>>>>
>>>>>>>>>>>>> For MLX it was only an optimization. For i40e something
>>>>>>>>>>>>> similar may be a workaround for this issue.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for this Eli, let me sync with Beilei on this.
>>>>>>>>>>>>
>>>>>>>>>>>> If it's something we can resolve in the PMD then I think we
>>>>>>>>>>>> can add an errata or known issue for the 2.14 release (and
>>>>>>>>>>>> possibly the
>>>>>>>>>>>> 2.13 release as I think the same issue is present there).
>>>>>>>>>>>>
>>>>>>>>>>>> If it was fixed in the future we could then remove the issue
>>>>>>>>>>>> notice.>
>>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> According to the OVS patch and mlx DPDK patch, is the
>>>>>>>>>>> requirement to support rte flow pattern like 'pattern IPv4 /
>>>>>>>>>>> UDP src is 32 / end', no need to use ''pattern ETH / IPv4 /
>>>>>>>>>>> UDP src is 32 /
>>>>> end '?
>>>>>>>>>>> please correct me if I'm wrong.
>>>>>>>>>>>
>>>>>>>>>>> If yes, could you please tell me how OVS adds a flow which
>>>>>>>>>>> doesn’t include ETH item? I'm not very familiar with OVS, I
>>>>>>>>>>> run some simple commands before, and find eth will always exist.
>> E.g:
>>>>>>>>>>> flow_add: ufid:fced09a8-9b8a-420d-9cb6-454ab9bed224
>>>>>>>>>>> skb_priority(0),skb_mark(0),\
>>>>>>>>>>> ct_state(-new-est-rel-rpl-inv-trk-snat-dnat),ct_zone(0),ct_mar
>>>>>>>>>>> k(
>>>>>>>>>>> 0
>>>>>>>>>>> ),
>>>>>>>>>>> c
>>>>>>>>>>> t_
>>>>>>>>>>> label(0),recirc_id(0),\
>>>>>>>>>>>
>>>>> dp_hash(0),in_port(2),packet_type(ns=0,id=0),eth(src=00:e8:ca:11:ba:
>>>>>>>>>>> 80
>>>>>>>>>>> ,dst=ff:ff:ff:ff:ff:ff),
>>>>>>>>>>> eth_type(0x0800),ipv4(src=0.0.0.0,dst=255.255.255.255,proto=17
>>>>>>>>>>> ,t
>>>>>>>>>>> o
>>>>>>>>>>> s=
>>>>>>>>>>> 0
>>>>>>>>>>> x1
>>>>>>>>>>> 0,ttl=128,frag=no),
>>>>>>>>>>> udp(src=68,dst=67), actions:drop
>>>>>>>>>>
>>>>>>>>>> Emma, Ian, could you please provide the pattern that fails to
>>>>>>>>>> offload on current OVS master so it will be easier for everyone
>>>>>>>>>> to understand the
>>>>>>>> issue.
>>>>>>>>>> It's not obvious how to construct the bad pattern by only
>>>>>>>>>> looking at the i40e pmd code.  Also, please, enable debug logs
>>>>>>>>>> and provide the testpmd-style arguments constructed by OVS.
>>>>>>>>>>
>>>>>>>>>> It might be also good to have all this information in the
>>>>>>>>>> commit
>>>>> message.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Following flow pattern is failing from logs:
>>>>>>>>>
>>>>>>>>>   Attributes: ingress=1, egress=0, prio=0, group=0, transfer=1
>>>>>>>>> rte flow eth pattern:
>>>>>>>>>   Spec: src=00:00:04:00:0b:00, dst=00:00:04:00:0a:00, type=0x0800
>>>>>>>>>   Mask: src=00:00:00:00:00:00, dst=00:00:00:00:00:00,
>>>>>>>>> type=0xffff rte flow ipv4 pattern:
>>>>>>>>>   Spec: tos=0x0, ttl=40, proto=0x11, src=1.1.0.0, dst=0.0.0.0
>>>>>>>>>   Mask: tos=0x0, ttl=0, proto=0x0, src=255.255.255.255,
>>>>>>>>> dst=0.0.0.0 rte flow count action:
>>>>>>>>>   Count: shared=0, id=0
>>>>>>>>> rte flow port-id action:
>>>>>>>>>   Port-id: original=0, id=1
>>>>>>>>> 2020-08-
>>>>>>>
>>>>>>
>> 11T10:49:28.002Z|00003|netdev_offload_dpdk(dp_netdev_flow_18)|WAR
>>>>>>> N|
>>>>>>>> dpdk0: rte_flow creation failed: 13 (Unsupported ether_type.).
>>>>>>> Hi
>>>>>>>
>>>>>>> Maybe another workaround until fixed in i40e PMD is to mask out
>>>>>>> the ether_type match if there is IPv4 or IPv6.
>>>>>>> It means that patterns of:
>>>>>>> eth type is 0x0800 / ipv4 / ...
>>>>>>> will be
>>>>>>> eth / ipv4 / ...
>>>>>>>
>>>>>>> For IPv6 the same, but for other ether types no removal of the match.
>>>>>>> eth type is 0x1234 /
>>>>>>> kept the same.
>>>>>>>
>>>>>>> I referred to MLX5 PMD patch. As I mentioned, for MLX5 it's only
>>>>>>> an optimization, for XL710 it can be used as a workaround.
>>>>>>>
>>>>> https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmai
>>>>> l
>>>>>>> s
>>>>>>> .dpdk.org%2Farchives%2Fdev%2F2020-
>>>>>> May%2F166272.html&amp;data=02%7C01%7
>>>>>>>
>>>>>>
>> Celibr%40nvidia.com%7C3ce7437a70c14475299e08d83f657fdd%7C43083d1
>>>>> 5
>>>>>> 72734
>>>>>>>
>>>>>>
>> 0c1b7db39efd9ccc17a%7C0%7C0%7C637329052456016934&amp;sdata=D2
>>>>> H
>>>>>> XS8Wo9Si
>>>>>>> aD3BCtY8%2BvX5gKH8a%2B7VmAVG8YPxkjAE%3D&amp;reserved=0
>>>>>>>
>>>>>>> A similar masking out of protocol type is done today for
>>>>>>> TCP/UDP/SCTP/ICMP. For example:
>>>>>>>         /* proto == TCP and ITEM_TYPE_TCP, thus no need for proto match.
>>>>> */
>>>>>>>         if (next_proto_mask) {
>>>>>>>             *next_proto_mask = 0;
>>>>>>>         }
>>>>>>> Please take a look at
>>>>>>>
>>>>> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpa
>>>>> t
>>>>>>> c
>>>>>>>
>>>>>>
>> hwork.ozlabs.org%2Fproject%2Fopenvswitch%2Fpatch%2F20200710120718
>>>>> .3
>>>>>> &am
>>>>>>>
>>>>>>
>> p;data=02%7C01%7Celibr%40nvidia.com%7C3ce7437a70c14475299e08d83f
>>>>> 6
>>>>>> 57fdd
>>>>>>>
>>>>>>
>> %7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637329052456016
>>>>>> 934&amp;s
>>>>>>>
>>>>>>
>> data=I8EPuHTyuiP7e3NeEQ86fXIKyAzf5%2BOQ2B8CHo7ilI4%3D&amp;reser
>>>>> ved
>>>>>> =0
>>>>>>> 8633-3-sriharsha.basavapatna@broadcom.com/
>>>>>>> That posted commit suggest not to mask out the protocol type and
>>>>>>> let the PMDs do it if they wish.
>>>>>>>
>>>>>>> The proposed workaround (not tested):
>>>>>>> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
>>>>>>> index de6101e4d..1af7bebcd 100644
>>>>>>> --- a/lib/netdev-offload-dpdk.c
>>>>>>> +++ b/lib/netdev-offload-dpdk.c
>>>>>>> @@ -676,6 +676,7 @@ static int
>>>>>>>  parse_flow_match(struct flow_patterns *patterns,
>>>>>>>                   struct match *match)  {
>>>>>>> +    uint8_t *next_eth_type_mask = NULL;
>>>>>>>      uint8_t *next_proto_mask = NULL;
>>>>>>>      struct flow *consumed_masks;
>>>>>>>      uint8_t proto = 0;
>>>>>>> @@ -712,6 +713,7 @@ parse_flow_match(struct flow_patterns
>>>>> *patterns,
>>>>>>>          consumed_masks->dl_type = 0;
>>>>>>>
>>>>>>>          add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec,
>>>>>>> mask);
>>>>>>> +        next_eth_type_mask = &mask->type;
>>>>>>>      }
>>>>>>>
>>>>>>>      /* VLAN */
>>>>>>> @@ -739,6 +741,11 @@ parse_flow_match(struct flow_patterns
>>>>> *patterns,
>>>>>>>      if (match->flow.dl_type == htons(ETH_TYPE_IP)) {
>>>>>>>          struct rte_flow_item_ipv4 *spec, *mask;
>>>>>>>
>>>>>>> +        /* IPv4. No need to match ether type. */
>>>>>>> +        if (next_eth_type_mask) {
>>>>>>> +            *next_eth_type_mask = 0;
>>>>>>> +        }
>>>>>>> +
>>>>>>>          spec = xzalloc(sizeof *spec);
>>>>>>>          mask = xzalloc(sizeof *mask);
>>>>>>>
>>>>>>> @@ -777,6 +784,11 @@ parse_flow_match(struct flow_patterns
>>>>> *patterns,
>>>>>>>      if (match->flow.dl_type == htons(ETH_TYPE_IPV6)) {
>>>>>>>          struct rte_flow_item_ipv6 *spec, *mask;
>>>>>>>
>>>>>>> +        /* IPv6. No need to match ether type. */
>>>>>>> +        if (next_eth_type_mask) {
>>>>>>> +            *next_eth_type_mask = 0;
>>>>>>> +        }
>>>>>>> +
>>>>>>>          spec = xzalloc(sizeof *spec);
>>>>>>>          mask = xzalloc(sizeof *mask);
>>>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>> Or from dump-flows :
>>>>>>>>>
>>>>>>>>> flow-dump from pmd on cpu core: 23
>>>>>>>>> ufid:b4ba667d-8f4d-4f45-8896-c541d9fcecc0,
>>>>>>>>> skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_ma
>>>>>>>>> rk
>>>>>>>>> (
>>>>>>>>> 0/
>>>>>>>>> 0
>>>>>>>>> ),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(dpdk0),packet_
>>>>>>>>> ty
>>>>>>>>> p
>>>>>>>>> e(
>>>>>>>>> n
>>>>>>>>>
>>>>> s=0,id=0),eth(src=00:00:04:00:0b:00/00:00:00:00:00:00,dst=00:00:04:00:
>>>>>>>>> 0a:00/00:00:00:00:00:00),eth_type(0x0800),ipv4(src=1.1.0.0,dst=0.0.0.
>>>>>>>>> 0
>>>>>>>>> /0.0.0.0,proto=17/0,tos=0/0,ttl=64/0,frag=no),udp(src=63/0,dst=6
>>>>>>>>> 3/
>>>>>>>>> 0 ), packets:42575487, bytes:2554529220, used:0.000s,
>>>>>>>>> offloaded:partial, dp:ovs, actions:dpdk1,
>>>>>>>>> dp-extra-info:miniflow_bits(4,2)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> with regards to " provide the testpmd-style arguments
>>>>>>>>> constructed by
>>>>>>> OVS."
>>>>>>>>> Can you confirm what you mean?
>>>>>>>>
>>>>>>>> We changed the way of logging in a following commit:
>>>>>>>> d8ad173fb9c1 ("netdev-offload-dpdk: Log testpmd format for flow
>>>>>>>> create/destroy.")
>>>>>>>>
>>>>>>>> It's on master and branch-2.14.
>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>> Tested both of these patches and neither fixed the issue.
>>>>>>
>>>>>> The problem is that parse_flow_match() is setting an ethernet
>>>>>> pattern that the i40e PMD doesn't support.
>>>>>> Working Flow:
>>>>>> rte flow eth pattern:
>>>>>> Spec = null
>>>>>> Mask = null
>>>>>> rte flow ipv4 pattern:
>>>>>> Spec: tos=0x0, ttl=40, proto=0x11, src=21.2.5.1, dst=0.0.0.0
>>>>>> Mask: tos=0x0, ttl=0, proto=0x0, src=240.0.0.0, dst=0.0.0.0
>>>>>>
>>>>>> Broken Flow:
>>>>>> flow eth pattern:
>>>>>> Spec: src=00:00:04:00:0b:00, dst=00:00:04:00:0a:00, type=0x0800
>>>>>> Mask: src=00:00:00:00:00:00, dst=00:00:00:00:00:00, type=0xffff rte
>>>>>> flow ipv4
>>>>>> pattern:
>>>>>> Spec: tos=0x0, ttl=40, proto=0x11, src=1.1.0.0, dst=0.0.0.0
>>>>>> Mask: tos=0x0, ttl=0, proto=0x0, src=255.255.255.255, dst=0.0.0.0
>>>>>>
>>>>>> I will continue to try rework the patches to fix this.
>>>>>>
>>>>>> Thanks,
>>>>>> Emma
>>>>>>
>>>>> Hi Emma,
>>>>>
>>>>> My previous patch had a bug (uint8_t instead of uint16_t). Anyway, I
>>>>> have prepared another one according to your inputs (also not
>>>>> tested). Please give it a try, and if works, finalize it with comments etc.
>>>>> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
>>>>> index
>>>>> de6101e4d..f4b04a880 100644
>>>>> --- a/lib/netdev-offload-dpdk.c
>>>>> +++ b/lib/netdev-offload-dpdk.c
>>>>> @@ -672,10 +672,19 @@ free_flow_actions(struct flow_actions *actions)
>>>>>      actions->cnt = 0;
>>>>>  }
>>>>>
>>>>> +/* write a comment this is a workaround... */ #define
>>>>> +NULL_ETH_MASK_IF_ZEROS \
>>>>> +    if (eth_mask && is_all_zeros(&eth_mask->src, sizeof eth_mask->src) &&
>> \
>>>>> +        is_all_zeros(&eth_mask->dst, sizeof eth_mask->dst)) { \
>>>>> +        patterns->items[0].mask = NULL; \
>>>>> +        free(eth_mask); \
>>>>> +    }
>>>
>>> If we're going with this approach, please, make it a function that
>>> accepts 'struct rte_flow_item *', i.e. something like:
>>>
>>>     clear_eth_pattern_if_type_only(&patterns->items[0]);
>>>
>>> One extra check that it's actually an eth pattern should be added
>>> inside the function.
>>>
>>> Is it OK to pass spec, but not mask?   I do not see anything about this kind
>>> of configuration in the rte_flow API definition.  I think, we should
>>> clean up spec too in this case, just to be sure that we're not leaking any
>> resources.
>>>
>>>>> +
>>>>>  static int
>>>>>  parse_flow_match(struct flow_patterns *patterns,
>>>>>                   struct match *match)  {
>>>>> +    struct rte_flow_item_eth *eth_mask = NULL;
>>>
>>> In case of a function, there is no need to move this definition
>>> outside of its block.
>>>
>>>>>      uint8_t *next_proto_mask = NULL;
>>>>>      struct flow *consumed_masks;
>>>>>      uint8_t proto = 0;
>>>>> @@ -694,24 +703,24 @@ parse_flow_match(struct flow_patterns
>> *patterns,
>>>>>      if (match->wc.masks.dl_type ||
>>>>>          !eth_addr_is_zero(match->wc.masks.dl_src) ||
>>>>>          !eth_addr_is_zero(match->wc.masks.dl_dst)) {
>>>>> -        struct rte_flow_item_eth *spec, *mask;
>>>>> +        struct rte_flow_item_eth *spec;
>>>>>
>>>>>          spec = xzalloc(sizeof *spec);
>>>>> -        mask = xzalloc(sizeof *mask);
>>>>> +        eth_mask = xzalloc(sizeof *eth_mask);
>>>>>
>>>>>          memcpy(&spec->dst, &match->flow.dl_dst, sizeof spec->dst);
>>>>>          memcpy(&spec->src, &match->flow.dl_src, sizeof spec->src);
>>>>>          spec->type = match->flow.dl_type;
>>>>>
>>>>> -        memcpy(&mask->dst, &match->wc.masks.dl_dst, sizeof mask->dst);
>>>>> -        memcpy(&mask->src, &match->wc.masks.dl_src, sizeof mask->src);
>>>>> -        mask->type = match->wc.masks.dl_type;
>>>>> +        memcpy(&eth_mask->dst, &match->wc.masks.dl_dst, sizeof
>>>>> + eth_mask-
>>>>>> dst);
>>>>> +        memcpy(&eth_mask->src, &match->wc.masks.dl_src, sizeof
>>>>> + eth_mask-
>>>>>> src);
>>>>> +        eth_mask->type = match->wc.masks.dl_type;
>>>>>
>>>>>          memset(&consumed_masks->dl_dst, 0, sizeof consumed_masks-
>>>>>> dl_dst);
>>>>>          memset(&consumed_masks->dl_src, 0, sizeof consumed_masks-
>>>>>> dl_src);
>>>>>          consumed_masks->dl_type = 0;
>>>>>
>>>>> -        add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec, mask);
>>>>> +        add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec,
>>>>> + eth_mask);
>>>>>      }
>>>>>
>>>>>      /* VLAN */
>>>>> @@ -739,6 +748,8 @@ parse_flow_match(struct flow_patterns *patterns,
>>>>>      if (match->flow.dl_type == htons(ETH_TYPE_IP)) {
>>>>>          struct rte_flow_item_ipv4 *spec, *mask;
>>>>>
>>>>> +        NULL_ETH_MASK_IF_ZEROS;
>>>>> +
>>>>>          spec = xzalloc(sizeof *spec);
>>>>>          mask = xzalloc(sizeof *mask);
>>>>>
>>>>> @@ -777,6 +788,8 @@ parse_flow_match(struct flow_patterns *patterns,
>>>>>      if (match->flow.dl_type == htons(ETH_TYPE_IPV6)) {
>>>>>          struct rte_flow_item_ipv6 *spec, *mask;
>>>>>
>>>>> +        NULL_ETH_MASK_IF_ZEROS;
>>>>> +
>>>>>          spec = xzalloc(sizeof *spec);
>>>>>          mask = xzalloc(sizeof *mask);
>>>>>
>>>>>
>>>>
>>>> Great. Thanks Eli, I tested this and it works. I will also have to test this for
>> backporting to v.2.13.1.
>>>> I can provide test/review tags for this patch.
>>
>> Just for a clarification.
>>
>> Emma, will you finalize and send a formal patch for this?
>> IIUC, Eli expected you to finish this work.
> 
> Hi Ilya,
> 
> Emma is reworking the patch now and will send on this evening, however Emma will be out of office tomorrow, I can help with the testing and re-work if needed.

Great.  Thanks!

> 
>>
>> We have a tight time frame since we have 2.14 release scheduled on Monday.
>> It'll be good to have a patch today, so it could be tested.
> 
> Agreed, once Emma sends the patch the plan is to test and re-work this evening/tomorrow.
> 
> Thanks
> Ian
>>
>>>>
>>>> Thanks,
>>>> Emma
>>>>
>>>>> Thanks,
>>>>> Eli
>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Emma
>>>>>>>>>
>>>>>>>>>> Thanks.
>>>>>>>>>>
>>>>>>>>>> Best regards, Ilya Maximets.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Beilei
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Regards
>>>>>>>>>>>> Ian
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Eli
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>> Ian
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Fixed by partial reversion of these changes.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Signed-off-by: Emma Finn<emma.finn@intel.com>
>>>>>>>>>>>
>>>>>>>>>
>>>>
>>>
>
diff mbox series

Patch

diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index de6101e..b300884 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -691,8 +691,7 @@  parse_flow_match(struct flow_patterns *patterns,
     consumed_masks->packet_type = 0;
 
     /* Eth */
-    if (match->wc.masks.dl_type ||
-        !eth_addr_is_zero(match->wc.masks.dl_src) ||
+    if (!eth_addr_is_zero(match->wc.masks.dl_src) ||
         !eth_addr_is_zero(match->wc.masks.dl_dst)) {
         struct rte_flow_item_eth *spec, *mask;
 
@@ -712,6 +711,16 @@  parse_flow_match(struct flow_patterns *patterns,
         consumed_masks->dl_type = 0;
 
         add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, spec, mask);
+    } else {
+        /*
+        * If user specifies a flow (like UDP flow) without L2 patterns,
+        * OVS will at least set the dl_type. Normally, it's enough to
+        * create an eth pattern just with it. Unluckily, some Intel's
+        * NIC (such as XL710) doesn't support that. Below is a workaround,
+        * which simply matches any L2 pkts.
+        */
+        consumed_masks->dl_type = 0;
+        add_flow_pattern(patterns, RTE_FLOW_ITEM_TYPE_ETH, NULL, NULL);
     }
 
     /* VLAN */