mbox series

[ovs-dev,V2,00/14] Netdev vxlan-decap offload

Message ID 20210210152702.4898-1-elibr@nvidia.com
Headers show
Series Netdev vxlan-decap offload | expand

Message

Eli Britstein Feb. 10, 2021, 3:26 p.m. UTC
VXLAN decap in OVS-DPDK configuration consists of two flows:
F1: in_port(ens1f0),eth(),ipv4(),udp(), actions:tnl_pop(vxlan_sys_4789)
F2: tunnel(),in_port(vxlan_sys_4789),eth(),ipv4(), actions:ens1f0_0

F1 is a classification flow. It has outer headers matches and it
classifies the packet as a VXLAN packet, and using tnl_pop action the
packet continues processing in F2.
F2 is a flow that has matches on tunnel metadata as well as on the inner
packet headers (as any other flow).

In order to fully offload VXLAN decap path, both F1 and F2 should be
offloaded. As there are more than one flow in HW, it is possible that
F1 is done by HW but F2 is not. Packet is received by SW, and should be
processed starting from F2 as F1 was already done by HW.
Rte_flows are applicable only on physical port IDs. Keeping the original
physical in port on which the packet is received on enables applying
vport flows (e.g. F2) on that physical port.

This patch-set makes use of [1] introduced in DPDK 20.11, that adds API
for tunnel offloads.

v2-v1:
- Tracking original in_port, and applying vport on that physical port instead of all PFs.

Travis:
v1: https://travis-ci.org/github/elibritstein/OVS/builds/756418552
v2: https://travis-ci.org/github/elibritstein/OVS/builds/758382963

GitHub Actions:
v1: https://github.com/elibritstein/OVS/actions/runs/515334647
v2: https://github.com/elibritstein/OVS/actions/runs/554986007

[1] https://mails.dpdk.org/archives/dev/2020-October/187314.html

Eli Britstein (11):
  netdev-offload: Add HW miss packet state recover API
  netdev-dpdk: Introduce DPDK tunnel APIs
  netdev-offload-dpdk: Implement flow dump create/destroy APIs
  netdev-dpdk: Add flow_api support for netdev vxlan vports
  netdev-offload-dpdk: Implement HW miss packet recover for vport
  dpif-netdev: Add HW miss packet state recover logic
  netdev-offload-dpdk: Change log rate limits
  netdev-offload-dpdk: Support tunnel pop action
  netdev-offload-dpdk: Refactor offload rule creation
  netdev-offload-dpdk: Support vports flows offload
  netdev-dpdk-offload: Add vxlan pattern matching function

Ilya Maximets (2):
  netdev-offload: Allow offloading to netdev without ifindex.
  netdev-offload: Disallow offloading to unrelated tunneling vports.

Sriharsha Basavapatna (1):
  dpif-netdev: Provide orig_in_port in metadata for tunneled packets

 Documentation/howto/dpdk.rst  |   1 +
 NEWS                          |   2 +
 lib/dpif-netdev.c             |  69 ++--
 lib/netdev-dpdk.c             | 118 ++++++
 lib/netdev-dpdk.h             | 102 ++++-
 lib/netdev-offload-dpdk.c     | 706 +++++++++++++++++++++++++++++-----
 lib/netdev-offload-provider.h |   5 +
 lib/netdev-offload-tc.c       |   8 +
 lib/netdev-offload.c          |  29 +-
 lib/netdev-offload.h          |   2 +
 lib/packets.h                 |   8 +-
 11 files changed, 921 insertions(+), 129 deletions(-)

Comments

Pai G, Sunil Feb. 18, 2021, 1:41 p.m. UTC | #1
Sending to Marko. As he wasn't subscribed to ovs-dev then.

> -----Original Message-----
> From: dev <ovs-dev-bounces@openvswitch.org> On Behalf Of Eli Britstein
> Sent: Wednesday, February 10, 2021 8:57 PM
> To: dev@openvswitch.org; Ilya Maximets <i.maximets@ovn.org>
> Cc: Eli Britstein <elibr@nvidia.com>; Ameer Mahagneh
> <ameerm@nvidia.com>; Majd Dibbiny <majd@nvidia.com>; Gaetan Rivet
> <gaetanr@nvidia.com>
> Subject: [ovs-dev] [PATCH V2 00/14] Netdev vxlan-decap offload
> 
> VXLAN decap in OVS-DPDK configuration consists of two flows:
> F1: in_port(ens1f0),eth(),ipv4(),udp(), actions:tnl_pop(vxlan_sys_4789)
> F2: tunnel(),in_port(vxlan_sys_4789),eth(),ipv4(), actions:ens1f0_0
> 
> F1 is a classification flow. It has outer headers matches and it classifies the
> packet as a VXLAN packet, and using tnl_pop action the packet continues
> processing in F2.
> F2 is a flow that has matches on tunnel metadata as well as on the inner
> packet headers (as any other flow).
> 
> In order to fully offload VXLAN decap path, both F1 and F2 should be
> offloaded. As there are more than one flow in HW, it is possible that
> F1 is done by HW but F2 is not. Packet is received by SW, and should be
> processed starting from F2 as F1 was already done by HW.
> Rte_flows are applicable only on physical port IDs. Keeping the original
> physical in port on which the packet is received on enables applying vport
> flows (e.g. F2) on that physical port.
> 
> This patch-set makes use of [1] introduced in DPDK 20.11, that adds API for
> tunnel offloads.
> 
> v2-v1:
> - Tracking original in_port, and applying vport on that physical port instead of
> all PFs.
> 
> Travis:
> v1: https://travis-ci.org/github/elibritstein/OVS/builds/756418552
> v2: https://travis-ci.org/github/elibritstein/OVS/builds/758382963
> 
> GitHub Actions:
> v1: https://github.com/elibritstein/OVS/actions/runs/515334647
> v2: https://github.com/elibritstein/OVS/actions/runs/554986007
> 
> [1] https://mails.dpdk.org/archives/dev/2020-October/187314.html
> 
> Eli Britstein (11):
>   netdev-offload: Add HW miss packet state recover API
>   netdev-dpdk: Introduce DPDK tunnel APIs
>   netdev-offload-dpdk: Implement flow dump create/destroy APIs
>   netdev-dpdk: Add flow_api support for netdev vxlan vports
>   netdev-offload-dpdk: Implement HW miss packet recover for vport
>   dpif-netdev: Add HW miss packet state recover logic
>   netdev-offload-dpdk: Change log rate limits
>   netdev-offload-dpdk: Support tunnel pop action
>   netdev-offload-dpdk: Refactor offload rule creation
>   netdev-offload-dpdk: Support vports flows offload
>   netdev-dpdk-offload: Add vxlan pattern matching function
> 
> Ilya Maximets (2):
>   netdev-offload: Allow offloading to netdev without ifindex.
>   netdev-offload: Disallow offloading to unrelated tunneling vports.
> 
> Sriharsha Basavapatna (1):
>   dpif-netdev: Provide orig_in_port in metadata for tunneled packets
> 
>  Documentation/howto/dpdk.rst  |   1 +
>  NEWS                          |   2 +
>  lib/dpif-netdev.c             |  69 ++--
>  lib/netdev-dpdk.c             | 118 ++++++
>  lib/netdev-dpdk.h             | 102 ++++-
>  lib/netdev-offload-dpdk.c     | 706 +++++++++++++++++++++++++++++-----
>  lib/netdev-offload-provider.h |   5 +
>  lib/netdev-offload-tc.c       |   8 +
>  lib/netdev-offload.c          |  29 +-
>  lib/netdev-offload.h          |   2 +
>  lib/packets.h                 |   8 +-
>  11 files changed, 921 insertions(+), 129 deletions(-)
> 
> --
> 2.28.0.546.g385c171
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Kovacevic, Marko Feb. 18, 2021, 4:38 p.m. UTC | #2
<...> 
> Sending to Marko. As he wasn't subscribed to ovs-dev then.
> 
<...>
> > VXLAN decap in OVS-DPDK configuration consists of two flows:
> > F1: in_port(ens1f0),eth(),ipv4(),udp(), actions:tnl_pop(vxlan_sys_4789)
> > F2: tunnel(),in_port(vxlan_sys_4789),eth(),ipv4(), actions:ens1f0_0
> >
> > F1 is a classification flow. It has outer headers matches and it classifies the
> > packet as a VXLAN packet, and using tnl_pop action the packet continues
> > processing in F2.
> > F2 is a flow that has matches on tunnel metadata as well as on the inner
> > packet headers (as any other flow).
> >
<...>

Hi Eli,

Hi,
After testing the patchset it seems  after the tenth patch I start seeing a drop in the scatter performance around ~4% decrease  across all packet sizes tested(112,256,512,1518)
Burst measurement see a decrease also but not as much as the scatter does.

Patch10
fff1f9168 netdev-offload-dpdk: Support tunnel pop action

The test used for this is a 32 virito-user ports with 1Millions flows.

Traffic @ Phy NIC Rx:
Ether()/IP()/UDP()/VXLAN()/Ether()/IP()

Burst: on the outer ip we do a burst of 32 packets with same ip then switch for next 32 and so on 
Scatter: for scatter we do incrementally for 32 
And on the inner packet we have a total of  1048576 flows

I can send on a diagram directly just restricted with html here to send the diagram here of the test setup

Thanks 
Marko K
Eli Britstein Feb. 21, 2021, 1:34 p.m. UTC | #3
On 2/18/2021 6:38 PM, Kovacevic, Marko wrote:
> External email: Use caution opening links or attachments
>
>
> <...>
>> Sending to Marko. As he wasn't subscribed to ovs-dev then.
>>
> <...>
>>> VXLAN decap in OVS-DPDK configuration consists of two flows:
>>> F1: in_port(ens1f0),eth(),ipv4(),udp(), actions:tnl_pop(vxlan_sys_4789)
>>> F2: tunnel(),in_port(vxlan_sys_4789),eth(),ipv4(), actions:ens1f0_0
>>>
>>> F1 is a classification flow. It has outer headers matches and it classifies the
>>> packet as a VXLAN packet, and using tnl_pop action the packet continues
>>> processing in F2.
>>> F2 is a flow that has matches on tunnel metadata as well as on the inner
>>> packet headers (as any other flow).
>>>
> <...>
>
> Hi Eli,
>
> Hi,
> After testing the patchset it seems  after the tenth patch I start seeing a drop in the scatter performance around ~4% decrease  across all packet sizes tested(112,256,512,1518)
> Burst measurement see a decrease also but not as much as the scatter does.

Hi Marko,

Thanks for testing this series.

>
> Patch10
> fff1f9168 netdev-offload-dpdk: Support tunnel pop action

It doesn't make sense that this commit causes any degradation as it only 
enhances offloads that are not in the datapath and not done for 
virtio-user ports in any case.

Could you please double check?

I would expect maybe a degradation with:

Patch 12: 8a21a377c dpif-netdev: Provide orig_in_port in metadata for 
tunneled packets

Patch 6: e548c079d dpif-netdev: Add HW miss packet state recover logic

Could you please double check what is the offending commit?

Do you compile with ALLOW_EXPERIMENTAL_API defined or not?

> The test used for this is a 32 virito-user ports with 1Millions flows.

Could you please elaborate exactly about your setup and test?

What are "1M flows"? what are the differences between them?

What are the OpenFlow rules you use?

Are there any other configurations set (other_config for example)?

What is being done with the packets in the guest side? all ports are in 
the same VM?

>
> Traffic @ Phy NIC Rx:
> Ether()/IP()/UDP()/VXLAN()/Ether()/IP()
>
> Burst: on the outer ip we do a burst of 32 packets with same ip then switch for next 32 and so on
> Scatter: for scatter we do incrementally for 32
> And on the inner packet we have a total of  1048576 flows
>
> I can send on a diagram directly just restricted with html here to send the diagram here of the test setup

As commented above, I would appreciate more details about your tests and 
setup.

Thanks,

Eli

>
> Thanks
> Marko K
Sriharsha Basavapatna Feb. 23, 2021, 10:48 a.m. UTC | #4
On Sun, Feb 21, 2021 at 7:04 PM Eli Britstein <elibr@nvidia.com> wrote:
>
>
> On 2/18/2021 6:38 PM, Kovacevic, Marko wrote:
> > External email: Use caution opening links or attachments
> >
> >
> > <...>
> >> Sending to Marko. As he wasn't subscribed to ovs-dev then.
> >>
> > <...>
> >>> VXLAN decap in OVS-DPDK configuration consists of two flows:
> >>> F1: in_port(ens1f0),eth(),ipv4(),udp(), actions:tnl_pop(vxlan_sys_4789)
> >>> F2: tunnel(),in_port(vxlan_sys_4789),eth(),ipv4(), actions:ens1f0_0
> >>>
> >>> F1 is a classification flow. It has outer headers matches and it classifies the
> >>> packet as a VXLAN packet, and using tnl_pop action the packet continues
> >>> processing in F2.
> >>> F2 is a flow that has matches on tunnel metadata as well as on the inner
> >>> packet headers (as any other flow).
> >>>
> > <...>
> >
> > Hi Eli,
> >
> > Hi,
> > After testing the patchset it seems  after the tenth patch I start seeing a drop in the scatter performance around ~4% decrease  across all packet sizes tested(112,256,512,1518)
> > Burst measurement see a decrease also but not as much as the scatter does.
>
> Hi Marko,
>
> Thanks for testing this series.
>
> >
> > Patch10
> > fff1f9168 netdev-offload-dpdk: Support tunnel pop action
>
> It doesn't make sense that this commit causes any degradation as it only
> enhances offloads that are not in the datapath and not done for
> virtio-user ports in any case.

Patch 10 enables offload for flow F1 with tnl_pop action. If
hw_offload is enabled, then the new code to offload this flow would be
executed for virtio-user ports as well, since this flow is independent
of the end point port (whether virtio or vf-rep).

Before this patch (i.e, with the original code in master/2.15),
parse_flow_actions() would fail for TUNNEL_POP action. But with the
new code, this action is processed by the function -
add_tnl_pop_action(). There is some processing in this function,
including a new rte_flow API (rte_flow_tunnel_decap_set) to the PMD.
Maybe this is adding some overhead ?

Thanks,
-Harsha
>
> Could you please double check?
>
> I would expect maybe a degradation with:
>
> Patch 12: 8a21a377c dpif-netdev: Provide orig_in_port in metadata for
> tunneled packets
>
> Patch 6: e548c079d dpif-netdev: Add HW miss packet state recover logic
>
> Could you please double check what is the offending commit?
>
> Do you compile with ALLOW_EXPERIMENTAL_API defined or not?
>
> > The test used for this is a 32 virito-user ports with 1Millions flows.
>
> Could you please elaborate exactly about your setup and test?
>
> What are "1M flows"? what are the differences between them?
>
> What are the OpenFlow rules you use?
>
> Are there any other configurations set (other_config for example)?
>
> What is being done with the packets in the guest side? all ports are in
> the same VM?
>
> >
> > Traffic @ Phy NIC Rx:
> > Ether()/IP()/UDP()/VXLAN()/Ether()/IP()
> >
> > Burst: on the outer ip we do a burst of 32 packets with same ip then switch for next 32 and so on
> > Scatter: for scatter we do incrementally for 32
> > And on the inner packet we have a total of  1048576 flows
> >
> > I can send on a diagram directly just restricted with html here to send the diagram here of the test setup
>
> As commented above, I would appreciate more details about your tests and
> setup.
>
> Thanks,
>
> Eli
>
> >
> > Thanks
> > Marko K
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Kovacevic, Marko Feb. 23, 2021, 11:38 a.m. UTC | #5
> >
> >
> > <...>
> >> Sending to Marko. As he wasn't subscribed to ovs-dev then.
> >>
> > <...>
> >>> VXLAN decap in OVS-DPDK configuration consists of two flows:
> >>> F1: in_port(ens1f0),eth(),ipv4(),udp(), actions:tnl_pop(vxlan_sys_4789)
> >>> F2: tunnel(),in_port(vxlan_sys_4789),eth(),ipv4(), actions:ens1f0_0
> >>>
> >>> F1 is a classification flow. It has outer headers matches and it classifies
> the
> >>> packet as a VXLAN packet, and using tnl_pop action the packet continues
> >>> processing in F2.
> >>> F2 is a flow that has matches on tunnel metadata as well as on the inner
> >>> packet headers (as any other flow).
> >>>
> > <...>
> >
> > Hi Eli,
> >
> > Hi,
> > After testing the patchset it seems  after the tenth patch I start seeing a
> drop in the scatter performance around ~4% decrease  across all packet sizes
> tested(112,256,512,1518)
> > Burst measurement see a decrease also but not as much as the scatter
> does.
> 
> Hi Marko,
> 
> Thanks for testing this series.
> 
> >
> > Patch10
> > fff1f9168 netdev-offload-dpdk: Support tunnel pop action
> 
> It doesn't make sense that this commit causes any degradation as it only
> enhances offloads that are not in the datapath and not done for
> virtio-user ports in any case.
> 
> Could you please double check?
> 
> I would expect maybe a degradation with:
> 
> Patch 12: 8a21a377c dpif-netdev: Provide orig_in_port in metadata for
> tunneled packets
> 
> Patch 6: e548c079d dpif-netdev: Add HW miss packet state recover logic
> 
> Could you please double check what is the offending commit?

So what I did after initial testing was rolled it back to patch 3 then 7 didn’t see any perf drop then went to 10 and once I tested it then I saw the perf drop as from the initial test.
Ill double check it again to be sure with more commits.

> 
> Do you compile with ALLOW_EXPERIMENTAL_API defined or not?

This is what I compile with: 
./boot.sh
./configure --with-dpdk=static CFLAGS="-g -Ofast -march=native"
make -j 10 
make install
> 
> > The test used for this is a 32 virito-user ports with 1Millions flows.
> 
> Could you please elaborate exactly about your setup and test?
> 
> What are "1M flows"? what are the differences between them?
	No difference in the million flows that is just how many are generated.
	Ip changes are mostly changed  with each flow

I commented below about the 1m flows on the test


> 
> What are the OpenFlow rules you use?
> 
> Are there any other configurations set (other_config for example)?
> 
> What is being done with the packets in the guest side? all ports are in
> the same VM?
	Encap decap on the vhost connection, and its 32virtio user ports and not vms is simulates the vm connection without using one.

Traffic @ Phy NIC Rx: 
Ether()/IP()/UDP()/VXLAN()/Ether()/IP() 

BURST: 

On the outer IP the source address remains the same for 32 instances before changing by .1 for 1024 times,
 while the destination remains the same, for the inner IP both destination and source IP’s increment by .1 for a total of 1048576 creating the 1 million flows for the test. 

SCATTER: 

On the outer IP the source address changes incrementally for 1-32 unlike burst which uses the same IP for 32 instances, 
the destination address remains the same on the outer IP, for the inner IP source IP remains the same while the destination 
address increments for a total of 1048576 times creating the million flows for the test.

> 
> >
> > Traffic @ Phy NIC Rx:
> > Ether()/IP()/UDP()/VXLAN()/Ether()/IP()
> >
> > Burst: on the outer ip we do a burst of 32 packets with same ip then switch
> for next 32 and so on
> > Scatter: for scatter we do incrementally for 32
> > And on the inner packet we have a total of  1048576 flows
> >
> > I can send on a diagram directly just restricted with html here to send the
> diagram here of the test setup
> 
> As commented above, I would appreciate more details about your tests and
> setup.
> 
> Thanks,
> 
> Eli
> 
> >
> > Thanks
> > Marko K
Eli Britstein Feb. 23, 2021, 11:44 a.m. UTC | #6
On 2/23/2021 12:48 PM, Sriharsha Basavapatna wrote:
> On Sun, Feb 21, 2021 at 7:04 PM Eli Britstein <elibr@nvidia.com> wrote:
>>
>> On 2/18/2021 6:38 PM, Kovacevic, Marko wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> <...>
>>>> Sending to Marko. As he wasn't subscribed to ovs-dev then.
>>>>
>>> <...>
>>>>> VXLAN decap in OVS-DPDK configuration consists of two flows:
>>>>> F1: in_port(ens1f0),eth(),ipv4(),udp(), actions:tnl_pop(vxlan_sys_4789)
>>>>> F2: tunnel(),in_port(vxlan_sys_4789),eth(),ipv4(), actions:ens1f0_0
>>>>>
>>>>> F1 is a classification flow. It has outer headers matches and it classifies the
>>>>> packet as a VXLAN packet, and using tnl_pop action the packet continues
>>>>> processing in F2.
>>>>> F2 is a flow that has matches on tunnel metadata as well as on the inner
>>>>> packet headers (as any other flow).
>>>>>
>>> <...>
>>>
>>> Hi Eli,
>>>
>>> Hi,
>>> After testing the patchset it seems  after the tenth patch I start seeing a drop in the scatter performance around ~4% decrease  across all packet sizes tested(112,256,512,1518)
>>> Burst measurement see a decrease also but not as much as the scatter does.
>> Hi Marko,
>>
>> Thanks for testing this series.
>>
>>> Patch10
>>> fff1f9168 netdev-offload-dpdk: Support tunnel pop action
>> It doesn't make sense that this commit causes any degradation as it only
>> enhances offloads that are not in the datapath and not done for
>> virtio-user ports in any case.
> Patch 10 enables offload for flow F1 with tnl_pop action. If
> hw_offload is enabled, then the new code to offload this flow would be
> executed for virtio-user ports as well, since this flow is independent
> of the end point port (whether virtio or vf-rep).
No. virtio-user ports won't have "flow_api" function pointer to dpdk 
offload provider. Although, this tnl_pop flow is on the PF, so it is not 
virtio-user.
>
> Before this patch (i.e, with the original code in master/2.15),
> parse_flow_actions() would fail for TUNNEL_POP action. But with the
> new code, this action is processed by the function -
> add_tnl_pop_action(). There is some processing in this function,
> including a new rte_flow API (rte_flow_tunnel_decap_set) to the PMD.
> Maybe this is adding some overhead ?

The new API is processed in the offload thread, not in the datapath. 
Indeed it can affect the datapath, depending if/how the PF's PMD 
support/implementation.

As seen from Marko's configuration line, there is no experimental 
support, so there are no new offloads either.

>
> Thanks,
> -Harsha
>> Could you please double check?
>>
>> I would expect maybe a degradation with:
>>
>> Patch 12: 8a21a377c dpif-netdev: Provide orig_in_port in metadata for
>> tunneled packets
>>
>> Patch 6: e548c079d dpif-netdev: Add HW miss packet state recover logic
>>
>> Could you please double check what is the offending commit?
>>
>> Do you compile with ALLOW_EXPERIMENTAL_API defined or not?
>>
>>> The test used for this is a 32 virito-user ports with 1Millions flows.
>> Could you please elaborate exactly about your setup and test?
>>
>> What are "1M flows"? what are the differences between them?
>>
>> What are the OpenFlow rules you use?
>>
>> Are there any other configurations set (other_config for example)?
>>
>> What is being done with the packets in the guest side? all ports are in
>> the same VM?
>>
>>> Traffic @ Phy NIC Rx:
>>> Ether()/IP()/UDP()/VXLAN()/Ether()/IP()
>>>
>>> Burst: on the outer ip we do a burst of 32 packets with same ip then switch for next 32 and so on
>>> Scatter: for scatter we do incrementally for 32
>>> And on the inner packet we have a total of  1048576 flows
>>>
>>> I can send on a diagram directly just restricted with html here to send the diagram here of the test setup
>> As commented above, I would appreciate more details about your tests and
>> setup.
>>
>> Thanks,
>>
>> Eli
>>
>>> Thanks
>>> Marko K
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Eli Britstein Feb. 23, 2021, 11:50 a.m. UTC | #7
On 2/23/2021 1:38 PM, Kovacevic, Marko wrote:
> External email: Use caution opening links or attachments
>
>
>>>
>>> <...>
>>>> Sending to Marko. As he wasn't subscribed to ovs-dev then.
>>>>
>>> <...>
>>>>> VXLAN decap in OVS-DPDK configuration consists of two flows:
>>>>> F1: in_port(ens1f0),eth(),ipv4(),udp(), actions:tnl_pop(vxlan_sys_4789)
>>>>> F2: tunnel(),in_port(vxlan_sys_4789),eth(),ipv4(), actions:ens1f0_0
>>>>>
>>>>> F1 is a classification flow. It has outer headers matches and it classifies
>> the
>>>>> packet as a VXLAN packet, and using tnl_pop action the packet continues
>>>>> processing in F2.
>>>>> F2 is a flow that has matches on tunnel metadata as well as on the inner
>>>>> packet headers (as any other flow).
>>>>>
>>> <...>
>>>
>>> Hi Eli,
>>>
>>> Hi,
>>> After testing the patchset it seems  after the tenth patch I start seeing a
>> drop in the scatter performance around ~4% decrease  across all packet sizes
>> tested(112,256,512,1518)
>>> Burst measurement see a decrease also but not as much as the scatter
>> does.
>>
>> Hi Marko,
>>
>> Thanks for testing this series.
>>
>>> Patch10
>>> fff1f9168 netdev-offload-dpdk: Support tunnel pop action
>> It doesn't make sense that this commit causes any degradation as it only
>> enhances offloads that are not in the datapath and not done for
>> virtio-user ports in any case.
>>
>> Could you please double check?
>>
>> I would expect maybe a degradation with:
>>
>> Patch 12: 8a21a377c dpif-netdev: Provide orig_in_port in metadata for
>> tunneled packets
>>
>> Patch 6: e548c079d dpif-netdev: Add HW miss packet state recover logic
>>
>> Could you please double check what is the offending commit?
> So what I did after initial testing was rolled it back to patch 3 then 7 didn’t see any perf drop then went to 10 and once I tested it then I saw the perf drop as from the initial test.
> Ill double check it again to be sure with more commits.
>
>> Do you compile with ALLOW_EXPERIMENTAL_API defined or not?
> This is what I compile with:
> ./boot.sh
> ./configure --with-dpdk=static CFLAGS="-g -Ofast -march=native"
> make -j 10
> make install

Thanks. There is no ALLOW_EXPERIMENTAL_API defined, so there are no new 
offloads introduced by this series for this compilation.

The only datapath commits that might be suspected are then as above 
(6,12). Please double check it.

I would expect your PF (NIC) support for partial offload. Could you 
please verify that this is not affected before and after the series?

>>> The test used for this is a 32 virito-user ports with 1Millions flows.
>> Could you please elaborate exactly about your setup and test?
>>
>> What are "1M flows"? what are the differences between them?
>          No difference in the million flows that is just how many are generated.
>          Ip changes are mostly changed  with each flow
>
> I commented below about the 1m flows on the test
>
>
>> What are the OpenFlow rules you use?
>>
>> Are there any other configurations set (other_config for example)?
>>
>> What is being done with the packets in the guest side? all ports are in
>> the same VM?
>          Encap decap on the vhost connection, and its 32virtio user ports and not vms is simulates the vm connection without using one.
Could you please elaborate the details of this? is it testpmd? if so, 
what's the execution line? some other app?
>
> Traffic @ Phy NIC Rx:
> Ether()/IP()/UDP()/VXLAN()/Ether()/IP()
>
> BURST:
>
> On the outer IP the source address remains the same for 32 instances before changing by .1 for 1024 times,
>   while the destination remains the same, for the inner IP both destination and source IP’s increment by .1 for a total of 1048576 creating the 1 million flows for the test.
>
> SCATTER:
>
> On the outer IP the source address changes incrementally for 1-32 unlike burst which uses the same IP for 32 instances,
> the destination address remains the same on the outer IP, for the inner IP source IP remains the same while the destination
> address increments for a total of 1048576 times creating the million flows for the test.
>
>>> Traffic @ Phy NIC Rx:
>>> Ether()/IP()/UDP()/VXLAN()/Ether()/IP()
>>>
>>> Burst: on the outer ip we do a burst of 32 packets with same ip then switch
>> for next 32 and so on
>>> Scatter: for scatter we do incrementally for 32
>>> And on the inner packet we have a total of  1048576 flows
>>>
>>> I can send on a diagram directly just restricted with html here to send the
>> diagram here of the test setup
>>
>> As commented above, I would appreciate more details about your tests and
>> setup.
>>
>> Thanks,
>>
>> Eli
>>
>>> Thanks
>>> Marko K
Sriharsha Basavapatna Feb. 23, 2021, 1:35 p.m. UTC | #8
On Tue, Feb 23, 2021 at 5:14 PM Eli Britstein <elibr@nvidia.com> wrote:
>
>
> On 2/23/2021 12:48 PM, Sriharsha Basavapatna wrote:
> > On Sun, Feb 21, 2021 at 7:04 PM Eli Britstein <elibr@nvidia.com> wrote:
> >>
> >> On 2/18/2021 6:38 PM, Kovacevic, Marko wrote:
> >>> External email: Use caution opening links or attachments
> >>>
> >>>
> >>> <...>
> >>>> Sending to Marko. As he wasn't subscribed to ovs-dev then.
> >>>>
> >>> <...>
> >>>>> VXLAN decap in OVS-DPDK configuration consists of two flows:
> >>>>> F1: in_port(ens1f0),eth(),ipv4(),udp(), actions:tnl_pop(vxlan_sys_4789)
> >>>>> F2: tunnel(),in_port(vxlan_sys_4789),eth(),ipv4(), actions:ens1f0_0
> >>>>>
> >>>>> F1 is a classification flow. It has outer headers matches and it classifies the
> >>>>> packet as a VXLAN packet, and using tnl_pop action the packet continues
> >>>>> processing in F2.
> >>>>> F2 is a flow that has matches on tunnel metadata as well as on the inner
> >>>>> packet headers (as any other flow).
> >>>>>
> >>> <...>
> >>>
> >>> Hi Eli,
> >>>
> >>> Hi,
> >>> After testing the patchset it seems  after the tenth patch I start seeing a drop in the scatter performance around ~4% decrease  across all packet sizes tested(112,256,512,1518)
> >>> Burst measurement see a decrease also but not as much as the scatter does.
> >> Hi Marko,
> >>
> >> Thanks for testing this series.
> >>
> >>> Patch10
> >>> fff1f9168 netdev-offload-dpdk: Support tunnel pop action
> >> It doesn't make sense that this commit causes any degradation as it only
> >> enhances offloads that are not in the datapath and not done for
> >> virtio-user ports in any case.
> > Patch 10 enables offload for flow F1 with tnl_pop action. If
> > hw_offload is enabled, then the new code to offload this flow would be
> > executed for virtio-user ports as well, since this flow is independent
> > of the end point port (whether virtio or vf-rep).
> No. virtio-user ports won't have "flow_api" function pointer to dpdk
> offload provider. Although, this tnl_pop flow is on the PF, so it is not
> virtio-user.

I know that virio-user ports won't have "flow-api" function pointers.
That's not what I meant. While offloading flow-F1, we don't really
know what the final endpoint port is (virtio or vf-rep), since the
in_port for flow-F1 is a PF port. So, add_tnl_pop_action() would be
executed independent of the target destination port (which is
available as out_port in flow-F2). So, even if the packet is
eventually destined to a virtio-user port (in F2), F1 still executes
add_tnl_pop_action().

> >
> > Before this patch (i.e, with the original code in master/2.15),
> > parse_flow_actions() would fail for TUNNEL_POP action. But with the
> > new code, this action is processed by the function -
> > add_tnl_pop_action(). There is some processing in this function,
> > including a new rte_flow API (rte_flow_tunnel_decap_set) to the PMD.
> > Maybe this is adding some overhead ?
>
> The new API is processed in the offload thread, not in the datapath.
> Indeed it can affect the datapath, depending if/how the PF's PMD
> support/implementation.
>
> As seen from Marko's configuration line, there is no experimental
> support, so there are no new offloads either.

Even if experimental api support is not enabled, if hw-offload is
enabled in OVS, then add_tnl_pop_action() would still be called ? And
at the very least these 3 functions would be invoked in that function:
netdev_ports_get(), vport_to_rte_tunnel() and
netdev_dpdk_rte_flow_tunnel_decap_set(), the last one returns -1.

Is hw-offload enabled in Marko's configuration ?


>
> >
> > Thanks,
> > -Harsha
> >> Could you please double check?
> >>
> >> I would expect maybe a degradation with:
> >>
> >> Patch 12: 8a21a377c dpif-netdev: Provide orig_in_port in metadata for
> >> tunneled packets
> >>
> >> Patch 6: e548c079d dpif-netdev: Add HW miss packet state recover logic
> >>
> >> Could you please double check what is the offending commit?
> >>
> >> Do you compile with ALLOW_EXPERIMENTAL_API defined or not?
> >>
> >>> The test used for this is a 32 virito-user ports with 1Millions flows.
> >> Could you please elaborate exactly about your setup and test?
> >>
> >> What are "1M flows"? what are the differences between them?
> >>
> >> What are the OpenFlow rules you use?
> >>
> >> Are there any other configurations set (other_config for example)?
> >>
> >> What is being done with the packets in the guest side? all ports are in
> >> the same VM?
> >>
> >>> Traffic @ Phy NIC Rx:
> >>> Ether()/IP()/UDP()/VXLAN()/Ether()/IP()
> >>>
> >>> Burst: on the outer ip we do a burst of 32 packets with same ip then switch for next 32 and so on
> >>> Scatter: for scatter we do incrementally for 32
> >>> And on the inner packet we have a total of  1048576 flows
> >>>
> >>> I can send on a diagram directly just restricted with html here to send the diagram here of the test setup
> >> As commented above, I would appreciate more details about your tests and
> >> setup.
> >>
> >> Thanks,
> >>
> >> Eli
> >>
> >>> Thanks
> >>> Marko K
> >> _______________________________________________
> >> dev mailing list
> >> dev@openvswitch.org
> >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Eli Britstein Feb. 23, 2021, 1:41 p.m. UTC | #9
On 2/23/2021 3:35 PM, Sriharsha Basavapatna wrote:
> On Tue, Feb 23, 2021 at 5:14 PM Eli Britstein <elibr@nvidia.com> wrote:
>>
>> On 2/23/2021 12:48 PM, Sriharsha Basavapatna wrote:
>>> On Sun, Feb 21, 2021 at 7:04 PM Eli Britstein <elibr@nvidia.com> wrote:
>>>> On 2/18/2021 6:38 PM, Kovacevic, Marko wrote:
>>>>> External email: Use caution opening links or attachments
>>>>>
>>>>>
>>>>> <...>
>>>>>> Sending to Marko. As he wasn't subscribed to ovs-dev then.
>>>>>>
>>>>> <...>
>>>>>>> VXLAN decap in OVS-DPDK configuration consists of two flows:
>>>>>>> F1: in_port(ens1f0),eth(),ipv4(),udp(), actions:tnl_pop(vxlan_sys_4789)
>>>>>>> F2: tunnel(),in_port(vxlan_sys_4789),eth(),ipv4(), actions:ens1f0_0
>>>>>>>
>>>>>>> F1 is a classification flow. It has outer headers matches and it classifies the
>>>>>>> packet as a VXLAN packet, and using tnl_pop action the packet continues
>>>>>>> processing in F2.
>>>>>>> F2 is a flow that has matches on tunnel metadata as well as on the inner
>>>>>>> packet headers (as any other flow).
>>>>>>>
>>>>> <...>
>>>>>
>>>>> Hi Eli,
>>>>>
>>>>> Hi,
>>>>> After testing the patchset it seems  after the tenth patch I start seeing a drop in the scatter performance around ~4% decrease  across all packet sizes tested(112,256,512,1518)
>>>>> Burst measurement see a decrease also but not as much as the scatter does.
>>>> Hi Marko,
>>>>
>>>> Thanks for testing this series.
>>>>
>>>>> Patch10
>>>>> fff1f9168 netdev-offload-dpdk: Support tunnel pop action
>>>> It doesn't make sense that this commit causes any degradation as it only
>>>> enhances offloads that are not in the datapath and not done for
>>>> virtio-user ports in any case.
>>> Patch 10 enables offload for flow F1 with tnl_pop action. If
>>> hw_offload is enabled, then the new code to offload this flow would be
>>> executed for virtio-user ports as well, since this flow is independent
>>> of the end point port (whether virtio or vf-rep).
>> No. virtio-user ports won't have "flow_api" function pointer to dpdk
>> offload provider. Although, this tnl_pop flow is on the PF, so it is not
>> virtio-user.
> I know that virio-user ports won't have "flow-api" function pointers.
> That's not what I meant. While offloading flow-F1, we don't really
> know what the final endpoint port is (virtio or vf-rep), since the
> in_port for flow-F1 is a PF port. So, add_tnl_pop_action() would be
> executed independent of the target destination port (which is
> available as out_port in flow-F2). So, even if the packet is
> eventually destined to a virtio-user port (in F2), F1 still executes
> add_tnl_pop_action().
Right, see below comment.
>
>>> Before this patch (i.e, with the original code in master/2.15),
>>> parse_flow_actions() would fail for TUNNEL_POP action. But with the
>>> new code, this action is processed by the function -
>>> add_tnl_pop_action(). There is some processing in this function,
>>> including a new rte_flow API (rte_flow_tunnel_decap_set) to the PMD.
>>> Maybe this is adding some overhead ?
>> The new API is processed in the offload thread, not in the datapath.
>> Indeed it can affect the datapath, depending if/how the PF's PMD
>> support/implementation.
>>
>> As seen from Marko's configuration line, there is no experimental
>> support, so there are no new offloads either.
> Even if experimental api support is not enabled, if hw-offload is
> enabled in OVS, then add_tnl_pop_action() would still be called ? And
> at the very least these 3 functions would be invoked in that function:
> netdev_ports_get(), vport_to_rte_tunnel() and
> netdev_dpdk_rte_flow_tunnel_decap_set(), the last one returns -1.
Those calls are right, but they occur only when the flow is created and 
in the offload thread and not the datapath, so they should not affect it.
>
> Is hw-offload enabled in Marko's configuration ?
I suppose it is.
>
>
>>> Thanks,
>>> -Harsha
>>>> Could you please double check?
>>>>
>>>> I would expect maybe a degradation with:
>>>>
>>>> Patch 12: 8a21a377c dpif-netdev: Provide orig_in_port in metadata for
>>>> tunneled packets
>>>>
>>>> Patch 6: e548c079d dpif-netdev: Add HW miss packet state recover logic
>>>>
>>>> Could you please double check what is the offending commit?
>>>>
>>>> Do you compile with ALLOW_EXPERIMENTAL_API defined or not?
>>>>
>>>>> The test used for this is a 32 virito-user ports with 1Millions flows.
>>>> Could you please elaborate exactly about your setup and test?
>>>>
>>>> What are "1M flows"? what are the differences between them?
>>>>
>>>> What are the OpenFlow rules you use?
>>>>
>>>> Are there any other configurations set (other_config for example)?
>>>>
>>>> What is being done with the packets in the guest side? all ports are in
>>>> the same VM?
>>>>
>>>>> Traffic @ Phy NIC Rx:
>>>>> Ether()/IP()/UDP()/VXLAN()/Ether()/IP()
>>>>>
>>>>> Burst: on the outer ip we do a burst of 32 packets with same ip then switch for next 32 and so on
>>>>> Scatter: for scatter we do incrementally for 32
>>>>> And on the inner packet we have a total of  1048576 flows
>>>>>
>>>>> I can send on a diagram directly just restricted with html here to send the diagram here of the test setup
>>>> As commented above, I would appreciate more details about your tests and
>>>> setup.
>>>>
>>>> Thanks,
>>>>
>>>> Eli
>>>>
>>>>> Thanks
>>>>> Marko K
>>>> _______________________________________________
>>>> dev mailing list
>>>> dev@openvswitch.org
>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev