mbox series

[ovs-dev,v7,0/6] OVS-DPDK flow offload with rte_flow

Message ID 1517209188-16608-1-git-send-email-yliu@fridaylinux.org
Headers show
Series OVS-DPDK flow offload with rte_flow | expand

Message

Yuanhan Liu Jan. 29, 2018, 6:59 a.m. UTC
Hi,

Here is a joint work from Mellanox and Napatech, to enable the flow hw
offload with the DPDK generic flow interface (rte_flow).

The basic idea is to associate the flow with a mark id (a unit32_t number).
Later, we then get the flow directly from the mark id, which could bypass
some heavy CPU operations, including but not limiting to mini flow extract,
emc lookup, dpcls lookup, etc.

The association is done with CMAP in patch 1. The CPU workload bypassing
is done in patch 2. The flow offload is done in patch 3, which mainly does
two things:

- translate the ovs match to DPDK rte flow patterns
- bind those patterns with a RSS + MARK action.

Patch 5 makes the offload work happen in another thread, for leaving the
datapath as light as possible.

A PHY-PHY forwarding with 1000 mega flows (udp,tp_src=1000-1999) and 1
million streams (tp_src=1000-1999, tp_dst=2000-2999) show more than 260%
performance boost.

Note that it's disabled by default, which can be enabled by:

    $ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true

v7: - fixed wrong hash for mark_to_flow that has been refactored in v6
    - set the rss_conf for rss action to NULL, to workaround a mlx5 change
      in DPDK v17.11. Note that it will obey the rss settings OVS-DPDK has
      set in the beginning. Thus, nothing should be effected.

v6: - fixed a sparse warning
    - added documentation
    - used hash_int to compute mark to flow hash
    - added more comments
    - added lock for pot lookup
    - rebased on top of the latest code

v5: - fixed an issue that it took too long if we do flow add/remove
      repeatedly.
    - removed an unused mutex lock
    - turned most of the log level to DBG
    - rebased on top of the latest code

v4: - use RSS action instead of QUEUE action with MARK
    - make it work with multiple queue (see patch 1)
    - rebased on top of latest code

v3: - The mark and id association is done with array instead of CMAP.
    - Added a thread to do hw offload operations
    - Removed macros completely
    - dropped the patch to set FDIR_CONF, which is a workround some
      Intel NICs.
    - Added a debug patch to show all flow patterns we have created.
    - Misc fixes

v2: - workaround the queue action issue
    - fixed the tcp_flags being skipped issue, which also fixed the
      build warnings
    - fixed l2 patterns for Intel nic
    - Converted some macros to functions
    - did not hardcode the max number of flow/action
    - rebased on top of the latest code

Thanks.

    --yliu

---
Finn Christensen (1):
  netdev-dpdk: implement flow offload with rte flow

Yuanhan Liu (5):
  dpif-netdev: associate flow with a mark id
  dpif-netdev: retrieve flow directly from the flow mark
  netdev-dpdk: add debug for rte flow patterns
  dpif-netdev: do hw flow offload in a thread
  Documentation: document ovs-dpdk flow offload

 Documentation/howto/dpdk.rst |  17 +
 NEWS                         |   1 +
 lib/dp-packet.h              |  13 +
 lib/dpif-netdev.c            | 495 ++++++++++++++++++++++++++++-
 lib/flow.c                   | 155 +++++++--
 lib/flow.h                   |   1 +
 lib/netdev-dpdk.c            | 736 ++++++++++++++++++++++++++++++++++++++++++-
 lib/netdev.h                 |   6 +
 8 files changed, 1385 insertions(+), 39 deletions(-)

Comments

Flavio Leitner Feb. 1, 2018, 3:35 a.m. UTC | #1
On Mon, Jan 29, 2018 at 02:59:42PM +0800, Yuanhan Liu wrote:
> Hi,
> 
> Here is a joint work from Mellanox and Napatech, to enable the flow hw
> offload with the DPDK generic flow interface (rte_flow).
> 
> The basic idea is to associate the flow with a mark id (a unit32_t number).
> Later, we then get the flow directly from the mark id, which could bypass
> some heavy CPU operations, including but not limiting to mini flow extract,
> emc lookup, dpcls lookup, etc.
> 
> The association is done with CMAP in patch 1. The CPU workload bypassing
> is done in patch 2. The flow offload is done in patch 3, which mainly does
> two things:
> 
> - translate the ovs match to DPDK rte flow patterns
> - bind those patterns with a RSS + MARK action.
> 
> Patch 5 makes the offload work happen in another thread, for leaving the
> datapath as light as possible.
> 
> A PHY-PHY forwarding with 1000 mega flows (udp,tp_src=1000-1999) and 1
> million streams (tp_src=1000-1999, tp_dst=2000-2999) show more than 260%
> performance boost.
> 
> Note that it's disabled by default, which can be enabled by:

Hi,

First of all, thanks for working on this feature.

I have some general comments I'd like to discuss before going deeper
on it.

The documentation is too simple.  It should mention the HW requirements
in order to use this feature. Also some important limitations, like no
support for IP frags, MPLS or conntrack, for instance.

It seems it would be possible to leave the HW offloading code outside
of dpif-netdev.c which is quite long already. I hope it will improve
isolation and code clarity too.

So far there is no synchronization between PMDs in the fast path.
However, we got a new mutex to sync PMDs and a new thread to manage.
Even without the patch adding the thread, there would be a new mutex
in the fast path.  It seems the slow path today causes issues, so maybe
the whole upcall processing could be pushed to another thread. I
realize this is outside of the scope of this patchset, but it is
something we should consider.

As an alternative solution, maybe we could use a DPDK ring to have a
lockless way to push flows to the auxiliary thread.

There are some memory allocations and deallocations in the fast path
using OVS functions.  Perhaps it is better to use rte_* functions
instead (another reason to split the code out of dpif-netdev.c)

I am curious to know why there is no flow dump or flush?

The function to help debugging (dump_flow_pattern) should use an
initial condition to return asap if debug is not enabled.
E.g.:
    if (VLOG_DROP_DBG(rl)) {
        return;
    }   

I am still wrapping my head around the RSS+MARK action and rte_flow
API, so I can't really comment those yet.

Thanks!
fbl

> 
>     $ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
> 
> v7: - fixed wrong hash for mark_to_flow that has been refactored in v6
>     - set the rss_conf for rss action to NULL, to workaround a mlx5 change
>       in DPDK v17.11. Note that it will obey the rss settings OVS-DPDK has
>       set in the beginning. Thus, nothing should be effected.
> 
> v6: - fixed a sparse warning
>     - added documentation
>     - used hash_int to compute mark to flow hash
>     - added more comments
>     - added lock for pot lookup
>     - rebased on top of the latest code
> 
> v5: - fixed an issue that it took too long if we do flow add/remove
>       repeatedly.
>     - removed an unused mutex lock
>     - turned most of the log level to DBG
>     - rebased on top of the latest code
> 
> v4: - use RSS action instead of QUEUE action with MARK
>     - make it work with multiple queue (see patch 1)
>     - rebased on top of latest code
> 
> v3: - The mark and id association is done with array instead of CMAP.
>     - Added a thread to do hw offload operations
>     - Removed macros completely
>     - dropped the patch to set FDIR_CONF, which is a workround some
>       Intel NICs.
>     - Added a debug patch to show all flow patterns we have created.
>     - Misc fixes
> 
> v2: - workaround the queue action issue
>     - fixed the tcp_flags being skipped issue, which also fixed the
>       build warnings
>     - fixed l2 patterns for Intel nic
>     - Converted some macros to functions
>     - did not hardcode the max number of flow/action
>     - rebased on top of the latest code
> 
> Thanks.
> 
>     --yliu
> 
> ---
> Finn Christensen (1):
>   netdev-dpdk: implement flow offload with rte flow
> 
> Yuanhan Liu (5):
>   dpif-netdev: associate flow with a mark id
>   dpif-netdev: retrieve flow directly from the flow mark
>   netdev-dpdk: add debug for rte flow patterns
>   dpif-netdev: do hw flow offload in a thread
>   Documentation: document ovs-dpdk flow offload
> 
>  Documentation/howto/dpdk.rst |  17 +
>  NEWS                         |   1 +
>  lib/dp-packet.h              |  13 +
>  lib/dpif-netdev.c            | 495 ++++++++++++++++++++++++++++-
>  lib/flow.c                   | 155 +++++++--
>  lib/flow.h                   |   1 +
>  lib/netdev-dpdk.c            | 736 ++++++++++++++++++++++++++++++++++++++++++-
>  lib/netdev.h                 |   6 +
>  8 files changed, 1385 insertions(+), 39 deletions(-)
> 
> -- 
> 2.7.4
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Yuanhan Liu Feb. 6, 2018, 9:10 a.m. UTC | #2
On Thu, Feb 01, 2018 at 01:35:15AM -0200, Flavio Leitner wrote:
> On Mon, Jan 29, 2018 at 02:59:42PM +0800, Yuanhan Liu wrote:
> > Hi,
> > 
> > Here is a joint work from Mellanox and Napatech, to enable the flow hw
> > offload with the DPDK generic flow interface (rte_flow).
> > 
> > The basic idea is to associate the flow with a mark id (a unit32_t number).
> > Later, we then get the flow directly from the mark id, which could bypass
> > some heavy CPU operations, including but not limiting to mini flow extract,
> > emc lookup, dpcls lookup, etc.
> > 
> > The association is done with CMAP in patch 1. The CPU workload bypassing
> > is done in patch 2. The flow offload is done in patch 3, which mainly does
> > two things:
> > 
> > - translate the ovs match to DPDK rte flow patterns
> > - bind those patterns with a RSS + MARK action.
> > 
> > Patch 5 makes the offload work happen in another thread, for leaving the
> > datapath as light as possible.
> > 
> > A PHY-PHY forwarding with 1000 mega flows (udp,tp_src=1000-1999) and 1
> > million streams (tp_src=1000-1999, tp_dst=2000-2999) show more than 260%
> > performance boost.
> > 
> > Note that it's disabled by default, which can be enabled by:
> 
> Hi,
> 
> First of all, thanks for working on this feature.
> 
> I have some general comments I'd like to discuss before going deeper
> on it.
> 
> The documentation is too simple.  It should mention the HW requirements
> in order to use this feature.

It listed the NICs that support this feature.

>  Also some important limitations, like no
> support for IP frags, MPLS or conntrack, for instance.

Yes, that could be added.

> It seems it would be possible to leave the HW offloading code outside
> of dpif-netdev.c which is quite long already. I hope it will improve
> isolation and code clarity too.

Yes. My thoughts was we can do it later, when we are going to add some
other flow models (say, full offload) in future. Moreover, it seems
that it's a custom in OVS to have long source code files. I was just
following it.

But I hve no objections to break it. Ian, should I break it now?
What's your plan to merge it?

> So far there is no synchronization between PMDs in the fast path.
> However, we got a new mutex to sync PMDs and a new thread to manage.
> Even without the patch adding the thread, there would be a new mutex
> in the fast path.  It seems the slow path today causes issues, so maybe
> the whole upcall processing could be pushed to another thread. I
> realize this is outside of the scope of this patchset, but it is
> something we should consider.
> 
> As an alternative solution, maybe we could use a DPDK ring to have a
> lockless way to push flows to the auxiliary thread.

It could be an option, but it brings a hard dependency on DPDK. Note that
the netdev datapath could leave without DPDK, IIUC.

> There are some memory allocations and deallocations in the fast path
> using OVS functions.  Perhaps it is better to use rte_* functions
> instead (another reason to split the code out of dpif-netdev.c)

Why do you think the one from DPDK is better? Also, it brings the DPDK
dependency to the netdev datapth.

> I am curious to know why there is no flow dump or flush?

Not quite sure I got it.

> The function to help debugging (dump_flow_pattern) should use an
> initial condition to return asap if debug is not enabled.
> E.g.:
>     if (VLOG_DROP_DBG(rl)) {
>         return;
>     }   

Good tip. Thanks!

> I am still wrapping my head around the RSS+MARK action and rte_flow
> API, so I can't really comment those yet.

Thanks for looking at it!

	--yliu
Stokes, Ian Feb. 6, 2018, 9:34 p.m. UTC | #3
> -----Original Message-----
> From: Yuanhan Liu [mailto:yliu@fridaylinux.org]
> Sent: Tuesday, February 6, 2018 9:10 AM
> To: Flavio Leitner <fbl@sysclose.org>; Stokes, Ian <ian.stokes@intel.com>
> Cc: dev@openvswitch.org; Simon Horman <simon.horman@netronome.com>
> Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow
> 
> On Thu, Feb 01, 2018 at 01:35:15AM -0200, Flavio Leitner wrote:
> > On Mon, Jan 29, 2018 at 02:59:42PM +0800, Yuanhan Liu wrote:
> > > Hi,
> > >
> > > Here is a joint work from Mellanox and Napatech, to enable the flow
> > > hw offload with the DPDK generic flow interface (rte_flow).
> > >
> > > The basic idea is to associate the flow with a mark id (a unit32_t
> number).
> > > Later, we then get the flow directly from the mark id, which could
> > > bypass some heavy CPU operations, including but not limiting to mini
> > > flow extract, emc lookup, dpcls lookup, etc.
> > >
> > > The association is done with CMAP in patch 1. The CPU workload
> > > bypassing is done in patch 2. The flow offload is done in patch 3,
> > > which mainly does two things:
> > >
> > > - translate the ovs match to DPDK rte flow patterns
> > > - bind those patterns with a RSS + MARK action.
> > >
> > > Patch 5 makes the offload work happen in another thread, for leaving
> > > the datapath as light as possible.
> > >
> > > A PHY-PHY forwarding with 1000 mega flows (udp,tp_src=1000-1999) and
> > > 1 million streams (tp_src=1000-1999, tp_dst=2000-2999) show more
> > > than 260% performance boost.
> > >
> > > Note that it's disabled by default, which can be enabled by:
> >
> > Hi,
> >
> > First of all, thanks for working on this feature.
> >
> > I have some general comments I'd like to discuss before going deeper
> > on it.
> >
> > The documentation is too simple.  It should mention the HW
> > requirements in order to use this feature.
> 
> It listed the NICs that support this feature.
> 
> >  Also some important limitations, like no support for IP frags, MPLS
> > or conntrack, for instance.
> 
> Yes, that could be added.
> 
> > It seems it would be possible to leave the HW offloading code outside
> > of dpif-netdev.c which is quite long already. I hope it will improve
> > isolation and code clarity too.
> 
> Yes. My thoughts was we can do it later, when we are going to add some
> other flow models (say, full offload) in future. Moreover, it seems that
> it's a custom in OVS to have long source code files. I was just following
> it.
> 
> But I hve no objections to break it. Ian, should I break it now?
> What's your plan to merge it?

Hi Yuanhan, I believe I gave the same feedback on the first review pass I did. Previously I was happy to hold off on this work in order to make it into the 2.9 release but as that window has passed I think it makes sense to break it out now. It will save the refactor work later.

I should have more time later this week to continue reviewing, I'll provide some more feedback on the rest of the patchset then.

Thanks
Ian


> 
> > So far there is no synchronization between PMDs in the fast path.
> > However, we got a new mutex to sync PMDs and a new thread to manage.
> > Even without the patch adding the thread, there would be a new mutex
> > in the fast path.  It seems the slow path today causes issues, so
> > maybe the whole upcall processing could be pushed to another thread. I
> > realize this is outside of the scope of this patchset, but it is
> > something we should consider.
> >
> > As an alternative solution, maybe we could use a DPDK ring to have a
> > lockless way to push flows to the auxiliary thread.
> 
> It could be an option, but it brings a hard dependency on DPDK. Note that
> the netdev datapath could leave without DPDK, IIUC.
> 
> > There are some memory allocations and deallocations in the fast path
> > using OVS functions.  Perhaps it is better to use rte_* functions
> > instead (another reason to split the code out of dpif-netdev.c)
> 
> Why do you think the one from DPDK is better? Also, it brings the DPDK
> dependency to the netdev datapth.
> 
> > I am curious to know why there is no flow dump or flush?
> 
> Not quite sure I got it.
> 
> > The function to help debugging (dump_flow_pattern) should use an
> > initial condition to return asap if debug is not enabled.
> > E.g.:
> >     if (VLOG_DROP_DBG(rl)) {
> >         return;
> >     }
> 
> Good tip. Thanks!
> 
> > I am still wrapping my head around the RSS+MARK action and rte_flow
> > API, so I can't really comment those yet.
> 
> Thanks for looking at it!
> 
> 	--yliu
Yuanhan Liu Feb. 7, 2018, 2:11 a.m. UTC | #4
On Tue, Feb 06, 2018 at 09:34:36PM +0000, Stokes, Ian wrote:
> > -----Original Message-----
> > From: Yuanhan Liu [mailto:yliu@fridaylinux.org]
> > Sent: Tuesday, February 6, 2018 9:10 AM
> > To: Flavio Leitner <fbl@sysclose.org>; Stokes, Ian <ian.stokes@intel.com>
> > Cc: dev@openvswitch.org; Simon Horman <simon.horman@netronome.com>
> > Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow
> > 
> > On Thu, Feb 01, 2018 at 01:35:15AM -0200, Flavio Leitner wrote:
> > > On Mon, Jan 29, 2018 at 02:59:42PM +0800, Yuanhan Liu wrote:
> > > > Hi,
> > > >
> > > > Here is a joint work from Mellanox and Napatech, to enable the flow
> > > > hw offload with the DPDK generic flow interface (rte_flow).
> > > >
> > > > The basic idea is to associate the flow with a mark id (a unit32_t
> > number).
> > > > Later, we then get the flow directly from the mark id, which could
> > > > bypass some heavy CPU operations, including but not limiting to mini
> > > > flow extract, emc lookup, dpcls lookup, etc.
> > > >
> > > > The association is done with CMAP in patch 1. The CPU workload
> > > > bypassing is done in patch 2. The flow offload is done in patch 3,
> > > > which mainly does two things:
> > > >
> > > > - translate the ovs match to DPDK rte flow patterns
> > > > - bind those patterns with a RSS + MARK action.
> > > >
> > > > Patch 5 makes the offload work happen in another thread, for leaving
> > > > the datapath as light as possible.
> > > >
> > > > A PHY-PHY forwarding with 1000 mega flows (udp,tp_src=1000-1999) and
> > > > 1 million streams (tp_src=1000-1999, tp_dst=2000-2999) show more
> > > > than 260% performance boost.
> > > >
> > > > Note that it's disabled by default, which can be enabled by:
> > >
> > > Hi,
> > >
> > > First of all, thanks for working on this feature.
> > >
> > > I have some general comments I'd like to discuss before going deeper
> > > on it.
> > >
> > > The documentation is too simple.  It should mention the HW
> > > requirements in order to use this feature.
> > 
> > It listed the NICs that support this feature.
> > 
> > >  Also some important limitations, like no support for IP frags, MPLS
> > > or conntrack, for instance.
> > 
> > Yes, that could be added.
> > 
> > > It seems it would be possible to leave the HW offloading code outside
> > > of dpif-netdev.c which is quite long already. I hope it will improve
> > > isolation and code clarity too.
> > 
> > Yes. My thoughts was we can do it later, when we are going to add some
> > other flow models (say, full offload) in future. Moreover, it seems that
> > it's a custom in OVS to have long source code files. I was just following
> > it.
> > 
> > But I hve no objections to break it. Ian, should I break it now?
> > What's your plan to merge it?
> 
> Hi Yuanhan, I believe I gave the same feedback on the first review pass I did. Previously I was happy to hold off on this work in order to make it into the 2.9 release but as that window has passed I think it makes sense to break it out now. It will save the refactor work later.

So it won't in ovs 2.9 release?

	--yliu

> I should have more time later this week to continue reviewing, I'll provide some more feedback on the rest of the patchset then.
> 
> Thanks
> Ian
> 
> 
> > 
> > > So far there is no synchronization between PMDs in the fast path.
> > > However, we got a new mutex to sync PMDs and a new thread to manage.
> > > Even without the patch adding the thread, there would be a new mutex
> > > in the fast path.  It seems the slow path today causes issues, so
> > > maybe the whole upcall processing could be pushed to another thread. I
> > > realize this is outside of the scope of this patchset, but it is
> > > something we should consider.
> > >
> > > As an alternative solution, maybe we could use a DPDK ring to have a
> > > lockless way to push flows to the auxiliary thread.
> > 
> > It could be an option, but it brings a hard dependency on DPDK. Note that
> > the netdev datapath could leave without DPDK, IIUC.
> > 
> > > There are some memory allocations and deallocations in the fast path
> > > using OVS functions.  Perhaps it is better to use rte_* functions
> > > instead (another reason to split the code out of dpif-netdev.c)
> > 
> > Why do you think the one from DPDK is better? Also, it brings the DPDK
> > dependency to the netdev datapth.
> > 
> > > I am curious to know why there is no flow dump or flush?
> > 
> > Not quite sure I got it.
> > 
> > > The function to help debugging (dump_flow_pattern) should use an
> > > initial condition to return asap if debug is not enabled.
> > > E.g.:
> > >     if (VLOG_DROP_DBG(rl)) {
> > >         return;
> > >     }
> > 
> > Good tip. Thanks!
> > 
> > > I am still wrapping my head around the RSS+MARK action and rte_flow
> > > API, so I can't really comment those yet.
> > 
> > Thanks for looking at it!
> > 
> > 	--yliu
Stokes, Ian Feb. 7, 2018, 9:24 a.m. UTC | #5
> -----Original Message-----
> From: Yuanhan Liu [mailto:yliu@fridaylinux.org]
> Sent: Wednesday, February 7, 2018 2:12 AM
> To: Stokes, Ian <ian.stokes@intel.com>
> Cc: Flavio Leitner <fbl@sysclose.org>; dev@openvswitch.org; Simon Horman
> <simon.horman@netronome.com>; Olga Shern <olgas@mellanox.com>; Shahaf
> Shuler <shahafs@mellanox.com>
> Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow
> 
> On Tue, Feb 06, 2018 at 09:34:36PM +0000, Stokes, Ian wrote:
> > > -----Original Message-----
> > > From: Yuanhan Liu [mailto:yliu@fridaylinux.org]
> > > Sent: Tuesday, February 6, 2018 9:10 AM
> > > To: Flavio Leitner <fbl@sysclose.org>; Stokes, Ian
> > > <ian.stokes@intel.com>
> > > Cc: dev@openvswitch.org; Simon Horman <simon.horman@netronome.com>
> > > Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with
> > > rte_flow
> > >
> > > On Thu, Feb 01, 2018 at 01:35:15AM -0200, Flavio Leitner wrote:
> > > > On Mon, Jan 29, 2018 at 02:59:42PM +0800, Yuanhan Liu wrote:
> > > > > Hi,
> > > > >
> > > > > Here is a joint work from Mellanox and Napatech, to enable the
> > > > > flow hw offload with the DPDK generic flow interface (rte_flow).
> > > > >
> > > > > The basic idea is to associate the flow with a mark id (a
> > > > > unit32_t
> > > number).
> > > > > Later, we then get the flow directly from the mark id, which
> > > > > could bypass some heavy CPU operations, including but not
> > > > > limiting to mini flow extract, emc lookup, dpcls lookup, etc.
> > > > >
> > > > > The association is done with CMAP in patch 1. The CPU workload
> > > > > bypassing is done in patch 2. The flow offload is done in patch
> > > > > 3, which mainly does two things:
> > > > >
> > > > > - translate the ovs match to DPDK rte flow patterns
> > > > > - bind those patterns with a RSS + MARK action.
> > > > >
> > > > > Patch 5 makes the offload work happen in another thread, for
> > > > > leaving the datapath as light as possible.
> > > > >
> > > > > A PHY-PHY forwarding with 1000 mega flows (udp,tp_src=1000-1999)
> > > > > and
> > > > > 1 million streams (tp_src=1000-1999, tp_dst=2000-2999) show more
> > > > > than 260% performance boost.
> > > > >
> > > > > Note that it's disabled by default, which can be enabled by:
> > > >
> > > > Hi,
> > > >
> > > > First of all, thanks for working on this feature.
> > > >
> > > > I have some general comments I'd like to discuss before going
> > > > deeper on it.
> > > >
> > > > The documentation is too simple.  It should mention the HW
> > > > requirements in order to use this feature.
> > >
> > > It listed the NICs that support this feature.
> > >
> > > >  Also some important limitations, like no support for IP frags,
> > > > MPLS or conntrack, for instance.
> > >
> > > Yes, that could be added.
> > >
> > > > It seems it would be possible to leave the HW offloading code
> > > > outside of dpif-netdev.c which is quite long already. I hope it
> > > > will improve isolation and code clarity too.
> > >
> > > Yes. My thoughts was we can do it later, when we are going to add
> > > some other flow models (say, full offload) in future. Moreover, it
> > > seems that it's a custom in OVS to have long source code files. I
> > > was just following it.
> > >
> > > But I hve no objections to break it. Ian, should I break it now?
> > > What's your plan to merge it?
> >
> > Hi Yuanhan, I believe I gave the same feedback on the first review pass
> I did. Previously I was happy to hold off on this work in order to make it
> into the 2.9 release but as that window has passed I think it makes sense
> to break it out now. It will save the refactor work later.
> 
> So it won't in ovs 2.9 release?

No, the code freeze was the 31st of January for new features. Only bug fixes are accepted for the remaining 2.9 window.  This would be part of the 2.10 release.

Thanks
Ian
> 
> 	--yliu
> 
> > I should have more time later this week to continue reviewing, I'll
> provide some more feedback on the rest of the patchset then.
> >
> > Thanks
> > Ian
> >
> >
> > >
> > > > So far there is no synchronization between PMDs in the fast path.
> > > > However, we got a new mutex to sync PMDs and a new thread to manage.
> > > > Even without the patch adding the thread, there would be a new
> > > > mutex in the fast path.  It seems the slow path today causes
> > > > issues, so maybe the whole upcall processing could be pushed to
> > > > another thread. I realize this is outside of the scope of this
> > > > patchset, but it is something we should consider.
> > > >
> > > > As an alternative solution, maybe we could use a DPDK ring to have
> > > > a lockless way to push flows to the auxiliary thread.
> > >
> > > It could be an option, but it brings a hard dependency on DPDK. Note
> > > that the netdev datapath could leave without DPDK, IIUC.
> > >
> > > > There are some memory allocations and deallocations in the fast
> > > > path using OVS functions.  Perhaps it is better to use rte_*
> > > > functions instead (another reason to split the code out of
> > > > dpif-netdev.c)
> > >
> > > Why do you think the one from DPDK is better? Also, it brings the
> > > DPDK dependency to the netdev datapth.
> > >
> > > > I am curious to know why there is no flow dump or flush?
> > >
> > > Not quite sure I got it.
> > >
> > > > The function to help debugging (dump_flow_pattern) should use an
> > > > initial condition to return asap if debug is not enabled.
> > > > E.g.:
> > > >     if (VLOG_DROP_DBG(rl)) {
> > > >         return;
> > > >     }
> > >
> > > Good tip. Thanks!
> > >
> > > > I am still wrapping my head around the RSS+MARK action and
> > > > rte_flow API, so I can't really comment those yet.
> > >
> > > Thanks for looking at it!
> > >
> > > 	--yliu
Yuanhan Liu Feb. 7, 2018, 9:36 a.m. UTC | #6
On Wed, Feb 07, 2018 at 09:24:52AM +0000, Stokes, Ian wrote:
> > -----Original Message-----
> > From: Yuanhan Liu [mailto:yliu@fridaylinux.org]
> > Sent: Wednesday, February 7, 2018 2:12 AM
> > To: Stokes, Ian <ian.stokes@intel.com>
> > Cc: Flavio Leitner <fbl@sysclose.org>; dev@openvswitch.org; Simon Horman
> > <simon.horman@netronome.com>; Olga Shern <olgas@mellanox.com>; Shahaf
> > Shuler <shahafs@mellanox.com>
> > Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow
> > 
> > On Tue, Feb 06, 2018 at 09:34:36PM +0000, Stokes, Ian wrote:
> > > > -----Original Message-----
> > > > From: Yuanhan Liu [mailto:yliu@fridaylinux.org]
> > > > Sent: Tuesday, February 6, 2018 9:10 AM
> > > > To: Flavio Leitner <fbl@sysclose.org>; Stokes, Ian
> > > > <ian.stokes@intel.com>
> > > > Cc: dev@openvswitch.org; Simon Horman <simon.horman@netronome.com>
> > > > Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with
> > > > rte_flow
> > > >
> > > > On Thu, Feb 01, 2018 at 01:35:15AM -0200, Flavio Leitner wrote:
> > > > > On Mon, Jan 29, 2018 at 02:59:42PM +0800, Yuanhan Liu wrote:
> > > > > > Hi,
> > > > > >
> > > > > > Here is a joint work from Mellanox and Napatech, to enable the
> > > > > > flow hw offload with the DPDK generic flow interface (rte_flow).
> > > > > >
> > > > > > The basic idea is to associate the flow with a mark id (a
> > > > > > unit32_t
> > > > number).
> > > > > > Later, we then get the flow directly from the mark id, which
> > > > > > could bypass some heavy CPU operations, including but not
> > > > > > limiting to mini flow extract, emc lookup, dpcls lookup, etc.
> > > > > >
> > > > > > The association is done with CMAP in patch 1. The CPU workload
> > > > > > bypassing is done in patch 2. The flow offload is done in patch
> > > > > > 3, which mainly does two things:
> > > > > >
> > > > > > - translate the ovs match to DPDK rte flow patterns
> > > > > > - bind those patterns with a RSS + MARK action.
> > > > > >
> > > > > > Patch 5 makes the offload work happen in another thread, for
> > > > > > leaving the datapath as light as possible.
> > > > > >
> > > > > > A PHY-PHY forwarding with 1000 mega flows (udp,tp_src=1000-1999)
> > > > > > and
> > > > > > 1 million streams (tp_src=1000-1999, tp_dst=2000-2999) show more
> > > > > > than 260% performance boost.
> > > > > >
> > > > > > Note that it's disabled by default, which can be enabled by:
> > > > >
> > > > > Hi,
> > > > >
> > > > > First of all, thanks for working on this feature.
> > > > >
> > > > > I have some general comments I'd like to discuss before going
> > > > > deeper on it.
> > > > >
> > > > > The documentation is too simple.  It should mention the HW
> > > > > requirements in order to use this feature.
> > > >
> > > > It listed the NICs that support this feature.
> > > >
> > > > >  Also some important limitations, like no support for IP frags,
> > > > > MPLS or conntrack, for instance.
> > > >
> > > > Yes, that could be added.
> > > >
> > > > > It seems it would be possible to leave the HW offloading code
> > > > > outside of dpif-netdev.c which is quite long already. I hope it
> > > > > will improve isolation and code clarity too.
> > > >
> > > > Yes. My thoughts was we can do it later, when we are going to add
> > > > some other flow models (say, full offload) in future. Moreover, it
> > > > seems that it's a custom in OVS to have long source code files. I
> > > > was just following it.
> > > >
> > > > But I hve no objections to break it. Ian, should I break it now?
> > > > What's your plan to merge it?
> > >
> > > Hi Yuanhan, I believe I gave the same feedback on the first review pass
> > I did. Previously I was happy to hold off on this work in order to make it
> > into the 2.9 release but as that window has passed I think it makes sense
> > to break it out now. It will save the refactor work later.
> > 
> > So it won't in ovs 2.9 release?
> 
> No, the code freeze was the 31st of January for new features. Only bug fixes are accepted for the remaining 2.9 window.  This would be part of the 2.10 release.

I thought we all come an agreement that this will be part of 2.9 release
as a experimental feature. What made we missed it?

Thanks.

	--yliu
Stokes, Ian Feb. 7, 2018, 10:02 a.m. UTC | #7
> -----Original Message-----
> From: Yuanhan Liu [mailto:yliu@fridaylinux.org]
> Sent: Wednesday, February 7, 2018 9:37 AM
> To: Stokes, Ian <ian.stokes@intel.com>
> Cc: Flavio Leitner <fbl@sysclose.org>; dev@openvswitch.org; Simon Horman
> <simon.horman@netronome.com>; Olga Shern <olgas@mellanox.com>; Shahaf
> Shuler <shahafs@mellanox.com>
> Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow
> 
> On Wed, Feb 07, 2018 at 09:24:52AM +0000, Stokes, Ian wrote:
> > > -----Original Message-----
> > > From: Yuanhan Liu [mailto:yliu@fridaylinux.org]
> > > Sent: Wednesday, February 7, 2018 2:12 AM
> > > To: Stokes, Ian <ian.stokes@intel.com>
> > > Cc: Flavio Leitner <fbl@sysclose.org>; dev@openvswitch.org; Simon
> > > Horman <simon.horman@netronome.com>; Olga Shern
> > > <olgas@mellanox.com>; Shahaf Shuler <shahafs@mellanox.com>
> > > Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with
> > > rte_flow
> > >
> > > On Tue, Feb 06, 2018 at 09:34:36PM +0000, Stokes, Ian wrote:
> > > > > -----Original Message-----
> > > > > From: Yuanhan Liu [mailto:yliu@fridaylinux.org]
> > > > > Sent: Tuesday, February 6, 2018 9:10 AM
> > > > > To: Flavio Leitner <fbl@sysclose.org>; Stokes, Ian
> > > > > <ian.stokes@intel.com>
> > > > > Cc: dev@openvswitch.org; Simon Horman
> > > > > <simon.horman@netronome.com>
> > > > > Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with
> > > > > rte_flow
> > > > >
> > > > > On Thu, Feb 01, 2018 at 01:35:15AM -0200, Flavio Leitner wrote:
> > > > > > On Mon, Jan 29, 2018 at 02:59:42PM +0800, Yuanhan Liu wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > Here is a joint work from Mellanox and Napatech, to enable
> > > > > > > the flow hw offload with the DPDK generic flow interface
> (rte_flow).
> > > > > > >
> > > > > > > The basic idea is to associate the flow with a mark id (a
> > > > > > > unit32_t
> > > > > number).
> > > > > > > Later, we then get the flow directly from the mark id, which
> > > > > > > could bypass some heavy CPU operations, including but not
> > > > > > > limiting to mini flow extract, emc lookup, dpcls lookup, etc.
> > > > > > >
> > > > > > > The association is done with CMAP in patch 1. The CPU
> > > > > > > workload bypassing is done in patch 2. The flow offload is
> > > > > > > done in patch 3, which mainly does two things:
> > > > > > >
> > > > > > > - translate the ovs match to DPDK rte flow patterns
> > > > > > > - bind those patterns with a RSS + MARK action.
> > > > > > >
> > > > > > > Patch 5 makes the offload work happen in another thread, for
> > > > > > > leaving the datapath as light as possible.
> > > > > > >
> > > > > > > A PHY-PHY forwarding with 1000 mega flows
> > > > > > > (udp,tp_src=1000-1999) and
> > > > > > > 1 million streams (tp_src=1000-1999, tp_dst=2000-2999) show
> > > > > > > more than 260% performance boost.
> > > > > > >
> > > > > > > Note that it's disabled by default, which can be enabled by:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > First of all, thanks for working on this feature.
> > > > > >
> > > > > > I have some general comments I'd like to discuss before going
> > > > > > deeper on it.
> > > > > >
> > > > > > The documentation is too simple.  It should mention the HW
> > > > > > requirements in order to use this feature.
> > > > >
> > > > > It listed the NICs that support this feature.
> > > > >
> > > > > >  Also some important limitations, like no support for IP
> > > > > > frags, MPLS or conntrack, for instance.
> > > > >
> > > > > Yes, that could be added.
> > > > >
> > > > > > It seems it would be possible to leave the HW offloading code
> > > > > > outside of dpif-netdev.c which is quite long already. I hope
> > > > > > it will improve isolation and code clarity too.
> > > > >
> > > > > Yes. My thoughts was we can do it later, when we are going to
> > > > > add some other flow models (say, full offload) in future.
> > > > > Moreover, it seems that it's a custom in OVS to have long source
> > > > > code files. I was just following it.
> > > > >
> > > > > But I hve no objections to break it. Ian, should I break it now?
> > > > > What's your plan to merge it?
> > > >
> > > > Hi Yuanhan, I believe I gave the same feedback on the first review
> > > > pass
> > > I did. Previously I was happy to hold off on this work in order to
> > > make it into the 2.9 release but as that window has passed I think
> > > it makes sense to break it out now. It will save the refactor work
> later.
> > >
> > > So it won't in ovs 2.9 release?
> >
> > No, the code freeze was the 31st of January for new features. Only bug
> fixes are accepted for the remaining 2.9 window.  This would be part of
> the 2.10 release.
> 
> I thought we all come an agreement that this will be part of 2.9 release
> as a experimental feature. What made we missed it?

So there  was agreement the feature would be experimental when it is upstreamed to master (not just for OVS 2.9).

Unfortunately it missed the 2.9 deadline due to the lack of reviews external to authors within community.

I was hoping to review it myself in the run up to the 31st however a separate issue relating to dpdk mempools took priority as it was a blocking issue for users upgrading from OVS 2.8 to OVS 2.9. It was raised by the community and took priority.

I've had time only to reviewed the first patch and the documentation patch of the series at this point. I understand it's experimental but I think it's important that it's still reviewed and feedback given before its introduction to OVS. AFAIK there's still issues with compilation for sparse etc. which was breaking the OVS travis build. This alone would keep it from being upstreamed at this point. I'm planning to look at review the rest of this set later this week and providing more feedback.

Ian



> 
> Thanks.
> 
> 	--yliu
Stokes, Ian March 7, 2018, 4:59 p.m. UTC | #8
> -----Original Message-----
> From: Yuanhan Liu [mailto:yliu@fridaylinux.org]
> Sent: Monday, January 29, 2018 7:00 AM
> To: dev@openvswitch.org
> Cc: Stokes, Ian <ian.stokes@intel.com>; Finn Christensen
> <fc@napatech.com>; Darrell Ball <dball@vmware.com>; Chandran, Sugesh
> <sugesh.chandran@intel.com>; Simon Horman <simon.horman@netronome.com>;
> Yuanhan Liu <yliu@fridaylinux.org>
> Subject: [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow
> 
> Hi,
> 
> Here is a joint work from Mellanox and Napatech, to enable the flow hw
> offload with the DPDK generic flow interface (rte_flow).
> 

Thanks for this Yuanhan,

I'm currently in the middle of reviewing and testing the series. I have some feedback on patch 1 and 2 of the series so decided to provide it now rather than wait.

Just a general comment, I'm seeing compilation failures for sparse, clang and gcc etc. Maybe you already run these (I know sparse can be a bit hit and miss) but if you don't it's worth giving them a go before submitting, it will catch smaller bugs that break the OVS travis build and avoid version re-spins.

I'll have more comments on the rest of the series this week.

Thanks
Ian

> The basic idea is to associate the flow with a mark id (a unit32_t
> number).
> Later, we then get the flow directly from the mark id, which could bypass
> some heavy CPU operations, including but not limiting to mini flow
> extract, emc lookup, dpcls lookup, etc.
> 
> The association is done with CMAP in patch 1. The CPU workload bypassing
> is done in patch 2. The flow offload is done in patch 3, which mainly does
> two things:
> 
> - translate the ovs match to DPDK rte flow patterns
> - bind those patterns with a RSS + MARK action.
> 
> Patch 5 makes the offload work happen in another thread, for leaving the
> datapath as light as possible.
> 
> A PHY-PHY forwarding with 1000 mega flows (udp,tp_src=1000-1999) and 1
> million streams (tp_src=1000-1999, tp_dst=2000-2999) show more than 260%
> performance boost.
> 
> Note that it's disabled by default, which can be enabled by:
> 
>     $ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
> 
> v7: - fixed wrong hash for mark_to_flow that has been refactored in v6
>     - set the rss_conf for rss action to NULL, to workaround a mlx5 change
>       in DPDK v17.11. Note that it will obey the rss settings OVS-DPDK has
>       set in the beginning. Thus, nothing should be effected.
> 
> v6: - fixed a sparse warning
>     - added documentation
>     - used hash_int to compute mark to flow hash
>     - added more comments
>     - added lock for pot lookup
>     - rebased on top of the latest code
> 
> v5: - fixed an issue that it took too long if we do flow add/remove
>       repeatedly.
>     - removed an unused mutex lock
>     - turned most of the log level to DBG
>     - rebased on top of the latest code
> 
> v4: - use RSS action instead of QUEUE action with MARK
>     - make it work with multiple queue (see patch 1)
>     - rebased on top of latest code
> 
> v3: - The mark and id association is done with array instead of CMAP.
>     - Added a thread to do hw offload operations
>     - Removed macros completely
>     - dropped the patch to set FDIR_CONF, which is a workround some
>       Intel NICs.
>     - Added a debug patch to show all flow patterns we have created.
>     - Misc fixes
> 
> v2: - workaround the queue action issue
>     - fixed the tcp_flags being skipped issue, which also fixed the
>       build warnings
>     - fixed l2 patterns for Intel nic
>     - Converted some macros to functions
>     - did not hardcode the max number of flow/action
>     - rebased on top of the latest code
> 
> Thanks.
> 
>     --yliu
> 
> ---
> Finn Christensen (1):
>   netdev-dpdk: implement flow offload with rte flow
> 
> Yuanhan Liu (5):
>   dpif-netdev: associate flow with a mark id
>   dpif-netdev: retrieve flow directly from the flow mark
>   netdev-dpdk: add debug for rte flow patterns
>   dpif-netdev: do hw flow offload in a thread
>   Documentation: document ovs-dpdk flow offload
> 
>  Documentation/howto/dpdk.rst |  17 +
>  NEWS                         |   1 +
>  lib/dp-packet.h              |  13 +
>  lib/dpif-netdev.c            | 495 ++++++++++++++++++++++++++++-
>  lib/flow.c                   | 155 +++++++--
>  lib/flow.h                   |   1 +
>  lib/netdev-dpdk.c            | 736
> ++++++++++++++++++++++++++++++++++++++++++-
>  lib/netdev.h                 |   6 +
>  8 files changed, 1385 insertions(+), 39 deletions(-)
> 
> --
> 2.7.4
Shahaf Shuler March 13, 2018, 8:53 a.m. UTC | #9
Wednesday, February 7, 2018 4:12 AM, Yuanhan Liu:
> On Tue, Feb 06, 2018 at 09:34:36PM +0000, Stokes, Ian wrote:
> > > Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with
> > > rte_flow

Hi,

Just to inform I am taking over this patchset for integration in ovs 2.10 release.

Appreciate if you can sent again the comments/question you have on this patchset so I can fix/answer.
Stokes, Ian March 15, 2018, 11:52 a.m. UTC | #10
> -----Original Message-----
> From: Shahaf Shuler [mailto:shahafs@mellanox.com]
> Sent: Tuesday, March 13, 2018 8:53 AM
> To: Yuanhan Liu <yliu@fridaylinux.org>; Stokes, Ian <ian.stokes@intel.com>
> Cc: Flavio Leitner <fbl@sysclose.org>; dev@openvswitch.org; Simon Horman
> <simon.horman@netronome.com>; Olga Shern <olgas@mellanox.com>
> Subject: RE: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow
> 
> Wednesday, February 7, 2018 4:12 AM, Yuanhan Liu:
> > On Tue, Feb 06, 2018 at 09:34:36PM +0000, Stokes, Ian wrote:
> > > > Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with
> > > > rte_flow
> 
> Hi,
> 
> Just to inform I am taking over this patchset for integration in ovs 2.10
> release.
> 
> Appreciate if you can sent again the comments/question you have on this
> patchset so I can fix/answer.

Thanks for helping with this, I'll add you to the patch threads to continue the review discussion.

Ian
>
Stokes, Ian March 15, 2018, 11:54 a.m. UTC | #11
> > -----Original Message-----
> > From: Yuanhan Liu [mailto:yliu@fridaylinux.org]
> > Sent: Monday, January 29, 2018 7:00 AM
> > To: dev@openvswitch.org
> > Cc: Stokes, Ian <ian.stokes@intel.com>; Finn Christensen
> > <fc@napatech.com>; Darrell Ball <dball@vmware.com>; Chandran, Sugesh
> > <sugesh.chandran@intel.com>; Simon Horman
> > <simon.horman@netronome.com>; Yuanhan Liu <yliu@fridaylinux.org>
> > Subject: [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow
> >
> > Hi,
> >
> > Here is a joint work from Mellanox and Napatech, to enable the flow hw
> > offload with the DPDK generic flow interface (rte_flow).
> >
> 
> Thanks for this Yuanhan,

Adding Shahaf Shuler who will continue this work.

> 
> I'm currently in the middle of reviewing and testing the series. I have
> some feedback on patch 1 and 2 of the series so decided to provide it now
> rather than wait.
> 
> Just a general comment, I'm seeing compilation failures for sparse, clang
> and gcc etc. Maybe you already run these (I know sparse can be a bit hit
> and miss) but if you don't it's worth giving them a go before submitting,
> it will catch smaller bugs that break the OVS travis build and avoid
> version re-spins.
> 
> I'll have more comments on the rest of the series this week.
> 
> Thanks
> Ian
> 
> > The basic idea is to associate the flow with a mark id (a unit32_t
> > number).
> > Later, we then get the flow directly from the mark id, which could
> > bypass some heavy CPU operations, including but not limiting to mini
> > flow extract, emc lookup, dpcls lookup, etc.
> >
> > The association is done with CMAP in patch 1. The CPU workload
> > bypassing is done in patch 2. The flow offload is done in patch 3,
> > which mainly does two things:
> >
> > - translate the ovs match to DPDK rte flow patterns
> > - bind those patterns with a RSS + MARK action.
> >
> > Patch 5 makes the offload work happen in another thread, for leaving
> > the datapath as light as possible.
> >
> > A PHY-PHY forwarding with 1000 mega flows (udp,tp_src=1000-1999) and 1
> > million streams (tp_src=1000-1999, tp_dst=2000-2999) show more than
> > 260% performance boost.
> >
> > Note that it's disabled by default, which can be enabled by:
> >
> >     $ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
> >
> > v7: - fixed wrong hash for mark_to_flow that has been refactored in v6
> >     - set the rss_conf for rss action to NULL, to workaround a mlx5
> change
> >       in DPDK v17.11. Note that it will obey the rss settings OVS-DPDK
> has
> >       set in the beginning. Thus, nothing should be effected.
> >
> > v6: - fixed a sparse warning
> >     - added documentation
> >     - used hash_int to compute mark to flow hash
> >     - added more comments
> >     - added lock for pot lookup
> >     - rebased on top of the latest code
> >
> > v5: - fixed an issue that it took too long if we do flow add/remove
> >       repeatedly.
> >     - removed an unused mutex lock
> >     - turned most of the log level to DBG
> >     - rebased on top of the latest code
> >
> > v4: - use RSS action instead of QUEUE action with MARK
> >     - make it work with multiple queue (see patch 1)
> >     - rebased on top of latest code
> >
> > v3: - The mark and id association is done with array instead of CMAP.
> >     - Added a thread to do hw offload operations
> >     - Removed macros completely
> >     - dropped the patch to set FDIR_CONF, which is a workround some
> >       Intel NICs.
> >     - Added a debug patch to show all flow patterns we have created.
> >     - Misc fixes
> >
> > v2: - workaround the queue action issue
> >     - fixed the tcp_flags being skipped issue, which also fixed the
> >       build warnings
> >     - fixed l2 patterns for Intel nic
> >     - Converted some macros to functions
> >     - did not hardcode the max number of flow/action
> >     - rebased on top of the latest code
> >
> > Thanks.
> >
> >     --yliu
> >
> > ---
> > Finn Christensen (1):
> >   netdev-dpdk: implement flow offload with rte flow
> >
> > Yuanhan Liu (5):
> >   dpif-netdev: associate flow with a mark id
> >   dpif-netdev: retrieve flow directly from the flow mark
> >   netdev-dpdk: add debug for rte flow patterns
> >   dpif-netdev: do hw flow offload in a thread
> >   Documentation: document ovs-dpdk flow offload
> >
> >  Documentation/howto/dpdk.rst |  17 +
> >  NEWS                         |   1 +
> >  lib/dp-packet.h              |  13 +
> >  lib/dpif-netdev.c            | 495 ++++++++++++++++++++++++++++-
> >  lib/flow.c                   | 155 +++++++--
> >  lib/flow.h                   |   1 +
> >  lib/netdev-dpdk.c            | 736
> > ++++++++++++++++++++++++++++++++++++++++++-
> >  lib/netdev.h                 |   6 +
> >  8 files changed, 1385 insertions(+), 39 deletions(-)
> >
> > --
> > 2.7.4