[ovs-dev,net-next,1/6] netdev-dpdk: Allow vswitchd to parse devargs as dpdk-bond args

Message ID 1523537572-2595-2-git-send-email-xiangxia.m.yue@gmail.com
State Rejected
Delegated to: Ian Stokes
Headers show
Series
  • Add dpdk-bond support
Related show

Commit Message

Tonghao Zhang April 12, 2018, 12:52 p.m.
From: Tonghao Zhang <xiangxia.m.yue@gmail.com>

If users set the interface options with multi-pci or device names
with ',' as a separator, we try to parse it as dpdk-bond args.
For example, set an interface as:

    ovs-vsctl add-port br0 dpdk0 -- \
	    set Interface dpdk0 type=dpdk \
	    options:dpdk-devargs=0000:06:00.0,0000:06:00.1

This patch allows vswitchd to parse it and will be used in
next patch.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
---
 lib/netdev-dpdk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Ilya Maximets April 12, 2018, 1:13 p.m. | #1
> From: Tonghao Zhang <xiangxia.m.yue at gmail.com>
> 
> The bond of openvswitch has not good performance.

Any examples?

> In some
> cases we would recommend that you use Linux bonds instead
> of Open vSwitch bonds. In userspace datapath, we wants use
> bond to improve bandwidth. The DPDK has implemented it as lib.

You could use OVS bonding for userspace datapath and it has
good performance, especially after TX batching patch-set.

DPDK bonding has a variety of limitations like the requirement
to call rte_eth_tx_burst and rte_eth_rx_burst with intervals
period of less than 100ms for link aggregation modes.
OVS could not assure that.

> 
> These patches base DPDK bond to implement the dpdk-bond
> device as a vswitchd interface.
> 
> If users set the interface options with multi-pci or device names
> with ',' as a separator, we try to parse it as dpdk-bond args.
> For example, set an interface as:
> 
>     ovs-vsctl add-port br0 dpdk0 -- \
> 	    set Interface dpdk0 type=dpdk \
> 	    options:dpdk-devargs=0000:06:00.0,0000:06:00.1
> 	    
> And now these patch support to set bond mode, such as round
> robin, active_backup and balance and so on. Later some features
> of bond will be supported.

Hmm, but you're already have ability to add any virtual dpdk device
including bond devices like this:

    ovs-vsctl add-port br0 bond0 -- \
        set Interface dpdk0 type=dpdk \
        options:dpdk-devargs="eth_bond0,mode=2,slave=0000:05:00.0,slave=0000:05:00.1,xmit_policy=l34"

So, what is the profit of this patch-set?

> 					
> These patches are RFC, any proposal will be welcome. Ignore the doc,
> if these pathes is ok for openvswitch the doc will be posted.
> 					
> There are somes shell scripts, which can help us to test the patches. 
> https://github.com/nickcooper-zhangtonghao/ovs-bond-tests
> 
> Tonghao Zhang (6):
>   netdev-dpdk: Allow vswitchd to parse devargs as dpdk-bond args
>   netdev-dpdk: Allow dpdk-ethdev not support setting mtu
>   netdev-dpdk: Add netdev_dpdk_bond struct
>   netdev-dpdk: Add dpdk-bond support
>   netdev-dpdk: Add check whether dpdk-port is used
>   netdev-dpdk: Add dpdk-bond mode setting
> 
>  lib/netdev-dpdk.c | 304 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 299 insertions(+), 5 deletions(-)
> 
> -- 
> 1.8.3.1
Jan Scheurich April 12, 2018, 2:18 p.m. | #2
> > The bond of openvswitch has not good performance.
> 
> Any examples?

For example, balance-tcp bond mode for L34 load sharing still requires a recirculation after dp_hash.

I believe that it would definitely be interesting to compare bond performance between DPDK bonding and OVS bonding with DPDK datapath for various bond modes and traffic patterns.

Another interesting performance metric would be link failover times and packet drop (at link down and link up) in static and dynamic (LACP) bond configurations. That is an area where we have repeatedly seen problems with OVS bonding.

> 
> > In some
> > cases we would recommend that you use Linux bonds instead
> > of Open vSwitch bonds. In userspace datapath, we wants use
> > bond to improve bandwidth. The DPDK has implemented it as lib.
> 
> You could use OVS bonding for userspace datapath and it has
> good performance, especially after TX batching patch-set.
> 
> DPDK bonding has a variety of limitations like the requirement
> to call rte_eth_tx_burst and rte_eth_rx_burst with intervals
> period of less than 100ms for link aggregation modes.
> OVS could not assure that.

A periodic dummy tx burst every 100 ms is something that could easily be added to dpif-netdev PMD for bonded dpdk netdevs.

> 
> >
> > These patches base DPDK bond to implement the dpdk-bond
> > device as a vswitchd interface.
> >
> > If users set the interface options with multi-pci or device names
> > with ',' as a separator, we try to parse it as dpdk-bond args.
> > For example, set an interface as:
> >
> >     ovs-vsctl add-port br0 dpdk0 -- \
> > 	    set Interface dpdk0 type=dpdk \
> > 	    options:dpdk-devargs=0000:06:00.0,0000:06:00.1
> >
> > And now these patch support to set bond mode, such as round
> > robin, active_backup and balance and so on. Later some features
> > of bond will be supported.
> 
> Hmm, but you're already have ability to add any virtual dpdk device
> including bond devices like this:
> 
>     ovs-vsctl add-port br0 bond0 -- \
>         set Interface dpdk0 type=dpdk \
>         options:dpdk-devargs="eth_bond0,mode=2,slave=0000:05:00.0,slave=0000:05:00.1,xmit_policy=l34"
>
> So, what is the profit of this patch-set?

Thanks for the pointer. That is a valid question. 

I guess special handling like periodic dummy tx burst might have to be enabled based on dpdk-devargs bond configuration.

BR, Jan
Tonghao Zhang April 12, 2018, 2:37 p.m. | #3
On Thu, Apr 12, 2018 at 10:18 PM, Jan Scheurich
<jan.scheurich@ericsson.com> wrote:
>> > The bond of openvswitch has not good performance.
>>
>> Any examples?
>
> For example, balance-tcp bond mode for L34 load sharing still requires a recirculation after dp_hash.
Yes, we need more bond modes for us.

> I believe that it would definitely be interesting to compare bond performance between DPDK bonding and OVS bonding with DPDK datapath for various bond modes and traffic patterns.
>
> Another interesting performance metric would be link failover times and packet drop (at link down and link up) in static and dynamic (LACP) bond configurations. That is an area where we have repeatedly seen problems with OVS bonding.
>
>>
>> > In some
>> > cases we would recommend that you use Linux bonds instead
>> > of Open vSwitch bonds. In userspace datapath, we wants use
>> > bond to improve bandwidth. The DPDK has implemented it as lib.
>>
>> You could use OVS bonding for userspace datapath and it has
>> good performance, especially after TX batching patch-set.
>>
>> DPDK bonding has a variety of limitations like the requirement
>> to call rte_eth_tx_burst and rte_eth_rx_burst with intervals
>> period of less than 100ms for link aggregation modes.
>> OVS could not assure that.
>
> A periodic dummy tx burst every 100 ms is something that could easily be added to dpif-netdev PMD for bonded dpdk netdevs.
>
>>
>> >
>> > These patches base DPDK bond to implement the dpdk-bond
>> > device as a vswitchd interface.
>> >
>> > If users set the interface options with multi-pci or device names
>> > with ',' as a separator, we try to parse it as dpdk-bond args.
>> > For example, set an interface as:
>> >
>> >     ovs-vsctl add-port br0 dpdk0 -- \
>> >         set Interface dpdk0 type=dpdk \
>> >         options:dpdk-devargs=0000:06:00.0,0000:06:00.1
>> >
>> > And now these patch support to set bond mode, such as round
>> > robin, active_backup and balance and so on. Later some features
>> > of bond will be supported.
>>
>> Hmm, but you're already have ability to add any virtual dpdk device
>> including bond devices like this:
>>
>>     ovs-vsctl add-port br0 bond0 -- \
>>         set Interface dpdk0 type=dpdk \
>>         options:dpdk-devargs="eth_bond0,mode=2,slave=0000:05:00.0,slave=0000:05:00.1,xmit_policy=l34"
>>
>> So, what is the profit of this patch-set?
>
> Thanks for the pointer. That is a valid question.
Thanks for review. At least, with these patches, we can dynamically,
change the dpdk-bond mode, xmit_policy  and del/add slave ports.
even though v1 only change the mode dynamically.

> I guess special handling like periodic dummy tx burst might have to be enabled based on dpdk-devargs bond configuration.
>
> BR, Jan
Ilya Maximets April 12, 2018, 2:46 p.m. | #4
On 12.04.2018 17:18, Jan Scheurich wrote:
>>> The bond of openvswitch has not good performance.
>>
>> Any examples?
> 
> For example, balance-tcp bond mode for L34 load sharing still requires a recirculation after dp_hash.

dp_hash is lightweight action now since we're using rss hash for it.
Recirculation itself is not a very big issue. Last time we measured
DPDK bonding in XOR balancing mode with OVS, it had more than 30%
of performance overhead in compare with simple DPDK port in a L2
"receive -- > send-back" scenario and a single PMD thread.

> 
> I believe that it would definitely be interesting to compare bond performance between DPDK bonding and OVS bonding with DPDK datapath for various bond modes and traffic patterns.

Maybe, but I would not expect much profit from using DPDK bonding.

> 
> Another interesting performance metric would be link failover times and packet drop (at link down and link up) in static and dynamic (LACP) bond configurations. That is an area where we have repeatedly seen problems with OVS bonding.
> 
>>
>>> In some
>>> cases we would recommend that you use Linux bonds instead
>>> of Open vSwitch bonds. In userspace datapath, we wants use
>>> bond to improve bandwidth. The DPDK has implemented it as lib.
>>
>> You could use OVS bonding for userspace datapath and it has
>> good performance, especially after TX batching patch-set.
>>
>> DPDK bonding has a variety of limitations like the requirement
>> to call rte_eth_tx_burst and rte_eth_rx_burst with intervals
>> period of less than 100ms for link aggregation modes.
>> OVS could not assure that.
> 
> A periodic dummy tx burst every 100 ms is something that could easily be added to dpif-netdev PMD for bonded dpdk netdevs.

I don't think that it could be done easily, because port could be never used
for sending, or even not polling by any PMD thread.
If these dummy operations will be called from a separate thread we'll have
to protect everything by a mutexes. Even harder will be ensuring of required
periodicity.

Anyway this will be one more dirty crutch in a generic code.
I'd prefer not to do that.

> 
>>
>>>
>>> These patches base DPDK bond to implement the dpdk-bond
>>> device as a vswitchd interface.
>>>
>>> If users set the interface options with multi-pci or device names
>>> with ',' as a separator, we try to parse it as dpdk-bond args.
>>> For example, set an interface as:
>>>
>>>     ovs-vsctl add-port br0 dpdk0 -- \
>>> 	    set Interface dpdk0 type=dpdk \
>>> 	    options:dpdk-devargs=0000:06:00.0,0000:06:00.1
>>>
>>> And now these patch support to set bond mode, such as round
>>> robin, active_backup and balance and so on. Later some features
>>> of bond will be supported.
>>
>> Hmm, but you're already have ability to add any virtual dpdk device
>> including bond devices like this:
>>
>>     ovs-vsctl add-port br0 bond0 -- \
>>         set Interface dpdk0 type=dpdk \
>>         options:dpdk-devargs="eth_bond0,mode=2,slave=0000:05:00.0,slave=0000:05:00.1,xmit_policy=l34"
>>
>> So, what is the profit of this patch-set?
> 
> Thanks for the pointer. That is a valid question. 
> 
> I guess special handling like periodic dummy tx burst might have to be enabled based on dpdk-devargs bond configuration.
> 
> BR, Jan
> 
> 
>

Patch

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index ee39cbe..30dc76d 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -1389,7 +1389,7 @@  netdev_dpdk_process_devargs(struct netdev_dpdk *dev,
     if (strncmp(devargs, "class=eth,mac=", 14) == 0) {
         new_port_id = netdev_dpdk_get_port_by_mac(&devargs[14]);
     } else {
-        name = xmemdup0(devargs, strcspn(devargs, ","));
+        name = xmemdup0(devargs, strlen(devargs));
         if (rte_eth_dev_get_port_by_name(name, &new_port_id)
                 || !rte_eth_dev_is_valid_port(new_port_id)) {
             /* Device not found in DPDK, attempt to attach it */