diff mbox

[ovs-dev,v1] netdev-linux: Refactoring the netdev_linux_send in forwarding batch packets

Message ID 20170529040804.21264-1-sysugaozhenyu@gmail.com
State Changes Requested
Headers show

Commit Message

Gao Zhenyu May 29, 2017, 4:08 a.m. UTC
1. We don't need to initialize sock,msg in before calling sendmsg each time
Just initialize them before the loop, it can reduce cpu cycles.

Signed-off-by: Zhenyu Gao <sysugaozhenyu@gmail.com>
---
 lib/netdev-linux.c | 62 ++++++++++++++++++++++++++++++------------------------
 1 file changed, 34 insertions(+), 28 deletions(-)

Comments

Ben Pfaff May 30, 2017, 4:43 p.m. UTC | #1
On Mon, May 29, 2017 at 04:08:03AM +0000, Zhenyu Gao wrote:
> 1. We don't need to initialize sock,msg in before calling sendmsg each time
> Just initialize them before the loop, it can reduce cpu cycles.
> 
> Signed-off-by: Zhenyu Gao <sysugaozhenyu@gmail.com>

Thanks for working on OVS.

GCC reports the following, and I think that it is right.  Can you take a
look?

../lib/netdev-linux.c:1247:20: error: 'sock' may be used uninitialized in this function [-Werror=maybe-uninitialized]
             retval = sendmsg(sock, &msg, 0);
             ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
Gao Zhenyu May 31, 2017, 1:50 a.m. UTC | #2
Hi Ben,

Thanks for catching it. I will add "--enable-Werror" next time.

BTW, I would like to submit another patch to use sendmmsg to replace
sendmsg. Sendmmsg get more benefit on throughput(my draft testing show 100%
improvment). Do you think it is doable?

Thanks
Zhenyu Gao

2017-05-31 0:43 GMT+08:00 Ben Pfaff <blp@ovn.org>:

> On Mon, May 29, 2017 at 04:08:03AM +0000, Zhenyu Gao wrote:
> > 1. We don't need to initialize sock,msg in before calling sendmsg each
> time
> > Just initialize them before the loop, it can reduce cpu cycles.
> >
> > Signed-off-by: Zhenyu Gao <sysugaozhenyu@gmail.com>
>
> Thanks for working on OVS.
>
> GCC reports the following, and I think that it is right.  Can you take a
> look?
>
> ../lib/netdev-linux.c:1247:20: error: 'sock' may be used uninitialized in
> this function [-Werror=maybe-uninitialized]
>              retval = sendmsg(sock, &msg, 0);
>              ~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
>
Ben Pfaff May 31, 2017, 3:41 p.m. UTC | #3
On Wed, May 31, 2017 at 09:50:09AM +0800, Gao Zhenyu wrote:
> BTW, I would like to submit another patch to use sendmmsg to replace
> sendmsg. Sendmmsg get more benefit on throughput(my draft testing show 100%
> improvment). Do you think it is doable?

I'm surprised that it makes a big difference, because I tested a similar
change years ago and it did not.  However, let's assume that it does.
In that case, of course we'd accept a change.  It would be important to
retain support for older kernels and non-Linux kernels.
Gao Zhenyu June 1, 2017, 1:50 a.m. UTC | #4
Here is the backgroud:

I tried to consume ovs-dpdk(ovs-2.7.0, dpdk-16.11) for container-app and
here is the topologic


    netserver
|-------------------|
|                  |
|  container   |
|--------veth----|
          |
          |              |--------------------|
          |-------veth-|   dpdk-ovs
|                                                   netperf
                         |
|                                               |-------------------|

 |---------dpdk----|                                               |
bare-metal   |

|
---------------------

|                                                               |

|                                                               |

physical-nic--------------------------------------------physical-nic

But the performance is worse than regular OVS and then I found sendmsg cost
96% cpu cycles.  Some packets were dropped due to insufficient cpu.
So I tried to replace sendmsg with sendmmsg, it shows some performance
improvement like:

netperf -H 10.100.85.242 -t TCP_STREAM -l 60
335.98Mb(sendmsg + ovs-dpdk)   ---->   663Mb(sendmmsg + ovs-dpdk)

(I turn off the veth's tx offloading because the dpdk would not do
tx-checksum which introduces tcp-checksum error.   ethtool -K eth1 tx off)


Thanks
Zhenyu Gao


2017-05-31 23:41 GMT+08:00 Ben Pfaff <blp@ovn.org>:

> On Wed, May 31, 2017 at 09:50:09AM +0800, Gao Zhenyu wrote:
> > BTW, I would like to submit another patch to use sendmmsg to replace
> > sendmsg. Sendmmsg get more benefit on throughput(my draft testing show
> 100%
> > improvment). Do you think it is doable?
>
> I'm surprised that it makes a big difference, because I tested a similar
> change years ago and it did not.  However, let's assume that it does.
> In that case, of course we'd accept a change.  It would be important to
> retain support for older kernels and non-Linux kernels.
>
Ben Pfaff June 1, 2017, 2:16 a.m. UTC | #5
Are you sure that this is the fastest way to interface OVS-DPDK to a
container?  But, even if it is not, optimizations are welcome.

On Thu, Jun 01, 2017 at 09:50:44AM +0800, Gao Zhenyu wrote:
> Here is the backgroud:
> 
> I tried to consume ovs-dpdk(ovs-2.7.0, dpdk-16.11) for container-app and
> here is the topologic
> 
> 
>     netserver
> |-------------------|
> |                  |
> |  container   |
> |--------veth----|
>           |
>           |              |--------------------|
>           |-------veth-|   dpdk-ovs
> |                                                   netperf
>                          |
> |                                               |-------------------|
> 
>  |---------dpdk----|                                               |
> bare-metal   |
> 
> |
> ---------------------
> 
> |                                                               |
> 
> |                                                               |
> 
> physical-nic--------------------------------------------physical-nic
> 
> But the performance is worse than regular OVS and then I found sendmsg cost
> 96% cpu cycles.  Some packets were dropped due to insufficient cpu.
> So I tried to replace sendmsg with sendmmsg, it shows some performance
> improvement like:
> 
> netperf -H 10.100.85.242 -t TCP_STREAM -l 60
> 335.98Mb(sendmsg + ovs-dpdk)   ---->   663Mb(sendmmsg + ovs-dpdk)
> 
> (I turn off the veth's tx offloading because the dpdk would not do
> tx-checksum which introduces tcp-checksum error.   ethtool -K eth1 tx off)
> 
> 
> Thanks
> Zhenyu Gao
> 
> 
> 2017-05-31 23:41 GMT+08:00 Ben Pfaff <blp@ovn.org>:
> 
> > On Wed, May 31, 2017 at 09:50:09AM +0800, Gao Zhenyu wrote:
> > > BTW, I would like to submit another patch to use sendmmsg to replace
> > > sendmsg. Sendmmsg get more benefit on throughput(my draft testing show
> > 100%
> > > improvment). Do you think it is doable?
> >
> > I'm surprised that it makes a big difference, because I tested a similar
> > change years ago and it did not.  However, let's assume that it does.
> > In that case, of course we'd accept a change.  It would be important to
> > retain support for older kernels and non-Linux kernels.
> >
Gao Zhenyu June 1, 2017, 2:29 a.m. UTC | #6
well, ovs-dpdk + userspace-tcp + app should be a better way. I just made
this test so we can have a conclusion that ovs-dpdk + veth is not a good
choice. And it worse than regular ovs.

I will do more performance testings on latest openvswitch and send a patch
out ASAP.


Thanks
Zhenyu Gao

2017-06-01 10:16 GMT+08:00 Ben Pfaff <blp@ovn.org>:

> Are you sure that this is the fastest way to interface OVS-DPDK to a
> container?  But, even if it is not, optimizations are welcome.
>
> On Thu, Jun 01, 2017 at 09:50:44AM +0800, Gao Zhenyu wrote:
> > Here is the backgroud:
> >
> > I tried to consume ovs-dpdk(ovs-2.7.0, dpdk-16.11) for container-app and
> > here is the topologic
> >
> >
> >     netserver
> > |-------------------|
> > |                  |
> > |  container   |
> > |--------veth----|
> >           |
> >           |              |--------------------|
> >           |-------veth-|   dpdk-ovs
> > |                                                   netperf
> >                          |
> > |                                               |-------------------|
> >
> >  |---------dpdk----|                                               |
> > bare-metal   |
> >
> > |
> > ---------------------
> >
> > |                                                               |
> >
> > |                                                               |
> >
> > physical-nic--------------------------------------------physical-nic
> >
> > But the performance is worse than regular OVS and then I found sendmsg
> cost
> > 96% cpu cycles.  Some packets were dropped due to insufficient cpu.
> > So I tried to replace sendmsg with sendmmsg, it shows some performance
> > improvement like:
> >
> > netperf -H 10.100.85.242 -t TCP_STREAM -l 60
> > 335.98Mb(sendmsg + ovs-dpdk)   ---->   663Mb(sendmmsg + ovs-dpdk)
> >
> > (I turn off the veth's tx offloading because the dpdk would not do
> > tx-checksum which introduces tcp-checksum error.   ethtool -K eth1 tx
> off)
> >
> >
> > Thanks
> > Zhenyu Gao
> >
> >
> > 2017-05-31 23:41 GMT+08:00 Ben Pfaff <blp@ovn.org>:
> >
> > > On Wed, May 31, 2017 at 09:50:09AM +0800, Gao Zhenyu wrote:
> > > > BTW, I would like to submit another patch to use sendmmsg to replace
> > > > sendmsg. Sendmmsg get more benefit on throughput(my draft testing
> show
> > > 100%
> > > > improvment). Do you think it is doable?
> > >
> > > I'm surprised that it makes a big difference, because I tested a
> similar
> > > change years ago and it did not.  However, let's assume that it does.
> > > In that case, of course we'd accept a change.  It would be important to
> > > retain support for older kernels and non-Linux kernels.
> > >
>
diff mbox

Patch

diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index 3ad3d45..b1de3df 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -1194,6 +1194,37 @@  netdev_linux_send(struct netdev *netdev_, int qid OVS_UNUSED,
 {
     int i;
     int error = 0;
+    int ifindex;
+    int sock;
+    struct sockaddr_ll sll;
+    struct msghdr msg;
+
+    if (!is_tap_netdev(netdev_)) {
+        sock = af_packet_sock();
+        if (sock < 0) {
+            error = -sock;
+            goto free_batch;
+        }
+
+        ifindex = netdev_get_ifindex(netdev_);
+        if (ifindex < 0) {
+            error = -ifindex;
+            goto free_batch;
+        }
+
+        /* We don't bother setting most fields in sockaddr_ll because the
+         * kernel ignores them for SOCK_RAW. */
+        memset(&sll, 0, sizeof sll);
+        sll.sll_family = AF_PACKET;
+        sll.sll_ifindex = ifindex;
+
+        msg.msg_name = &sll;
+        msg.msg_namelen = sizeof sll;
+        msg.msg_iovlen = 1;
+        msg.msg_control = NULL;
+        msg.msg_controllen = 0;
+        msg.msg_flags = 0;
+    }
 
     /* 'i' is incremented only if there's no error */
     for (i = 0; i < batch->count;) {
@@ -1206,38 +1237,12 @@  netdev_linux_send(struct netdev *netdev_, int qid OVS_UNUSED,
 
         if (!is_tap_netdev(netdev_)) {
             /* Use our AF_PACKET socket to send to this device. */
-            struct sockaddr_ll sll;
-            struct msghdr msg;
             struct iovec iov;
-            int ifindex;
-            int sock;
-
-            sock = af_packet_sock();
-            if (sock < 0) {
-                return -sock;
-            }
-
-            ifindex = netdev_get_ifindex(netdev_);
-            if (ifindex < 0) {
-                return -ifindex;
-            }
-
-            /* We don't bother setting most fields in sockaddr_ll because the
-             * kernel ignores them for SOCK_RAW. */
-            memset(&sll, 0, sizeof sll);
-            sll.sll_family = AF_PACKET;
-            sll.sll_ifindex = ifindex;
 
             iov.iov_base = CONST_CAST(void *, data);
             iov.iov_len = size;
 
-            msg.msg_name = &sll;
-            msg.msg_namelen = sizeof sll;
             msg.msg_iov = &iov;
-            msg.msg_iovlen = 1;
-            msg.msg_control = NULL;
-            msg.msg_controllen = 0;
-            msg.msg_flags = 0;
 
             retval = sendmsg(sock, &msg, 0);
         } else {
@@ -1278,13 +1283,14 @@  netdev_linux_send(struct netdev *netdev_, int qid OVS_UNUSED,
         i++;
     }
 
-    dp_packet_delete_batch(batch, may_steal);
-
     if (error && error != EAGAIN) {
             VLOG_WARN_RL(&rl, "error sending Ethernet packet on %s: %s",
                          netdev_get_name(netdev_), ovs_strerror(error));
     }
 
+free_batch:
+    dp_packet_delete_batch(batch, may_steal);
+
     return error;
 
 }