From patchwork Tue Jun 26 05:17:56 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tonghao Zhang X-Patchwork-Id: 934673 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="Xzc/EnzU"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 41FDq50Gx8z9ryk for ; Tue, 26 Jun 2018 15:18:56 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751286AbeFZFSx (ORCPT ); Tue, 26 Jun 2018 01:18:53 -0400 Received: from mail-pg0-f66.google.com ([74.125.83.66]:42066 "EHLO mail-pg0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750994AbeFZFSw (ORCPT ); Tue, 26 Jun 2018 01:18:52 -0400 Received: by mail-pg0-f66.google.com with SMTP id c10-v6so7106117pgu.9 for ; Mon, 25 Jun 2018 22:18:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=LZHzM59YWudiQsK7Q+2UxZiILnipraC/ro99YkzA3PY=; b=Xzc/EnzUvpivnAmJJrsD1qZ6DTeLfIJmNLbOFo9+whmFS0L5AhMH+Np6jsGcLOf/fy x1U095Tknudzk0cfW3GPso0eJSHFSv3Dtz4fWMDDkX4lBrsBSt3/Q5qTYnZkad8vu+/u RfZqYG3mZ+lMoIF9ebQ8/xMfkFXwZnYsAf3qPEXEBfsrm3SRCz4jsLMFGcNCmkLi2v15 LJ4sGjvPqb9u9XgCVOAqsLqg/Qh1oZ8d4ky8MBQGXF0tKbmoi7lJcrCwtaBQwZwvama/ hCmyJlV+Qb1xwPc5zjuFhradM0gcG+IrafBkNnITorC2rxi5hjaGlPIGYX2CugY8Lyiq X1gA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=LZHzM59YWudiQsK7Q+2UxZiILnipraC/ro99YkzA3PY=; b=kM0GuoS/o7hiVhPRW5ubbEbVyHyBwSnTNtxA0nNe7IXinZMhVpkaYWg1/d8YEwrqom WyrioTmYen7TWJgtIEmOnJaP6WMkUWaprc2jia8NyPcrbFWRriOz4mTJrgyIgzjtA8uc xl6I/pXcSFPmipAdUKyTQhEkymeSUffd/cr6UC5PulO18blXst20cCBRMM3omAevABA2 1hueYfKzg6NsShxdBSDXr+G8dThNFCSUAK9ezc0vveGReGTSF6QczyBfTnSa7KtyDDEB 3oSYQQAjZRnjrbBaC+V/u8D6wLG6+xNm2TfBh80wjlHiGeDR4VW91M2gMJunLqp/94Tf chog== X-Gm-Message-State: APt69E28H0kgfiW0UCdNrXxw1jVTZvOYiv7UBYMgortD9pUfUisijSkS nvCInDXWvYz5lYO+H0dVFwA= X-Google-Smtp-Source: ADUXVKI5yrYMn14TlJsKCn0nEEkACEJhEkwQhgsDC68p7M5bEhH6N/nVk7bb0ZYwTzU9kX2X2d3E4w== X-Received: by 2002:a63:62c4:: with SMTP id w187-v6mr23649pgb.55.1529990331838; Mon, 25 Jun 2018 22:18:51 -0700 (PDT) Received: from local.opencloud.tech.localdomain ([111.205.210.1]) by smtp.gmail.com with ESMTPSA id k12-v6sm999337pff.31.2018.06.25.22.18.49 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 25 Jun 2018 22:18:51 -0700 (PDT) From: xiangxia.m.yue@gmail.com To: jasowang@redhat.com Cc: virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, Tonghao Zhang , Tonghao Zhang Subject: [PATCH net-next v2] net: vhost: improve performance when enable busyloop Date: Mon, 25 Jun 2018 22:17:56 -0700 Message-Id: <1529990276-61157-1-git-send-email-xiangxia.m.yue@gmail.com> X-Mailer: git-send-email 1.8.3.1 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Tonghao Zhang This patch improves the guest receive performance from host. On the handle_tx side, we poll the sock receive queue at the same time. handle_rx do that in the same way. For avoiding deadlock, change the code to lock the vq one by one and use the VHOST_NET_VQ_XX as a subclass for mutex_lock_nested. With the patch, qemu can set differently the busyloop_timeout for rx or tx queue. We set the poll-us=100us and use the iperf3 to test its throughput. The iperf3 command is shown as below. on the guest: iperf3 -s -D on the host: iperf3 -c 192.168.1.100 -i 1 -P 10 -t 10 -M 1400 * With the patch: 23.1 Gbits/sec * Without the patch: 12.7 Gbits/sec Signed-off-by: Tonghao Zhang --- drivers/vhost/net.c | 106 +++++++++++++++++++++++++++----------------------- drivers/vhost/vhost.c | 24 ++++-------- 2 files changed, 66 insertions(+), 64 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index e7cf7d2..38e9adb 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -429,22 +429,62 @@ static int vhost_net_enable_vq(struct vhost_net *n, return vhost_poll_start(poll, sock->file); } +static int sk_has_rx_data(struct sock *sk) +{ + struct socket *sock = sk->sk_socket; + + if (sock->ops->peek_len) + return sock->ops->peek_len(sock); + + return skb_queue_empty(&sk->sk_receive_queue); +} + +static void vhost_net_busy_poll(struct vhost_net *net, + struct vhost_virtqueue *rvq, + struct vhost_virtqueue *tvq, + bool rx) +{ + unsigned long uninitialized_var(endtime); + struct socket *sock = rvq->private_data; + struct vhost_virtqueue *vq = rx ? tvq : rvq; + unsigned long busyloop_timeout = rx ? rvq->busyloop_timeout : + tvq->busyloop_timeout; + + mutex_lock_nested(&vq->mutex, rx ? VHOST_NET_VQ_TX: VHOST_NET_VQ_RX); + vhost_disable_notify(&net->dev, vq); + + preempt_disable(); + endtime = busy_clock() + busyloop_timeout; + while (vhost_can_busy_poll(tvq->dev, endtime) && + !(sock && sk_has_rx_data(sock->sk)) && + vhost_vq_avail_empty(tvq->dev, tvq)) + cpu_relax(); + preempt_enable(); + + if ((rx && !vhost_vq_avail_empty(&net->dev, vq)) || + (!rx && (sock && sk_has_rx_data(sock->sk)))) { + vhost_poll_queue(&vq->poll); + } else if (unlikely(vhost_enable_notify(&net->dev, vq))) { + vhost_disable_notify(&net->dev, vq); + vhost_poll_queue(&vq->poll); + } + + mutex_unlock(&vq->mutex); +} + static int vhost_net_tx_get_vq_desc(struct vhost_net *net, struct vhost_virtqueue *vq, struct iovec iov[], unsigned int iov_size, unsigned int *out_num, unsigned int *in_num) { - unsigned long uninitialized_var(endtime); + struct vhost_net_virtqueue *nvq_rx = &net->vqs[VHOST_NET_VQ_RX]; + int r = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov), out_num, in_num, NULL, NULL); if (r == vq->num && vq->busyloop_timeout) { - preempt_disable(); - endtime = busy_clock() + vq->busyloop_timeout; - while (vhost_can_busy_poll(vq->dev, endtime) && - vhost_vq_avail_empty(vq->dev, vq)) - cpu_relax(); - preempt_enable(); + vhost_net_busy_poll(net, &nvq_rx->vq, vq, false); + r = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov), out_num, in_num, NULL, NULL); } @@ -484,7 +524,7 @@ static void handle_tx(struct vhost_net *net) bool zcopy, zcopy_used; int sent_pkts = 0; - mutex_lock(&vq->mutex); + mutex_lock_nested(&vq->mutex, VHOST_NET_VQ_TX); sock = vq->private_data; if (!sock) goto out; @@ -621,16 +661,6 @@ static int peek_head_len(struct vhost_net_virtqueue *rvq, struct sock *sk) return len; } -static int sk_has_rx_data(struct sock *sk) -{ - struct socket *sock = sk->sk_socket; - - if (sock->ops->peek_len) - return sock->ops->peek_len(sock); - - return skb_queue_empty(&sk->sk_receive_queue); -} - static void vhost_rx_signal_used(struct vhost_net_virtqueue *nvq) { struct vhost_virtqueue *vq = &nvq->vq; @@ -645,39 +675,19 @@ static void vhost_rx_signal_used(struct vhost_net_virtqueue *nvq) static int vhost_net_rx_peek_head_len(struct vhost_net *net, struct sock *sk) { - struct vhost_net_virtqueue *rvq = &net->vqs[VHOST_NET_VQ_RX]; - struct vhost_net_virtqueue *nvq = &net->vqs[VHOST_NET_VQ_TX]; - struct vhost_virtqueue *vq = &nvq->vq; - unsigned long uninitialized_var(endtime); - int len = peek_head_len(rvq, sk); - - if (!len && vq->busyloop_timeout) { - /* Flush batched heads first */ - vhost_rx_signal_used(rvq); - /* Both tx vq and rx socket were polled here */ - mutex_lock_nested(&vq->mutex, 1); - vhost_disable_notify(&net->dev, vq); - - preempt_disable(); - endtime = busy_clock() + vq->busyloop_timeout; - - while (vhost_can_busy_poll(&net->dev, endtime) && - !sk_has_rx_data(sk) && - vhost_vq_avail_empty(&net->dev, vq)) - cpu_relax(); + struct vhost_net_virtqueue *nvq_rx = &net->vqs[VHOST_NET_VQ_RX]; + struct vhost_net_virtqueue *nvq_tx = &net->vqs[VHOST_NET_VQ_TX]; - preempt_enable(); + int len = peek_head_len(nvq_rx, sk); - if (!vhost_vq_avail_empty(&net->dev, vq)) - vhost_poll_queue(&vq->poll); - else if (unlikely(vhost_enable_notify(&net->dev, vq))) { - vhost_disable_notify(&net->dev, vq); - vhost_poll_queue(&vq->poll); - } + if (!len && nvq_rx->vq.busyloop_timeout) { + /* Flush batched heads first */ + vhost_rx_signal_used(nvq_rx); - mutex_unlock(&vq->mutex); + /* Both tx vq and rx socket were polled here */ + vhost_net_busy_poll(net, &nvq_rx->vq, &nvq_tx->vq, true); - len = peek_head_len(rvq, sk); + len = peek_head_len(nvq_rx, sk); } return len; @@ -789,7 +799,7 @@ static void handle_rx(struct vhost_net *net) __virtio16 num_buffers; int recv_pkts = 0; - mutex_lock_nested(&vq->mutex, 0); + mutex_lock_nested(&vq->mutex, VHOST_NET_VQ_RX); sock = vq->private_data; if (!sock) goto out; diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 895eaa2..1716b10 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -294,8 +294,11 @@ static void vhost_vq_meta_reset(struct vhost_dev *d) { int i; - for (i = 0; i < d->nvqs; ++i) + for (i = 0; i < d->nvqs; ++i) { + mutex_lock(&d->vqs[i]->mutex); __vhost_vq_meta_reset(d->vqs[i]); + mutex_unlock(&d->vqs[i]->mutex); + } } static void vhost_vq_reset(struct vhost_dev *dev, @@ -887,19 +890,6 @@ static inline void __user *__vhost_get_user(struct vhost_virtqueue *vq, #define vhost_get_used(vq, x, ptr) \ vhost_get_user(vq, x, ptr, VHOST_ADDR_USED) -static void vhost_dev_lock_vqs(struct vhost_dev *d) -{ - int i = 0; - for (i = 0; i < d->nvqs; ++i) - mutex_lock_nested(&d->vqs[i]->mutex, i); -} - -static void vhost_dev_unlock_vqs(struct vhost_dev *d) -{ - int i = 0; - for (i = 0; i < d->nvqs; ++i) - mutex_unlock(&d->vqs[i]->mutex); -} static int vhost_new_umem_range(struct vhost_umem *umem, u64 start, u64 size, u64 end, @@ -950,7 +940,11 @@ static void vhost_iotlb_notify_vq(struct vhost_dev *d, if (msg->iova <= vq_msg->iova && msg->iova + msg->size - 1 > vq_msg->iova && vq_msg->type == VHOST_IOTLB_MISS) { + + mutex_lock(&node->vq->mutex); vhost_poll_queue(&node->vq->poll); + mutex_unlock(&node->vq->mutex); + list_del(&node->node); kfree(node); } @@ -982,7 +976,6 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev, int ret = 0; mutex_lock(&dev->mutex); - vhost_dev_lock_vqs(dev); switch (msg->type) { case VHOST_IOTLB_UPDATE: if (!dev->iotlb) { @@ -1016,7 +1009,6 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev, break; } - vhost_dev_unlock_vqs(dev); mutex_unlock(&dev->mutex); return ret;