diff mbox

[net-next-2.6] vhost: Restart tx poll when socket send queue is full

Message ID 1266526751.15681.71.camel@w-sridhar.beaverton.ibm.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Sridhar Samudrala Feb. 18, 2010, 8:59 p.m. UTC
When running guest to remote host TCP stream test using vhost-net
via tap/macvtap, i am seeing network transmit hangs. This happens
when handle_tx() returns because of the socket send queue full 
condition.
This patch fixes this by restarting tx poll when hitting this
condition.

Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Michael S. Tsirkin Feb. 18, 2010, 10:30 p.m. UTC | #1
On Thu, Feb 18, 2010 at 12:59:11PM -0800, Sridhar Samudrala wrote:
> When running guest to remote host TCP stream test using vhost-net
> via tap/macvtap, i am seeing network transmit hangs. This happens
> when handle_tx() returns because of the socket send queue full 
> condition.
> This patch fixes this by restarting tx poll when hitting this
> condition.


Thanks! I would like to better understand what happens exactly.
Some questions below:

> 
> Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 91a324c..82d4bbe 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -113,12 +113,16 @@ static void handle_tx(struct vhost_net *net)
>  	if (!sock)
>  		return;
>  
> -	wmem = atomic_read(&sock->sk->sk_wmem_alloc);
> -	if (wmem >= sock->sk->sk_sndbuf)
> -		return;
> -

The disadvantage here is that a spurious wakeup
when queue is still full becomes more expensive.

>  	use_mm(net->dev.mm);
>  	mutex_lock(&vq->mutex);
> +
> +	wmem = atomic_read(&sock->sk->sk_wmem_alloc);
> +	if (wmem >= sock->sk->sk_sndbuf) {
> +		tx_poll_start(net, sock);

Hmm. We already do
                       if (wmem >= sock->sk->sk_sndbuf * 3 / 4) {
                                tx_poll_start(net, sock);
                                set_bit(SOCK_ASYNC_NOSPACE, &sock->flags);
                                break;
                        }
why does not this code trigger here?


> +		set_bit(SOCK_ASYNC_NOSPACE, &sock->flags);
              
Isn't the bit already set? If not, why?

> +		goto unlock;
> +	}
> +
>  	vhost_disable_notify(vq);
>  
>  	if (wmem < sock->sk->sk_sndbuf * 2)
> @@ -178,6 +182,7 @@ static void handle_tx(struct vhost_net *net)
>  		}
>  	}
>  
> +unlock:
>  	mutex_unlock(&vq->mutex);
>  	unuse_mm(net->dev.mm);
>  }
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 91a324c..82d4bbe 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -113,12 +113,16 @@  static void handle_tx(struct vhost_net *net)
 	if (!sock)
 		return;
 
-	wmem = atomic_read(&sock->sk->sk_wmem_alloc);
-	if (wmem >= sock->sk->sk_sndbuf)
-		return;
-
 	use_mm(net->dev.mm);
 	mutex_lock(&vq->mutex);
+
+	wmem = atomic_read(&sock->sk->sk_wmem_alloc);
+	if (wmem >= sock->sk->sk_sndbuf) {
+		tx_poll_start(net, sock);
+		set_bit(SOCK_ASYNC_NOSPACE, &sock->flags);
+		goto unlock;
+	}
+
 	vhost_disable_notify(vq);
 
 	if (wmem < sock->sk->sk_sndbuf * 2)
@@ -178,6 +182,7 @@  static void handle_tx(struct vhost_net *net)
 		}
 	}
 
+unlock:
 	mutex_unlock(&vq->mutex);
 	unuse_mm(net->dev.mm);
 }