[3/3] vhost-net: use lock_sock_fast() in peek_head_len()

Message ID	20110313150646.GA30494@redhat.com
State	Not Applicable, archived
Delegated to:	David Miller
Headers	show Return-Path: <netdev-owner@vger.kernel.org> Date: Sun, 13 Mar 2011 17:06:47 +0200 From: "Michael S. Tsirkin" <mst@redhat.com> To: Jason Wang <jasowang@redhat.com> Cc: virtualization@lists.osdl.org, netdev@vger.kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 3/3] vhost-net: use lock_sock_fast() in peek_head_len() Message-ID: <20110313150646.GA30494@redhat.com> References: <20110117081058.18900.67456.stgit@dhcp-91-7.nay.redhat.com.englab.nay.redhat.com> <20110117081117.18900.48672.stgit@dhcp-91-7.nay.redhat.com.englab.nay.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110117081117.18900.48672.stgit@dhcp-91-7.nay.redhat.com.englab.nay.redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: netdev-owner@vger.kernel.org Precedence: bulk

Michael S. Tsirkin March 13, 2011, 3:06 p.m. UTC

On Mon, Jan 17, 2011 at 04:11:17PM +0800, Jason Wang wrote:
> We can use lock_sock_fast() instead of lock_sock() in order to get
> speedup in peek_head_len().
> 
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/vhost/net.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index c32a2e4..50b622a 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -211,12 +211,12 @@ static int peek_head_len(struct sock *sk)
>  {
>  	struct sk_buff *head;
>  	int len = 0;
> +	bool slow = lock_sock_fast(sk);
>  
> -	lock_sock(sk);
>  	head = skb_peek(&sk->sk_receive_queue);
>  	if (head)
>  		len = head->len;
> -	release_sock(sk);
> +	unlock_sock_fast(sk, slow);
>  	return len;
>  }
>  

Wanted to apply this, but looking at the code I think the lock_sock here
is wrong. What we really need is to handle the case where the skb is
pulled from the receive queue after skb_peek.  However this is not the
right lock to use for that, sk_receive_queue.lock is.
So I expect the following is the right way to handle this.
Comments?

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Eric Dumazet March 13, 2011, 3:52 p.m. UTC | #1

Le dimanche 13 mars 2011 à 17:06 +0200, Michael S. Tsirkin a écrit :
> On Mon, Jan 17, 2011 at 04:11:17PM +0800, Jason Wang wrote:
> > We can use lock_sock_fast() instead of lock_sock() in order to get
> > speedup in peek_head_len().
> > 
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> >  drivers/vhost/net.c |    4 ++--
> >  1 files changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> > index c32a2e4..50b622a 100644
> > --- a/drivers/vhost/net.c
> > +++ b/drivers/vhost/net.c
> > @@ -211,12 +211,12 @@ static int peek_head_len(struct sock *sk)
> >  {
> >  	struct sk_buff *head;
> >  	int len = 0;
> > +	bool slow = lock_sock_fast(sk);
> >  
> > -	lock_sock(sk);
> >  	head = skb_peek(&sk->sk_receive_queue);
> >  	if (head)
> >  		len = head->len;
> > -	release_sock(sk);
> > +	unlock_sock_fast(sk, slow);
> >  	return len;
> >  }
> >  
> 
> Wanted to apply this, but looking at the code I think the lock_sock here
> is wrong. What we really need is to handle the case where the skb is
> pulled from the receive queue after skb_peek.  However this is not the
> right lock to use for that, sk_receive_queue.lock is.
> So I expect the following is the right way to handle this.
> Comments?
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> index 0329c41..5720301 100644
> --- a/drivers/vhost/net.c
> +++ b/drivers/vhost/net.c
> @@ -213,12 +213,13 @@ static int peek_head_len(struct sock *sk)
>  {
>  	struct sk_buff *head;
>  	int len = 0;
> +	unsigned long flags;
>  
> -	lock_sock(sk);
> +	spin_lock_irqsave(&sk->sk_receive_queue.lock, flags);
>  	head = skb_peek(&sk->sk_receive_queue);
> -	if (head)
> +	if (likely(head))
>  		len = head->len;
> -	release_sock(sk);
> +	spin_unlock_irqrestore(&sk->sk_receive_queue.lock, flags);
>  	return len;
>  }
>  

You may be right, only way to be sure is to check the other side.

If it uses skb_queue_tail(), then yes, your patch is fine.

If other side did not lock socket, then your patch is a bug fix.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Michael S. Tsirkin March 13, 2011, 4:19 p.m. UTC | #2

On Sun, Mar 13, 2011 at 04:52:50PM +0100, Eric Dumazet wrote:
> Le dimanche 13 mars 2011 à 17:06 +0200, Michael S. Tsirkin a écrit :
> > On Mon, Jan 17, 2011 at 04:11:17PM +0800, Jason Wang wrote:
> > > We can use lock_sock_fast() instead of lock_sock() in order to get
> > > speedup in peek_head_len().
> > > 
> > > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > > ---
> > >  drivers/vhost/net.c |    4 ++--
> > >  1 files changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> > > index c32a2e4..50b622a 100644
> > > --- a/drivers/vhost/net.c
> > > +++ b/drivers/vhost/net.c
> > > @@ -211,12 +211,12 @@ static int peek_head_len(struct sock *sk)
> > >  {
> > >  	struct sk_buff *head;
> > >  	int len = 0;
> > > +	bool slow = lock_sock_fast(sk);
> > >  
> > > -	lock_sock(sk);
> > >  	head = skb_peek(&sk->sk_receive_queue);
> > >  	if (head)
> > >  		len = head->len;
> > > -	release_sock(sk);
> > > +	unlock_sock_fast(sk, slow);
> > >  	return len;
> > >  }
> > >  
> > 
> > Wanted to apply this, but looking at the code I think the lock_sock here
> > is wrong. What we really need is to handle the case where the skb is
> > pulled from the receive queue after skb_peek.  However this is not the
> > right lock to use for that, sk_receive_queue.lock is.
> > So I expect the following is the right way to handle this.
> > Comments?
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > 
> > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> > index 0329c41..5720301 100644
> > --- a/drivers/vhost/net.c
> > +++ b/drivers/vhost/net.c
> > @@ -213,12 +213,13 @@ static int peek_head_len(struct sock *sk)
> >  {
> >  	struct sk_buff *head;
> >  	int len = 0;
> > +	unsigned long flags;
> >  
> > -	lock_sock(sk);
> > +	spin_lock_irqsave(&sk->sk_receive_queue.lock, flags);
> >  	head = skb_peek(&sk->sk_receive_queue);
> > -	if (head)
> > +	if (likely(head))
> >  		len = head->len;
> > -	release_sock(sk);
> > +	spin_unlock_irqrestore(&sk->sk_receive_queue.lock, flags);
> >  	return len;
> >  }
> >  
> 
> You may be right, only way to be sure is to check the other side.
> 
> If it uses skb_queue_tail(), then yes, your patch is fine.
> 
> If other side did not lock socket, then your patch is a bug fix.
> 
> 

Other side is in drivers/net/tun.c and net/packet/af_packet.c
At least wrt tun it seems clear socket is not locked.
Besides queue, dequeue seems to be done without socket locked.

Eric Dumazet March 13, 2011, 4:32 p.m. UTC | #3

Le dimanche 13 mars 2011 à 18:19 +0200, Michael S. Tsirkin a écrit :

> Other side is in drivers/net/tun.c and net/packet/af_packet.c
> At least wrt tun it seems clear socket is not locked.

Yes (assuming you refer to tun_net_xmit())

> Besides queue, dequeue seems to be done without socket locked.
> 

It seems this code (assuming you speak of drivers/vhost/net.c ?) has
some races indeed.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Michael S. Tsirkin March 13, 2011, 4:43 p.m. UTC | #4

On Sun, Mar 13, 2011 at 05:32:07PM +0100, Eric Dumazet wrote:
> Le dimanche 13 mars 2011 à 18:19 +0200, Michael S. Tsirkin a écrit :
> 
> > Other side is in drivers/net/tun.c and net/packet/af_packet.c
> > At least wrt tun it seems clear socket is not locked.
> 
> Yes (assuming you refer to tun_net_xmit())
> 
> > Besides queue, dequeue seems to be done without socket locked.
> > 
> 
> It seems this code (assuming you speak of drivers/vhost/net.c ?) has
> some races indeed.
> 

Hmm. Any more besides the one fixed here?

Eric Dumazet March 13, 2011, 5:41 p.m. UTC | #5

Le dimanche 13 mars 2011 à 18:43 +0200, Michael S. Tsirkin a écrit :
> On Sun, Mar 13, 2011 at 05:32:07PM +0100, Eric Dumazet wrote:
> > Le dimanche 13 mars 2011 à 18:19 +0200, Michael S. Tsirkin a écrit :
> > 
> > > Other side is in drivers/net/tun.c and net/packet/af_packet.c
> > > At least wrt tun it seems clear socket is not locked.
> > 
> > Yes (assuming you refer to tun_net_xmit())
> > 
> > > Besides queue, dequeue seems to be done without socket locked.
> > > 
> > 
> > It seems this code (assuming you speak of drivers/vhost/net.c ?) has
> > some races indeed.
> > 
> 
> Hmm. Any more besides the one fixed here?
> 

If writers and readers dont share a common lock, how can they reliably
synchronize states ?

For example, the check at line 420 seems unsafe or useless.

skb_queue_empty(&sock->sk->sk_receive_queue)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[3/3] vhost-net: use lock_sock_fast() in peek_head_len()

Commit Message

Comments

Patch