diff mbox series

[net-next] tun: fix multiqueue rx

Message ID 20181116070015.1759-1-matthew.cover@stackpath.com
State Changes Requested, archived
Delegated to: David Miller
Headers show
Series [net-next] tun: fix multiqueue rx | expand

Commit Message

Matt Cover Nov. 16, 2018, 7 a.m. UTC
When writing packets to a descriptor associated with a combined queue, the
packets should end up on that queue.

Before this change all packets written to any descriptor associated with a
tap interface end up on rx-0, even when the descriptor is associated with a
different queue.

The rx traffic can be generated by either of the following.
  1. a simple tap program which spins up multiple queues and writes packets
     to each of the file descriptors
  2. tx from a qemu vm with a tap multiqueue netdev

The queue for rx traffic can be observed by either of the following (done
on the hypervisor in the qemu case).
  1. a simple netmap program which opens and reads from per-queue
     descriptors
  2. configuring RPS and doing per-cpu captures with rxtxcpu

Alternatively, if you printk() the return value of skb_get_rx_queue() just
before each instance of netif_receive_skb() in tun.c, you will get 65535
for every skb.

Calling skb_record_rx_queue() to set the rx queue to the queue_index fixes
the association between descriptor and rx queue.

Signed-off-by: Matthew Cover <matthew.cover@stackpath.com>
---
 drivers/net/tun.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Jason Wang Nov. 16, 2018, 7:11 a.m. UTC | #1
On 2018/11/16 下午3:00, Matthew Cover wrote:
> When writing packets to a descriptor associated with a combined queue, the
> packets should end up on that queue.
>
> Before this change all packets written to any descriptor associated with a
> tap interface end up on rx-0, even when the descriptor is associated with a
> different queue.
>
> The rx traffic can be generated by either of the following.
>    1. a simple tap program which spins up multiple queues and writes packets
>       to each of the file descriptors
>    2. tx from a qemu vm with a tap multiqueue netdev
>
> The queue for rx traffic can be observed by either of the following (done
> on the hypervisor in the qemu case).
>    1. a simple netmap program which opens and reads from per-queue
>       descriptors
>    2. configuring RPS and doing per-cpu captures with rxtxcpu
>
> Alternatively, if you printk() the return value of skb_get_rx_queue() just
> before each instance of netif_receive_skb() in tun.c, you will get 65535
> for every skb.
>
> Calling skb_record_rx_queue() to set the rx queue to the queue_index fixes
> the association between descriptor and rx queue.
>
> Signed-off-by: Matthew Cover <matthew.cover@stackpath.com>
> ---
>   drivers/net/tun.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index a65779c6d72f..ce8620f3ea5e 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1536,6 +1536,7 @@ static void tun_rx_batched(struct tun_struct *tun, struct tun_file *tfile,
>   
>   	if (!rx_batched || (!more && skb_queue_empty(queue))) {
>   		local_bh_disable();
> +		skb_record_rx_queue(skb, tfile->queue_index);
>   		netif_receive_skb(skb);
>   		local_bh_enable();
>   		return;
> @@ -1555,8 +1556,11 @@ static void tun_rx_batched(struct tun_struct *tun, struct tun_file *tfile,
>   		struct sk_buff *nskb;
>   
>   		local_bh_disable();
> -		while ((nskb = __skb_dequeue(&process_queue)))
> +		while ((nskb = __skb_dequeue(&process_queue))) {
> +			skb_record_rx_queue(nskb, tfile->queue_index);
>   			netif_receive_skb(nskb);
> +		}
> +		skb_record_rx_queue(skb, tfile->queue_index);
>   		netif_receive_skb(skb);
>   		local_bh_enable();
>   	}
> @@ -2452,6 +2456,7 @@ static int tun_xdp_one(struct tun_struct *tun,
>   	    !tfile->detached)
>   		rxhash = __skb_get_hash_symmetric(skb);
>   
> +	skb_record_rx_queue(skb, tfile->queue_index);
>   	netif_receive_skb(skb);
>   
>   	stats = get_cpu_ptr(tun->pcpu_stats);


Acked-by: Jason Wang <jasowang@redhat.com>
Michael S. Tsirkin Nov. 16, 2018, 8:10 p.m. UTC | #2
On Fri, Nov 16, 2018 at 12:00:15AM -0700, Matthew Cover wrote:
> When writing packets to a descriptor associated with a combined queue, the
> packets should end up on that queue.
> 
> Before this change all packets written to any descriptor associated with a
> tap interface end up on rx-0, even when the descriptor is associated with a
> different queue.
> 
> The rx traffic can be generated by either of the following.
>   1. a simple tap program which spins up multiple queues and writes packets
>      to each of the file descriptors
>   2. tx from a qemu vm with a tap multiqueue netdev
> 
> The queue for rx traffic can be observed by either of the following (done
> on the hypervisor in the qemu case).
>   1. a simple netmap program which opens and reads from per-queue
>      descriptors
>   2. configuring RPS and doing per-cpu captures with rxtxcpu
> 
> Alternatively, if you printk() the return value of skb_get_rx_queue() just
> before each instance of netif_receive_skb() in tun.c, you will get 65535
> for every skb.
> 
> Calling skb_record_rx_queue() to set the rx queue to the queue_index fixes
> the association between descriptor and rx queue.
> 
> Signed-off-by: Matthew Cover <matthew.cover@stackpath.com>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

stable material?

> ---
>  drivers/net/tun.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index a65779c6d72f..ce8620f3ea5e 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1536,6 +1536,7 @@ static void tun_rx_batched(struct tun_struct *tun, struct tun_file *tfile,
>  
>  	if (!rx_batched || (!more && skb_queue_empty(queue))) {
>  		local_bh_disable();
> +		skb_record_rx_queue(skb, tfile->queue_index);
>  		netif_receive_skb(skb);
>  		local_bh_enable();
>  		return;
> @@ -1555,8 +1556,11 @@ static void tun_rx_batched(struct tun_struct *tun, struct tun_file *tfile,
>  		struct sk_buff *nskb;
>  
>  		local_bh_disable();
> -		while ((nskb = __skb_dequeue(&process_queue)))
> +		while ((nskb = __skb_dequeue(&process_queue))) {
> +			skb_record_rx_queue(nskb, tfile->queue_index);
>  			netif_receive_skb(nskb);
> +		}
> +		skb_record_rx_queue(skb, tfile->queue_index);
>  		netif_receive_skb(skb);
>  		local_bh_enable();
>  	}
> @@ -2452,6 +2456,7 @@ static int tun_xdp_one(struct tun_struct *tun,
>  	    !tfile->detached)
>  		rxhash = __skb_get_hash_symmetric(skb);
>  
> +	skb_record_rx_queue(skb, tfile->queue_index);
>  	netif_receive_skb(skb);
>  
>  	stats = get_cpu_ptr(tun->pcpu_stats);
> -- 
> 2.15.2 (Apple Git-101.1)
Matt Cover Nov. 16, 2018, 8:45 p.m. UTC | #3
On Fri, Nov 16, 2018 at 1:10 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Fri, Nov 16, 2018 at 12:00:15AM -0700, Matthew Cover wrote:
> > When writing packets to a descriptor associated with a combined queue, the
> > packets should end up on that queue.
> >
> > Before this change all packets written to any descriptor associated with a
> > tap interface end up on rx-0, even when the descriptor is associated with a
> > different queue.
> >
> > The rx traffic can be generated by either of the following.
> >   1. a simple tap program which spins up multiple queues and writes packets
> >      to each of the file descriptors
> >   2. tx from a qemu vm with a tap multiqueue netdev
> >
> > The queue for rx traffic can be observed by either of the following (done
> > on the hypervisor in the qemu case).
> >   1. a simple netmap program which opens and reads from per-queue
> >      descriptors
> >   2. configuring RPS and doing per-cpu captures with rxtxcpu
> >
> > Alternatively, if you printk() the return value of skb_get_rx_queue() just
> > before each instance of netif_receive_skb() in tun.c, you will get 65535
> > for every skb.
> >
> > Calling skb_record_rx_queue() to set the rx queue to the queue_index fixes
> > the association between descriptor and rx queue.
> >
> > Signed-off-by: Matthew Cover <matthew.cover@stackpath.com>
>
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
>
> stable material?
>

Yes, I believe so.

The documentation below I think justifies classifying this as a fix.
https://github.com/torvalds/linux/blob/v4.19/Documentation/networking/tuntap.txt#L111

> > ---
> >  drivers/net/tun.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> > index a65779c6d72f..ce8620f3ea5e 100644
> > --- a/drivers/net/tun.c
> > +++ b/drivers/net/tun.c
> > @@ -1536,6 +1536,7 @@ static void tun_rx_batched(struct tun_struct *tun, struct tun_file *tfile,
> >
> >       if (!rx_batched || (!more && skb_queue_empty(queue))) {
> >               local_bh_disable();
> > +             skb_record_rx_queue(skb, tfile->queue_index);
> >               netif_receive_skb(skb);
> >               local_bh_enable();
> >               return;
> > @@ -1555,8 +1556,11 @@ static void tun_rx_batched(struct tun_struct *tun, struct tun_file *tfile,
> >               struct sk_buff *nskb;
> >
> >               local_bh_disable();
> > -             while ((nskb = __skb_dequeue(&process_queue)))
> > +             while ((nskb = __skb_dequeue(&process_queue))) {
> > +                     skb_record_rx_queue(nskb, tfile->queue_index);
> >                       netif_receive_skb(nskb);
> > +             }
> > +             skb_record_rx_queue(skb, tfile->queue_index);
> >               netif_receive_skb(skb);
> >               local_bh_enable();
> >       }
> > @@ -2452,6 +2456,7 @@ static int tun_xdp_one(struct tun_struct *tun,
> >           !tfile->detached)
> >               rxhash = __skb_get_hash_symmetric(skb);
> >
> > +     skb_record_rx_queue(skb, tfile->queue_index);
> >       netif_receive_skb(skb);
> >
> >       stats = get_cpu_ptr(tun->pcpu_stats);
> > --
> > 2.15.2 (Apple Git-101.1)
David Miller Nov. 18, 2018, 5:11 a.m. UTC | #4
From: Matthew Cover <werekraken@gmail.com>
Date: Fri, 16 Nov 2018 00:00:15 -0700

> When writing packets to a descriptor associated with a combined queue, the
> packets should end up on that queue.
> 
> Before this change all packets written to any descriptor associated with a
> tap interface end up on rx-0, even when the descriptor is associated with a
> different queue.
> 
> The rx traffic can be generated by either of the following.
>   1. a simple tap program which spins up multiple queues and writes packets
>      to each of the file descriptors
>   2. tx from a qemu vm with a tap multiqueue netdev
> 
> The queue for rx traffic can be observed by either of the following (done
> on the hypervisor in the qemu case).
>   1. a simple netmap program which opens and reads from per-queue
>      descriptors
>   2. configuring RPS and doing per-cpu captures with rxtxcpu
> 
> Alternatively, if you printk() the return value of skb_get_rx_queue() just
> before each instance of netif_receive_skb() in tun.c, you will get 65535
> for every skb.
> 
> Calling skb_record_rx_queue() to set the rx queue to the queue_index fixes
> the association between descriptor and rx queue.
> 
> Signed-off-by: Matthew Cover <matthew.cover@stackpath.com>

If this is intended to target -stable as well, which some responses seem to
indicate, you need to respin and submit this against 'net'.

Thanks.
diff mbox series

Patch

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index a65779c6d72f..ce8620f3ea5e 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1536,6 +1536,7 @@  static void tun_rx_batched(struct tun_struct *tun, struct tun_file *tfile,
 
 	if (!rx_batched || (!more && skb_queue_empty(queue))) {
 		local_bh_disable();
+		skb_record_rx_queue(skb, tfile->queue_index);
 		netif_receive_skb(skb);
 		local_bh_enable();
 		return;
@@ -1555,8 +1556,11 @@  static void tun_rx_batched(struct tun_struct *tun, struct tun_file *tfile,
 		struct sk_buff *nskb;
 
 		local_bh_disable();
-		while ((nskb = __skb_dequeue(&process_queue)))
+		while ((nskb = __skb_dequeue(&process_queue))) {
+			skb_record_rx_queue(nskb, tfile->queue_index);
 			netif_receive_skb(nskb);
+		}
+		skb_record_rx_queue(skb, tfile->queue_index);
 		netif_receive_skb(skb);
 		local_bh_enable();
 	}
@@ -2452,6 +2456,7 @@  static int tun_xdp_one(struct tun_struct *tun,
 	    !tfile->detached)
 		rxhash = __skb_get_hash_symmetric(skb);
 
+	skb_record_rx_queue(skb, tfile->queue_index);
 	netif_receive_skb(skb);
 
 	stats = get_cpu_ptr(tun->pcpu_stats);