Message ID | 534B3A33.9090401@adjacentlink.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
From: Steven Galgano <sgalgano@adjacentlink.com> Date: Sun, 13 Apr 2014 21:30:27 -0400 > Added optional per queue flow control support using IFF_FLOW_CONTROL. When the IFF_FLOW_CONTROL TUNSETIFF flag is specified it will set a per queue flag to indicate that the queue should be stopped using netif_tx_stop_queue(), rather than discarding frames once full. After reading a frame from the respective stopped queue, a netif_tx_wake_queue() is issued to signal resource availability. > > The per queue TUN_FLOW_CONTROL flag is stored in struct tun_file. This provides the flexibility to enable flow control on all, none or some queues when using IFF_MULTI_QUEUE. When not using IFF_MULTI_QUEUE, IFF_FLOW_CONTROL will apply to the single queue. No changes were made to the default drop frame policy. > > This change adds support for back pressure use cases. > > Reported-by: Brian Adamson <brian.adamson@nrl.navy.mil> > Tested-by: Joseph Giovatto <jgiovatto@djacentlink.com> > Signed-off-by: Steven Galgano <sgalgano@adjacentlink.com> Please format your commit messages to ~80 columns of text. It won't be automatically formatted by GIT and in fact it looks ugly with all the wrapping in text based tools. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 04/13/2014 09:40 PM, David Miller wrote: > From: Steven Galgano <sgalgano@adjacentlink.com> > Date: Sun, 13 Apr 2014 21:30:27 -0400 > >> Added optional per queue flow control support using IFF_FLOW_CONTROL. When the IFF_FLOW_CONTROL TUNSETIFF flag is specified it will set a per queue flag to indicate that the queue should be stopped using netif_tx_stop_queue(), rather than discarding frames once full. After reading a frame from the respective stopped queue, a netif_tx_wake_queue() is issued to signal resource availability. >> >> The per queue TUN_FLOW_CONTROL flag is stored in struct tun_file. This provides the flexibility to enable flow control on all, none or some queues when using IFF_MULTI_QUEUE. When not using IFF_MULTI_QUEUE, IFF_FLOW_CONTROL will apply to the single queue. No changes were made to the default drop frame policy. >> >> This change adds support for back pressure use cases. >> >> Reported-by: Brian Adamson <brian.adamson@nrl.navy.mil> >> Tested-by: Joseph Giovatto <jgiovatto@djacentlink.com> >> Signed-off-by: Steven Galgano <sgalgano@adjacentlink.com> > > Please format your commit messages to ~80 columns of text. > > It won't be automatically formatted by GIT and in fact it looks ugly > with all the wrapping in text based tools. > Added optional per queue flow control support using IFF_FLOW_CONTROL. When the IFF_FLOW_CONTROL TUNSETIFF flag is specified it will set a per queue flag to indicate that the queue should be stopped using netif_tx_stop_queue(), rather than discarding frames once full. After reading a frame from the respective stopped queue, a netif_tx_wake_queue() is issued to signal resource availability. The per queue TUN_FLOW_CONTROL flag is stored in struct tun_file. This provides the flexibility to enable flow control on all, none or some queues when using IFF_MULTI_QUEUE. When not using IFF_MULTI_QUEUE, IFF_FLOW_CONTROL will apply to the single queue. No changes were made to the default drop frame policy. This change adds support for back pressure use cases. Reported-by: Brian Adamson <brian.adamson@nrl.navy.mil> Tested-by: Joseph Giovatto <jgiovatto@djacentlink.com> Signed-off-by: Steven Galgano <sgalgano@adjacentlink.com> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Please do not post new versions of a patch as a reply to an existing discussion. Instead, post a fresh new email for the patch. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Apr 14, 2014 at 12:19:14AM -0400, Steven Galgano wrote: > On 04/13/2014 09:40 PM, David Miller wrote: > > From: Steven Galgano <sgalgano@adjacentlink.com> > > Date: Sun, 13 Apr 2014 21:30:27 -0400 > > > >> Added optional per queue flow control support using IFF_FLOW_CONTROL. When the IFF_FLOW_CONTROL TUNSETIFF flag is specified it will set a per queue flag to indicate that the queue should be stopped using netif_tx_stop_queue(), rather than discarding frames once full. After reading a frame from the respective stopped queue, a netif_tx_wake_queue() is issued to signal resource availability. > >> > >> The per queue TUN_FLOW_CONTROL flag is stored in struct tun_file. This provides the flexibility to enable flow control on all, none or some queues when using IFF_MULTI_QUEUE. When not using IFF_MULTI_QUEUE, IFF_FLOW_CONTROL will apply to the single queue. No changes were made to the default drop frame policy. > >> > >> This change adds support for back pressure use cases. > >> > >> Reported-by: Brian Adamson <brian.adamson@nrl.navy.mil> > >> Tested-by: Joseph Giovatto <jgiovatto@djacentlink.com> > >> Signed-off-by: Steven Galgano <sgalgano@adjacentlink.com> > > > > Please format your commit messages to ~80 columns of text. > > > > It won't be automatically formatted by GIT and in fact it looks ugly > > with all the wrapping in text based tools. > > > Added optional per queue flow control support using IFF_FLOW_CONTROL. When the > IFF_FLOW_CONTROL TUNSETIFF flag is specified it will set a per queue flag to > indicate that the queue should be stopped using netif_tx_stop_queue(), rather > than discarding frames once full. After reading a frame from the respective > stopped queue, a netif_tx_wake_queue() is issued to signal resource > availability. > > The per queue TUN_FLOW_CONTROL flag is stored in struct tun_file. This provides > the flexibility to enable flow control on all, none or some queues when using > IFF_MULTI_QUEUE. When not using IFF_MULTI_QUEUE, IFF_FLOW_CONTROL will apply to > the single queue. No changes were made to the default drop frame policy. > > This change adds support for back pressure use cases. > > Reported-by: Brian Adamson <brian.adamson@nrl.navy.mil> > Tested-by: Joseph Giovatto <jgiovatto@djacentlink.com> > Signed-off-by: Steven Galgano <sgalgano@adjacentlink.com> > Yes, doing this per-queue is an improvement - but still not ideal. The issue here is still that a daemon would set the flow control because it deals with a specific protocol - and affect behaviour of all potentially unrelated applications - they might deadlock simply because a packet made its way to tun's qdisc. I do understand this simple patch works for you for simple setups where there's more or less a single tun device in the system, but I think it's worth it to try to address a slightly bigger issue: both to make the feature more generally applicable, and to prevent potential problems in case where it's not. I've been thinking about ways to address it issue, and I think I see a way, though still a bit vaguely. Basically, I think the problem would be at least partially solved if we use a queue that limits packet delay (if I understand correctly e.g. codel doesn't do this exactly in that it won't drop packets if queues never fill up). So instead of QUEUE_STATE_DRV_XOFF tun would set some other bit, that means "we want to stop transmit queue but can't guarantee we'll restart it within reasonable time". This bit would also have to be rate-limited, to prevent not just full deadlock but also performance degradation. This could then be done unconditionally: instead of setting the FLOW_CONTROL bit user would configure this "flow control" queue. Sorry about posting such rough thoughts, no patch yet but I thought this might be preferable to silence as I'm going offline for a couple of days.
diff --git a/drivers/net/tun.c b/drivers/net/tun.c index ee328ba..3d09f5a 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -137,7 +137,7 @@ struct tun_file { struct tun_struct __rcu *tun; struct net *net; struct fasync_struct *fasync; - /* only used for fasnyc */ + /* used for fasnyc and flow control */ unsigned int flags; union { u16 queue_index; @@ -783,8 +783,19 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev) * number of queues. */ if (skb_queue_len(&tfile->socket.sk->sk_receive_queue) * numqueues - >= dev->tx_queue_len) - goto drop; + >= dev->tx_queue_len) { + if (tfile->flags & TUN_FLOW_CONTROL) { + /* Resources unavailable stop queue */ + netif_tx_stop_queue(netdev_get_tx_queue(dev, txq)); + + /* We won't see all dropped packets individually, so + * over run error is more appropriate. + */ + dev->stats.tx_fifo_errors++; + } else { + goto drop; + } + } if (unlikely(skb_orphan_frags(skb, GFP_ATOMIC))) goto drop; @@ -1333,6 +1344,7 @@ static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile, DECLARE_WAITQUEUE(wait, current); struct sk_buff *skb; ssize_t ret = 0; + struct netdev_queue *ntxq; tun_debug(KERN_INFO, tun, "tun_do_read\n"); @@ -1362,6 +1374,12 @@ static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile, continue; } + ntxq = netdev_get_tx_queue(tun->dev, tfile->queue_index); + + if (tfile->flags & TUN_FLOW_CONTROL && + netif_tx_queue_stopped(ntxq)) + netif_tx_wake_queue(ntxq); + ret = tun_put_user(tun, tfile, skb, iv, len); kfree_skb(skb); break; @@ -1732,6 +1750,11 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr) else tun->flags &= ~TUN_TAP_MQ; + if (ifr->ifr_flags & IFF_FLOW_CONTROL) + tfile->flags |= TUN_FLOW_CONTROL; + else + tfile->flags &= ~TUN_FLOW_CONTROL; + /* Make sure persistent devices do not get stuck in * xoff state. */ @@ -1900,7 +1923,8 @@ static long __tun_chr_ioctl(struct file *file, unsigned int cmd, * This is needed because we never checked for invalid flags on * TUNSETIFF. */ return put_user(IFF_TUN | IFF_TAP | IFF_NO_PI | IFF_ONE_QUEUE | - IFF_VNET_HDR | IFF_MULTI_QUEUE, + IFF_VNET_HDR | IFF_MULTI_QUEUE | + IFF_FLOW_CONTROL, (unsigned int __user*)argp); } else if (cmd == TUNSETQUEUE) return tun_set_queue(file, &ifr); diff --git a/include/uapi/linux/if_tun.h b/include/uapi/linux/if_tun.h index e9502dd..bcf2790 100644 --- a/include/uapi/linux/if_tun.h +++ b/include/uapi/linux/if_tun.h @@ -36,6 +36,7 @@ #define TUN_PERSIST 0x0100 #define TUN_VNET_HDR 0x0200 #define TUN_TAP_MQ 0x0400 +#define TUN_FLOW_CONTROL 0x0800 /* Ioctl defines */ #define TUNSETNOCSUM _IOW('T', 200, int) @@ -70,6 +71,7 @@ #define IFF_MULTI_QUEUE 0x0100 #define IFF_ATTACH_QUEUE 0x0200 #define IFF_DETACH_QUEUE 0x0400 +#define IFF_FLOW_CONTROL 0x0010 /* read-only flag */ #define IFF_PERSIST 0x0800 #define IFF_NOFILTER 0x1000