diff mbox

netdev: handle setting transmit queue length on active device.

Message ID 20090318100237.06e974c9@nehalam
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

stephen hemminger March 18, 2009, 5:02 p.m. UTC
Some virtual devices like VLAN's or bridges by default have not transmit queue.
In order to do queuing disciplines on these devices a queue must be added
by setting transmit queue length. The kernel allows doing this at any time, but
setting the queue length value is not enough. Setting the queue length does
not create the transmit queue (going from 0 to non-zero) or destroy it
(setting to zero); the queue is only created (or destroyed) when device
comes up (or goes down).

This patch handles this be doing the necessary activations.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

---
 include/linux/netdevice.h |    2 ++
 net/core/dev.c            |   19 +++++++++++++++++--
 net/core/net-sysfs.c      |    5 ++---
 net/core/rtnetlink.c      |    2 +-
 4 files changed, 22 insertions(+), 6 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Patrick McHardy March 18, 2009, 5:15 p.m. UTC | #1
Stephen Hemminger wrote:
> Some virtual devices like VLAN's or bridges by default have not transmit queue.
> In order to do queuing disciplines on these devices a queue must be added
> by setting transmit queue length. The kernel allows doing this at any time, but
> setting the queue length value is not enough. Setting the queue length does
> not create the transmit queue (going from 0 to non-zero) or destroy it
> (setting to zero); the queue is only created (or destroyed) when device
> comes up (or goes down).
> 
> This patch handles this be doing the necessary activations.
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> 
> +/**
> + *
> + *	@dev: device
> + *	@len: limit of frames in transmit queue
> + *
> + *	Set ifalias for a device,
> + */
> +int dev_set_tx_queue_len(struct net_device *dev, unsigned long len)
> +{
> +	if (dev->flags & IFF_UP)
> +		dev_deactivate(dev);
> +	dev->tx_queue_len = len;
> +	if (dev->flags & IFF_UP)
> +		dev_activate(dev);
> +	return 0;
> +}


That seems quite heavy-weight in case of hierarchical qdiscs, it will
walk and reset the entire tree and looses all state, including packets
as a side effect. Its also a bit inconsistent since it only affects
the default pfifos at the root, not the inner default qdiscs if I'm
not mistaken.

A less intrusive way would be to simply use dev->tx_queue_len directly
in pfifo when no limit is configured. That would require to make a
runtime decision for each packet though. Two more alternatives:

- add a special pfifo variant that always uses tx_queue_len. Duplication
   comes down to pfifo_enqueue (a few lines) and a new struct Qdisc_ops

- if you really only care about the root, reconfigure it explicitly
   using fifo_set_limit()




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet March 18, 2009, 5:18 p.m. UTC | #2
Stephen Hemminger a écrit :
> Some virtual devices like VLAN's or bridges by default have not transmit queue.
> In order to do queuing disciplines on these devices a queue must be added
> by setting transmit queue length. The kernel allows doing this at any time, but
> setting the queue length value is not enough. Setting the queue length does
> not create the transmit queue (going from 0 to non-zero) or destroy it
> (setting to zero); the queue is only created (or destroyed) when device
> comes up (or goes down).
> 
> This patch handles this be doing the necessary activations.
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> 
> ---
>  include/linux/netdevice.h |    2 ++
>  net/core/dev.c            |   19 +++++++++++++++++--
>  net/core/net-sysfs.c      |    5 ++---
>  net/core/rtnetlink.c      |    2 +-
>  4 files changed, 22 insertions(+), 6 deletions(-)
> 
> --- a/include/linux/netdevice.h	2009-03-17 11:06:40.358479573 -0700
> +++ b/include/linux/netdevice.h	2009-03-17 11:13:55.243604979 -0700
> @@ -1436,6 +1436,8 @@ extern int		dev_change_net_namespace(str
>  extern int		dev_set_mtu(struct net_device *, int);
>  extern int		dev_set_mac_address(struct net_device *,
>  					    struct sockaddr *);
> +extern int		dev_set_tx_queue_len(struct net_device *dev,
> +					     unsigned long len);
>  extern int		dev_hard_start_xmit(struct sk_buff *skb,
>  					    struct net_device *dev,
>  					    struct netdev_queue *txq);
> --- a/net/core/dev.c	2009-03-17 11:05:54.919541341 -0700
> +++ b/net/core/dev.c	2009-03-17 11:16:25.717479320 -0700
> @@ -983,6 +983,22 @@ int dev_set_alias(struct net_device *dev
>  	return len;
>  }
>  
> +/**
> + *
> + *	@dev: device
> + *	@len: limit of frames in transmit queue
> + *
> + *	Set ifalias for a device,

copy/pasted from "dev_set_alias - change ifalias of a device" ? ;)

> + */
> +int dev_set_tx_queue_len(struct net_device *dev, unsigned long len)
> +{

if (dev->tx_queue_len == len)
	return 0;

> +	if (dev->flags & IFF_UP)
> +		dev_deactivate(dev);

So changing txqueuelen will drop frames ?


> +	dev->tx_queue_len = len;
> +	if (dev->flags & IFF_UP)
> +		dev_activate(dev);
> +	return 0;
> +}
>  
> 



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
stephen hemminger March 18, 2009, 6 p.m. UTC | #3
On Wed, 18 Mar 2009 18:18:44 +0100
Eric Dumazet <dada1@cosmosbay.com> wrote:

> Stephen Hemminger a écrit :
> > Some virtual devices like VLAN's or bridges by default have not transmit queue.
> > In order to do queuing disciplines on these devices a queue must be added
> > by setting transmit queue length. The kernel allows doing this at any time, but
> > setting the queue length value is not enough. Setting the queue length does
> > not create the transmit queue (going from 0 to non-zero) or destroy it
> > (setting to zero); the queue is only created (or destroyed) when device
> > comes up (or goes down).
> > 
> > This patch handles this be doing the necessary activations.
> > 
> > Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> > 
> > ---
> >  include/linux/netdevice.h |    2 ++
> >  net/core/dev.c            |   19 +++++++++++++++++--
> >  net/core/net-sysfs.c      |    5 ++---
> >  net/core/rtnetlink.c      |    2 +-
> >  4 files changed, 22 insertions(+), 6 deletions(-)
> > 
> > --- a/include/linux/netdevice.h	2009-03-17 11:06:40.358479573 -0700
> > +++ b/include/linux/netdevice.h	2009-03-17 11:13:55.243604979 -0700
> > @@ -1436,6 +1436,8 @@ extern int		dev_change_net_namespace(str
> >  extern int		dev_set_mtu(struct net_device *, int);
> >  extern int		dev_set_mac_address(struct net_device *,
> >  					    struct sockaddr *);
> > +extern int		dev_set_tx_queue_len(struct net_device *dev,
> > +					     unsigned long len);
> >  extern int		dev_hard_start_xmit(struct sk_buff *skb,
> >  					    struct net_device *dev,
> >  					    struct netdev_queue *txq);
> > --- a/net/core/dev.c	2009-03-17 11:05:54.919541341 -0700
> > +++ b/net/core/dev.c	2009-03-17 11:16:25.717479320 -0700
> > @@ -983,6 +983,22 @@ int dev_set_alias(struct net_device *dev
> >  	return len;
> >  }
> >  
> > +/**
> > + *
> > + *	@dev: device
> > + *	@len: limit of frames in transmit queue
> > + *
> > + *	Set ifalias for a device,
> 
> copy/pasted from "dev_set_alias - change ifalias of a device" ? ;)
> 
> > + */
> > +int dev_set_tx_queue_len(struct net_device *dev, unsigned long len)
> > +{
> 
> if (dev->tx_queue_len == len)
> 	return 0;

Also could do:
         
if ((dev->tx_queue_len != 0) && (len != 0)) {
      dev->tx_queue_len = len;
      return 0;
}
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
stephen hemminger March 18, 2009, 6:02 p.m. UTC | #4
On Wed, 18 Mar 2009 18:15:38 +0100
Patrick McHardy <kaber@trash.net> wrote:

> Stephen Hemminger wrote:
> > Some virtual devices like VLAN's or bridges by default have not transmit queue.
> > In order to do queuing disciplines on these devices a queue must be added
> > by setting transmit queue length. The kernel allows doing this at any time, but
> > setting the queue length value is not enough. Setting the queue length does
> > not create the transmit queue (going from 0 to non-zero) or destroy it
> > (setting to zero); the queue is only created (or destroyed) when device
> > comes up (or goes down).
> > 
> > This patch handles this be doing the necessary activations.
> > 
> > Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> > 
> > +/**
> > + *
> > + *	@dev: device
> > + *	@len: limit of frames in transmit queue
> > + *
> > + *	Set ifalias for a device,
> > + */
> > +int dev_set_tx_queue_len(struct net_device *dev, unsigned long len)
> > +{
> > +	if (dev->flags & IFF_UP)
> > +		dev_deactivate(dev);
> > +	dev->tx_queue_len = len;
> > +	if (dev->flags & IFF_UP)
> > +		dev_activate(dev);
> > +	return 0;
> > +}
> 
> 
> That seems quite heavy-weight in case of hierarchical qdiscs, it will
> walk and reset the entire tree and looses all state, including packets
> as a side effect. Its also a bit inconsistent since it only affects
> the default pfifos at the root, not the inner default qdiscs if I'm
> not mistaken.
> 
> A less intrusive way would be to simply use dev->tx_queue_len directly
> in pfifo when no limit is configured. That would require to make a
> runtime decision for each packet though. Two more alternatives:
> 
> - add a special pfifo variant that always uses tx_queue_len. Duplication
>    comes down to pfifo_enqueue (a few lines) and a new struct Qdisc_ops
> 
> - if you really only care about the root, reconfigure it explicitly
>    using fifo_set_limit()
> 

The problem is when setting limit to 0 to force noop and setting it
to non-zero to force pfifo_fast back again.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick McHardy March 18, 2009, 6:08 p.m. UTC | #5
Stephen Hemminger wrote:
> On Wed, 18 Mar 2009 18:15:38 +0100
> Patrick McHardy <kaber@trash.net> wrote:
> 
>> Stephen Hemminger wrote:
>>> Some virtual devices like VLAN's or bridges by default have not transmit queue.
>>> In order to do queuing disciplines on these devices a queue must be added
>>> by setting transmit queue length. The kernel allows doing this at any time, but
>>> setting the queue length value is not enough. Setting the queue length does
>>> not create the transmit queue (going from 0 to non-zero) or destroy it
>>> (setting to zero); the queue is only created (or destroyed) when device
>>> comes up (or goes down).
>>>
>>> This patch handles this be doing the necessary activations.
>>>
>> That seems quite heavy-weight in case of hierarchical qdiscs, it will
>> walk and reset the entire tree and looses all state, including packets
>> as a side effect. Its also a bit inconsistent since it only affects
>> the default pfifos at the root, not the inner default qdiscs if I'm
>> not mistaken.
>>
>> A less intrusive way would be to simply use dev->tx_queue_len directly
>> in pfifo when no limit is configured. That would require to make a
>> runtime decision for each packet though. Two more alternatives:
>>
>> - add a special pfifo variant that always uses tx_queue_len. Duplication
>>    comes down to pfifo_enqueue (a few lines) and a new struct Qdisc_ops
>>
>> - if you really only care about the root, reconfigure it explicitly
>>    using fifo_set_limit()
>>
> 
> The problem is when setting limit to 0 to force noop and setting it
> to non-zero to force pfifo_fast back again.


I'd suggest to simply attach a pfifo qdisc with an explicitly
configured limit to activate queueing on virtual devices.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
stephen hemminger March 18, 2009, 6:16 p.m. UTC | #6
On Wed, 18 Mar 2009 19:08:07 +0100
Patrick McHardy <kaber@trash.net> wrote:

> Stephen Hemminger wrote:
> > On Wed, 18 Mar 2009 18:15:38 +0100
> > Patrick McHardy <kaber@trash.net> wrote:
> > 
> >> Stephen Hemminger wrote:
> >>> Some virtual devices like VLAN's or bridges by default have not transmit queue.
> >>> In order to do queuing disciplines on these devices a queue must be added
> >>> by setting transmit queue length. The kernel allows doing this at any time, but
> >>> setting the queue length value is not enough. Setting the queue length does
> >>> not create the transmit queue (going from 0 to non-zero) or destroy it
> >>> (setting to zero); the queue is only created (or destroyed) when device
> >>> comes up (or goes down).
> >>>
> >>> This patch handles this be doing the necessary activations.
> >>>
> >> That seems quite heavy-weight in case of hierarchical qdiscs, it will
> >> walk and reset the entire tree and looses all state, including packets
> >> as a side effect. Its also a bit inconsistent since it only affects
> >> the default pfifos at the root, not the inner default qdiscs if I'm
> >> not mistaken.
> >>
> >> A less intrusive way would be to simply use dev->tx_queue_len directly
> >> in pfifo when no limit is configured. That would require to make a
> >> runtime decision for each packet though. Two more alternatives:
> >>
> >> - add a special pfifo variant that always uses tx_queue_len. Duplication
> >>    comes down to pfifo_enqueue (a few lines) and a new struct Qdisc_ops
> >>
> >> - if you really only care about the root, reconfigure it explicitly
> >>    using fifo_set_limit()
> >>
> > 
> > The problem is when setting limit to 0 to force noop and setting it
> > to non-zero to force pfifo_fast back again.
> 
> 
> I'd suggest to simply attach a pfifo qdisc with an explicitly
> configured limit to activate queueing on virtual devices.

The existing code is broken if user does:

# ip li set dev eth0.200 txq 200
# tc qdisc set dev eth0.200 root sfq
# tc qdisc del dev eth0.200 root
# ip li set dev eth0.200 txq 0

The vlan is now dead because it is impossible to go back
to the no queueing (noop) qdisc.  Also, when doing performance tests
it is useful to be able to turn off transmit queue entirely.

Other alternative would be to reject changes to txq when device is
up, but that probably would break existing users. And would be a pain
in the ass when trying to configure PPP and tunnel endpoints.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick McHardy March 18, 2009, 6:31 p.m. UTC | #7
Stephen Hemminger wrote:
> On Wed, 18 Mar 2009 19:08:07 +0100
> Patrick McHardy <kaber@trash.net> wrote:
> 
>> I'd suggest to simply attach a pfifo qdisc with an explicitly
>> configured limit to activate queueing on virtual devices.
> 
> The existing code is broken if user does:
> 
> # ip li set dev eth0.200 txq 200
> # tc qdisc set dev eth0.200 root sfq
> # tc qdisc del dev eth0.200 root
> # ip li set dev eth0.200 txq 0
> 
> The vlan is now dead because it is impossible to go back
> to the no queueing (noop) qdisc.  Also, when doing performance tests
> it is useful to be able to turn off transmit queue entirely.

What I meant is:

# ip l l dummy0
109: dummy0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state 
UNKNOWN
     link/ether 0a:94:90:3c:e3:45 brd ff:ff:ff:ff:ff:ff

# tc qdisc add dev dummy0 root pfifo limit 1000
# ip l l dummy0
109: dummy0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo state 
UNKNOWN
     link/ether 0a:94:90:3c:e3:45 brd ff:ff:ff:ff:ff:ff

# tc qdisc del dev dummy0 root
# ip l l dummy0
109: dummy0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state 
UNKNOWN
     link/ether 0a:94:90:3c:e3:45 brd ff:ff:ff:ff:ff:ff

It should work your way as well if you set the transmit queue length
to zero before removing the root qdisc.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- a/include/linux/netdevice.h	2009-03-17 11:06:40.358479573 -0700
+++ b/include/linux/netdevice.h	2009-03-17 11:13:55.243604979 -0700
@@ -1436,6 +1436,8 @@  extern int		dev_change_net_namespace(str
 extern int		dev_set_mtu(struct net_device *, int);
 extern int		dev_set_mac_address(struct net_device *,
 					    struct sockaddr *);
+extern int		dev_set_tx_queue_len(struct net_device *dev,
+					     unsigned long len);
 extern int		dev_hard_start_xmit(struct sk_buff *skb,
 					    struct net_device *dev,
 					    struct netdev_queue *txq);
--- a/net/core/dev.c	2009-03-17 11:05:54.919541341 -0700
+++ b/net/core/dev.c	2009-03-17 11:16:25.717479320 -0700
@@ -983,6 +983,22 @@  int dev_set_alias(struct net_device *dev
 	return len;
 }
 
+/**
+ *
+ *	@dev: device
+ *	@len: limit of frames in transmit queue
+ *
+ *	Set ifalias for a device,
+ */
+int dev_set_tx_queue_len(struct net_device *dev, unsigned long len)
+{
+	if (dev->flags & IFF_UP)
+		dev_deactivate(dev);
+	dev->tx_queue_len = len;
+	if (dev->flags & IFF_UP)
+		dev_activate(dev);
+	return 0;
+}
 
 /**
  *	netdev_features_change - device changes features
@@ -3975,8 +3991,7 @@  static int dev_ifsioc(struct net *net, s
 		case SIOCSIFTXQLEN:
 			if (ifr->ifr_qlen < 0)
 				return -EINVAL;
-			dev->tx_queue_len = ifr->ifr_qlen;
-			return 0;
+			return dev_set_tx_queue_len(dev, ifr->ifr_qlen);
 
 		case SIOCSIFNAME:
 			ifr->ifr_newname[IFNAMSIZ-1] = '\0';
--- a/net/core/net-sysfs.c	2009-03-17 11:05:54.896453584 -0700
+++ b/net/core/net-sysfs.c	2009-03-17 11:10:10.877230558 -0700
@@ -198,10 +198,9 @@  static ssize_t store_flags(struct device
 
 NETDEVICE_SHOW(tx_queue_len, fmt_ulong);
 
-static int change_tx_queue_len(struct net_device *net, unsigned long new_len)
+static int change_tx_queue_len(struct net_device *netdev, unsigned long new_len)
 {
-	net->tx_queue_len = new_len;
-	return 0;
+	return dev_set_tx_queue_len(netdev, new_len);
 }
 
 static ssize_t store_tx_queue_len(struct device *dev,
--- a/net/core/rtnetlink.c	2009-03-17 11:10:47.540541648 -0700
+++ b/net/core/rtnetlink.c	2009-03-17 11:11:28.722542409 -0700
@@ -885,7 +885,7 @@  static int do_setlink(struct net_device 
 	}
 
 	if (tb[IFLA_TXQLEN])
-		dev->tx_queue_len = nla_get_u32(tb[IFLA_TXQLEN]);
+		dev_set_tx_queue_len(dev, nla_get_u32(tb[IFLA_TXQLEN]));
 
 	if (tb[IFLA_OPERSTATE])
 		set_operstate(dev, nla_get_u8(tb[IFLA_OPERSTATE]));