Message ID | 1438203103-27013-2-git-send-email-phil@nwl.cc |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
Phil Sutter <phil@nwl.cc> wrote: > Signed-off-by: Phil Sutter <phil@nwl.cc> > --- > net/Kconfig | 12 ++++++++++++ > net/ethernet/eth.c | 2 +- > 2 files changed, 13 insertions(+), 1 deletion(-) > > diff --git a/net/Kconfig b/net/Kconfig > index 7021c1b..21c164f 100644 > --- a/net/Kconfig > +++ b/net/Kconfig > @@ -48,6 +48,18 @@ config COMPAT_NETLINK_MESSAGES > config NET_INGRESS > bool > > +config DEFAULT_TX_QUEUE_LEN > + prompt "Default TX queue length (in packets)" if EXPERT > + int > + default 1000 # Ethernet wants good queues > + help > + Set the default value of tx_queue_len for newly created network > + interfaces. It is used by queueing disciplines to determine how many > + packets to keep in backlog before starting to drop new ones. > + > + The default value of 1000 packets is there for a very long time and > + in combination with GSO way too big. > + I can't see how this could be used in a meaningful way. No distro is going to touch this. I don't think sysctl value would help either. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 29, 2015 at 11:06:18PM +0200, Florian Westphal wrote: > Phil Sutter <phil@nwl.cc> wrote: [...] > > +config DEFAULT_TX_QUEUE_LEN > > + prompt "Default TX queue length (in packets)" if EXPERT > > + int > > + default 1000 # Ethernet wants good queues > > + help > > + Set the default value of tx_queue_len for newly created network > > + interfaces. It is used by queueing disciplines to determine how many > > + packets to keep in backlog before starting to drop new ones. > > + > > + The default value of 1000 packets is there for a very long time and > > + in combination with GSO way too big. > > + > > I can't see how this could be used in a meaningful way. > > No distro is going to touch this. > > I don't think sysctl value would help either. I just didn't want to introduce yet another magic value assignment. It's merely a #define with a little flexibility and a subtle note that the default should be changed attached. Cheers, Phil -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Phil Sutter <phil@nwl.cc> Date: Wed, 29 Jul 2015 23:34:28 +0200 > On Wed, Jul 29, 2015 at 11:06:18PM +0200, Florian Westphal wrote: >> Phil Sutter <phil@nwl.cc> wrote: > [...] >> > +config DEFAULT_TX_QUEUE_LEN >> > + prompt "Default TX queue length (in packets)" if EXPERT >> > + int >> > + default 1000 # Ethernet wants good queues >> > + help >> > + Set the default value of tx_queue_len for newly created network >> > + interfaces. It is used by queueing disciplines to determine how many >> > + packets to keep in backlog before starting to drop new ones. >> > + >> > + The default value of 1000 packets is there for a very long time and >> > + in combination with GSO way too big. >> > + >> >> I can't see how this could be used in a meaningful way. >> >> No distro is going to touch this. >> >> I don't think sysctl value would help either. > > I just didn't want to introduce yet another magic value assignment. It's > merely a #define with a little flexibility and a subtle note that the > default should be changed attached. Like others have mentioned, fix the _REAL_ issue. Which is that there are devices (virtual or whatever) which don't want a qdisc attached no matter what. Flag those devices as such and adjust the qdisc attachment logic to check that new flag. Anything is better than hacking the queue len. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 29 Jul 2015 14:37:31 -0700 (PDT) David Miller <davem@davemloft.net> wrote: [...] > Which is that there are devices (virtual or whatever) which don't want > a qdisc attached no matter what. Flag those devices as such and > adjust the qdisc attachment logic to check that new flag. I agree on the approach DaveM are suggesting. But virtual devices must support getting a qdisc attached. I know of many companies depending on this behavior. Some times people just get hit by this "strange" zero len issues when they happen to use pfifo_fast as leaf node. > Anything is better than hacking the queue len. The hole problem comes from the double meaning of the queue len. e.g. that value 0 have special meaning, but only during assigning the default qdisc. And pfifo_fast will use queue len zero if assigned. (proposed solution:) As DaveM also suggested, I would likely use a device flag to indicate the device does not require any qdisc, and not assign any qdisc (actually "noqueue") in case the default qdisc is chosen for this device. This should solve the problem for veth. And then we should cleanup all the virtual devices, adding this flag and changing the dev->tx_queue_len to the default value (e.g. remove setting it to zero).
On Tue, Aug 11, 2015 at 05:48:07PM +0200, Jesper Dangaard Brouer wrote: > > On Wed, 29 Jul 2015 14:37:31 -0700 (PDT) David Miller <davem@davemloft.net> wrote: > [...] > > Which is that there are devices (virtual or whatever) which don't want > > a qdisc attached no matter what. Flag those devices as such and > > adjust the qdisc attachment logic to check that new flag. > > I agree on the approach DaveM are suggesting. > > But virtual devices must support getting a qdisc attached. I know of > many companies depending on this behavior. Some times people just get > hit by this "strange" zero len issues when they happen to use > pfifo_fast as leaf node. That was just how I (purposely mis-)interpreted David's suggestion. > > Anything is better than hacking the queue len. > > The hole problem comes from the double meaning of the queue len. e.g. > that value 0 have special meaning, but only during assigning the > default qdisc. And pfifo_fast will use queue len zero if assigned. > > (proposed solution:) > > As DaveM also suggested, I would likely use a device flag to indicate > the device does not require any qdisc, and not assign any qdisc > (actually "noqueue") in case the default qdisc is chosen for this > device. > > This should solve the problem for veth. And then we should cleanup all > the virtual devices, adding this flag and changing the > dev->tx_queue_len to the default value (e.g. remove setting it to zero). I have an unfinished solution in the oven, but being kept busy with other things for now. The action plan is as follows: 1) Introduce IFF_NO_QUEUE net_device->priv_flag. 2) Have attach_default_qdiscs() and attach_one_default_qdisc() treat IFF_NO_QUEUE as alternative to tx_queue_len == 0. 3) Add warning to register_netdevice() if tx_queue_len == 0. 4) Change virtual NIC drivers to set IFF_NO_QUEUE and leave tx_queue_len alone. 5) Eventually drop all special handling for tx_queue_len == 0. I am currently somewhere in 2) and need to implement 4) for veth as PoC to check if 2) suffices in all situations we want. Not sure if 3) is desireable at all or if there are valid cases for a literally zero length TX queue length. Cheers, Phil -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Aug 11, 2015 at 06:23:35PM +0200, Phil Sutter wrote: > > I have an unfinished solution in the oven, but being kept busy with > other things for now. The action plan is as follows: > > 1) Introduce IFF_NO_QUEUE net_device->priv_flag. > 2) Have attach_default_qdiscs() and attach_one_default_qdisc() treat > IFF_NO_QUEUE as alternative to tx_queue_len == 0. > 3) Add warning to register_netdevice() if tx_queue_len == 0. > 4) Change virtual NIC drivers to set IFF_NO_QUEUE and leave tx_queue_len > alone. > 5) Eventually drop all special handling for tx_queue_len == 0. > > I am currently somewhere in 2) and need to implement 4) for veth as PoC to > check if 2) suffices in all situations we want. Not sure if 3) is > desireable at all or if there are valid cases for a literally zero > length TX queue length. sounds like you want to change default qdisc from pfifo_fast to noqueue for veth, right? In general 'changing the default' may be an acceptable thing, but then it needs to strongly justified. How much performance does it bring? Also why introduce the flag? Why not just add 'tx_queue_len = 0;' to veth_setup() like the whole bunch of devices do? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2015-08-11 at 18:13 -0700, Alexei Starovoitov wrote: > Also why introduce the flag? Why not just add 'tx_queue_len = 0;' > to veth_setup() like the whole bunch of devices do? Sigh. Because some people install htb or pfifo on veth, leaving tx_queue_len unchanged. If you install htb while tx_queue_len is 0, pfifo created on htb classe can only queue one packet. static int fifo_init(struct Qdisc *sch, struct nlattr *opt) { bool bypass; bool is_bfifo = sch->ops == &bfifo_qdisc_ops; if (opt == NULL) { u32 limit = qdisc_dev(sch)->tx_queue_len ? : 1; if (is_bfifo) limit *= psched_mtu(qdisc_dev(sch)); sch->limit = limit; Changing veth txqueuelen is too late. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Aug 11, 2015 at 06:13:49PM -0700, Alexei Starovoitov wrote: > On Tue, Aug 11, 2015 at 06:23:35PM +0200, Phil Sutter wrote: > > > > I have an unfinished solution in the oven, but being kept busy with > > other things for now. The action plan is as follows: > > > > 1) Introduce IFF_NO_QUEUE net_device->priv_flag. > > 2) Have attach_default_qdiscs() and attach_one_default_qdisc() treat > > IFF_NO_QUEUE as alternative to tx_queue_len == 0. > > 3) Add warning to register_netdevice() if tx_queue_len == 0. > > 4) Change virtual NIC drivers to set IFF_NO_QUEUE and leave tx_queue_len > > alone. > > 5) Eventually drop all special handling for tx_queue_len == 0. > > > > I am currently somewhere in 2) and need to implement 4) for veth as PoC to > > check if 2) suffices in all situations we want. Not sure if 3) is > > desireable at all or if there are valid cases for a literally zero > > length TX queue length. > > sounds like you want to change default qdisc from pfifo_fast to noqueue > for veth, right? > In general 'changing the default' may be an acceptable thing, but then > it needs to strongly justified. How much performance does it bring? > Also why introduce the flag? Why not just add 'tx_queue_len = 0;' > to veth_setup() like the whole bunch of devices do? A quick test on my local VM with veth and netperf (netserver and veth peer in different netns) I see an increase of about 5% of throughput when using noqueue instead of the default pfifo_fast. Cheers, Phil -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 13 Aug 2015 03:13:40 +0200 Phil Sutter <phil@nwl.cc> wrote: > On Tue, Aug 11, 2015 at 06:13:49PM -0700, Alexei Starovoitov wrote: > > In general 'changing the default' may be an acceptable thing, but then > > it needs to strongly justified. How much performance does it bring? > > A quick test on my local VM with veth and netperf (netserver and veth > peer in different netns) I see an increase of about 5% of throughput > when using noqueue instead of the default pfifo_fast. Good that you can show 5% improvement with a single netperf flow. We are saving approx 6 atomic operations avoiding the qdisc code path. This fixes a scalability issue with veth. Thus, the real performance boost will happen with multiple flows and multiple CPU cores in action. You can try with a multi core VM and use super_netperf. https://github.com/borkmann/stuff/blob/master/super_netperf
On Thu, Aug 13, 2015 at 03:10:33PM +0200, Jesper Dangaard Brouer wrote: > > On Thu, 13 Aug 2015 03:13:40 +0200 Phil Sutter <phil@nwl.cc> wrote: > > > On Tue, Aug 11, 2015 at 06:13:49PM -0700, Alexei Starovoitov wrote: > > > > In general 'changing the default' may be an acceptable thing, but then > > > it needs to strongly justified. How much performance does it bring? > > > > A quick test on my local VM with veth and netperf (netserver and veth > > peer in different netns) I see an increase of about 5% of throughput > > when using noqueue instead of the default pfifo_fast. > > Good that you can show 5% improvement with a single netperf flow. We > are saving approx 6 atomic operations avoiding the qdisc code path. > > This fixes a scalability issue with veth. Thus, the real performance > boost will happen with multiple flows and multiple CPU cores in > action. You can try with a multi core VM and use super_netperf. > > https://github.com/borkmann/stuff/blob/master/super_netperf I actually used that on my VM as well, but the difference between a single and ten streams in parallel was negligible. In order to avoid tampering the results, I tested again on a physical system with four cores, ran each benchmark ten times and built an average over the results. This showed an increase in throughput of about 35% with a single stream and about 10% with ten streams in parallel. Not sure though why the improvement is bigger in the first case if there really is a scalability problem as you say. Cheers, Phil -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/Kconfig b/net/Kconfig index 7021c1b..21c164f 100644 --- a/net/Kconfig +++ b/net/Kconfig @@ -48,6 +48,18 @@ config COMPAT_NETLINK_MESSAGES config NET_INGRESS bool +config DEFAULT_TX_QUEUE_LEN + prompt "Default TX queue length (in packets)" if EXPERT + int + default 1000 # Ethernet wants good queues + help + Set the default value of tx_queue_len for newly created network + interfaces. It is used by queueing disciplines to determine how many + packets to keep in backlog before starting to drop new ones. + + The default value of 1000 packets is there for a very long time and + in combination with GSO way too big. + menu "Networking options" source "net/packet/Kconfig" diff --git a/net/ethernet/eth.c b/net/ethernet/eth.c index 77e0f0e..b778586 100644 --- a/net/ethernet/eth.c +++ b/net/ethernet/eth.c @@ -355,7 +355,7 @@ void ether_setup(struct net_device *dev) dev->hard_header_len = ETH_HLEN; dev->mtu = ETH_DATA_LEN; dev->addr_len = ETH_ALEN; - dev->tx_queue_len = 1000; /* Ethernet wants good queues */ + dev->tx_queue_len = CONFIG_DEFAULT_TX_QUEUE_LEN; dev->flags = IFF_BROADCAST|IFF_MULTICAST; dev->priv_flags |= IFF_TX_SKB_SHARING;
Signed-off-by: Phil Sutter <phil@nwl.cc> --- net/Kconfig | 12 ++++++++++++ net/ethernet/eth.c | 2 +- 2 files changed, 13 insertions(+), 1 deletion(-)