diff mbox

BQL + Basic Latency under load results - 100Mbit, GSO/TSO off, pfifo_fast vs SFQ vs QFQ

Message ID 1325478811.2526.10.camel@edumazet-laptop
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet Jan. 2, 2012, 4:33 a.m. UTC
Le lundi 02 janvier 2012 à 00:17 +0100, Dave Taht a écrit :
> QFQ wins even bigger vs SFQ at 50 iperfs
> 
> http://www.teklibre.com/~d/bloat/pfifo_sfq_vs_qfq_linear50.png
> 
> And I think it's going to win even bigger at 10 Mbit.
> 

Happy new year !

This makes no sense to me for such a low amount of flows, SFQ should
perform the same than QFQ :)

You dont find out why it is so.



Please try following patch :

[PATCH net-next] sch_sfq: dont put new flow at the end of flows

SFQ enqueue algo puts a new flow _behind_ all pre-existing flows in the
circular list. In fact this is probably an old SFQ implementation bug.

100 Mbits = ~8333 full frames per second, or ~8 frames per ms.

With 50 flows, it means your "new flow" will have to wait 50 packets
being sent before its own packet. Thats the ~6ms.

We certainly can change SFQ to give a priority advantage to new flows,
so that next dequeued packet is taken from a new flow, not an old one.

Reported-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Eric Dumazet Jan. 2, 2012, 4:55 a.m. UTC | #1
Le lundi 02 janvier 2012 à 05:33 +0100, Eric Dumazet a écrit :
> Le lundi 02 janvier 2012 à 00:17 +0100, Dave Taht a écrit :
> > QFQ wins even bigger vs SFQ at 50 iperfs
> > 
> > http://www.teklibre.com/~d/bloat/pfifo_sfq_vs_qfq_linear50.png
> > 
> > And I think it's going to win even bigger at 10 Mbit.
> > 
> 
> Happy new year !
> 
> This makes no sense to me for such a low amount of flows, SFQ should
> perform the same than QFQ :)
> 
> You dont find out why it is so.
> 
> 
> 
> Please try following patch :
> 
> [PATCH net-next] sch_sfq: dont put new flow at the end of flows
> 
> SFQ enqueue algo puts a new flow _behind_ all pre-existing flows in the
> circular list. In fact this is probably an old SFQ implementation bug.
> 
> 100 Mbits = ~8333 full frames per second, or ~8 frames per ms.
> 
> With 50 flows, it means your "new flow" will have to wait 50 packets
> being sent before its own packet. Thats the ~6ms.
> 
> We certainly can change SFQ to give a priority advantage to new flows,
> so that next dequeued packet is taken from a new flow, not an old one.
> 
> Reported-by: Dave Taht <dave.taht@gmail.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
> diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
> index c23b957..f7f62a5 100644
> --- a/net/sched/sch_sfq.c
> +++ b/net/sched/sch_sfq.c
> @@ -366,11 +366,11 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc *sch)
>  	if (slot->qlen == 1) {		/* The flow is new */
>  		if (q->tail == NULL) {	/* It is the first flow */
>  			slot->next = x;
> +			q->tail = slot;
>  		} else {
>  			slot->next = q->tail->next;
>  			q->tail->next = x;
>  		}
> -		q->tail = slot;
>  		slot->allot = q->scaled_quantum;
>  	}
>  	if (++sch->q.qlen <= q->limit)
> 

I tested this patch with a 50 concurrent netperf workload, and indeed
this fixes the problem for me.

# ping 192.168.20.108
PING 192.168.20.108 (192.168.20.108) 56(84) bytes of data.
64 bytes from 192.168.20.108: icmp_req=1 ttl=64 time=0.021 ms
64 bytes from 192.168.20.108: icmp_req=2 ttl=64 time=0.011 ms
64 bytes from 192.168.20.108: icmp_req=3 ttl=64 time=0.011 ms
64 bytes from 192.168.20.108: icmp_req=4 ttl=64 time=0.010 ms
64 bytes from 192.168.20.108: icmp_req=5 ttl=64 time=0.010 ms
64 bytes from 192.168.20.108: icmp_req=6 ttl=64 time=0.010 ms


# tc -s -d qdisc show dev eth3
qdisc htb 1: root refcnt 18 r2q 10 default 1 direct_packets_stat 0 ver 3.17
 Sent 1178661043 bytes 806848 pkt (dropped 9068, overlimits 382834 requeues 3) 
 rate 97748Kbit 8355pps backlog 0b 122p requeues 3 
qdisc sfq 10: parent 1:1 limit 127p quantum 1514b flows 127/1024 divisor 1024 
 Sent 1178661043 bytes 806848 pkt (dropped 17568, overlimits 0 requeues 0) 
 rate 97748Kbit 8355pps backlog 962708b 122p requeues 0 


# tc -s -d cl show dev eth3
class htb 1:1 root leaf 10: prio 0 quantum 80000 rate 100000Kbit ceil 100000Kbit 
burst 40000b/256 mpu 0b overhead 0b cburst 40000b/256 mpu 0b overhead 0b level 0 
 Sent 1367302331 bytes 935622 pkt (dropped 10494, overlimits 0 requeues 0) 
 rate 100156Kbit 8560pps backlog 0b 125p requeues 0 
 lended: 207922 borrowed: 0 giants: 0
 tokens: -14605 ctokens: -14605

class sfq 10:15 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 14678b 3p requeues 0 
 allot -3264 

class sfq 10:17 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 20Kb 3p requeues 0 
 allot -6912 

class sfq 10:22 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 17508b 2p requeues 0 
 allot -808 

class sfq 10:3c parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 17508b 2p requeues 0 
 allot -4208 

class sfq 10:5e parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 27Kb 2p requeues 0 
 allot -6024 

class sfq 10:66 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 11716b 2p requeues 0 
 allot -4856 

class sfq 10:8d parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 14612b 2p requeues 0 
 allot 1296 

class sfq 10:9b parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 7504b 4p requeues 0 
 allot 1184 

class sfq 10:9c parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 14612b 2p requeues 0 
 allot -104 

class sfq 10:a9 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 18956b 2p requeues 0 
 allot -5456 

class sfq 10:ab parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 14678b 3p requeues 0 
 allot -88 

class sfq 10:ba parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot -624 

class sfq 10:c3 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 27710b 3p requeues 0 
 allot -8440 

class sfq 10:ce parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 18956b 2p requeues 0 
 allot -3160 

class sfq 10:f2 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 20Kb 3p requeues 0 
 allot -672 

class sfq 10:11c parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 5990b 3p requeues 0 
 allot -712 

class sfq 10:17c parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 17508b 2p requeues 0 
 allot -4000 

class sfq 10:188 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 16060b 2p requeues 0 
 allot -6640 

class sfq 10:1b9 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 10202b 1p requeues 0 
 allot 120 

class sfq 10:1c2 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 5924b 2p requeues 0 
 allot 264 

class sfq 10:1cb parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 19022b 3p requeues 0 
 allot -2904 

class sfq 10:1e5 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 23366b 3p requeues 0 
 allot -7800 

class sfq 10:1f9 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 8984b 4p requeues 0 
 allot 560 

class sfq 10:210 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 17574b 3p requeues 0 
 allot -384 

class sfq 10:231 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 19022b 3p requeues 0 
 allot 1416 

class sfq 10:27c parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 20404b 2p requeues 0 
 allot 752 

class sfq 10:289 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 19022b 3p requeues 0 
 allot 1208 

class sfq 10:2a3 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 14612b 2p requeues 0 
 allot -2592 

class sfq 10:2a6 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 19022b 3p requeues 0 
 allot -5008 

class sfq 10:2c7 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 7570b 5p requeues 0 
 allot -1168 

class sfq 10:2d0 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 8820b 2p requeues 0 
 allot -1088 

class sfq 10:2e5 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 20404b 2p requeues 0 
 allot -10648 

class sfq 10:2ec parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 13164b 2p requeues 0 
 allot -1800 

class sfq 10:2ee parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 20Kb 3p requeues 0 
 allot -9016 

class sfq 10:305 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 10400b 4p requeues 0 
 allot -872 

class sfq 10:318 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 504 

class sfq 10:31f parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 10400b 4p requeues 0 
 allot -4024 

class sfq 10:328 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 16060b 2p requeues 0 
 allot -16 

class sfq 10:32d parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 21918b 3p requeues 0 
 allot -840 

class sfq 10:37d parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 14612b 2p requeues 0 
 allot 432 

class sfq 10:3ac parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1224 

class sfq 10:3d0 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 20Kb 3p requeues 0 
 allot -11232 

class sfq 10:3d1 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 26262b 3p requeues 0 
 allot -1832 

class sfq 10:3d8 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 5924b 2p requeues 0 
 allot -512 

class sfq 10:3e7 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 29158b 3p requeues 0 
 allot -10976 

class sfq 10:3f0 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 32054b 3p requeues 0 
 allot -2080 

class sfq 10:3f9 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 18956b 2p requeues 0 
 allot -1704 


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Jan. 2, 2012, 5:07 a.m. UTC | #2
Le lundi 02 janvier 2012 à 05:55 +0100, Eric Dumazet a écrit :

> I tested this patch with a 50 concurrent netperf workload, and indeed
> this fixes the problem for me.
> 
> # ping 192.168.20.108
> PING 192.168.20.108 (192.168.20.108) 56(84) bytes of data.
> 64 bytes from 192.168.20.108: icmp_req=1 ttl=64 time=0.021 ms
> 64 bytes from 192.168.20.108: icmp_req=2 ttl=64 time=0.011 ms
> 64 bytes from 192.168.20.108: icmp_req=3 ttl=64 time=0.011 ms
> 64 bytes from 192.168.20.108: icmp_req=4 ttl=64 time=0.010 ms
> 64 bytes from 192.168.20.108: icmp_req=5 ttl=64 time=0.010 ms
> 64 bytes from 192.168.20.108: icmp_req=6 ttl=64 time=0.010 ms
> 

Oops, pinging a real machine, not myself I get more realistic numbers :)

# ping -c 20 192.168.20.112
PING 192.168.20.112 (192.168.20.112) 56(84) bytes of data.
64 bytes from 192.168.20.112: icmp_req=1 ttl=64 time=0.488 ms
64 bytes from 192.168.20.112: icmp_req=2 ttl=64 time=0.214 ms
64 bytes from 192.168.20.112: icmp_req=3 ttl=64 time=0.696 ms
64 bytes from 192.168.20.112: icmp_req=4 ttl=64 time=0.135 ms
64 bytes from 192.168.20.112: icmp_req=5 ttl=64 time=0.110 ms
64 bytes from 192.168.20.112: icmp_req=6 ttl=64 time=0.401 ms
64 bytes from 192.168.20.112: icmp_req=7 ttl=64 time=0.378 ms
64 bytes from 192.168.20.112: icmp_req=8 ttl=64 time=0.384 ms
64 bytes from 192.168.20.112: icmp_req=9 ttl=64 time=1.03 ms
64 bytes from 192.168.20.112: icmp_req=10 ttl=64 time=0.439 ms
64 bytes from 192.168.20.112: icmp_req=11 ttl=64 time=0.126 ms
64 bytes from 192.168.20.112: icmp_req=12 ttl=64 time=0.093 ms
64 bytes from 192.168.20.112: icmp_req=13 ttl=64 time=0.834 ms
64 bytes from 192.168.20.112: icmp_req=14 ttl=64 time=0.696 ms
64 bytes from 192.168.20.112: icmp_req=15 ttl=64 time=0.776 ms
64 bytes from 192.168.20.112: icmp_req=16 ttl=64 time=0.215 ms
64 bytes from 192.168.20.112: icmp_req=17 ttl=64 time=0.262 ms
64 bytes from 192.168.20.112: icmp_req=18 ttl=64 time=0.554 ms
64 bytes from 192.168.20.112: icmp_req=19 ttl=64 time=0.373 ms
64 bytes from 192.168.20.112: icmp_req=20 ttl=64 time=0.666 ms

--- 192.168.20.112 ping statistics ---
20 packets transmitted, 20 received, 0% packet loss, time 19000ms
rtt min/avg/max/mdev = 0.093/0.443/1.035/0.264 ms


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Jan. 2, 2012, 5:27 a.m. UTC | #3
Le lundi 02 janvier 2012 à 06:07 +0100, Eric Dumazet a écrit :
> Le lundi 02 janvier 2012 à 05:55 +0100, Eric Dumazet a écrit :
> 
> > I tested this patch with a 50 concurrent netperf workload, and indeed
> > this fixes the problem for me.
> > 
> > # ping 192.168.20.108
> > PING 192.168.20.108 (192.168.20.108) 56(84) bytes of data.
> > 64 bytes from 192.168.20.108: icmp_req=1 ttl=64 time=0.021 ms
> > 64 bytes from 192.168.20.108: icmp_req=2 ttl=64 time=0.011 ms
> > 64 bytes from 192.168.20.108: icmp_req=3 ttl=64 time=0.011 ms
> > 64 bytes from 192.168.20.108: icmp_req=4 ttl=64 time=0.010 ms
> > 64 bytes from 192.168.20.108: icmp_req=5 ttl=64 time=0.010 ms
> > 64 bytes from 192.168.20.108: icmp_req=6 ttl=64 time=0.010 ms
> > 
> 
> Oops, pinging a real machine, not myself I get more realistic numbers :)
> 
> # ping -c 20 192.168.20.112
> PING 192.168.20.112 (192.168.20.112) 56(84) bytes of data.
> 64 bytes from 192.168.20.112: icmp_req=1 ttl=64 time=0.488 ms
> 64 bytes from 192.168.20.112: icmp_req=2 ttl=64 time=0.214 ms
> 64 bytes from 192.168.20.112: icmp_req=3 ttl=64 time=0.696 ms
> 64 bytes from 192.168.20.112: icmp_req=4 ttl=64 time=0.135 ms
> 64 bytes from 192.168.20.112: icmp_req=5 ttl=64 time=0.110 ms
> 64 bytes from 192.168.20.112: icmp_req=6 ttl=64 time=0.401 ms
> 64 bytes from 192.168.20.112: icmp_req=7 ttl=64 time=0.378 ms
> 64 bytes from 192.168.20.112: icmp_req=8 ttl=64 time=0.384 ms
> 64 bytes from 192.168.20.112: icmp_req=9 ttl=64 time=1.03 ms
> 64 bytes from 192.168.20.112: icmp_req=10 ttl=64 time=0.439 ms
> 64 bytes from 192.168.20.112: icmp_req=11 ttl=64 time=0.126 ms
> 64 bytes from 192.168.20.112: icmp_req=12 ttl=64 time=0.093 ms
> 64 bytes from 192.168.20.112: icmp_req=13 ttl=64 time=0.834 ms
> 64 bytes from 192.168.20.112: icmp_req=14 ttl=64 time=0.696 ms
> 64 bytes from 192.168.20.112: icmp_req=15 ttl=64 time=0.776 ms
> 64 bytes from 192.168.20.112: icmp_req=16 ttl=64 time=0.215 ms
> 64 bytes from 192.168.20.112: icmp_req=17 ttl=64 time=0.262 ms
> 64 bytes from 192.168.20.112: icmp_req=18 ttl=64 time=0.554 ms
> 64 bytes from 192.168.20.112: icmp_req=19 ttl=64 time=0.373 ms
> 64 bytes from 192.168.20.112: icmp_req=20 ttl=64 time=0.666 ms
> 
> --- 192.168.20.112 ping statistics ---
> 20 packets transmitted, 20 received, 0% packet loss, time 19000ms
> rtt min/avg/max/mdev = 0.093/0.443/1.035/0.264 ms
> 

And after disabling TSO (as you did in your tests) I get this :
(Note my link is Gigabit, so I had to install a HTB shaper to 100Mbit to
mimic your workload)

# ping -c 20 192.168.20.112
PING 192.168.20.112 (192.168.20.112) 56(84) bytes of data.
64 bytes from 192.168.20.112: icmp_req=1 ttl=64 time=0.113 ms
64 bytes from 192.168.20.112: icmp_req=2 ttl=64 time=0.153 ms
64 bytes from 192.168.20.112: icmp_req=3 ttl=64 time=0.092 ms
64 bytes from 192.168.20.112: icmp_req=4 ttl=64 time=0.095 ms
64 bytes from 192.168.20.112: icmp_req=5 ttl=64 time=0.176 ms
64 bytes from 192.168.20.112: icmp_req=6 ttl=64 time=0.159 ms
64 bytes from 192.168.20.112: icmp_req=7 ttl=64 time=0.169 ms
64 bytes from 192.168.20.112: icmp_req=8 ttl=64 time=0.122 ms
64 bytes from 192.168.20.112: icmp_req=9 ttl=64 time=0.148 ms
64 bytes from 192.168.20.112: icmp_req=10 ttl=64 time=0.123 ms
64 bytes from 192.168.20.112: icmp_req=11 ttl=64 time=0.186 ms
64 bytes from 192.168.20.112: icmp_req=12 ttl=64 time=0.210 ms
64 bytes from 192.168.20.112: icmp_req=13 ttl=64 time=0.142 ms
64 bytes from 192.168.20.112: icmp_req=14 ttl=64 time=0.134 ms
64 bytes from 192.168.20.112: icmp_req=15 ttl=64 time=0.092 ms
64 bytes from 192.168.20.112: icmp_req=16 ttl=64 time=0.187 ms
64 bytes from 192.168.20.112: icmp_req=17 ttl=64 time=0.123 ms
64 bytes from 192.168.20.112: icmp_req=18 ttl=64 time=0.159 ms
64 bytes from 192.168.20.112: icmp_req=19 ttl=64 time=0.142 ms
64 bytes from 192.168.20.112: icmp_req=20 ttl=64 time=0.207 ms

--- 192.168.20.112 ping statistics ---
20 packets transmitted, 20 received, 0% packet loss, time 18999ms
rtt min/avg/max/mdev = 0.092/0.146/0.210/0.037 ms

# tc -s -d cl show dev eth3
class htb 1:1 root leaf 10: prio 0 quantum 80000 rate 100000Kbit ceil 100000Kbit 
burst 40000b/256 mpu 0b overhead 0b cburst 40000b/256 mpu 0b overhead 0b level 0 
 Sent 4569893772 bytes 3107171 pkt (dropped 43457, overlimits 0 requeues 0) 
 rate 75836Kbit 6315pps backlog 0b 127p requeues 0 
 lended: 1204316 borrowed: 0 giants: 0
 tokens: -1892 ctokens: -1892

class sfq 10:1b parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 3028b 2p requeues 0 
 allot 1520 

class sfq 10:25 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:34 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:48 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:4b parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 3028b 2p requeues 0 
 allot -448 

class sfq 10:4c parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:68 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:96 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:a9 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 3028b 2p requeues 0 
 allot 1520 

class sfq 10:c8 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:ca parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:112 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:11a parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:149 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:150 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 3028b 2p requeues 0 
 allot 1520 

class sfq 10:16b parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:171 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:185 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:19f parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:1ac parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:1bb parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:1bf parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 3028b 2p requeues 0 
 allot 1520 

class sfq 10:1dc parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:1e7 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 3028b 2p requeues 0 
 allot 1520 

class sfq 10:1f1 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:201 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:20f parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:216 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:21d parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 3028b 2p requeues 0 
 allot 1520 

class sfq 10:22a parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 3028b 2p requeues 0 
 allot 1520 

class sfq 10:22e parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:263 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:2d4 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:306 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:307 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:344 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:346 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 3028b 2p requeues 0 
 allot 1520 

class sfq 10:34f parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:359 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:36f parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:385 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:3a0 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:3a4 parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 1514b 1p requeues 0 
 allot 0 

class sfq 10:3ac parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 3028b 2p requeues 0 
 allot 1520 

class sfq 10:3ee parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 

class sfq 10:3fe parent 10: 
 (dropped 0, overlimits 0 requeues 0) 
 backlog 4542b 3p requeues 0 
 allot 1520 



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Jan. 3, 2012, 5:52 p.m. UTC | #4
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 02 Jan 2012 05:33:31 +0100

> [PATCH net-next] sch_sfq: dont put new flow at the end of flows
> 
> SFQ enqueue algo puts a new flow _behind_ all pre-existing flows in the
> circular list. In fact this is probably an old SFQ implementation bug.
> 
> 100 Mbits = ~8333 full frames per second, or ~8 frames per ms.
> 
> With 50 flows, it means your "new flow" will have to wait 50 packets
> being sent before its own packet. Thats the ~6ms.
> 
> We certainly can change SFQ to give a priority advantage to new flows,
> so that next dequeued packet is taken from a new flow, not an old one.
> 
> Reported-by: Dave Taht <dave.taht@gmail.com>
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

Applied, thanks Eric.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index c23b957..f7f62a5 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -366,11 +366,11 @@  sfq_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 	if (slot->qlen == 1) {		/* The flow is new */
 		if (q->tail == NULL) {	/* It is the first flow */
 			slot->next = x;
+			q->tail = slot;
 		} else {
 			slot->next = q->tail->next;
 			q->tail->next = x;
 		}
-		q->tail = slot;
 		slot->allot = q->scaled_quantum;
 	}
 	if (++sch->q.qlen <= q->limit)