From patchwork Tue Dec 21 13:04:59 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Eric Dumazet X-Patchwork-Id: 76291 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 1C923B70A4 for ; Wed, 22 Dec 2010 00:05:26 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750907Ab0LUNFG (ORCPT ); Tue, 21 Dec 2010 08:05:06 -0500 Received: from mail-wy0-f174.google.com ([74.125.82.174]:45576 "EHLO mail-wy0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750776Ab0LUNFE (ORCPT ); Tue, 21 Dec 2010 08:05:04 -0500 Received: by wyb28 with SMTP id 28so3920768wyb.19 for ; Tue, 21 Dec 2010 05:05:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:subject:from:to:cc :in-reply-to:references:content-type:date:message-id:mime-version :x-mailer:content-transfer-encoding; bh=G8SkQ8DLLLEf3naGfa5ensTAt9/ngq/wGqgfU0aBYAw=; b=icXYtxC/xVd2vi7XCrgdNJ3vk+V9bRorZRz/CYrTR/MWeROBTCzYRK1bUvoY/1pNIz yYcZqmmIHpBFp4RII1ng0VuxyfydAovUy7uZLa/RcKZPphnp6RKQBagzJCvCkTAQiFFo HRXiPjVhUOiZrnZBFKb8o1IgTfL0gve2VOwOY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=subject:from:to:cc:in-reply-to:references:content-type:date :message-id:mime-version:x-mailer:content-transfer-encoding; b=lrlVd1YBrpTctJJu/j/oqsbf4MX3s/N7UwzdBsqofpxmV0x+t+OUW4sx7febme+p1q Zk9AafECm/9hEOCHKpCfBdFRYJYiRPLY6fGZSruRwifOAc8LwycdxBf1BqCzfa8Wi5Ha f0uG581boMIbgfvMkBB0giqQSs19epJxuTAAs= Received: by 10.216.11.205 with SMTP id 55mr8714024wex.72.1292936702798; Tue, 21 Dec 2010 05:05:02 -0800 (PST) Received: from [10.150.51.216] (gw0.net.jmsp.net [212.23.165.14]) by mx.google.com with ESMTPS id o19sm2481482wee.2.2010.12.21.05.05.00 (version=SSLv3 cipher=RC4-MD5); Tue, 21 Dec 2010 05:05:01 -0800 (PST) Subject: [PATCH v2 net-next-2.6] sch_sfq: allow big packets and be fair From: Eric Dumazet To: Jarek Poplawski Cc: David Miller , Patrick McHardy , netdev In-Reply-To: <20101221121706.GC8813@ff.dom.local> References: <20101221101506.GA8149@ff.dom.local> <1292929037.2720.12.camel@edumazet-laptop> <20101221113920.GB8813@ff.dom.local> <20101221121706.GC8813@ff.dom.local> Date: Tue, 21 Dec 2010 14:04:59 +0100 Message-ID: <1292936699.2720.23.camel@edumazet-laptop> Mime-Version: 1.0 X-Mailer: Evolution 2.30.3 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Le mardi 21 décembre 2010 à 12:17 +0000, Jarek Poplawski a écrit : > Oops! You're right yet ;-) This skipping shouldn't happen with quantum > bigger than max packet size, so this patch is OK. Thanks Jarek, here is a v2 with the scale you suggested. [PATCH v2 net-next-2.6] sch_sfq: allow big packets and be fair SFQ is currently 'limited' to small packets, because it uses a 15bit allotment number per flow. Introduce a scale by 8, so that we can handle full size TSO/GRO packets. Use appropriate handling to make sure allot is positive before a new packet is dequeued, so that fairness is respected. Signed-off-by: Eric Dumazet Cc: Jarek Poplawski Cc: Patrick McHardy --- v2: Use a scale of 8 as Jarek suggested, instead of 18bit fields net/sched/sch_sfq.c | 28 +++++++++++++++++++++------- 1 file changed, 21 insertions(+), 7 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c index c474b4b..f3a9fd7 100644 --- a/net/sched/sch_sfq.c +++ b/net/sched/sch_sfq.c @@ -67,7 +67,7 @@ IMPLEMENTATION: This implementation limits maximal queue length to 128; - maximal mtu to 2^15-1; max 128 flows, number of hash buckets to 1024. + max mtu to 2^18-1; max 128 flows, number of hash buckets to 1024. The only goal of this restrictions was that all data fit into one 4K page on 32bit arches. @@ -77,6 +77,11 @@ #define SFQ_SLOTS 128 /* max number of flows */ #define SFQ_EMPTY_SLOT 255 #define SFQ_HASH_DIVISOR 1024 +/* We use 15+1 bits to store allot, and want to handle packets up to 64K + * Scale allot by 8 (1<<3) so that no overflow occurs. + */ +#define SFQ_ALLOT_SHIFT 3 +#define SFQ_ALLOT_SIZE(X) DIV_ROUND_UP(X, 1 << SFQ_ALLOT_SHIFT) /* This type should contain at least SFQ_DEPTH + SFQ_SLOTS values */ typedef unsigned char sfq_index; @@ -115,7 +120,7 @@ struct sfq_sched_data struct timer_list perturb_timer; u32 perturbation; sfq_index cur_depth; /* depth of longest slot */ - + unsigned short scaled_quantum; /* SFQ_ALLOT_SIZE(quantum) */ struct sfq_slot *tail; /* current slot in round */ sfq_index ht[SFQ_HASH_DIVISOR]; /* Hash table */ struct sfq_slot slots[SFQ_SLOTS]; @@ -394,7 +399,7 @@ sfq_enqueue(struct sk_buff *skb, struct Qdisc *sch) q->tail->next = x; } q->tail = slot; - slot->allot = q->quantum; + slot->allot = q->scaled_quantum; } if (++sch->q.qlen <= q->limit) { sch->bstats.bytes += qdisc_pkt_len(skb); @@ -430,8 +435,14 @@ sfq_dequeue(struct Qdisc *sch) if (q->tail == NULL) return NULL; +next_slot: a = q->tail->next; slot = &q->slots[a]; + if (slot->allot <= 0) { + q->tail = slot; + slot->allot += q->scaled_quantum; + goto next_slot; + } skb = slot_dequeue_head(slot); sfq_dec(q, a); sch->q.qlen--; @@ -446,9 +457,8 @@ sfq_dequeue(struct Qdisc *sch) return skb; } q->tail->next = next_a; - } else if ((slot->allot -= qdisc_pkt_len(skb)) <= 0) { - q->tail = slot; - slot->allot += q->quantum; + } else { + slot->allot -= SFQ_ALLOT_SIZE(qdisc_pkt_len(skb)); } return skb; } @@ -484,6 +494,7 @@ static int sfq_change(struct Qdisc *sch, struct nlattr *opt) sch_tree_lock(sch); q->quantum = ctl->quantum ? : psched_mtu(qdisc_dev(sch)); + q->scaled_quantum = SFQ_ALLOT_SIZE(q->quantum); q->perturb_period = ctl->perturb_period * HZ; if (ctl->limit) q->limit = min_t(u32, ctl->limit, SFQ_DEPTH - 1); @@ -524,6 +535,7 @@ static int sfq_init(struct Qdisc *sch, struct nlattr *opt) q->tail = NULL; if (opt == NULL) { q->quantum = psched_mtu(qdisc_dev(sch)); + q->scaled_quantum = SFQ_ALLOT_SIZE(q->quantum); q->perturb_period = 0; q->perturbation = net_random(); } else { @@ -610,7 +622,9 @@ static int sfq_dump_class_stats(struct Qdisc *sch, unsigned long cl, struct sfq_sched_data *q = qdisc_priv(sch); const struct sfq_slot *slot = &q->slots[q->ht[cl - 1]]; struct gnet_stats_queue qs = { .qlen = slot->qlen }; - struct tc_sfq_xstats xstats = { .allot = slot->allot }; + struct tc_sfq_xstats xstats = { + .allot = slot->allot << SFQ_ALLOT_SHIFT + }; struct sk_buff *skb; slot_queue_walk(slot, skb)