Patchwork CHOKe flow scheduler (0.9)

login
register
mail settings
Submitter Eric Dumazet
Date Jan. 15, 2011, 7:45 a.m.
Message ID <1295077542.3977.20.camel@edumazet-laptop>
Download mbox | patch
Permalink /patch/79032/
State RFC
Delegated to: David Miller
Headers show

Comments

Eric Dumazet - Jan. 15, 2011, 7:45 a.m.
Le vendredi 14 janvier 2011 à 15:45 -0800, Stephen Hemminger a écrit :
> CHOKe ("CHOose and Kill" or "CHOose and Keep") is an alternative
> packet scheduler based on the Random Exponential Drop (RED) algorithm.
> 
> The core idea is:
>   For every packet arrival:
>   	Calculate Qave
> 	if (Qave < minth) 
> 	     Queue the new packet
> 	else 
> 	     Select randomly a packet from the queue 
> 	     if (both packets from same flow)
> 	     then Drop both the packets
> 	     else if (Qave > maxth)
> 	          Drop packet
> 	     else
> 	       	  Admit packet with proability p (same as RED)
> 
> See also:
>   Rong Pan, Balaji Prabhakar, Konstantinos Psounis, "CHOKe: a stateless active
>    queue management scheme for approximating fair bandwidth allocation", 
>   Proceeding of INFOCOM'2000, March 2000.
> 
> Help from:
>      Eric Dumazet <eric.dumazet@gmail.com>
>      Patrick McHardy <kaber@trash.net>
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> 
> ---
> This version is based on net-next, and assumes Eric's patch for
> corrected bstats is already applied.
> 
> 0.9 incorporate patches from Patrick/Eric
>     rework the peek_random and drop code to simplify and fix bug where
>     random_N needs to called with full length (including holes).

Nice catch, I now have more "matched" counts after my test :

qdisc choke 11: parent 1:11 limit 130000b min 10833b max 32500b ewma 13 Plog 21 Scell_log 30
 Sent 93944198 bytes 170889 pkt (dropped 829140, overlimits 436686 requeues 0) 
 rate 48bit 0pps backlog 0b 0p requeues 0 
  marked 0 early 436686 pdrop 0 other 0 matched 196227

You missed the qdisc_bstats_update() move from enqueue() to dequeue()

And some minor CodingStyle / checkpatch.pl changes, here is my
latest diff on top of 0.9

I believe you can release v1 :)

Thanks !

 net/sched/sch_choke.c |   35 ++++++++++++++++++-----------------
 1 files changed, 18 insertions(+), 17 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
stephen hemminger - Jan. 17, 2011, 5:25 p.m.
On Sat, 15 Jan 2011 08:45:42 +0100
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Le vendredi 14 janvier 2011 à 15:45 -0800, Stephen Hemminger a écrit :
> > CHOKe ("CHOose and Kill" or "CHOose and Keep") is an alternative
> > packet scheduler based on the Random Exponential Drop (RED) algorithm.
> > 
> > The core idea is:
> >   For every packet arrival:
> >   	Calculate Qave
> > 	if (Qave < minth) 
> > 	     Queue the new packet
> > 	else 
> > 	     Select randomly a packet from the queue 
> > 	     if (both packets from same flow)
> > 	     then Drop both the packets
> > 	     else if (Qave > maxth)
> > 	          Drop packet
> > 	     else
> > 	       	  Admit packet with proability p (same as RED)
> > 
> > See also:
> >   Rong Pan, Balaji Prabhakar, Konstantinos Psounis, "CHOKe: a stateless active
> >    queue management scheme for approximating fair bandwidth allocation", 
> >   Proceeding of INFOCOM'2000, March 2000.
> > 
> > Help from:
> >      Eric Dumazet <eric.dumazet@gmail.com>
> >      Patrick McHardy <kaber@trash.net>
> > 
> > Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> > 
> > ---
> > This version is based on net-next, and assumes Eric's patch for
> > corrected bstats is already applied.
> > 
> > 0.9 incorporate patches from Patrick/Eric
> >     rework the peek_random and drop code to simplify and fix bug where
> >     random_N needs to called with full length (including holes).
> 
> Nice catch, I now have more "matched" counts after my test :
> 
> qdisc choke 11: parent 1:11 limit 130000b min 10833b max 32500b ewma 13 Plog 21 Scell_log 30
>  Sent 93944198 bytes 170889 pkt (dropped 829140, overlimits 436686 requeues 0) 
>  rate 48bit 0pps backlog 0b 0p requeues 0 
>   marked 0 early 436686 pdrop 0 other 0 matched 196227
> 
> You missed the qdisc_bstats_update() move from enqueue() to dequeue()
> 
> And some minor CodingStyle / checkpatch.pl changes, here is my
> latest diff on top of 0.9
> 
> I believe you can release v1 :)
> 
> Thanks !

I rolled in your changes. But there is one more change I want to make.
The existing flow match based on hash is vulnerable to side-channel DoS attack.
It is possible for a hostile flow to send packets that match the same
hash value which would effectively kill a targeted flow.

The solution is to match based on full source and destination, not hash value.
Still coding that up.
Eric Dumazet - Jan. 17, 2011, 5:54 p.m.
Le lundi 17 janvier 2011 à 09:25 -0800, Stephen Hemminger a écrit :

> I rolled in your changes. But there is one more change I want to make.
> The existing flow match based on hash is vulnerable to side-channel DoS attack.
> It is possible for a hostile flow to send packets that match the same
> hash value which would effectively kill a targeted flow.
> 
> The solution is to match based on full source and destination, not hash value.
> Still coding that up.

I see, but you only want to make this full test if (!q->filter_list)  ?

(or precisely only if skb_get_rxhash() was used to get the cookie )





--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/net/sched/sch_choke.c b/net/sched/sch_choke.c
index 55eeb7d..a7605a0 100644
--- a/net/sched/sch_choke.c
+++ b/net/sched/sch_choke.c
@@ -20,7 +20,7 @@ 
 #include <net/red.h>
 
 /*	CHOKe stateless AQM for fair bandwidth allocation
-        =================================================
+	=================================================
 
    CHOKe (CHOose and Keep for responsive flows, CHOose and Kill for
    unresponsive flows) is a variant of RED that penalizes misbehaving flows but
@@ -137,10 +137,10 @@  static void choke_drop_by_idx(struct Qdisc *sch, unsigned int idx)
 }
 
 /* Classify flow using either:
-   1. pre-existing classification result in skb
-   2. fast internal classification
-   3. use TC filter based classification
-*/
+ * 1. pre-existing classification result in skb
+ * 2. fast internal classification (rxhash)
+ * 3. use TC filter based classification
+ */
 static unsigned int choke_classify(struct sk_buff *skb,
 				   struct Qdisc *sch, int *qerr)
 
@@ -176,10 +176,9 @@  static unsigned int choke_classify(struct sk_buff *skb,
 }
 
 /* Select a packet at random from the queue and return flow classification
-   returns 0 if queue empty
+ * returns 0 if queue empty
  */
-static unsigned int choke_peek_random(struct Qdisc *sch,
-				      unsigned int *pidx)
+static unsigned int choke_peek_random(struct Qdisc *sch, unsigned int *pidx)
 {
 	struct choke_sched_data *q = qdisc_priv(sch);
 	const struct sk_buff *skb;
@@ -229,9 +228,9 @@  static int choke_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 		red_end_of_idle_period(p);
 
 	/* Is queue small? */
-	if (p->qavg <= p->qth_min)
+	if (p->qavg <= p->qth_min) {
 		p->qcount = -1;
-	else {
+	} else {
 		unsigned int idx;
 
 		/* Draw a packet at random from queue and compare flow */
@@ -266,8 +265,9 @@  static int choke_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 
 				q->stats.prob_mark++;
 			}
-		} else
+		} else {
 			p->qR = red_random(p);
+		}
 	}
 
 	/* Admit new packet */
@@ -276,7 +276,6 @@  static int choke_enqueue(struct sk_buff *skb, struct Qdisc *sch)
 		q->tail = (q->tail + 1) & q->tab_mask;
 		++sch->q.qlen;
 		sch->qstats.backlog += qdisc_pkt_len(skb);
-		qdisc_bstats_update(sch, skb);
 		return NET_XMIT_SUCCESS;
 	}
 
@@ -306,6 +305,7 @@  static struct sk_buff *choke_dequeue(struct Qdisc *sch)
 	choke_zap_head_holes(q);
 	--sch->q.qlen;
 	sch->qstats.backlog -= qdisc_pkt_len(skb);
+	qdisc_bstats_update(sch, skb);
 
 	return skb;
 }
@@ -316,9 +316,9 @@  static unsigned int choke_drop(struct Qdisc *sch)
 	unsigned int len;
 
 	len = qdisc_queue_drop(sch);
-	if (len > 0)
+	if (len > 0) {
 		q->stats.other++;
-	else {
+	} else {
 		if (!red_is_idling(&q->parms))
 			red_start_of_idle_period(&q->parms);
 	}
@@ -326,7 +326,7 @@  static unsigned int choke_drop(struct Qdisc *sch)
 	return len;
 }
 
-static void choke_reset(struct Qdisc* sch)
+static void choke_reset(struct Qdisc *sch)
 {
 	struct choke_sched_data *q = qdisc_priv(sch);
 
@@ -410,8 +410,9 @@  static int choke_change(struct Qdisc *sch, struct nlattr *opt)
 
 		q->tab_mask = mask;
 		q->tab = ntab;
-	} else
+	} else {
 		sch_tree_lock(sch);
+	}
 
 	q->flags = ctl->flags;
 	q->limit = ctl->limit;
@@ -428,7 +429,7 @@  static int choke_change(struct Qdisc *sch, struct nlattr *opt)
 	return 0;
 }
 
-static int choke_init(struct Qdisc* sch, struct nlattr *opt)
+static int choke_init(struct Qdisc *sch, struct nlattr *opt)
 {
 	return choke_change(sch, opt);
 }