Message ID | 4AC4FE07.5070204@gmail.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
From: Eric Dumazet <eric.dumazet@gmail.com> Date: Thu, 01 Oct 2009 21:07:51 +0200 > We currently send TCA_STATS_RATE_EST elements to netlink users, even if no estimator > is running. > > # tc -s -d qdisc > qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > Sent 112833764978 bytes 1495081739 pkt (dropped 0, overlimits 0 requeues 0) > rate 0bit 0pps backlog 0b 0p requeues 0 > > User has no way to tell if the "rate 0bit 0pps" is a real estimation, or a fake > one (because no estimator is active) > > After this patch, tc command output is : > $ tc -s -d qdisc > qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > Sent 561075 bytes 1196 pkt (dropped 0, overlimits 0 requeues 0) > backlog 0b 0p requeues 0 > > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> I'm generally fine with this idea. The new behavior is certainly more intuitive even to me :-) Unless there are other objections I'm ok with this and I'll apply your final version when I start taking changes for net-next-2.6 (which is probably right after -rc3 is released). -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller wrote, On 10/01/2009 09:37 PM: > From: Eric Dumazet <eric.dumazet@gmail.com> > Date: Thu, 01 Oct 2009 21:07:51 +0200 > >> We currently send TCA_STATS_RATE_EST elements to netlink users, even if no estimator >> is running. >> >> # tc -s -d qdisc >> qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 >> Sent 112833764978 bytes 1495081739 pkt (dropped 0, overlimits 0 requeues 0) >> rate 0bit 0pps backlog 0b 0p requeues 0 >> >> User has no way to tell if the "rate 0bit 0pps" is a real estimation, or a fake >> one (because no estimator is active) >> >> After this patch, tc command output is : >> $ tc -s -d qdisc >> qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 >> Sent 561075 bytes 1196 pkt (dropped 0, overlimits 0 requeues 0) >> backlog 0b 0p requeues 0 >> >> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> > > I'm generally fine with this idea. > > The new behavior is certainly more intuitive even to me :-) > > Unless there are other objections I'm ok with this and I'll apply Since you ask... I wonder about this whole int plus quite a bit of struct unreadability for one flag only. Maybe it could be queried on qdisc level (with a flag if necessary), and additional parameter of gnet_stats_copy_rate_est()? (Qdiscs should have no problem with setting this param for their classes too.) Jarek P. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Jarek Poplawski <jarkao2@gmail.com> Date: Thu, 01 Oct 2009 23:05:53 +0200 > Since you ask... I wonder about this whole int plus quite a bit of > struct unreadability for one flag only. Maybe it could be queried > on qdisc level (with a flag if necessary), and additional parameter > of gnet_stats_copy_rate_est()? (Qdiscs should have no problem with > setting this param for their classes too.) Certainly, that's another approach to this problem. But logically, just like we wouldn't emit a block of RED scheduler data to 'tc' unless RED is actually configured, it seems consistent to not emit estimator data when no estimator is even there. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller wrote, On 10/01/2009 11:14 PM: > From: Jarek Poplawski <jarkao2@gmail.com> > Date: Thu, 01 Oct 2009 23:05:53 +0200 > >> Since you ask... I wonder about this whole int plus quite a bit of >> struct unreadability for one flag only. Maybe it could be queried >> on qdisc level (with a flag if necessary), and additional parameter >> of gnet_stats_copy_rate_est()? (Qdiscs should have no problem with >> setting this param for their classes too.) > > Certainly, that's another approach to this problem. > > But logically, just like we wouldn't emit a block of RED scheduler > data to 'tc' unless RED is actually configured, it seems consistent to > not emit estimator data when no estimator is even there. Sure! I've exaggerated with this additional parameter. ;-) Jarek P. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01-10-2009 23:21, Jarek Poplawski wrote: > David Miller wrote, On 10/01/2009 11:14 PM: > >> From: Jarek Poplawski <jarkao2@gmail.com> >> Date: Thu, 01 Oct 2009 23:05:53 +0200 >> >>> Since you ask... I wonder about this whole int plus quite a bit of >>> struct unreadability for one flag only. Maybe it could be queried >>> on qdisc level (with a flag if necessary), and additional parameter >>> of gnet_stats_copy_rate_est()? (Qdiscs should have no problem with >>> setting this param for their classes too.) >> Certainly, that's another approach to this problem. >> >> But logically, just like we wouldn't emit a block of RED scheduler >> data to 'tc' unless RED is actually configured, it seems consistent to >> not emit estimator data when no estimator is even there. > > Sure! I've exaggerated with this additional parameter. ;-) To make my point clare: why not something like this?: static int tc_fill_qdisc(struct sk_buff *skb, struct Qdisc *q, u32 clid, u32 pid, u32 seq, u16 flags, int event) { ... if (gnet_stats_copy_basic(&d, &q->bstats) < 0 || (gen_estimator_active(&q->bstats, &q->rate_est) && gnet_stats_copy_rate_est(&d, &q->rate_est) < 0) || gnet_stats_copy_queue(&d, &q->qstats) < 0) goto nla_put_failure; BTW, I'm not sure we need to chanage user visible API for this. (Is it really expected to work after updating gen_stats.h only in iproute?) Jarek P. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jarek Poplawski a écrit : > To make my point clare: why not something like this?: > > static int tc_fill_qdisc(struct sk_buff *skb, struct Qdisc *q, u32 clid, > u32 pid, u32 seq, u16 flags, int event) > { > ... > if (gnet_stats_copy_basic(&d, &q->bstats) < 0 || > (gen_estimator_active(&q->bstats, &q->rate_est) && > gnet_stats_copy_rate_est(&d, &q->rate_est) < 0) || > gnet_stats_copy_queue(&d, &q->qstats) < 0) > goto nla_put_failure; > > BTW, I'm not sure we need to chanage user visible API for this. > (Is it really expected to work after updating gen_stats.h only in > iproute?) > Thats would be better indeed, do you want to work on it or let me do it ? Thanks -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Oct 02, 2009 at 09:12:57AM +0200, Eric Dumazet wrote: > Jarek Poplawski a écrit : > > > To make my point clare: why not something like this?: > > > > static int tc_fill_qdisc(struct sk_buff *skb, struct Qdisc *q, u32 clid, > > u32 pid, u32 seq, u16 flags, int event) > > { > > ... > > if (gnet_stats_copy_basic(&d, &q->bstats) < 0 || > > (gen_estimator_active(&q->bstats, &q->rate_est) && > > gnet_stats_copy_rate_est(&d, &q->rate_est) < 0) || > > gnet_stats_copy_queue(&d, &q->qstats) < 0) > > goto nla_put_failure; > > > > BTW, I'm not sure we need to chanage user visible API for this. > > (Is it really expected to work after updating gen_stats.h only in > > iproute?) > > > > Thats would be better indeed, do you want to work on it or let me do it ? I want you work on it. Thanks, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Oct 02, 2009 at 07:08:19AM +0000, Jarek Poplawski wrote: > On 01-10-2009 23:21, Jarek Poplawski wrote: ... > To make my point clare: [...] Am I clair? ;-) Jarek P. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/linux/gen_stats.h b/include/linux/gen_stats.h index 710e901..7678ded 100644 --- a/include/linux/gen_stats.h +++ b/include/linux/gen_stats.h @@ -30,17 +30,27 @@ struct gnet_stats_basic_packed } __attribute__ ((packed)); /** - * struct gnet_stats_rate_est - rate estimator + * struct gnet_stats_user_rate_est - rate estimator * @bps: current byte rate * @pps: current packet rate */ -struct gnet_stats_rate_est -{ +struct gnet_stats_user_rate_est { __u32 bps; __u32 pps; }; /** + * struct gnet_stats_rate_est - rate estimator with flags + * @est: current byte/packet rate + * @flags: set to one if estimation is valid + */ +struct gnet_stats_rate_est { + struct gnet_stats_user_rate_est est; + int flags; +}; +#define RATE_EST_VALID 1 + +/** * struct gnet_stats_queue - queuing statistics * @qlen: queue length * @backlog: backlog size of queue diff --git a/net/core/gen_estimator.c b/net/core/gen_estimator.c index 493775f..5ba9d90 100644 --- a/net/core/gen_estimator.c +++ b/net/core/gen_estimator.c @@ -129,12 +129,13 @@ static void est_timer(unsigned long arg) brate = (nbytes - e->last_bytes)<<(7 - idx); e->last_bytes = nbytes; e->avbps += (brate >> e->ewma_log) - (e->avbps >> e->ewma_log); - e->rate_est->bps = (e->avbps+0xF)>>5; + e->rate_est->est.bps = (e->avbps+0xF)>>5; rate = (npackets - e->last_packets)<<(12 - idx); e->last_packets = npackets; e->avpps += (rate >> e->ewma_log) - (e->avpps >> e->ewma_log); - e->rate_est->pps = (e->avpps+0x1FF)>>10; + e->rate_est->est.pps = (e->avpps+0x1FF)>>10; + e->rate_est->flags |= RATE_EST_VALID; skip: read_unlock(&est_lock); spin_unlock(e->stats_lock); @@ -227,9 +228,9 @@ int gen_new_estimator(struct gnet_stats_basic_packed *bstats, est->stats_lock = stats_lock; est->ewma_log = parm->ewma_log; est->last_bytes = bstats->bytes; - est->avbps = rate_est->bps<<5; + est->avbps = rate_est->est.bps<<5; est->last_packets = bstats->packets; - est->avpps = rate_est->pps<<10; + est->avpps = rate_est->est.pps<<10; if (!elist[idx].timer.function) { INIT_LIST_HEAD(&elist[idx].list); diff --git a/net/core/gen_stats.c b/net/core/gen_stats.c index 8569310..b6f723c 100644 --- a/net/core/gen_stats.c +++ b/net/core/gen_stats.c @@ -138,13 +138,16 @@ gnet_stats_copy_basic(struct gnet_dump *d, struct gnet_stats_basic_packed *b) int gnet_stats_copy_rate_est(struct gnet_dump *d, struct gnet_stats_rate_est *r) { + if (!(r->flags & RATE_EST_VALID)) + return 0; + if (d->compat_tc_stats) { - d->tc_stats.bps = r->bps; - d->tc_stats.pps = r->pps; + d->tc_stats.bps = r->est.bps; + d->tc_stats.pps = r->est.pps; } if (d->tail) - return gnet_stats_copy(d, TCA_STATS_RATE_EST, r, sizeof(*r)); + return gnet_stats_copy(d, TCA_STATS_RATE_EST, &r->est, sizeof(r->est)); return 0; } diff --git a/net/sched/act_police.c b/net/sched/act_police.c index 723964c..ba01081 100644 --- a/net/sched/act_police.c +++ b/net/sched/act_police.c @@ -292,7 +292,7 @@ static int tcf_act_police(struct sk_buff *skb, struct tc_action *a, police->tcf_bstats.packets++; if (police->tcfp_ewma_rate && - police->tcf_rate_est.bps >= police->tcfp_ewma_rate) { + police->tcf_rate_est.est.bps >= police->tcfp_ewma_rate) { police->tcf_qstats.overlimits++; if (police->tcf_action == TC_ACT_SHOT) police->tcf_qstats.drops++;
We currently send TCA_STATS_RATE_EST elements to netlink users, even if no estimator is running. # tc -s -d qdisc qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 112833764978 bytes 1495081739 pkt (dropped 0, overlimits 0 requeues 0) rate 0bit 0pps backlog 0b 0p requeues 0 User has no way to tell if the "rate 0bit 0pps" is a real estimation, or a fake one (because no estimator is active) After this patch, tc command output is : $ tc -s -d qdisc qdisc pfifo_fast 0: dev eth0 root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 561075 bytes 1196 pkt (dropped 0, overlimits 0 requeues 0) backlog 0b 0p requeues 0 Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> --- include/linux/gen_stats.h | 16 +++++++++++++--- net/core/gen_estimator.c | 9 +++++---- net/core/gen_stats.c | 9 ++++++--- net/sched/act_police.c | 2 +- 4 files changed, 25 insertions(+), 11 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html