diff mbox

[net-next-2.6] tc: report informations for multiqueue devices

Message ID 4A9E699B.7080400@gmail.com
State Superseded, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet Sept. 2, 2009, 12:48 p.m. UTC
Eric Dumazet a écrit :
> David Miller a écrit :
>> From: Eric Dumazet <eric.dumazet@gmail.com>
>> Date: Wed, 02 Sep 2009 10:28:55 +0200
>>
>>> What naming convention should we choose for multiqueue devices ?
>> We could give an index field to multiple root qdiscs assigned
>> to a device.
> 
> Here is a patch then :)
> 
> Only point is that I am iterating from 0 to dev->real_num_tx_queues
> instead of dev->num_tx_queues. I hope it's fine, because there are
> allocated qdisc, but not really used.
> 
> Next patches to allow selective qdisc change/fetch (providing a TCA_QINDEX
> selector value to kernel)
> 
> Thanks
> 
> 
> [PATCH net-next-2.6] tc: report informations for multiqueue devices
> 
> qdisc and classes are not yet displayed by "tc -s -d {qdisc|class} show"
> for multiqueue devices.
> 
> We use a new TCA_QINDEX attribute, to report queue index to user space.
> iproute2 tc should be changed to eventually display this queue index as in :
> 
> $ tc -s -d qdisc
> qdisc pfifo_fast 0: dev eth0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
>  Sent 52498 bytes 465 pkt (dropped 0, overlimits 0 requeues 0)
>  rate 0bit 0pps backlog 0b 0p requeues 0
> qdisc pfifo_fast 0: dev eth0 qindex 1 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
>  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
>  rate 0bit 0pps backlog 0b 0p requeues 0
> 


Here is the iproute2 patch as well, to display queue indexes

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

stephen hemminger Sept. 2, 2009, 4:23 p.m. UTC | #1
On Wed, 02 Sep 2009 14:48:27 +0200
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Eric Dumazet a écrit :
> > David Miller a écrit :
> >> From: Eric Dumazet <eric.dumazet@gmail.com>
> >> Date: Wed, 02 Sep 2009 10:28:55 +0200
> >>
> >>> What naming convention should we choose for multiqueue devices ?
> >> We could give an index field to multiple root qdiscs assigned
> >> to a device.
> > 
> > Here is a patch then :)
> > 
> > Only point is that I am iterating from 0 to dev->real_num_tx_queues
> > instead of dev->num_tx_queues. I hope it's fine, because there are
> > allocated qdisc, but not really used.
> > 
> > Next patches to allow selective qdisc change/fetch (providing a TCA_QINDEX
> > selector value to kernel)
> > 
> > Thanks
> > 
> > 
> > [PATCH net-next-2.6] tc: report informations for multiqueue devices
> > 
> > qdisc and classes are not yet displayed by "tc -s -d {qdisc|class} show"
> > for multiqueue devices.
> > 
> > We use a new TCA_QINDEX attribute, to report queue index to user space.
> > iproute2 tc should be changed to eventually display this queue index as in :
> > 
> > $ tc -s -d qdisc
> > qdisc pfifo_fast 0: dev eth0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
> >  Sent 52498 bytes 465 pkt (dropped 0, overlimits 0 requeues 0)
> >  rate 0bit 0pps backlog 0b 0p requeues 0
> > qdisc pfifo_fast 0: dev eth0 qindex 1 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
> >  Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
> >  rate 0bit 0pps backlog 0b 0p requeues 0
> > 
> 
> 
> Here is the iproute2 patch as well, to display queue indexes
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> ---
> 
> diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
> index ba3254e..b80e0f6 100644
> --- a/include/linux/rtnetlink.h
> +++ b/include/linux/rtnetlink.h
> @@ -490,6 +490,7 @@ enum
>  	TCA_FCNT,
>  	TCA_STATS2,
>  	TCA_STAB,
> +	TCA_QINDEX,
>  	__TCA_MAX
>  };
>  
> diff --git a/tc/tc_class.c b/tc/tc_class.c
> index 9d4eea5..1bc4bc6 100644
> --- a/tc/tc_class.c
> +++ b/tc/tc_class.c
> @@ -196,6 +196,13 @@ int print_class(const struct sockaddr_nl *who,
>  	if (filter_ifindex == 0)
>  		fprintf(fp, "dev %s ", ll_index_to_name(t->tcm_ifindex));
>  
> +	if (tb[TCA_QINDEX]) {
> +		int qindex = 0;
> +
> +		memcpy(&qindex, RTA_DATA(tb[TCA_QINDEX]), MIN(RTA_PAYLOAD(tb[TCA_QINDEX]), sizeof(int)));
> +		fprintf(fp, "qindex %d ", qindex);
> +	}
> +
>  	if (t->tcm_parent == TC_H_ROOT)
>  		fprintf(fp, "root ");
>  	else {

Even if the kernel keeps the value in something else for the API
QINDEX needs to be a fixed size unsigned type like u32.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick McHardy Sept. 2, 2009, 4:30 p.m. UTC | #2
Stephen Hemminger wrote:
> Even if the kernel keeps the value in something else for the API
> QINDEX needs to be a fixed size unsigned type like u32.

Just to avoid duplicate work - I'm currently trying to put a patch
together that presents multiqueue qdiscs to userspace as regular
child qdiscs of a dummy qdisc, which also contains the aggregated
statistics. So far it actually looks pretty sane :)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Sept. 2, 2009, 4:50 p.m. UTC | #3
Patrick McHardy a écrit :
> Stephen Hemminger wrote:
>> Even if the kernel keeps the value in something else for the API
>> QINDEX needs to be a fixed size unsigned type like u32.
> 
> Just to avoid duplicate work - I'm currently trying to put a patch
> together that presents multiqueue qdiscs to userspace as regular
> child qdiscs of a dummy qdisc, which also contains the aggregated
> statistics. So far it actually looks pretty sane :)

Excellent !

I am cooking a patch to make vlan devices multiqueue aware
(ie having same tx_queue numbers than their real_dev)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Sept. 2, 2009, 5:31 p.m. UTC | #4
Patrick McHardy a écrit :
> Stephen Hemminger wrote:
>> Even if the kernel keeps the value in something else for the API
>> QINDEX needs to be a fixed size unsigned type like u32.
> 
> Just to avoid duplicate work - I'm currently trying to put a patch
> together that presents multiqueue qdiscs to userspace as regular
> child qdiscs of a dummy qdisc, which also contains the aggregated
> statistics. So far it actually looks pretty sane :)

There is one thing that bothers me with "tc -s -d qdisc", maybe
its the right time to speak

qdisc pfifo_fast 0: dev eth0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 54823 bytes 490 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0
qdisc pfifo_fast 0: dev eth1 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 468 bytes 6 pkt (dropped 0, overlimits 0 requeues 0)
 rate 0bit 0pps backlog 0b 0p requeues 0


The rate estimation is given by kernel even if no rate estimation is performed.

User space doesnt have an indication saying this 0 numbers are real or fake
(rate 0bit 0pps).

Could we set a bit at Qdisc level when a gen_new_estimator() is really
done one a Qdisc, so that we do not call gnet_stats_copy_rate_est()
from tc_fill_qdisc() if this bit is not set ?

->

qdisc pfifo_fast 0: dev eth0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 54823 bytes 490 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
qdisc pfifo_fast 0: dev eth1 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 468 bytes 6 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick McHardy Sept. 2, 2009, 5:50 p.m. UTC | #5
Eric Dumazet wrote:
> Patrick McHardy a écrit :
>> Just to avoid duplicate work - I'm currently trying to put a patch
>> together that presents multiqueue qdiscs to userspace as regular
>> child qdiscs of a dummy qdisc, which also contains the aggregated
>> statistics. So far it actually looks pretty sane :)
> 
> There is one thing that bothers me with "tc -s -d qdisc", maybe
> its the right time to speak
> 
> qdisc pfifo_fast 0: dev eth0 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
>  Sent 54823 bytes 490 pkt (dropped 0, overlimits 0 requeues 0)
>  rate 0bit 0pps backlog 0b 0p requeues 0
> qdisc pfifo_fast 0: dev eth1 root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
>  Sent 468 bytes 6 pkt (dropped 0, overlimits 0 requeues 0)
>  rate 0bit 0pps backlog 0b 0p requeues 0
> 
> 
> The rate estimation is given by kernel even if no rate estimation is performed.
> 
> User space doesnt have an indication saying this 0 numbers are real or fake
> (rate 0bit 0pps).
> 
> Could we set a bit at Qdisc level when a gen_new_estimator() is really
> done one a Qdisc, so that we do not call gnet_stats_copy_rate_est()
> from tc_fill_qdisc() if this bit is not set ?

That should be possible, I'll look into it once I've finished my patch.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index ba3254e..b80e0f6 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -490,6 +490,7 @@  enum
 	TCA_FCNT,
 	TCA_STATS2,
 	TCA_STAB,
+	TCA_QINDEX,
 	__TCA_MAX
 };
 
diff --git a/tc/tc_class.c b/tc/tc_class.c
index 9d4eea5..1bc4bc6 100644
--- a/tc/tc_class.c
+++ b/tc/tc_class.c
@@ -196,6 +196,13 @@  int print_class(const struct sockaddr_nl *who,
 	if (filter_ifindex == 0)
 		fprintf(fp, "dev %s ", ll_index_to_name(t->tcm_ifindex));
 
+	if (tb[TCA_QINDEX]) {
+		int qindex = 0;
+
+		memcpy(&qindex, RTA_DATA(tb[TCA_QINDEX]), MIN(RTA_PAYLOAD(tb[TCA_QINDEX]), sizeof(int)));
+		fprintf(fp, "qindex %d ", qindex);
+	}
+
 	if (t->tcm_parent == TC_H_ROOT)
 		fprintf(fp, "root ");
 	else {
diff --git a/tc/tc_qdisc.c b/tc/tc_qdisc.c
index c7f2988..3ec2cf4 100644
--- a/tc/tc_qdisc.c
+++ b/tc/tc_qdisc.c
@@ -232,6 +232,14 @@  int print_qdisc(const struct sockaddr_nl *who,
 	fprintf(fp, "qdisc %s %x: ", (char*)RTA_DATA(tb[TCA_KIND]), t->tcm_handle>>16);
 	if (filter_ifindex == 0)
 		fprintf(fp, "dev %s ", ll_index_to_name(t->tcm_ifindex));
+
+	if (tb[TCA_QINDEX]) {
+		int qindex = 0;
+
+		memcpy(&qindex, RTA_DATA(tb[TCA_QINDEX]), MIN(RTA_PAYLOAD(tb[TCA_QINDEX]), sizeof(int)));
+		fprintf(fp, "qindex %d ", qindex);
+	}
+
 	if (t->tcm_parent == TC_H_ROOT)
 		fprintf(fp, "root ");
 	else if (t->tcm_parent) {