[RFC] sched: Fix resource limiting in pfifo_fast

Message ID	20090830062344.6380.16713.sendpatchset@localhost.localdomain
State	Accepted, archived
Delegated to:	David Miller
Headers	show Return-Path: <netdev-owner@vger.kernel.org> From: Krishna Kumar <krkumar2@in.ibm.com> To: davem@davemloft.net Cc: netdev@vger.kernel.org, Krishna Kumar <krkumar2@in.ibm.com> Date: Sun, 30 Aug 2009 11:53:44 +0530 Message-Id: <20090830062344.6380.16713.sendpatchset@localhost.localdomain> Subject: [RFC PATCH] sched: Fix resource limiting in pfifo_fast Sender: netdev-owner@vger.kernel.org Precedence: bulk

Krishna Kumar Aug. 30, 2009, 6:23 a.m. UTC

From: Krishna Kumar <krkumar2@in.ibm.com>

pfifo_fast_enqueue has this check:
        if (skb_queue_len(list) < qdisc_dev(qdisc)->tx_queue_len) {

which allows each band to enqueue upto tx_queue_len skbs for a
total of 3*tx_queue_len skbs. I am not sure if this was the
intention of limiting in qdisc.

Patch compiled and 32 simultaneous netperf testing ran fine. Also:
# tc -s qdisc show dev eth2
qdisc pfifo_fast 0: root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 16835026752 bytes 373116 pkt (dropped 0, overlimits 0 requeues 25) 
 rate 0bit 0pps backlog 0b 0p requeues 25 

(I am taking next week off, so sorry for any delay in responding)

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
---

 net/sched/sch_generic.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Eric Dumazet Aug. 30, 2009, 6:54 a.m. UTC | #1

Krishna Kumar a écrit :
> From: Krishna Kumar <krkumar2@in.ibm.com>
> 
> pfifo_fast_enqueue has this check:
>         if (skb_queue_len(list) < qdisc_dev(qdisc)->tx_queue_len) {
> 
> which allows each band to enqueue upto tx_queue_len skbs for a
> total of 3*tx_queue_len skbs. I am not sure if this was the
> intention of limiting in qdisc.

Yes I noticed that and said to myself :
This was to let high priority packets have their own limit,
independent on fact that low priority packets filled their queue.

> 
> Patch compiled and 32 simultaneous netperf testing ran fine. Also:
> # tc -s qdisc show dev eth2
> qdisc pfifo_fast 0: root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
>  Sent 16835026752 bytes 373116 pkt (dropped 0, overlimits 0 requeues 25) 
>  rate 0bit 0pps backlog 0b 0p requeues 25 
> 
> (I am taking next week off, so sorry for any delay in responding)
> 
> Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
> ---
> 
>  net/sched/sch_generic.c |    8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff -ruNp org/net/sched/sch_generic.c new/net/sched/sch_generic.c
> --- org/net/sched/sch_generic.c	2009-08-30 11:18:23.000000000 +0530
> +++ new/net/sched/sch_generic.c	2009-08-30 11:21:50.000000000 +0530
> @@ -432,11 +432,11 @@ static inline struct sk_buff_head *band2
>  
>  static int pfifo_fast_enqueue(struct sk_buff *skb, struct Qdisc* qdisc)
>  {
> -	int band = prio2band[skb->priority & TC_PRIO_MAX];
> -	struct pfifo_fast_priv *priv = qdisc_priv(qdisc);
> -	struct sk_buff_head *list = band2list(priv, band);
> +	if (skb_queue_len(&qdisc->q) < qdisc_dev(qdisc)->tx_queue_len) {
> +		int band = prio2band[skb->priority & TC_PRIO_MAX];
> +		struct pfifo_fast_priv *priv = qdisc_priv(qdisc);
> +		struct sk_buff_head *list = band2list(priv, band);
>  
> -	if (skb_queue_len(list) < qdisc_dev(qdisc)->tx_queue_len) {
>  		priv->bitmap |= (1 << band);
>  		qdisc->q.qlen++;
>  		return __qdisc_enqueue_tail(skb, qdisc, list);

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jarek Poplawski Aug. 30, 2009, 7:47 a.m. UTC | #2

Eric Dumazet wrote, On 08/30/2009 08:54 AM:

> Krishna Kumar a écrit :
>> From: Krishna Kumar <krkumar2@in.ibm.com>
>>
>> pfifo_fast_enqueue has this check:
>>         if (skb_queue_len(list) < qdisc_dev(qdisc)->tx_queue_len) {
>>
>> which allows each band to enqueue upto tx_queue_len skbs for a
>> total of 3*tx_queue_len skbs. I am not sure if this was the
>> intention of limiting in qdisc.
> 
> Yes I noticed that and said to myself :
> This was to let high priority packets have their own limit,
> independent on fact that low priority packets filled their queue.


Good point, but then logically it could be something like:
  if (skb_queue_len(list) < qdisc_dev(qdisc)->tx_queue_len / 3)

Of course, there is a backward compatibility question, plus
an sch_prio consistency question.

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Krishna Kumar Aug. 30, 2009, 8:22 a.m. UTC | #3

I had thought of this reason before submitting. But I felt that if we are
filling up the qdisc due to some problem at driver/device, then the issue
should be handled at a different level (increase tx_queue_len, let
packets drop and TCP handle it, etc).

So I am not sure if it is intentionally designed this way, or whether it
needs fixing to reflect a maximum limit.

Thanks,

- KK

> Eric Dumazet <eric.dumazet@gmail.com>
> Re: [RFC PATCH] sched: Fix resource limiting in pfifo_fast
>
> Krishna Kumar a écrit :
> > From: Krishna Kumar <krkumar2@in.ibm.com>
> >
> > pfifo_fast_enqueue has this check:
> >         if (skb_queue_len(list) < qdisc_dev(qdisc)->tx_queue_len) {
> >
> > which allows each band to enqueue upto tx_queue_len skbs for a
> > total of 3*tx_queue_len skbs. I am not sure if this was the
> > intention of limiting in qdisc.
>
> Yes I noticed that and said to myself :
> This was to let high priority packets have their own limit,
> independent on fact that low priority packets filled their queue.
>
> >
> > Patch compiled and 32 simultaneous netperf testing ran fine. Also:
> > # tc -s qdisc show dev eth2
> > qdisc pfifo_fast 0: root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1
1
> >  Sent 16835026752 bytes 373116 pkt (dropped 0, overlimits 0 requeues
25)
> >  rate 0bit 0pps backlog 0b 0p requeues 25
> >
> > (I am taking next week off, so sorry for any delay in responding)
> >
> > Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
> > ---
> >
> >  net/sched/sch_generic.c |    8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff -ruNp org/net/sched/sch_generic.c new/net/sched/sch_generic.c
> > --- org/net/sched/sch_generic.c   2009-08-30 11:18:23.000000000 +0530
> > +++ new/net/sched/sch_generic.c   2009-08-30 11:21:50.000000000 +0530
> > @@ -432,11 +432,11 @@ static inline struct sk_buff_head *band2
> >
> >  static int pfifo_fast_enqueue(struct sk_buff *skb, struct Qdisc*
qdisc)
> >  {
> > -   int band = prio2band[skb->priority & TC_PRIO_MAX];
> > -   struct pfifo_fast_priv *priv = qdisc_priv(qdisc);
> > -   struct sk_buff_head *list = band2list(priv, band);
> > +   if (skb_queue_len(&qdisc->q) < qdisc_dev(qdisc)->tx_queue_len) {
> > +      int band = prio2band[skb->priority & TC_PRIO_MAX];
> > +      struct pfifo_fast_priv *priv = qdisc_priv(qdisc);
> > +      struct sk_buff_head *list = band2list(priv, band);
> >
> > -   if (skb_queue_len(list) < qdisc_dev(qdisc)->tx_queue_len) {
> >        priv->bitmap |= (1 << band);
> >        qdisc->q.qlen++;
> >        return __qdisc_enqueue_tail(skb, qdisc, list);
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Krishna Kumar Aug. 30, 2009, 8:46 a.m. UTC | #4

> Jarek Poplawski <jarkao2@gmail.com>
> >> pfifo_fast_enqueue has this check:
> >>         if (skb_queue_len(list) < qdisc_dev(qdisc)->tx_queue_len) {
> >>
> >> which allows each band to enqueue upto tx_queue_len skbs for a
> >> total of 3*tx_queue_len skbs. I am not sure if this was the
> >> intention of limiting in qdisc.
> >
> > Yes I noticed that and said to myself :
> > This was to let high priority packets have their own limit,
> > independent on fact that low priority packets filled their queue.
>
>
> Good point, but then logically it could be something like:
>   if (skb_queue_len(list) < qdisc_dev(qdisc)->tx_queue_len / 3)
>
> Of course, there is a backward compatibility question, plus
> an sch_prio consistency question.

Jarek, what is the consistency problem?

Thanks,

- KK

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jarek Poplawski Aug. 30, 2009, 9:57 a.m. UTC | #5

On Sun, Aug 30, 2009 at 02:16:13PM +0530, Krishna Kumar2 wrote:
> > Jarek Poplawski <jarkao2@gmail.com>
> > >> pfifo_fast_enqueue has this check:
> > >>         if (skb_queue_len(list) < qdisc_dev(qdisc)->tx_queue_len) {
> > >>
> > >> which allows each band to enqueue upto tx_queue_len skbs for a
> > >> total of 3*tx_queue_len skbs. I am not sure if this was the
> > >> intention of limiting in qdisc.
> > >
> > > Yes I noticed that and said to myself :
> > > This was to let high priority packets have their own limit,
> > > independent on fact that low priority packets filled their queue.
> >
> >
> > Good point, but then logically it could be something like:
> >   if (skb_queue_len(list) < qdisc_dev(qdisc)->tx_queue_len / 3)
> >
> > Of course, there is a backward compatibility question, plus
> > an sch_prio consistency question.
> 
> Jarek, what is the consistency problem?

Currently pfifo_fast and sch_prio behave similarly wrt. tx_queue_len,
don't they?

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Miller Aug. 31, 2009, 5:19 a.m. UTC | #6

From: Krishna Kumar <krkumar2@in.ibm.com>
Date: Sun, 30 Aug 2009 11:53:44 +0530

> From: Krishna Kumar <krkumar2@in.ibm.com>
> 
> pfifo_fast_enqueue has this check:
>         if (skb_queue_len(list) < qdisc_dev(qdisc)->tx_queue_len) {
> 
> which allows each band to enqueue upto tx_queue_len skbs for a
> total of 3*tx_queue_len skbs. I am not sure if this was the
> intention of limiting in qdisc.
> 
> Patch compiled and 32 simultaneous netperf testing ran fine. Also:
> # tc -s qdisc show dev eth2
> qdisc pfifo_fast 0: root bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
>  Sent 16835026752 bytes 373116 pkt (dropped 0, overlimits 0 requeues 25) 
>  rate 0bit 0pps backlog 0b 0p requeues 25 
> 
> (I am taking next week off, so sorry for any delay in responding)
> 
> Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>

This is probably just a thinko, nice catch.

I think I'll apply this to net-next-2.6, thanks!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC] sched: Fix resource limiting in pfifo_fast

Commit Message

Comments

Patch