Patchwork ixgbe: [RFC] [PATCH] Fix return of invalid txq

login
register
mail settings
Submitter Krishna Kumar
Date Jan. 15, 2010, 5:31 a.m.
Message ID <20100115053117.31513.82775.sendpatchset@krkumar2.in.ibm.com>
Download mbox | patch
Permalink /patch/42941/
State Awaiting Upstream
Delegated to: David Miller
Headers show

Comments

Krishna Kumar - Jan. 15, 2010, 5:31 a.m.
A developer had complained of getting lots of warnings:

"eth16 selects TX queue 98, but real number of TX queues is 64"

http://www.mail-archive.com/e1000-devel@lists.sourceforge.net/msg02200.html

As there was no follow up on that bug, I am submitting this
patch assuming that the other return points will not return
invalid txq's, and also that this fixes the bug (not tested).

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
---
 drivers/net/ixgbe/ixgbe_main.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Waskiewicz Jr, Peter P - Jan. 15, 2010, 7:58 a.m.
>-----Original Message-----
>From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
>On Behalf Of Krishna Kumar
>Sent: Thursday, January 14, 2010 9:31 PM
>To: davem@davemloft.net
>Cc: netdev@vger.kernel.org; Kirsher, Jeffrey T; Krishna Kumar
>Subject: ixgbe: [RFC] [PATCH] Fix return of invalid txq
>
>A developer had complained of getting lots of warnings:
>
>"eth16 selects TX queue 98, but real number of TX queues is 64"
>
>http://www.mail-archive.com/e1000-
>devel@lists.sourceforge.net/msg02200.html
>
>As there was no follow up on that bug, I am submitting this
>patch assuming that the other return points will not return
>invalid txq's, and also that this fixes the bug (not tested).
>
>Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>

This looks reasonable to me.  I've been trying to find time to add something like igb has, with a tiny Tx lookup table that maps CPUs into a smaller set of Tx queues, but I like this simple approach to fix the current bug.

We'll want to pull this in for a quick set of regression tests, but again I think this will work well.

Thanks Krishna,

-PJ

>---
> drivers/net/ixgbe/ixgbe_main.c |    5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
>diff -ruNp org/drivers/net/ixgbe/ixgbe_main.c
>new/drivers/net/ixgbe/ixgbe_main.c
>--- org/drivers/net/ixgbe/ixgbe_main.c	2010-01-12 11:50:24.000000000
>+0530
>+++ new/drivers/net/ixgbe/ixgbe_main.c	2010-01-12 11:50:44.000000000
>+0530
>@@ -5514,8 +5514,11 @@ static u16 ixgbe_select_queue(struct net
> 	struct ixgbe_adapter *adapter = netdev_priv(dev);
> 	int txq = smp_processor_id();
>
>-	if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE)
>+	if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE) {
>+		while (unlikely(txq >= dev->real_num_tx_queues))
>+			txq -= dev->real_num_tx_queues;
> 		return txq;
>+	}
>
> #ifdef IXGBE_FCOE
> 	if ((adapter->flags & IXGBE_FLAG_FCOE_ENABLED) &&
>--
>To unsubscribe from this list: send the line "unsubscribe netdev" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - Jan. 15, 2010, 8:43 a.m.
From: Krishna Kumar <krkumar2@in.ibm.com>
Date: Fri, 15 Jan 2010 11:01:17 +0530

> A developer had complained of getting lots of warnings:
> 
> "eth16 selects TX queue 98, but real number of TX queues is 64"
> 
> http://www.mail-archive.com/e1000-devel@lists.sourceforge.net/msg02200.html
> 
> As there was no follow up on that bug, I am submitting this
> patch assuming that the other return points will not return
> invalid txq's, and also that this fixes the bug (not tested).
> 
> Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>

Guys I'm really sick of seeing this bug.  Stop assuming that the
platform has limitations that platforms you happen to work on have.

NR_CPUS can be 4096, so you must assume everywhere that
smp_processor_id() can return just about any number.

Modulo the damn thing, or similar.

Don't use tables or crap like that.  We do like Krishna's scheme
in net/core/dev.c:skb_tx_hash(), and that's what we should do
here too.

Krishna's patch is therefore correct, and I don't want to hear
anything about "tables", unless you're going to add "tables"
to net/core/dev.c :-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - Jan. 15, 2010, 8:44 a.m.
From: "Waskiewicz Jr, Peter P" <peter.p.waskiewicz.jr@intel.com>
Date: Thu, 14 Jan 2010 23:58:17 -0800

> I've been trying to find time to add something like igb has, with a
> tiny Tx lookup table that maps CPUs into a smaller set of Tx queues.

Why do you need "tables"?  Just modulo the it, with whatever
optimizations you can come up with.

Or do we not have enough data references in the TX path already?
:-/

I would suggest getting rid of the table in IGB too.

Either "tables" are a good idea (I think they definitely are not)
or they are not.  And whatever the decision is we should do it
consistently.  net/core/dev.c doesn't use tables, it does the
subtraction modulo thing like Krishna does.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Waskiewicz Jr, Peter P - Jan. 15, 2010, 9 a.m.
On Fri, 2010-01-15 at 00:44 -0800, David Miller wrote:
> From: "Waskiewicz Jr, Peter P" <peter.p.waskiewicz.jr@intel.com>
> Date: Thu, 14 Jan 2010 23:58:17 -0800
> 
> > I've been trying to find time to add something like igb has, with a
> > tiny Tx lookup table that maps CPUs into a smaller set of Tx queues.
> 
> Why do you need "tables"?  Just modulo the it, with whatever
> optimizations you can come up with.
> 
> Or do we not have enough data references in the TX path already?
> :-/
> 
> I would suggest getting rid of the table in IGB too.
> 
> Either "tables" are a good idea (I think they definitely are not)
> or they are not.  And whatever the decision is we should do it
> consistently.  net/core/dev.c doesn't use tables, it does the
> subtraction modulo thing like Krishna does.

What I've been thinking of is more for the NUMA allocations per port.
If we have, say 2 sockets, 8 cores a piece, then we have 16 CPUs.  If we
assign a port to socket 0, I think the best use of resources is to
allocate 8 Rx/Tx queues, one per core in that socket.  If an application
comes from the other socket, we can have a table to map the other 8
cores from that socket into the 8 queues, instead of piling them all
into one of the Tx queues.

Cheers,
-PJ

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - Jan. 15, 2010, 9:06 a.m.
From: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Date: Fri, 15 Jan 2010 01:00:20 -0800

> What I've been thinking of is more for the NUMA allocations per port.
> If we have, say 2 sockets, 8 cores a piece, then we have 16 CPUs.  If we
> assign a port to socket 0, I think the best use of resources is to
> allocate 8 Rx/Tx queues, one per core in that socket.  If an application
> comes from the other socket, we can have a table to map the other 8
> cores from that socket into the 8 queues, instead of piling them all
> into one of the Tx queues.

I fail to see how this can act substantially better than simply
feeding traffic evenly amongst whatever group of queues have
been configured.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Waskiewicz Jr, Peter P - Jan. 16, 2010, 10:53 a.m.
>-----Original Message-----
>From: David Miller [mailto:davem@davemloft.net]
>Sent: Friday, January 15, 2010 1:06 AM
>To: Waskiewicz Jr, Peter P
>Cc: krkumar2@in.ibm.com; netdev@vger.kernel.org; Kirsher, Jeffrey T
>Subject: Re: ixgbe: [RFC] [PATCH] Fix return of invalid txq
>
>From: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
>Date: Fri, 15 Jan 2010 01:00:20 -0800
>
>> What I've been thinking of is more for the NUMA allocations per port.
>> If we have, say 2 sockets, 8 cores a piece, then we have 16 CPUs.  If
>we
>> assign a port to socket 0, I think the best use of resources is to
>> allocate 8 Rx/Tx queues, one per core in that socket.  If an
>application
>> comes from the other socket, we can have a table to map the other 8
>> cores from that socket into the 8 queues, instead of piling them all
>> into one of the Tx queues.
>
>I fail to see how this can act substantially better than simply
>feeding traffic evenly amongst whatever group of queues have
>been configured.

On systems with a large number of cores, and depending on how a NIC is allocated to a set of processors, there is a difference.  If a port is allocated to socket three on a large NUMA system, then it could be CPUs 16-23 that it'd be using, and I'd have 8 Tx/Rx queues.  I can either use the easy approach Krishna has, which could then execute two minus operations, or I can setup a lookup table in the driver on open(), and then I make a single indexed lookup.  Using the lookup table makes the queue selection O(1) for whatever CPU/queue layout we come up with.

Either way works though.  I still think the table is the better way to go, because of the determinism for any system and NIC configuration/layout.  The overhead of configuring the table is taken during open(), so it's not in the hotpath at all.

Cheers,
-PJ
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - Feb. 12, 2010, 7:55 p.m.
From: "Waskiewicz Jr, Peter P" <peter.p.waskiewicz.jr@intel.com>
Date: Sat, 16 Jan 2010 02:53:15 -0800

> Either way works though.  I still think the table is the better way
> to go, because of the determinism for any system and NIC
> configuration/layout.  The overhead of configuring the table is
> taken during open(), so it's not in the hotpath at all.

How many minus operations can your cpu perform in the same amount
of time it takes to access memory? :-)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Waskiewicz Jr, Peter P - Feb. 12, 2010, 8:12 p.m.
On Fri, 2010-02-12 at 12:55 -0700, David Miller wrote:
> From: "Waskiewicz Jr, Peter P" <peter.p.waskiewicz.jr@intel.com>
> Date: Sat, 16 Jan 2010 02:53:15 -0800
> 
> > Either way works though.  I still think the table is the better way
> > to go, because of the determinism for any system and NIC
> > configuration/layout.  The overhead of configuring the table is
> > taken during open(), so it's not in the hotpath at all.
> 
> How many minus operations can your cpu perform in the same amount
> of time it takes to access memory? :-)

Touche. :)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff -ruNp org/drivers/net/ixgbe/ixgbe_main.c new/drivers/net/ixgbe/ixgbe_main.c
--- org/drivers/net/ixgbe/ixgbe_main.c	2010-01-12 11:50:24.000000000 +0530
+++ new/drivers/net/ixgbe/ixgbe_main.c	2010-01-12 11:50:44.000000000 +0530
@@ -5514,8 +5514,11 @@  static u16 ixgbe_select_queue(struct net
 	struct ixgbe_adapter *adapter = netdev_priv(dev);
 	int txq = smp_processor_id();
 
-	if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE)
+	if (adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE) {
+		while (unlikely(txq >= dev->real_num_tx_queues))
+			txq -= dev->real_num_tx_queues;
 		return txq;
+	}
 
 #ifdef IXGBE_FCOE
 	if ((adapter->flags & IXGBE_FLAG_FCOE_ENABLED) &&