diff mbox series

[next-queue,v1] igb: Fix limiting the number of queues to number of cpus

Message ID 20190404215632.9881-1-vinicius.gomes@intel.com
State Awaiting Upstream
Delegated to: David Miller
Headers show
Series [next-queue,v1] igb: Fix limiting the number of queues to number of cpus | expand

Commit Message

Vinicius Costa Gomes April 4, 2019, 9:56 p.m. UTC
We have seen some reports[1] of users complaining that they aren't
able to use some queues when their machines have less than 4 cpus.
This affects some TSN workloads, as different traffic classes are
assigned different queues. The current behavior limits the number of
traffic classes that can be reliably handled.

In practice, what is not working, it returns an invalid parameter
error, in hosts with less than 4 cpus is something like this:

$ tc qdisc replace dev IFACE parent root mqprio	\
     	   num_tc 3 map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2	\
	   queues 1@0 1@1 2@2 hw 0

Because changing the default logic of the allocation of queues could
bring other effects, we propose adding a module parameter so expert
users may override that decision.

[1] https://github.com/jeez/iproute2/issues/1

Reported-by: Bhagath Singh Karunakaran <bhagath@kalycito.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
---

A similar fix should also be needed for igc, even if I don't have the
hardware to test, I can produce a patch, if others are able to test.

I am not totally sure that using a module parameter is the best
solution, so, suggestions are welcome.

 drivers/net/ethernet/intel/igb/igb_main.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

Comments

Kirsher, Jeffrey T April 5, 2019, 12:04 a.m. UTC | #1
On Thu, 2019-04-04 at 14:56 -0700, Vinicius Costa Gomes wrote:
> We have seen some reports[1] of users complaining that they aren't
> able to use some queues when their machines have less than 4 cpus.
> This affects some TSN workloads, as different traffic classes are
> assigned different queues. The current behavior limits the number of
> traffic classes that can be reliably handled.
> 
> In practice, what is not working, it returns an invalid parameter
> error, in hosts with less than 4 cpus is something like this:
> 
> $ tc qdisc replace dev IFACE parent root mqprio \
>            num_tc 3 map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
>            queues 1@0 1@1 2@2 hw 0
> 
> Because changing the default logic of the allocation of queues could
> bring other effects, we propose adding a module parameter so expert
> users may override that decision.
> 
> [1] https://github.com/jeez/iproute2/issues/1
> 
> Reported-by: Bhagath Singh Karunakaran <bhagath@kalycito.com>
> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
> ---
> 
> A similar fix should also be needed for igc, even if I don't have the
> hardware to test, I can produce a patch, if others are able to test.
> 
> I am not totally sure that using a module parameter is the best
> solution, so, suggestions are welcome.
> 
>  drivers/net/ethernet/intel/igb/igb_main.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)

A module parameter maybe fine for our out-of-tree driver, but not for the
kernel driver.

NACK on the basis that a new module parameter is being introduced for the
driver.  This is not acceptable by Dave Miller or myself.  As of now, I do
not have a alternative solution to propose unfortunately.

I will discuss the issue with my fellow developers and hopefully we can
come up with a kernel interface that all drivers can use to handle this
issue.
Vinicius Costa Gomes April 5, 2019, 1:32 a.m. UTC | #2
Hi Jeff,

Jeff Kirsher <jeffrey.t.kirsher@intel.com> writes:

> A module parameter maybe fine for our out-of-tree driver, but not for the
> kernel driver.
>
> NACK on the basis that a new module parameter is being introduced for the
> driver.  This is not acceptable by Dave Miller or myself.  As of now, I do
> not have a alternative solution to propose unfortunately.

I understand completely. This patch already served its purpose :-)

>
> I will discuss the issue with my fellow developers and hopefully we can
> come up with a kernel interface that all drivers can use to handle this
> issue.

Thank you.

If it helps, the only other alternative I can think of is a sysctl knob,
something like:

net.core.netdev_max_num_queues

And the default value would be the number of cpus.

Cheers,
--
Vinicius
Vinicius Costa Gomes April 18, 2019, 8:57 p.m. UTC | #3
Hi Jeff,

Jeff Kirsher <jeffrey.t.kirsher@intel.com> writes:

> I will discuss the issue with my fellow developers and hopefully we can
> come up with a kernel interface that all drivers can use to handle this
> issue.

Did you have the chance to discuss this issue?


Cheers,
--
Vinicius
Alexander H Duyck April 18, 2019, 9:28 p.m. UTC | #4
On Thu, Apr 18, 2019 at 1:57 PM Vinicius Costa Gomes
<vinicius.gomes@intel.com> wrote:
>
> Hi Jeff,
>
> Jeff Kirsher <jeffrey.t.kirsher@intel.com> writes:
>
> > I will discuss the issue with my fellow developers and hopefully we can
> > come up with a kernel interface that all drivers can use to handle this
> > issue.
>
> Did you have the chance to discuss this issue?
>
>
> Cheers,
> --
> Vinicius

Is there any reason why you couldn't just use the "ethtool -L" command
to change the number of queues after creating the interface instead of
having to use a module parameter? Just wondering since that would be a
way to change the number of queues, and it should support values
greater than the number of CPUs if I am not mistaken.

Thanks.

- Alex
Vinicius Costa Gomes April 18, 2019, 10:40 p.m. UTC | #5
Hi Alex,

Alexander Duyck <alexander.duyck@gmail.com> writes:

> On Thu, Apr 18, 2019 at 1:57 PM Vinicius Costa Gomes
> <vinicius.gomes@intel.com> wrote:
>>
>> Hi Jeff,
>>
>> Jeff Kirsher <jeffrey.t.kirsher@intel.com> writes:
>>
>> > I will discuss the issue with my fellow developers and hopefully we can
>> > come up with a kernel interface that all drivers can use to handle this
>> > issue.
>>
>> Did you have the chance to discuss this issue?
>>
>>
>> Cheers,
>> --
>> Vinicius
>
> Is there any reason why you couldn't just use the "ethtool -L" command
> to change the number of queues after creating the interface instead of
> having to use a module parameter? Just wondering since that would be a
> way to change the number of queues, and it should support values
> greater than the number of CPUs if I am not mistaken.

No reason at all. Just not being able to remember about that ethtool
parameter. It indeed works even when the number of CPUs are less than
the number of HW queues. Thanks a lot.


Cheers,
--
Vinicius
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 32d61d5a2706..87072d47c305 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -247,6 +247,10 @@  static int debug = -1;
 module_param(debug, int, 0);
 MODULE_PARM_DESC(debug, "Debug level (0=none,...,16=all)");
 
+static unsigned int num_queues;
+module_param(num_queues, uint, 0);
+MODULE_PARM_DESC(num_queues, "Allocate at maximum this number of queues (0=num_cpus(), default)");
+
 struct igb_reg_info {
 	u32 ofs;
 	char *name;
@@ -3763,7 +3767,13 @@  static void igb_init_queue_configuration(struct igb_adapter *adapter)
 	u32 max_rss_queues;
 
 	max_rss_queues = igb_get_max_rss_queues(adapter);
-	adapter->rss_queues = min_t(u32, max_rss_queues, num_online_cpus());
+
+	if (num_queues > 0)
+		adapter->rss_queues = min_t(u32, max_rss_queues,
+					    num_queues);
+	else
+		adapter->rss_queues = min_t(u32, max_rss_queues,
+					    num_online_cpus());
 
 	igb_set_flag_queue_pairs(adapter, max_rss_queues);
 }