diff mbox

[net-next,1/2] net: remove NET_LL_RX_POLL config menue

Message ID 20130611142428.17879.33582.stgit@ladj378.jer.intel.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Eliezer Tamir June 11, 2013, 2:24 p.m. UTC
Remove NET_LL_RX_POLL from the config menu.
Change default to y.
Busy polling still needs to be enabled at runtime via sysctl.

Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
---

 net/Kconfig |   11 ++---------
 1 files changed, 2 insertions(+), 9 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

David Miller June 12, 2013, 10:12 p.m. UTC | #1
From: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Date: Tue, 11 Jun 2013 17:24:28 +0300

>  	depends on X86_TSC

Wait a second, I didn't notice this before.  There needs to be a better
way to test for the accuracy you need, or if the issue is lack of a proper
API for cycle counter reading, fix that rather than add ugly arch
specific dependencies to generic networking code.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stephen Hemminger June 13, 2013, 2:01 a.m. UTC | #2
On Wed, 12 Jun 2013 15:12:05 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> From: Eliezer Tamir <eliezer.tamir@linux.intel.com>
> Date: Tue, 11 Jun 2013 17:24:28 +0300
> 
> >  	depends on X86_TSC
> 
> Wait a second, I didn't notice this before.  There needs to be a better
> way to test for the accuracy you need, or if the issue is lack of a proper
> API for cycle counter reading, fix that rather than add ugly arch
> specific dependencies to generic networking code.

This should be sched_clock(), rather than direct TSC access.
Also any code using TSC or sched_clock has to be carefully audited to deal with
clocks running at different rates on different CPU's. Basically value is only
meaning full on same CPU.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eliezer Tamir June 13, 2013, 2:13 a.m. UTC | #3
On 13/06/2013 05:01, Stephen Hemminger wrote:
> On Wed, 12 Jun 2013 15:12:05 -0700 (PDT)
> David Miller <davem@davemloft.net> wrote:
>
>> From: Eliezer Tamir <eliezer.tamir@linux.intel.com>
>> Date: Tue, 11 Jun 2013 17:24:28 +0300
>>
>>>   	depends on X86_TSC
>>
>> Wait a second, I didn't notice this before.  There needs to be a better
>> way to test for the accuracy you need, or if the issue is lack of a proper
>> API for cycle counter reading, fix that rather than add ugly arch
>> specific dependencies to generic networking code.
>
> This should be sched_clock(), rather than direct TSC access.
> Also any code using TSC or sched_clock has to be carefully audited to deal with
> clocks running at different rates on different CPU's. Basically value is only
> meaning full on same CPU.

OK,

If we covert to sched_clock(), would adding a define such as 
HAVE_HIGH_PRECISION_CLOCK to architectures that have both a high 
precision clock and a 64 bit cycles_t be a good solution?

(if not any other suggestion?)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Borkmann June 13, 2013, 8 a.m. UTC | #4
On 06/13/2013 04:13 AM, Eliezer Tamir wrote:
> On 13/06/2013 05:01, Stephen Hemminger wrote:
>> On Wed, 12 Jun 2013 15:12:05 -0700 (PDT)
>> David Miller <davem@davemloft.net> wrote:
>>
>>> From: Eliezer Tamir <eliezer.tamir@linux.intel.com>
>>> Date: Tue, 11 Jun 2013 17:24:28 +0300
>>>
>>>>       depends on X86_TSC
>>>
>>> Wait a second, I didn't notice this before.  There needs to be a better
>>> way to test for the accuracy you need, or if the issue is lack of a proper
>>> API for cycle counter reading, fix that rather than add ugly arch
>>> specific dependencies to generic networking code.
>>
>> This should be sched_clock(), rather than direct TSC access.
>> Also any code using TSC or sched_clock has to be carefully audited to deal with
>> clocks running at different rates on different CPU's. Basically value is only
>> meaning full on same CPU.
>
> OK,
>
> If we covert to sched_clock(), would adding a define such as HAVE_HIGH_PRECISION_CLOCK to architectures that have both a high precision clock and a 64 bit cycles_t be a good solution?
>
> (if not any other suggestion?)

Hm, probably cpu_clock() and similar might be better, since they use
sched_clock() in the background when !CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
(meaning when sched_clock() provides synchronized highres time source from
the architecture), and, quoting ....

  Otherwise it tries to create a semi stable clock from a mixture of other
  clocks, including:

   - GTOD (clock monotomic)
   - sched_clock()
   - explicit idle events

But yeah, it needs to be evaluated regarding the drift between CPUs in
general.

Then, eventually, you could get rid of the entire NET_LL_RX_POLL config
option plus related ifdefs in the code and have it built-in in general?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eliezer Tamir June 13, 2013, 10:09 a.m. UTC | #5
On 13/06/2013 11:00, Daniel Borkmann wrote:
> On 06/13/2013 04:13 AM, Eliezer Tamir wrote:
>> On 13/06/2013 05:01, Stephen Hemminger wrote:
>>> On Wed, 12 Jun 2013 15:12:05 -0700 (PDT)
>>> David Miller <davem@davemloft.net> wrote:
>>>
>>>> From: Eliezer Tamir <eliezer.tamir@linux.intel.com>
>>>> Date: Tue, 11 Jun 2013 17:24:28 +0300
>>>>
>>>>>       depends on X86_TSC
>>>>
>>>> Wait a second, I didn't notice this before.  There needs to be a better
>>>> way to test for the accuracy you need, or if the issue is lack of a
>>>> proper
>>>> API for cycle counter reading, fix that rather than add ugly arch
>>>> specific dependencies to generic networking code.
>>>
>>> This should be sched_clock(), rather than direct TSC access.
>>> Also any code using TSC or sched_clock has to be carefully audited to
>>> deal with
>>> clocks running at different rates on different CPU's. Basically value
>>> is only
>>> meaning full on same CPU.
>>
>> OK,
>>
>> If we covert to sched_clock(), would adding a define such as
>> HAVE_HIGH_PRECISION_CLOCK to architectures that have both a high
>> precision clock and a 64 bit cycles_t be a good solution?
>>
>> (if not any other suggestion?)
>
> Hm, probably cpu_clock() and similar might be better, since they use
> sched_clock() in the background when !CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
> (meaning when sched_clock() provides synchronized highres time source from
> the architecture), and, quoting ....

I don't think we want the overhead of disabling IRQs
that cpu_clock() adds.

We don't really care about precise measurement.
All we need is a sane cut-off for busy polling.
It's no big deal if on a rare occasion we poll less,
or even poll twice the time.
As long as it's rare it should not matter.

Maybe the answer is not to use cycle counting at all?
Maybe just wait the full sk_rcvtimo?
(resched() when proper, bail out if signal pending, etc.)

This could only be a safe/sane thing to do after we add
a socket option, because this can't be a global setting.

This would of course turn the option into a flag.
If it's set (and !nonblock), busy wait up to sk_recvtimo.

Opinions?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/Kconfig b/net/Kconfig
index d6a9ce6..8fe8845 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -244,16 +244,9 @@  config NETPRIO_CGROUP
 	  a per-interface basis
 
 config NET_LL_RX_POLL
-	bool "Low Latency Receive Poll"
+	boolean
 	depends on X86_TSC
-	default n
-	---help---
-	  Support Low Latency Receive Queue Poll.
-	  (For network card drivers which support this option.)
-	  When waiting for data in read or poll call directly into the the device driver
-	  to flush packets which may be pending on the device queues into the stack.
-
-	  If unsure, say N.
+	default y
 
 config BQL
 	boolean