diff mbox

[v2] tcp: tsq: restore minimal amount of queueing

Message ID 20131117231545.GA32350@electric-eye.fr.zoreil.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Francois Romieu Nov. 17, 2013, 11:15 p.m. UTC
Holger Hoffstaette <holger.hoffstaette@googlemail.com> :
[...]
> Since I saw this with r8169->r8169 and e1000e->r8169 it's probably
> everyone's favourite r8169 :)
> Unfortunately I can't be more help but if you can suggest/whip up a fix
> I'd be happy to help test.

The r8169 driver does not rely on a timer for Tx completion.

The patch below should not hurt.

Can you describe your system a bit more specifically ?

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Holger Hoffstätte Nov. 18, 2013, 4:26 p.m. UTC | #1
On 11/18/13 00:15, Francois Romieu wrote:
> Holger Hoffstaette <holger.hoffstaette@googlemail.com> :
> [...]
>> Since I saw this with r8169->r8169 and e1000e->r8169 it's probably
>> everyone's favourite r8169 :)
>> Unfortunately I can't be more help but if you can suggest/whip up a fix
>> I'd be happy to help test.
> 
> The r8169 driver does not rely on a timer for Tx completion.

Thankx, that's good to hear.

> The patch below should not hurt.

It does not seem to hurt, but neither can I notice much of a change.
However that's probably because of some other side effects, see below.

Do I understand the diff correctly that it makes the driver perform
outstanding transmissions before budgeting reads? Just curious.

> Can you describe your system a bit more specifically ?

Server has r8169, client is either r8169 (Windows/linux) or Thinkpad
with e1000e. Clients use NFS & Samba. Since Eric's TSQ patch the erratic
3.12.0-vanilla behaviour has "stabilized" in the sense that latency &
throughout became relatively smooth and more or less as expected, both
for large copies and many small files.

However since then I found that increasing the tcp_limit_output_bytes to
262144 (twice the default of 128k) makes things really fly. Copying
large files (>1GB) over NFS from the e1000e now quickly reaches the full
1Gb line throughput. This was really surprising.

Apart from the laptop being relatively old and being difficult to
benchmark due to typical power state scaling, I suspect the e1000e
running with dynamic interrupt moderation is not completely innocent
either. I used to turn this off some years back and had great success,
but that was on Windows.

Long story short there's just too much up- and downscaling, buffering
and queueing involved in all parts to point to any single culprit, but
increasing the byte limit *has* helped everywhere and had no noticeable
impact on internet traffic. I understand the motivation for small queues
from a bufferbloat-fighting point of view (using fq_codel did wonders
for a friend without external router!), but apparently for LAN traffic
this doesn't seem to work in all cases.

Not sure if any of this is helpful. :)

cheers
Holger

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Nov. 18, 2013, 4:47 p.m. UTC | #2
On Mon, 2013-11-18 at 17:26 +0100, Holger Hoffstätte wrote:
> On 11/18/13 00:15, Francois Romieu wrote:
> > Holger Hoffstaette <holger.hoffstaette@googlemail.com> :
> > [...]
> >> Since I saw this with r8169->r8169 and e1000e->r8169 it's probably
> >> everyone's favourite r8169 :)
> >> Unfortunately I can't be more help but if you can suggest/whip up a fix
> >> I'd be happy to help test.
> > 
> > The r8169 driver does not rely on a timer for Tx completion.
> 
> Thankx, that's good to hear.
> 
> > The patch below should not hurt.
> 
> It does not seem to hurt, but neither can I notice much of a change.
> However that's probably because of some other side effects, see below.
> 
> Do I understand the diff correctly that it makes the driver perform
> outstanding transmissions before budgeting reads? Just curious.
> 
> > Can you describe your system a bit more specifically ?
> 
> Server has r8169, client is either r8169 (Windows/linux) or Thinkpad
> with e1000e. Clients use NFS & Samba. Since Eric's TSQ patch the erratic
> 3.12.0-vanilla behaviour has "stabilized" in the sense that latency &
> throughout became relatively smooth and more or less as expected, both
> for large copies and many small files.
> 
> However since then I found that increasing the tcp_limit_output_bytes to
> 262144 (twice the default of 128k) makes things really fly. Copying
> large files (>1GB) over NFS from the e1000e now quickly reaches the full
> 1Gb line throughput. This was really surprising.
> 
> Apart from the laptop being relatively old and being difficult to
> benchmark due to typical power state scaling, I suspect the e1000e
> running with dynamic interrupt moderation is not completely innocent
> either. I used to turn this off some years back and had great success,
> but that was on Windows.

I think it would make sense to instrument the delay between the
ndo_start_xmit() and kfree_skb() for transmitted skb.

We might have a surprise for some drivers, seeing delays in the order of
several ms ....



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 3397cee..7280d5d 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -6393,12 +6393,12 @@  static int rtl8169_poll(struct napi_struct *napi, int budget)
 	status = rtl_get_events(tp);
 	rtl_ack_events(tp, status & ~tp->event_slow);
 
-	if (status & RTL_EVENT_NAPI_RX)
-		work_done = rtl_rx(dev, tp, (u32) budget);
-
 	if (status & RTL_EVENT_NAPI_TX)
 		rtl_tx(dev, tp);
 
+	if (status & RTL_EVENT_NAPI_RX)
+		work_done = rtl_rx(dev, tp, (u32) budget);
+
 	if (status & tp->event_slow) {
 		enable_mask &= ~tp->event_slow;