Message ID | 20131117231545.GA32350@electric-eye.fr.zoreil.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On 11/18/13 00:15, Francois Romieu wrote: > Holger Hoffstaette <holger.hoffstaette@googlemail.com> : > [...] >> Since I saw this with r8169->r8169 and e1000e->r8169 it's probably >> everyone's favourite r8169 :) >> Unfortunately I can't be more help but if you can suggest/whip up a fix >> I'd be happy to help test. > > The r8169 driver does not rely on a timer for Tx completion. Thankx, that's good to hear. > The patch below should not hurt. It does not seem to hurt, but neither can I notice much of a change. However that's probably because of some other side effects, see below. Do I understand the diff correctly that it makes the driver perform outstanding transmissions before budgeting reads? Just curious. > Can you describe your system a bit more specifically ? Server has r8169, client is either r8169 (Windows/linux) or Thinkpad with e1000e. Clients use NFS & Samba. Since Eric's TSQ patch the erratic 3.12.0-vanilla behaviour has "stabilized" in the sense that latency & throughout became relatively smooth and more or less as expected, both for large copies and many small files. However since then I found that increasing the tcp_limit_output_bytes to 262144 (twice the default of 128k) makes things really fly. Copying large files (>1GB) over NFS from the e1000e now quickly reaches the full 1Gb line throughput. This was really surprising. Apart from the laptop being relatively old and being difficult to benchmark due to typical power state scaling, I suspect the e1000e running with dynamic interrupt moderation is not completely innocent either. I used to turn this off some years back and had great success, but that was on Windows. Long story short there's just too much up- and downscaling, buffering and queueing involved in all parts to point to any single culprit, but increasing the byte limit *has* helped everywhere and had no noticeable impact on internet traffic. I understand the motivation for small queues from a bufferbloat-fighting point of view (using fq_codel did wonders for a friend without external router!), but apparently for LAN traffic this doesn't seem to work in all cases. Not sure if any of this is helpful. :) cheers Holger -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2013-11-18 at 17:26 +0100, Holger Hoffstätte wrote: > On 11/18/13 00:15, Francois Romieu wrote: > > Holger Hoffstaette <holger.hoffstaette@googlemail.com> : > > [...] > >> Since I saw this with r8169->r8169 and e1000e->r8169 it's probably > >> everyone's favourite r8169 :) > >> Unfortunately I can't be more help but if you can suggest/whip up a fix > >> I'd be happy to help test. > > > > The r8169 driver does not rely on a timer for Tx completion. > > Thankx, that's good to hear. > > > The patch below should not hurt. > > It does not seem to hurt, but neither can I notice much of a change. > However that's probably because of some other side effects, see below. > > Do I understand the diff correctly that it makes the driver perform > outstanding transmissions before budgeting reads? Just curious. > > > Can you describe your system a bit more specifically ? > > Server has r8169, client is either r8169 (Windows/linux) or Thinkpad > with e1000e. Clients use NFS & Samba. Since Eric's TSQ patch the erratic > 3.12.0-vanilla behaviour has "stabilized" in the sense that latency & > throughout became relatively smooth and more or less as expected, both > for large copies and many small files. > > However since then I found that increasing the tcp_limit_output_bytes to > 262144 (twice the default of 128k) makes things really fly. Copying > large files (>1GB) over NFS from the e1000e now quickly reaches the full > 1Gb line throughput. This was really surprising. > > Apart from the laptop being relatively old and being difficult to > benchmark due to typical power state scaling, I suspect the e1000e > running with dynamic interrupt moderation is not completely innocent > either. I used to turn this off some years back and had great success, > but that was on Windows. I think it would make sense to instrument the delay between the ndo_start_xmit() and kfree_skb() for transmitted skb. We might have a surprise for some drivers, seeing delays in the order of several ms .... -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c index 3397cee..7280d5d 100644 --- a/drivers/net/ethernet/realtek/r8169.c +++ b/drivers/net/ethernet/realtek/r8169.c @@ -6393,12 +6393,12 @@ static int rtl8169_poll(struct napi_struct *napi, int budget) status = rtl_get_events(tp); rtl_ack_events(tp, status & ~tp->event_slow); - if (status & RTL_EVENT_NAPI_RX) - work_done = rtl_rx(dev, tp, (u32) budget); - if (status & RTL_EVENT_NAPI_TX) rtl_tx(dev, tp); + if (status & RTL_EVENT_NAPI_RX) + work_done = rtl_rx(dev, tp, (u32) budget); + if (status & tp->event_slow) { enable_mask &= ~tp->event_slow;