Message ID | 1363154258.13690.40.camel@edumazet-glaptop |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, 2013-03-13 at 06:57 +0100, Eric Dumazet wrote: > On Tue, 2013-03-12 at 18:09 +0100, Michael Büsch wrote: > > On Tue, 12 Mar 2013 16:45:44 +0100 > > Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > > > On Tue, 2013-03-12 at 16:17 +0100, Michael Büsch wrote: > > > > Hi, > > > > > > > > Starting with 3.8.x scp stalls the atl1c based interface on my Asus Eeepc 1011px. > > > > iperf (for example) does not do that. But after scp stalled the interface, > > > > iperf transfers fail, too. > > > > > > I am pretty sure David stable list contains the needed fix > > > > > > http://patchwork.ozlabs.org/bundle/davem/stable/?state=* > > > > No this didn't fix it. > > > > However, I tried to revert 69b08f62e17439ee3d436faf0b9a7ca6fffb78db again, > > which already caused trouble for me in 3.7 > > and this fixed the issue. > > > > So it seems that this still is the same or a related issue that I reported > > for 3.7. I just wrongly stated that the problem was fixed in 3.8, because my > > simple ping test doesn't catch it on 3.8. > > > > And it seems the possible fix is here : http://patchwork.ozlabs.org/patch/227666/ -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 14 Mar 2013 15:31:00 +0100 Eric Dumazet <eric.dumazet@gmail.com> wrote: > And it seems the possible fix is here : > > http://patchwork.ozlabs.org/patch/227666/ I can still reproduce with this fix applied. However, I noticed that I cannot reproduce, if the wireless interface (ath9k) of the netbook is down while testing the ethernet. The wireless does not carry any test traffic. It's just idle. I do not know if this always had been the case, because wireless was always up (and mostly idle) in my previous ethernet tests.
On Thu, 2013-03-14 at 23:17 +0100, Michael Büsch wrote: > I can still reproduce with this fix applied. > > However, I noticed that I cannot reproduce, if the wireless interface (ath9k) of > the netbook is down while testing the ethernet. The wireless does not carry any > test traffic. It's just idle. > I do not know if this always had been the case, because wireless was always up (and mostly > idle) in my previous ethernet tests. > OK, then it must be kind of corruption issue in ath9k, or whatever ? You could try various DEBUGing stuff, like CONFIG_DEBUG_PAGEALLOC and CONFIG_SLUB_DEBUG_ON -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 15 Mar 2013 00:06:02 +0100 Eric Dumazet <eric.dumazet@gmail.com> wrote: > You could try various DEBUGing stuff, like CONFIG_DEBUG_PAGEALLOC and > CONFIG_SLUB_DEBUG_ON This bug is so weird, so I did some double-checking. Just to minimize the mistakes on my side. I compiled a kernel without the revert of the original commit and without the skb fix you suggested. It turns out that I am only able to reproduce the issue, if the ath9k interface is up while testing the atl1c ethernet. And I also double-checked that reverting the original commit fixes the issue. No stalls with up or down ath9k then. So that confirms my previous results. I tried to enable pagealloc debug and slub debug on a kernel with the suggested skb fix, but without the revert of the commit. Nothing special appeared in the logs. I'm currently building a kernel with almost all debugging options turned on. I will test that tomorrow. Thanks for your help.
On Fri, 15 Mar 2013 20:44:57 +0100 Michael Büsch <m@bues.ch> wrote: > I'm currently building a kernel with almost all debugging options > turned on. I will test that tomorrow. It took me a little bit longer than expected, but running the tests on a kernel with almost all debugging options enabled shows no additional kernel messages. :/
Any news on this? Am I still the only one with this issue? It's still 100% reproducible and I can workaround it by reverting 69b08f62e17439ee3d436faf0b9a7ca6fffb78db It can't possibly be that I'm the only one on this planet seeing this...
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 821c7f4..769fdac 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -1844,7 +1844,7 @@ static inline void __skb_queue_purge(struct sk_buff_head *list) kfree_skb(skb); } -#define NETDEV_FRAG_PAGE_MAX_ORDER get_order(32768) +#define NETDEV_FRAG_PAGE_MAX_ORDER get_order(8192) #define NETDEV_FRAG_PAGE_MAX_SIZE (PAGE_SIZE << NETDEV_FRAG_PAGE_MAX_ORDER) #define NETDEV_PAGECNT_MAX_BIAS NETDEV_FRAG_PAGE_MAX_SIZE