Message ID | 20090104151819.GA6590@basil.nowhere.org |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
On Sun, 2009-01-04 at 16:18 +0100, Andi Kleen wrote: > Fix up truesize after pskb_expand_head() in wireless stack > > When using a zd1211rw wireless usb stick I regularly got truesize > warnings in the kernel log. > > This patch fixes those up in the wireless layer. tx.c already > did that, but not rx.c. I think my messages only came from > the middle case, but I fixed up all three users in rx.c > > The underlying problem seems to be that pskb_expand_head() doesn't > manipulate truesize. Perhaps it should? I suspect more users of it have > the same problem. I didn't change the low level code because > I was afraid to break some callers, but perhaps it would be > better to do it this way. Anyways here's a patch that only > changes it in the wireless layer with minimal risk. > > Patch against 2.6.28, but I think linus git still has the same > issue. > > I believe this is a 2.6.28 stable candidate. I even saw > the same problem in 2.6.27. > > Signed-off-by: Andi Kleen <ak@linux.intel.com> Thanks, but I'll need to look at this in more detail, we need to make sure that we orphan the skb before etc. And then, we need to check whether it makes sense to do this in pskb_expand_head(). > --- > net/mac80211/rx.c | 19 ++++++++++++------- > 1 file changed, 12 insertions(+), 7 deletions(-) > > Index: linux-2.6.28-test/net/mac80211/rx.c > =================================================================== > --- linux-2.6.28-test.orig/net/mac80211/rx.c 2008-10-24 13:35:11.000000000 +0200 > +++ linux-2.6.28-test/net/mac80211/rx.c 2009-01-01 15:51:34.000000000 +0100 > @@ -263,10 +263,12 @@ > * probably export the length to drivers so that we can have > * them allocate enough headroom to start with. > */ > - if (skb_headroom(skb) < needed_headroom && > - pskb_expand_head(skb, needed_headroom, 0, GFP_ATOMIC)) { > - dev_kfree_skb(skb); > - return NULL; > + if (skb_headroom(skb) < needed_headroom) { > + if (pskb_expand_head(skb, needed_headroom, 0, GFP_ATOMIC)) { > + dev_kfree_skb(skb); > + return NULL; > + } > + skb->truesize += needed_headroom; > } > } else { > /* > @@ -945,6 +947,7 @@ > __skb_queue_purge(&entry->skb_list); > return RX_DROP_UNUSABLE; > } > + rx->skb->truesize += entry->extra_len; > } > while ((skb = __skb_dequeue(&entry->skb_list))) { > memcpy(skb_put(rx->skb, skb->len), skb->data, skb->len); > @@ -1691,9 +1694,11 @@ > if (rx->flags & IEEE80211_RX_CMNTR_REPORTED) > goto out_free_skb; > > - if (skb_headroom(skb) < sizeof(*rthdr) && > - pskb_expand_head(skb, sizeof(*rthdr), 0, GFP_ATOMIC)) > - goto out_free_skb; > + if (skb_headroom(skb) < sizeof(*rthdr)) { > + if (pskb_expand_head(skb, sizeof(*rthdr), 0, GFP_ATOMIC)) > + goto out_free_skb; > + skb->truesize += sizeof(*rthdr); > + } > > rthdr = (void *)skb_push(skb, sizeof(*rthdr)); > memset(rthdr, 0, sizeof(*rthdr)); > -- > To unsubscribe from this list: send the line "unsubscribe linux-wireless" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >
> Thanks, but I'll need to look at this in more detail, we need to make > sure that we orphan the skb before What do you mean with orphaning the skb? etc. And then, we need to check > whether it makes sense to do this in pskb_expand_head(). Well whatever you do this short term patch is needed, there's no reason to delay it. -Andi -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, 2009-01-04 at 17:28 +0100, Andi Kleen wrote: > > Thanks, but I'll need to look at this in more detail, we need to make > > sure that we orphan the skb before > > What do you mean with orphaning the skb? Well, touching truesize is absolutely not allowed while the skb is charged to a socket. This is what causes the truesize warning. The thing we need to do is figure out is why the skb has a wrong truesize. > etc. And then, we need to check > > whether it makes sense to do this in pskb_expand_head(). > > Well whatever you do this short term patch is needed, there's no > reason to delay it. Given that we've had this problem for a very long time now I think there's no reason to rush a workaround now. I realise that we already have a workaround like this in the tx path which we added because I thought the tx path was the problem, but it still doesn't make much sense to work around it at all spots until we know why it is required. I think this patch similarly just papers over the problem with pskb_expand_head(). I haven't looked through all the code yet, but if anything then I think pskb_expand_head() should fix up truesize afterwards, and we should audit all other callers too. Similar problems exist in net/core/pktgen.c, drivers/net/wireless/libertas/rx.c, net/ipv4/netfilter.c and many more, though those seem to not run into trouble. Only a few users adjust truesize. Any proper fix should also verify that the skb isn't charged to a socket while it's being reallocated. johannes
On Sun, 2009-01-04 at 18:43 +0100, Andi Kleen wrote: > > we need to do is figure out is why the skb has a wrong truesize. > > Because it was expanded without truesize being adjusted (see my > original mail) So why is this not making trouble with all the other users? johannes
On Sun, Jan 04, 2009 at 05:41:28PM +0100, Johannes Berg wrote: > On Sun, 2009-01-04 at 17:28 +0100, Andi Kleen wrote: > > > Thanks, but I'll need to look at this in more detail, we need to make > > > sure that we orphan the skb before > > > > What do you mean with orphaning the skb? > > Well, touching truesize is absolutely not allowed while the skb is > charged to a socket. This is what causes the truesize warning. The thing The cause of the warnings was because the skb changed, but truesize didn't. > we need to do is figure out is why the skb has a wrong truesize. Because it was expanded without truesize being adjusted (see my original mail) -Andi -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Jan 04, 2009 at 06:33:08PM +0100, Johannes Berg wrote: > On Sun, 2009-01-04 at 18:43 +0100, Andi Kleen wrote: > > > > we need to do is figure out is why the skb has a wrong truesize. > > > > Because it was expanded without truesize being adjusted (see my > > original mail) > > So why is this not making trouble with all the other users? I think most adjustments are too small to be noticed. Typically they are just for a few bytes in the header. truesize is already larger, so it can tolerate some slag. I also only see it occasionally (maybe 5-10 times/day) when the wireless stack appends a lot of data. My proposal would be to include this patch for 2.6.28/2.6.29 and investigate fixing pskb_expand_head for 2.6.30. -Andi
From: Andi Kleen <andi@firstfloor.org> Date: Sun, 4 Jan 2009 19:41:36 +0100 > On Sun, Jan 04, 2009 at 06:33:08PM +0100, Johannes Berg wrote: > > On Sun, 2009-01-04 at 18:43 +0100, Andi Kleen wrote: > > > > > > we need to do is figure out is why the skb has a wrong truesize. > > > > > > Because it was expanded without truesize being adjusted (see my > > > original mail) > > > > So why is this not making trouble with all the other users? > > I think most adjustments are too small to be noticed. Typically > they are just for a few bytes in the header. truesize > is already larger, so it can tolerate some slag. > > I also only see it occasionally (maybe 5-10 times/day) when > the wireless stack appends a lot of data. > > My proposal would be to include this patch for 2.6.28/2.6.29 > and investigate fixing pskb_expand_head for 2.6.30. At a minimum you need to add a skb->sk == NULL warn-on and abort path here, otherwise we will corrupt socket accounting and just explode somewhere else. Adding this patch as-is will just introduce a new bug in exchange for an existing one. There are cases where the Tx path of the wireless loops back packets back to the Rx path, and in such cases we certainly could see sockets attached to the SKB. And Johannes is right, the other alternative is to orphan the SKB, you absolutely cannot modify ->truesize blindly. You can't change the truesize value if a socket is attached. There is, as usual, an exception. TCP does this kind of adjustment, but it also fixes the socket memory accounting to match. Only it can do this because it happens to have the socket locked at the time and it's running in the proper context. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, 2009-01-04 at 19:41 +0100, Andi Kleen wrote: > I think most adjustments are too small to be noticed. Typically > they are just for a few bytes in the header. truesize > is already larger, so it can tolerate some slag. This statement is incompatible with your patch when you think about the exact definition of truesize and the (unconditional!) adjustments your patch makes. > I also only see it occasionally (maybe 5-10 times/day) when > the wireless stack appends a lot of data. Except the data it appends should generally be of the same or very similar size under unchanging conditions, so that doesn't make a lot of sense either. > My proposal would be to include this patch for 2.6.28/2.6.29 > and investigate fixing pskb_expand_head for 2.6.30. I disagree, obviously. I knew there was some truesize corruption, and I think you for tracing down where it occurs. I'll investigate a proper fix when I get around to that, meanwhile I don't think the problem is awfully urgent since we've had this going on for quite a while and, if any, it probably only affects/corrupts the raw monitor sockets. johannes
On Mon, 2009-01-05 at 14:21 +0100, Andi Kleen wrote: > > This statement is incompatible with your patch when you think about the > > exact definition of truesize and the (unconditional!) adjustments your > > patch makes. > > __alloc_skb does > > size = SKB_DATA_ALIGN(size); > skb->truesize = size + sizeof(struct sk_buff); > > [ BTW it would be probably better if alloc_skb() just asked > slab what the truesize is for kmalloc instead of guessing wrong like this. > But that's a different topic] > > and SKB_DATA_ALIGN is > > #define SKB_DATA_ALIGN(X) (((X) + (SMP_CACHE_BYTES - 1)) & \ > ~(SMP_CACHE_BYTES - 1)) > > and on my configuration SMP_CACHE_BYTES is 64 > > skb_truesize_check does > > int len = sizeof(struct sk_buff) + skb->len; > > if (unlikely((int)skb->truesize < len)) > skb_truesize_bug(skb); > > This means if the change is less than the 64byte cache alignment > (or more commonly 128 bytes on GENERIC_CPU distro kernels) > it won't be reported. To my knowledge header adjustments are usually > smaller and that is what pskb_expand_head() is usually used for. Ah, indeed, I thought I remembered that truesize had to match exactly, but obviously that wasn't done either. We should experiment with that though, and let pskb_expand_head() adjust things. > I didn't use any monitoring with this. No tcpdump, no wireless > sniffer tools or anything. It happened all the time during > normal operation. You did have a monitor interface up though, didn't you? If nothing actually used those skbs then it's likely that the warning didn't result in any corruption at all. johannes
On Mon, Jan 05, 2009 at 09:36:14AM +0100, Johannes Berg wrote: > On Sun, 2009-01-04 at 19:41 +0100, Andi Kleen wrote: > > > I think most adjustments are too small to be noticed. Typically > > they are just for a few bytes in the header. truesize > > is already larger, so it can tolerate some slag. > > This statement is incompatible with your patch when you think about the > exact definition of truesize and the (unconditional!) adjustments your > patch makes. __alloc_skb does size = SKB_DATA_ALIGN(size); skb->truesize = size + sizeof(struct sk_buff); [ BTW it would be probably better if alloc_skb() just asked slab what the truesize is for kmalloc instead of guessing wrong like this. But that's a different topic] and SKB_DATA_ALIGN is #define SKB_DATA_ALIGN(X) (((X) + (SMP_CACHE_BYTES - 1)) & \ ~(SMP_CACHE_BYTES - 1)) and on my configuration SMP_CACHE_BYTES is 64 skb_truesize_check does int len = sizeof(struct sk_buff) + skb->len; if (unlikely((int)skb->truesize < len)) skb_truesize_bug(skb); This means if the change is less than the 64byte cache alignment (or more commonly 128 bytes on GENERIC_CPU distro kernels) it won't be reported. To my knowledge header adjustments are usually smaller and that is what pskb_expand_head() is usually used for. > > I also only see it occasionally (maybe 5-10 times/day) when > > the wireless stack appends a lot of data. > > Except the data it appends should generally be of the same or very > similar size under unchanging conditions, so that doesn't make a lot of > sense either. I don't know too much about the packet dynamics of the wireless stack, but I can only report what my machine printed out. Here are some excerpts: SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=769, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1404, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=769, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=920, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (600) len=462, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (600) len=1452, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (600) len=1452, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (600) len=1452, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (600) len=1452, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (600) len=1452, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (600) len=468, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (600) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (600) len=820, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (600) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (600) len=820, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (600) len=468, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (600) len=533, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (600) len=468, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (600) len=542, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (600) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1349, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 SKB BUG: Invalid truesize (920) len=1440, sizeof(sk_buff)=216 > I disagree, obviously. I knew there was some truesize corruption, and I > think you for tracing down where it occurs. I'll investigate a proper > fix when I get around to that, meanwhile I don't think the problem is > awfully urgent since we've had this going on for quite a while and, if > any, it probably only affects/corrupts the raw monitor sockets. I didn't use any monitoring with this. No tcpdump, no wireless sniffer tools or anything. It happened all the time during normal operation. -Andi
On Mon, 2009-01-05 at 14:36 +0100, Andi Kleen wrote: > On Mon, Jan 05, 2009 at 02:16:19PM +0100, Johannes Berg wrote: > > > I didn't use any monitoring with this. No tcpdump, no wireless > > > sniffer tools or anything. It happened all the time during > > > normal operation. > > > > You did have a monitor interface up though, didn't you? If nothing > > wmaster0 was up. Is that a monitoring interface? No. Only if you do iwconfig ... mode monitor or use iw to create one or something like that. I was under the impression that you only modified monitor mode code, but upon closer inspection (really, you should diff with -p, makes it a lot easier to review) it seems that you also touched packet defragmentation code. Are you running a low fragmentation threshold on your network? Can you check /sys/kernel/debug/ieee80211/phy*/statistics/rx_expand_skb_head* please? > > actually used those skbs then it's likely that the warning didn't result > > in any corruption at all. > > Nothing was corrupted ever to my knowledge, just lots of spam in my kernel logs. You wouldn't easily notice any socket memory charge corruption anyway. johannes
> At a minimum you need to add a skb->sk == NULL warn-on and abort path > here, otherwise we will corrupt socket accounting and just explode > somewhere else. Adding this patch as-is will just introduce a new I ran the patch for a few days now and nothing exploded, no messages. I can add a skb->sk check and see if it triggers. > There are cases where the Tx path of the wireless loops back packets > back to the Rx path, and in such cases we certainly could see sockets > attached to the SKB. > > And Johannes is right, He keeps talking about cases like monitoring sockets that don't apply, which makes me somewhat suspicious of his analysis. > absolutely cannot modify ->truesize blindly. You can't change the > truesize value if a socket is attached. But pskb_expand_head modifies the size so obviously truesize needs to change too. So you're saying pskb_expand_head() is illegal when there might be a socket attached? Somehow I suspect a lot of the pskb_expand_head() callers are failing that requirement. Ok I guess we could call skb_orphan() unconditionally. Or if you guys have a better patch I can test. -Andi
On Mon, Jan 05, 2009 at 02:16:19PM +0100, Johannes Berg wrote: > > I didn't use any monitoring with this. No tcpdump, no wireless > > sniffer tools or anything. It happened all the time during > > normal operation. > > You did have a monitor interface up though, didn't you? If nothing wmaster0 was up. Is that a monitoring interface? > actually used those skbs then it's likely that the warning didn't result > in any corruption at all. Nothing was corrupted ever to my knowledge, just lots of spam in my kernel logs. -Andi
On Mon, Jan 05, 2009 at 02:31:15PM +0100, Johannes Berg wrote: > I was under the impression that you only modified monitor mode code, but > upon closer inspection (really, you should diff with -p, makes it a lot > easier to review) it seems that you also touched packet defragmentation > code. Are you running a low fragmentation threshold on your network? I don't use any special settings, only defaults. > Can you > check /sys/kernel/debug/ieee80211/phy*/statistics/rx_expand_skb_head* > please? The kernel doesn't have 80211 debugging enabled. I can enable it, but it will take some time until the messages reappear (I cannot reproduce them at will, but have to wait) > > > > actually used those skbs then it's likely that the warning didn't result > > > in any corruption at all. > > > > Nothing was corrupted ever to my knowledge, just lots of spam in my kernel logs. > > You wouldn't easily notice any socket memory charge corruption anyway. There are WARN_ON(sk->sk_forward_alloc)s in the socket destroy paths which should trigger. -Andi
Index: linux-2.6.28-test/net/mac80211/rx.c =================================================================== --- linux-2.6.28-test.orig/net/mac80211/rx.c 2008-10-24 13:35:11.000000000 +0200 +++ linux-2.6.28-test/net/mac80211/rx.c 2009-01-01 15:51:34.000000000 +0100 @@ -263,10 +263,12 @@ * probably export the length to drivers so that we can have * them allocate enough headroom to start with. */ - if (skb_headroom(skb) < needed_headroom && - pskb_expand_head(skb, needed_headroom, 0, GFP_ATOMIC)) { - dev_kfree_skb(skb); - return NULL; + if (skb_headroom(skb) < needed_headroom) { + if (pskb_expand_head(skb, needed_headroom, 0, GFP_ATOMIC)) { + dev_kfree_skb(skb); + return NULL; + } + skb->truesize += needed_headroom; } } else { /* @@ -945,6 +947,7 @@ __skb_queue_purge(&entry->skb_list); return RX_DROP_UNUSABLE; } + rx->skb->truesize += entry->extra_len; } while ((skb = __skb_dequeue(&entry->skb_list))) { memcpy(skb_put(rx->skb, skb->len), skb->data, skb->len); @@ -1691,9 +1694,11 @@ if (rx->flags & IEEE80211_RX_CMNTR_REPORTED) goto out_free_skb; - if (skb_headroom(skb) < sizeof(*rthdr) && - pskb_expand_head(skb, sizeof(*rthdr), 0, GFP_ATOMIC)) - goto out_free_skb; + if (skb_headroom(skb) < sizeof(*rthdr)) { + if (pskb_expand_head(skb, sizeof(*rthdr), 0, GFP_ATOMIC)) + goto out_free_skb; + skb->truesize += sizeof(*rthdr); + } rthdr = (void *)skb_push(skb, sizeof(*rthdr)); memset(rthdr, 0, sizeof(*rthdr));
Fix up truesize after pskb_expand_head() in wireless stack When using a zd1211rw wireless usb stick I regularly got truesize warnings in the kernel log. This patch fixes those up in the wireless layer. tx.c already did that, but not rx.c. I think my messages only came from the middle case, but I fixed up all three users in rx.c The underlying problem seems to be that pskb_expand_head() doesn't manipulate truesize. Perhaps it should? I suspect more users of it have the same problem. I didn't change the low level code because I was afraid to break some callers, but perhaps it would be better to do it this way. Anyways here's a patch that only changes it in the wireless layer with minimal risk. Patch against 2.6.28, but I think linus git still has the same issue. I believe this is a 2.6.28 stable candidate. I even saw the same problem in 2.6.27. Signed-off-by: Andi Kleen <ak@linux.intel.com> --- net/mac80211/rx.c | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html