Message ID | 20090126223022.GA24046@gondor.apana.org.au |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
Herbert Xu wrote: > On Sun, Jan 25, 2009 at 09:32:22PM -0800, David Miller wrote: >> From: Mike Christie <michaelc@cs.wisc.edu> >> Date: Thu, 22 Jan 2009 18:04:11 -0600 >> >>> With the patch running against linus's git tree, my box locks >>> up. You cannot ping it. I do not get a oops or anything in the logs, >>> and the keyboard does not respond. I will try to get some oops >>> output and more info. >> >> Herbert, any idea offhand? > > Yeah, I missed an offset update in there :) Here's a better version. > > net: Fix frag_list handling in skb_seq_read > > The frag_list handling was broken in skb_seq_read: > > 1) We didn't add the stepped offset when looking at the head > are of fragments other than the first. > > 2) We didn't take the stepped offset away when setting the data > pointer in the head area. > > 3) The frag index wasn't reset. > > This patch fixes both issues. 1-3 (both, :-)) doesn't seem to work here, I applied this and still got a panic when doing a quick test before I went home today. I can get panic trace from netconsole tomorrow.-- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jan 26, 2009 at 05:40:08PM -0800, Brandeburg, Jesse wrote: > > 1-3 (both, :-)) doesn't seem to work here, I applied this and still got a > panic when doing a quick test before I went home today. > > I can get panic trace from netconsole tomorrow. OK, looking at the original trace it looks like none of the bugs I've fixed actually explain it. The crash looks more like a ref count bug, which means that the issue isn't so much about the skb_seq_read interface (which should be fixed in any case), but in the iSCSI code itself. More digging is required. Cheers,
From: Herbert Xu <herbert@gondor.apana.org.au> Date: Tue, 27 Jan 2009 09:30:22 +1100 > net: Fix frag_list handling in skb_seq_read > > The frag_list handling was broken in skb_seq_read: > > 1) We didn't add the stepped offset when looking at the head > are of fragments other than the first. > > 2) We didn't take the stepped offset away when setting the data > pointer in the head area. > > 3) The frag index wasn't reset. > > This patch fixes both issues. > > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> I see, the code is only clearing the fragment index when it's advancing from one SKB to the next while already in the middle of a ->frag_list, not when transitioning past the root skb in such a list. I bet some weird cases happen when "consumed" it's advanced by the caller the entire length of data returned by the previous skb_seq_read(). It all seems to be designed to work for that case, however. Anyways, Herbert's patch looks definitely correct but until we've gotten these crashes and hangs solved I don't want to apply it. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Herbert Xu wrote: > On Sun, Jan 25, 2009 at 09:32:22PM -0800, David Miller wrote: >> From: Mike Christie <michaelc@cs.wisc.edu> >> Date: Thu, 22 Jan 2009 18:04:11 -0600 >> >>> With the patch running against linus's git tree, my box locks >>> up. You cannot ping it. I do not get a oops or anything in the logs, >>> and the keyboard does not respond. I will try to get some oops >>> output and more info. >> Herbert, any idea offhand? > > Yeah, I missed an offset update in there :) Here's a better version. > > net: Fix frag_list handling in skb_seq_read > > The frag_list handling was broken in skb_seq_read: > > 1) We didn't add the stepped offset when looking at the head > are of fragments other than the first. > > 2) We didn't take the stepped offset away when setting the data > pointer in the head area. > > 3) The frag index wasn't reset. > > This patch fixes both issues. > It oopsd for me in skb_seq_read. addr2line said it was linux-2.6/net/core/skbuff.c:2228, which is this line: while (st->frag_idx < skb_shinfo(st->cur_skb)->nr_frags) { I added some printks in there and it looks like we hit this: } else if (st->root_skb == st->cur_skb && skb_shinfo(st->root_skb)->frag_list) { st->cur_skb = skb_shinfo(st->root_skb)->frag_list; st->frag_idx = 0; goto next_skb; } Then when we hit the goto and start again, and we oops when we hit that "while (st->frag_idx < skb_shinfo(st->cur_skb)->nr_frags)" line. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 28, 2009 at 05:47:57PM +0530, Shyam_Iyer@Dell.com wrote: > > --- skbuff.c.orig 2009-01-29 01:12:03.000000000 +0530 > +++ skbuff.c 2009-01-29 01:34:57.000000000 +0530 > @@ -2039,15 +2039,15 @@ > st->frag_data = NULL; > } > > - if (st->cur_skb->next) { > - st->cur_skb = st->cur_skb->next; > - st->frag_idx = 0; > - goto next_skb; > - } else if (st->root_skb == st->cur_skb && > + if (st->root_skb == st->cur_skb && > skb_shinfo(st->root_skb)->frag_list) { > st->cur_skb = skb_shinfo(st->root_skb)->frag_list; > st->frag_idx=0; > goto next_skb; > + } else if (st->cur_skb->next) { > + st->cur_skb = st->cur_skb->next; > + st->frag_idx = 0; > + goto next_skb; > } > > return 0; Good catch! Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Thanks,
From: Herbert Xu <herbert@gondor.apana.org.au> Date: Thu, 29 Jan 2009 08:25:26 +1100 > On Wed, Jan 28, 2009 at 05:47:57PM +0530, Shyam_Iyer@Dell.com wrote: > > > > --- skbuff.c.orig 2009-01-29 01:12:03.000000000 +0530 > > +++ skbuff.c 2009-01-29 01:34:57.000000000 +0530 > > @@ -2039,15 +2039,15 @@ > > st->frag_data = NULL; > > } > > > > - if (st->cur_skb->next) { > > - st->cur_skb = st->cur_skb->next; > > - st->frag_idx = 0; > > - goto next_skb; > > - } else if (st->root_skb == st->cur_skb && > > + if (st->root_skb == st->cur_skb && > > skb_shinfo(st->root_skb)->frag_list) { > > st->cur_skb = skb_shinfo(st->root_skb)->frag_list; > > st->frag_idx=0; > > goto next_skb; > > + } else if (st->cur_skb->next) { > > + st->cur_skb = st->cur_skb->next; > > + st->frag_idx = 0; > > + goto next_skb; > > } > > > > return 0; > > Good catch! > > Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Excellent work everyone. I applied both Herbert's and Shyam's patches, and will queue them up for -stable as well. Thanks again. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/core/skbuff.c b/net/core/skbuff.c index d7efaf9..d4d0e31 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -2215,10 +2215,10 @@ unsigned int skb_seq_read(unsigned int consumed, const u8 **data, return 0; next_skb: - block_limit = skb_headlen(st->cur_skb); + block_limit = skb_headlen(st->cur_skb) + st->stepped_offset; if (abs_offset < block_limit) { - *data = st->cur_skb->data + abs_offset; + *data = st->cur_skb->data + (abs_offset - st->stepped_offset); return block_limit - abs_offset; } @@ -2260,6 +2260,7 @@ next_skb: } else if (st->root_skb == st->cur_skb && skb_shinfo(st->root_skb)->frag_list) { st->cur_skb = skb_shinfo(st->root_skb)->frag_list; + st->frag_idx = 0; goto next_skb; }