Message ID | 1352316791-16491-1-git-send-email-jwerner@chromium.org |
---|---|
State | Not Applicable, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, 2012-11-07 at 11:33 -0800, Julius Werner wrote: > tcp_recvmsg contains a sanity check that WARNs when there is a gap > between the socket's copied_seq and the first buffer in the > sk_receive_queue. In theory, the TCP stack makes sure that This Should > Never Happen (TM)... however, practice shows that there are still a few > bug reports from it out there (and one in my inbox). > > Unfortunately, when it does happen for whatever reason, the situation > is not handled very well: the kernel logs a warning and breaks out of > the loop that walks the receive queue. It proceeds to find nothing else > to do on the socket and hits sk_wait_data, which cannot block because > the receive queue is not empty. As no data was read, the outer while > loop repeats (logging the same warning again) ad infinitum until the > system's syslog exhausts all available hard drive capacity. > > This patch addresses that issue by closing the socket outright and > throwing EBADFD to userspace (which seems most appropriate to me at this > point). As the underlying bug condition is "impossible" and therefore by > definition unrecoverable, this is the only sensible action other than a > full panic. > > Signed-off-by: Julius Werner <jwerner@chromium.org> > --- > net/ipv4/tcp.c | 7 ++++++- > 1 files changed, 6 insertions(+), 1 deletions(-) > > diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c > index 197c000..d612308 100644 > --- a/net/ipv4/tcp.c > +++ b/net/ipv4/tcp.c > @@ -1628,7 +1628,7 @@ int tcp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, > "recvmsg bug: copied %X seq %X rcvnxt %X fl %X\n", > *seq, TCP_SKB_CB(skb)->seq, tp->rcv_nxt, > flags)) > - break; > + goto selfdestruct; > > offset = *seq - TCP_SKB_CB(skb)->seq; > if (tcp_hdr(skb)->syn) > @@ -1936,6 +1936,11 @@ recv_urg: > recv_sndq: > err = tcp_peek_sndq(sk, msg, len); > goto out; > + > +selfdestruct: > + err = -EBADFD; > + tcp_done(sk); > + goto out; > } > EXPORT_SYMBOL(tcp_recvmsg); > What I find very sad in all this is that you didnt mention the driver that was triggering this bug. So instead of making real progress, we are discussing of some dubious 'fixes' -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> What I find very sad in all this is that you didnt mention the driver > that was triggering this bug. Sorry, I was just trying to keep this thread focussed on one patch. The bug report that led me to this is publicly accessible at http://crosbug.com/35827. We have encountered the problem only once, on an Acer AC700 Chromebook that ran automated tests. The ethernet interface for the offending socket was provided by a USB-to-Ethernet dongle using the smsc95xx/usbnet module (v1.0.4). Don't get me wrong, I do understand the importance of finding the underlying cause of this... I just don't think I have much of a chance with one report. I can go through the above-mentioned module and see if something looks suspicious in the skb handling code if I can find the time. But on the other hand the fact remains that this condition is not handled well... not just for this particular case, but for all future kernel and driver bugs that may trigger it again. I am not trying to "hide" any issues, I am all for making them as visible as possible... but as Dave pointed out, kernel panics may not be the best way to do that either, and I think damage mitigation also has some value. The current code clearly does the worst of both worlds, so please let's just improve it one way or the other. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2012-11-07 at 13:14 -0800, Julius Werner wrote: > > What I find very sad in all this is that you didnt mention the driver > > that was triggering this bug. > > Sorry, I was just trying to keep this thread focussed on one patch. > The bug report that led me to this is publicly accessible at > http://crosbug.com/35827. We have encountered the problem only once, > on an Acer AC700 Chromebook that ran automated tests. The ethernet > interface for the offending socket was provided by a USB-to-Ethernet > dongle using the smsc95xx/usbnet module (v1.0.4). This driver uses interesting skb_clone() games and skb->truesize lies : skb->truesize = size + sizeof(struct sk_buff); So you probably are fighting a bug we already fixed in upstream kernel. (commit c8628155ece363 "tcp: reduce out_of_order memory use" did not played well with cloned skbs.) This issue was already discussed on netdev in the past. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 197c000..d612308 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1628,7 +1628,7 @@ int tcp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, "recvmsg bug: copied %X seq %X rcvnxt %X fl %X\n", *seq, TCP_SKB_CB(skb)->seq, tp->rcv_nxt, flags)) - break; + goto selfdestruct; offset = *seq - TCP_SKB_CB(skb)->seq; if (tcp_hdr(skb)->syn) @@ -1936,6 +1936,11 @@ recv_urg: recv_sndq: err = tcp_peek_sndq(sk, msg, len); goto out; + +selfdestruct: + err = -EBADFD; + tcp_done(sk); + goto out; } EXPORT_SYMBOL(tcp_recvmsg);
tcp_recvmsg contains a sanity check that WARNs when there is a gap between the socket's copied_seq and the first buffer in the sk_receive_queue. In theory, the TCP stack makes sure that This Should Never Happen (TM)... however, practice shows that there are still a few bug reports from it out there (and one in my inbox). Unfortunately, when it does happen for whatever reason, the situation is not handled very well: the kernel logs a warning and breaks out of the loop that walks the receive queue. It proceeds to find nothing else to do on the socket and hits sk_wait_data, which cannot block because the receive queue is not empty. As no data was read, the outer while loop repeats (logging the same warning again) ad infinitum until the system's syslog exhausts all available hard drive capacity. This patch addresses that issue by closing the socket outright and throwing EBADFD to userspace (which seems most appropriate to me at this point). As the underlying bug condition is "impossible" and therefore by definition unrecoverable, this is the only sensible action other than a full panic. Signed-off-by: Julius Werner <jwerner@chromium.org> --- net/ipv4/tcp.c | 7 ++++++- 1 files changed, 6 insertions(+), 1 deletions(-)