From patchwork Fri Nov 21 13:07:32 2008 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ilpo_J=C3=A4rvinen?= X-Patchwork-Id: 10002 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by ozlabs.org (Postfix) with ESMTP id 0B32FDDEEF for ; Sat, 22 Nov 2008 00:07:41 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754046AbYKUNHh (ORCPT ); Fri, 21 Nov 2008 08:07:37 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753113AbYKUNHh (ORCPT ); Fri, 21 Nov 2008 08:07:37 -0500 Received: from courier.cs.helsinki.fi ([128.214.9.1]:57702 "EHLO mail.cs.helsinki.fi" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752663AbYKUNHf (ORCPT ); Fri, 21 Nov 2008 08:07:35 -0500 Received: from wrl-59.cs.helsinki.fi (wrl-59.cs.helsinki.fi [128.214.166.179]) (AUTH: PLAIN cs-relay, TLS: TLSv1/SSLv3,256bits,AES256-SHA) by mail.cs.helsinki.fi with esmtp; Fri, 21 Nov 2008 15:07:33 +0200 id 0009C569.4926B295.00005DD3 Received: by wrl-59.cs.helsinki.fi (Postfix, from userid 50795) id E967BA0096; Fri, 21 Nov 2008 15:07:32 +0200 (EET) Received: from localhost (localhost [127.0.0.1]) by wrl-59.cs.helsinki.fi (Postfix) with ESMTP id DA18CA0091; Fri, 21 Nov 2008 15:07:32 +0200 (EET) Date: Fri, 21 Nov 2008 15:07:32 +0200 (EET) From: "=?ISO-8859-1?Q?Ilpo_J=E4rvinen?=" X-X-Sender: ijjarvin@wrl-59.cs.helsinki.fi To: Petr Tesarik cc: Netdev , LKML , "David S. Miller" , "Jan =?utf-8?q?=C5=A0embera?=" Subject: [PATCH] tcp: fix potential corner case issue in segmentation (Was: Re: [PATCH] Do not use TSO/GSO when there is urgent data) In-Reply-To: <200811211319.45515.ptesarik@suse.cz> Message-ID: References: <200811211319.45515.ptesarik@suse.cz> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Fri, 21 Nov 2008, Petr Tesarik wrote: > This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=12014 > > Since most (if not all) implementations of TSO and even the in-kernel > software GSO do not update the urgent pointer when splitting a large > segment, it is necessary to turn off TSO/GSO for all outgoing traffic > with the URG pointer set. Good observation, I totally missed this possibility of T/GSO while looking. > Looking at tcp_current_mss (and the preceding comment) I even think > this was the original intention. However, this approach is insufficient, > because TSO/GSO is turned off only for newly created frames, not for > frames which were already pending at the arrival of a message with > MSG_OOB set. These frames were created when TSO/GSO was enabled, > so they may be large, and they will have the urgent pointer set > in tcp_transmit_skb(). > > With this patch, such large packets will be fragmented again before > going to the transmit routine. I wonder if there's some corner case which still fails to fragment in tcp_retransmit_xmit's in skb->len <= cur_mss case if cur_mss grew very recently (and therefore skb-len now fits to a single segment). Patch below, changed subject is due to patchwork as requested by DaveM (if you'd find that strange). > As a side note, at least the following NICs are known to screw up > the urgent pointer in the TCP header when doing TSO: > > Intel 82566MM (PCI ID 8086:1049) > Intel 82566DC (PCI ID 8086:104b) > Intel 82541GI (PCI ID 8086:1076) > Broadcom NetXtreme II BCM5708 (PCI ID 14e4:164c) Heh, it might already be longer than the list we would get from the working ones. Not very likely too many people consider urg too seriously to get it right. > Signed-off-by: Petr Tesarik > CC: Jan Sembera > CC: Ilpo Jarvinen > > -- > tcp_output.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > --- a/net/ipv4/tcp_output.c > +++ b/net/ipv4/tcp_output.c > @@ -722,7 +722,8 @@ static void tcp_queue_skb(struct sock *sk, struct sk_buff > *skb) > static void tcp_set_skb_tso_segs(struct sock *sk, struct sk_buff *skb, > unsigned int mss_now) > { > - if (skb->len <= mss_now || !sk_can_gso(sk)) { > + if (skb->len <= mss_now || !sk_can_gso(sk) || > + tcp_urg_mode(tcp_sk(sk))) { > /* Avoid the costly divide in the normal > * non-TSO case. > */ > @@ -1163,7 +1164,9 @@ static int tcp_init_tso_segs(struct sock *sk, struct > sk_buff *skb, > { > int tso_segs = tcp_skb_pcount(skb); > > - if (!tso_segs || (tso_segs > 1 && tcp_skb_mss(skb) != mss_now)) { > + if (!tso_segs || > + (tso_segs > 1 && (tcp_skb_mss(skb) != mss_now || > + tcp_urg_mode(tcp_sk(sk))))) { > tcp_set_skb_tso_segs(sk, skb, mss_now); > tso_segs = tcp_skb_pcount(skb); > } It's a bit intrusive but I couldn't immediately come up with alternative that would have worked (came up with some not working ones :-)). Acked-by: Ilpo Järvinen diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index a524627..129f46b 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1944,6 +1944,8 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb) if (skb->len > cur_mss) { if (tcp_fragment(sk, skb, cur_mss, cur_mss)) return -ENOMEM; /* We'll try again later. */ + } else { + tcp_init_tso_segs(sk, skb, cur_mss); } /* Collapse two adjacent packets if worthwhile and we can. */