From patchwork Thu Mar 19 00:44:53 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Dykstra X-Patchwork-Id: 24633 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by ozlabs.org (Postfix) with ESMTP id 5BD9ADDDA5 for ; Thu, 19 Mar 2009 11:45:05 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754220AbZCSApA (ORCPT ); Wed, 18 Mar 2009 20:45:00 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754064AbZCSAo7 (ORCPT ); Wed, 18 Mar 2009 20:44:59 -0400 Received: from qw-out-2122.google.com ([74.125.92.25]:25371 "EHLO qw-out-2122.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753992AbZCSAo6 (ORCPT ); Wed, 18 Mar 2009 20:44:58 -0400 Received: by qw-out-2122.google.com with SMTP id 8so285630qwh.37 for ; Wed, 18 Mar 2009 17:44:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:subject:from:to:cc :in-reply-to:references:content-type:date:message-id:mime-version :x-mailer:content-transfer-encoding; bh=pvHF8JYQt4+FqmcLESybYnSwKVnFjUCOP8cWmPc1kqM=; b=Es594bNUttiTM5hA12gn5o7qWcqrVtVYMDeY1ADjVVt7IqhDfLI2eU+xy/Pqzvsl4P Tj6iLbxoS0dpnEPc0oGGdAqTGK+lmdpuQrY65MP9GRqvjpqeWQUypndh+QjZqMn3MHtB GsXBloFcE2GI/4FLfTAA6PScsYhKAj+TpJFXA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=subject:from:to:cc:in-reply-to:references:content-type:date :message-id:mime-version:x-mailer:content-transfer-encoding; b=LdrKwptnQS6rEkmL4X/4EClq1JTWyLlFUoDMyHs7lstMGFba7yg25xnfRCEIHQeBIq m8GEyjGOQ4YgtC3evyhOnBrterX3/nZ0h838hsiPmNiAcS32CoIBapx1KxSZYWJoCHDy 3JUWCMAloYERLzOOmtbPLj5wAgZSNrMqiRjdM= Received: by 10.224.54.83 with SMTP id p19mr3050019qag.292.1237423496390; Wed, 18 Mar 2009 17:44:56 -0700 (PDT) Received: from ?192.168.221.201? (c-24-118-80-156.hsd1.mn.comcast.net [24.118.80.156]) by mx.google.com with ESMTPS id 7sm494692qwb.31.2009.03.18.17.44.54 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 18 Mar 2009 17:44:55 -0700 (PDT) Subject: [PATCH net-next-2.6] Re: TCP/IP stack interpretation of acceptable packet From: John Dykstra To: Oliver Zheng Cc: netdev@vger.kernel.org, Andi Kleen In-Reply-To: <2ff60cd60902241459q1de39054lb3dc5233f13b69c3@mail.gmail.com> References: <2ff60cd60902241459q1de39054lb3dc5233f13b69c3@mail.gmail.com> Date: Thu, 19 Mar 2009 00:44:53 +0000 Message-Id: <1237423493.32009.31.camel@Maple> Mime-Version: 1.0 X-Mailer: Evolution 2.22.3.1 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Tue, 2009-02-24 at 14:59 -0800, Oliver Zheng wrote: > When a packet is received with the correct > sequence number but incorrect acknowledgement number (in my tests, it > was higher than the correct acknowledgement number), the stack accepts > the packet as a valid packet and passes the data up to the application TCP's fast path currently accepts segments who ack data that was never sent. The slow path and processing for states other than ESTABLISHED discard such segments. RFC793 says (Section 3.9) that not only should such segments be discarded, but that an ACK should be sent to the peer. I can't see what that accomplishes, and it seems to badly interact with fast retransmit--under some conditions with crafted packets you can get the two stacks ACKing each other forever. So I left that out of this patch: [PATCH net-next-2.6] tcp: Discard segments that ack data not yet sent Discard incoming packets whose ack field iincludes data not yet sent. This is consistent with RFC 793 Section 3.9. Change tcp_ack() to distinguish between too-small and too-large ack field values. Keep segments with too-large ack fields out of the fast path, and change slow path to discard them. Reported-by: Oliver Zheng Signed-off-by: John Dykstra --- net/ipv4/tcp_input.c | 31 +++++++++++++++++++------------ 1 files changed, 19 insertions(+), 12 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index fae78e3..01544cd 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3587,16 +3587,19 @@ static int tcp_ack(struct sock *sk, struct sk_buff *skb, int flag) u32 prior_fackets; int prior_packets; int frto_cwnd = 0; - - /* If the ack is newer than sent or older than previous acks + + /* If the ack is older than previous acks * then we can probably ignore it. */ - if (after(ack, tp->snd_nxt)) - goto uninteresting_ack; - if (before(ack, prior_snd_una)) goto old_ack; + /* If the ack includes data we haven't sent yet, discard + * this segment (RFC793 Section 3.9). + */ + if (after(ack, tp->snd_nxt)) + goto invalid_ack; + if (after(ack, prior_snd_una)) flag |= FLAG_SND_UNA_ADVANCED; @@ -3686,6 +3689,10 @@ no_queue: tcp_ack_probe(sk); return 1; +invalid_ack: + SOCK_DEBUG(sk, "Ack %u after %u:%u\n", ack, tp->snd_una, tp->snd_nxt); + return -1; + old_ack: if (TCP_SKB_CB(skb)->sacked) { tcp_sacktag_write_queue(sk, skb, prior_snd_una); @@ -3693,9 +3700,8 @@ old_ack: tcp_try_keep_open(sk); } -uninteresting_ack: - SOCK_DEBUG(sk, "Ack %u out of %u:%u\n", ack, tp->snd_una, tp->snd_nxt); - return 0; + SOCK_DEBUG(sk, "Ack %u before %u:%u\n", ack, tp->snd_una, tp->snd_nxt); + return 0; } /* Look for tcp options. Normally only called on SYN and SYNACK packets. @@ -5158,7 +5164,8 @@ int tcp_rcv_established(struct sock *sk, struct sk_buff *skb, */ if ((tcp_flag_word(th) & TCP_HP_BITS) == tp->pred_flags && - TCP_SKB_CB(skb)->seq == tp->rcv_nxt) { + TCP_SKB_CB(skb)->seq == tp->rcv_nxt && + !after(TCP_SKB_CB(skb)->ack_seq, tp->snd_nxt)) { int tcp_header_len = tp->tcp_header_len; /* Timestamp header prediction: tcp_header_len @@ -5311,8 +5318,8 @@ slow_path: return -res; step5: - if (th->ack) - tcp_ack(sk, skb, FLAG_SLOWPATH); + if (th->ack && tcp_ack(sk, skb, FLAG_SLOWPATH) < 0) + goto discard; tcp_rcv_rtt_measure_ts(sk, skb); @@ -5649,7 +5656,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb, /* step 5: check the ACK field */ if (th->ack) { - int acceptable = tcp_ack(sk, skb, FLAG_SLOWPATH); + int acceptable = tcp_ack(sk, skb, FLAG_SLOWPATH) > 0; switch (sk->sk_state) { case TCP_SYN_RECV: