From patchwork Wed Mar 11 10:03:35 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andi Kleen X-Patchwork-Id: 24296 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by ozlabs.org (Postfix) with ESMTP id 28676DE13F for ; Wed, 11 Mar 2009 21:06:10 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754575AbZCKKGA (ORCPT ); Wed, 11 Mar 2009 06:06:00 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753614AbZCKKF7 (ORCPT ); Wed, 11 Mar 2009 06:05:59 -0400 Received: from one.firstfloor.org ([213.235.205.2]:39944 "EHLO one.firstfloor.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753520AbZCKKF7 (ORCPT ); Wed, 11 Mar 2009 06:05:59 -0400 Received: from basil.firstfloor.org (p5B3CB4E0.dip0.t-ipconnect.de [91.60.180.224]) by one.firstfloor.org (Postfix) with ESMTP id 739DE1AB0002; Wed, 11 Mar 2009 11:25:18 +0100 (CET) Received: by basil.firstfloor.org (Postfix, from userid 1000) id 28EAC3E6674; Wed, 11 Mar 2009 11:03:35 +0100 (CET) To: David Miller Cc: md@bts.sk, netdev@vger.kernel.org Subject: Re: TCP rx window autotuning harmful at LAN context From: Andi Kleen References: <20090309112521.GB37984@bts.sk> <1e41a3230903091101u536a3b3bv7f0dd9da6891781e@mail.gmail.com> <20090309200505.GA58375@bts.sk> <20090309.170927.130334650.davem@davemloft.net> Date: Wed, 11 Mar 2009 11:03:35 +0100 In-Reply-To: <20090309.170927.130334650.davem@davemloft.net> (David Miller's message of "Mon, 09 Mar 2009 17:09:27 -0700 (PDT)") Message-ID: <87bps8fkaw.fsf@basil.nowhere.org> User-Agent: Gnus/5.1008 (Gnus v5.10.8) Emacs/21.3 (gnu/linux) MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org David Miller writes: > From: Marian Ďurkovič > Date: Mon, 9 Mar 2009 21:05:05 +0100 > >> Well, in practice that was always limited by receive window size, which >> was by default 64 kB on most operating systems. So this undesirable behavior >> was limited to hosts where receive window was manually increased to huge >> values. > > You say "was" as if this was a recent change. Linux has been doing > receive buffer autotuning for at least 5 years if not longer. I think his point was the only now does it become a visible problem as >= 1GB of memory is wide spread, which leads to 4MB rx buffer sizes. Perhaps this points to the default buffer sizing heuristics to be too aggressive for >= 1GB? Perhaps something like this patch? Marian, does that help? -Andi TCP: Lower per socket RX buffer sizing threshold Signed-off-by: Andi Kleen --- net/ipv4/tcp.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) Index: linux-2.6.28-test/net/ipv4/tcp.c =================================================================== --- linux-2.6.28-test.orig/net/ipv4/tcp.c 2009-02-09 11:06:52.000000000 +0100 +++ linux-2.6.28-test/net/ipv4/tcp.c 2009-03-11 11:01:53.000000000 +0100 @@ -2757,9 +2757,9 @@ sysctl_tcp_mem[1] = limit; sysctl_tcp_mem[2] = sysctl_tcp_mem[0] * 2; - /* Set per-socket limits to no more than 1/128 the pressure threshold */ - limit = ((unsigned long)sysctl_tcp_mem[1]) << (PAGE_SHIFT - 7); - max_share = min(4UL*1024*1024, limit); + /* Set per-socket limits to no more than 1/256 the pressure threshold */ + limit = ((unsigned long)sysctl_tcp_mem[1]) << (PAGE_SHIFT - 8); + max_share = min(2UL*1024*1024, limit); sysctl_tcp_wmem[0] = SK_MEM_QUANTUM; sysctl_tcp_wmem[1] = 16*1024;