[2/3] xfrm: Increase the garbage collector threshold

Message ID	1383294107-7509-3-git-send-email-steffen.klassert@secunet.com
State	Accepted, archived
Delegated to:	David Miller
Headers	show Return-Path: <netdev-owner@vger.kernel.org> From: Steffen Klassert <steffen.klassert@secunet.com> To: David Miller <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au>, Steffen Klassert <steffen.klassert@secunet.com>, netdev@vger.kernel.org Subject: [PATCH 2/3] xfrm: Increase the garbage collector threshold Date: Fri, 1 Nov 2013 09:21:46 +0100 Message-Id: <1383294107-7509-3-git-send-email-steffen.klassert@secunet.com> In-Reply-To: <1383294107-7509-1-git-send-email-steffen.klassert@secunet.com> References: <1383294107-7509-1-git-send-email-steffen.klassert@secunet.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk

Steffen Klassert Nov. 1, 2013, 8:21 a.m. UTC

With the removal of the routing cache, we lost the
option to tweak the garbage collector threshold
along with the maximum routing cache size. So git
commit 703fb94ec ("xfrm: Fix the gc threshold value
for ipv4") moved back to a static threshold.

It turned out that the current threshold before we
start garbage collecting is much to small for some
workloads, so increase it from 1024 to 32768. This
means that we start the garbage collector if we have
more than 32768 dst entries in the system and refuse
new allocations if we are above 65536.

Reported-by: Wolfgang Walter <linux@stwm.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv4/xfrm4_policy.c |    2 +-
 net/ipv6/xfrm6_policy.c |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

Maxime Bizon Dec. 10, 2013, 2:38 p.m. UTC | #1

On Fri, 2013-11-01 at 09:21 +0100, Steffen Klassert wrote:

Hello Steffen,

> It turned out that the current threshold before we
> start garbage collecting is much to small for some
> workloads, so increase it from 1024 to 32768. This
> means that we start the garbage collector if we have
> more than 32768 dst entries in the system and refuse
> new allocations if we are above 65536.

sorry to dig an old thread, but could you please explain the lifetime of
those dst entries ?

After basic tests it seems the ipv4 dst entry needs to be released first
before the xfrm dst can be released by gc.

Steffen Klassert Dec. 13, 2013, 10:12 a.m. UTC | #2

On Tue, Dec 10, 2013 at 03:38:43PM +0100, Maxime Bizon wrote:
> 
> On Fri, 2013-11-01 at 09:21 +0100, Steffen Klassert wrote:
> 
> Hello Steffen,
> 
> > It turned out that the current threshold before we
> > start garbage collecting is much to small for some
> > workloads, so increase it from 1024 to 32768. This
> > means that we start the garbage collector if we have
> > more than 32768 dst entries in the system and refuse
> > new allocations if we are above 65536.
> 
> sorry to dig an old thread, but could you please explain the lifetime of
> those dst entries ?
> 
> After basic tests it seems the ipv4 dst entry needs to be released first
> before the xfrm dst can be released by gc.
> 

Can you please be a bit more precise with your problem description?

This patch changes only the number of cache entries before
we start garbage collecting. It does not change anything
on the garbage collector itself.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Maxime Bizon Dec. 13, 2013, 4:13 p.m. UTC | #3

On Fri, 2013-12-13 at 11:12 +0100, Steffen Klassert wrote:

> Can you please be a bit more precise with your problem description?

yes sorry I wasn't clear.

I have a problem with an even simpler workload that the one using apache
bench in the original bug report.

I am using ipsec transport mode between two hosts and just run this on
one side:

  while :; do wget -O /dev/null http://remote_host/; done

I was surprised to see it fails after only 1024 requests (ENOBUF on
connect), and how long I had to wait to be able to do new requests.

After debugging I saw that the xfrm gc was called but was not able to
release anything.

after running "ip route flush cache", which forces all ipv4 dst entries
to be released, suddenly the xfrm gc had something to free, and xfrm
entry count went to zero.

So if it is correct that once a ipv4 dst entry exists, the xfrm entry
cannot be gc-ed, then we need to make sure we allow more xfrm entries to
be allocated than ipv4 dst.

> This patch changes only the number of cache entries before
> we start garbage collecting. It does not change anything
> on the garbage collector itself.

Yes my point was that it just hides another underlying problem.

Steffen Klassert Dec. 16, 2013, 11:51 a.m. UTC | #4

On Fri, Dec 13, 2013 at 05:13:29PM +0100, Maxime Bizon wrote:
> 
> On Fri, 2013-12-13 at 11:12 +0100, Steffen Klassert wrote:
> 
> > Can you please be a bit more precise with your problem description?
> 
> yes sorry I wasn't clear.
> 
> I have a problem with an even simpler workload that the one using apache
> bench in the original bug report.
> 
> I am using ipsec transport mode between two hosts and just run this on
> one side:
> 
>   while :; do wget -O /dev/null http://remote_host/; done
> 
> I was surprised to see it fails after only 1024 requests (ENOBUF on
> connect), and how long I had to wait to be able to do new requests.
> 
> After debugging I saw that the xfrm gc was called but was not able to
> release anything.
> 
> after running "ip route flush cache", which forces all ipv4 dst entries
> to be released, suddenly the xfrm gc had something to free, and xfrm
> entry count went to zero.

Well, "ip route flush cache" bumps the rt_genid what marks all
routes as invalid. The xfrm garbage collector will notice this
on the next run and deletes all invalid cached routes.

Garbage collecting is different from flushing, it only removes
stale routes and keeps everything that was used recently. That's
the whole point of this caching, we cache recently used IPsec
routes to avoid a slow path lookup.

> 
> So if it is correct that once a ipv4 dst entry exists, the xfrm entry
> cannot be gc-ed, then we need to make sure we allow more xfrm entries to
> be allocated than ipv4 dst.

No, we don't cache plain ipv4 routes, we cache only IPsec routes.
The original ipv4 dst_entry is cached together with the xfrm dst_entry
as a xfrm bundle at the IPsec flow cache, so it does not make sense to
ensure something like that.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Maxime Bizon Dec. 16, 2013, 2:29 p.m. UTC | #5

On Mon, 2013-12-16 at 12:51 +0100, Steffen Klassert wrote:

> >   while :; do wget -O /dev/null http://remote_host/; done
> > 
> > I was surprised to see it fails after only 1024 requests (ENOBUF on
> > connect), and how long I had to wait to be able to do new requests.
> > 
> > After debugging I saw that the xfrm gc was called but was not able to
> > release anything.
> > 
> > after running "ip route flush cache", which forces all ipv4 dst entries
> > to be released, suddenly the xfrm gc had something to free, and xfrm
> > entry count went to zero.
> 
> Well, "ip route flush cache" bumps the rt_genid what marks all
> routes as invalid. The xfrm garbage collector will notice this
> on the next run and deletes all invalid cached routes.
> 
> Garbage collecting is different from flushing, it only removes
> stale routes and keeps everything that was used recently. That's
> the whole point of this caching, we cache recently used IPsec
> routes to avoid a slow path lookup.

ok I don't know this part of the code at all.

but if it is only a *cache*, then it should not limit any new
allocations for the reason that the cache is full, it should take slow
path instead.

[2/3] xfrm: Increase the garbage collector threshold

Commit Message

Comments

Patch