Message ID | 1383294107-7509-3-git-send-email-steffen.klassert@secunet.com |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
On Fri, 2013-11-01 at 09:21 +0100, Steffen Klassert wrote: Hello Steffen, > It turned out that the current threshold before we > start garbage collecting is much to small for some > workloads, so increase it from 1024 to 32768. This > means that we start the garbage collector if we have > more than 32768 dst entries in the system and refuse > new allocations if we are above 65536. sorry to dig an old thread, but could you please explain the lifetime of those dst entries ? After basic tests it seems the ipv4 dst entry needs to be released first before the xfrm dst can be released by gc.
On Tue, Dec 10, 2013 at 03:38:43PM +0100, Maxime Bizon wrote: > > On Fri, 2013-11-01 at 09:21 +0100, Steffen Klassert wrote: > > Hello Steffen, > > > It turned out that the current threshold before we > > start garbage collecting is much to small for some > > workloads, so increase it from 1024 to 32768. This > > means that we start the garbage collector if we have > > more than 32768 dst entries in the system and refuse > > new allocations if we are above 65536. > > sorry to dig an old thread, but could you please explain the lifetime of > those dst entries ? > > After basic tests it seems the ipv4 dst entry needs to be released first > before the xfrm dst can be released by gc. > Can you please be a bit more precise with your problem description? This patch changes only the number of cache entries before we start garbage collecting. It does not change anything on the garbage collector itself. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2013-12-13 at 11:12 +0100, Steffen Klassert wrote: > Can you please be a bit more precise with your problem description? yes sorry I wasn't clear. I have a problem with an even simpler workload that the one using apache bench in the original bug report. I am using ipsec transport mode between two hosts and just run this on one side: while :; do wget -O /dev/null http://remote_host/; done I was surprised to see it fails after only 1024 requests (ENOBUF on connect), and how long I had to wait to be able to do new requests. After debugging I saw that the xfrm gc was called but was not able to release anything. after running "ip route flush cache", which forces all ipv4 dst entries to be released, suddenly the xfrm gc had something to free, and xfrm entry count went to zero. So if it is correct that once a ipv4 dst entry exists, the xfrm entry cannot be gc-ed, then we need to make sure we allow more xfrm entries to be allocated than ipv4 dst. > This patch changes only the number of cache entries before > we start garbage collecting. It does not change anything > on the garbage collector itself. Yes my point was that it just hides another underlying problem.
On Fri, Dec 13, 2013 at 05:13:29PM +0100, Maxime Bizon wrote: > > On Fri, 2013-12-13 at 11:12 +0100, Steffen Klassert wrote: > > > Can you please be a bit more precise with your problem description? > > yes sorry I wasn't clear. > > I have a problem with an even simpler workload that the one using apache > bench in the original bug report. > > I am using ipsec transport mode between two hosts and just run this on > one side: > > while :; do wget -O /dev/null http://remote_host/; done > > I was surprised to see it fails after only 1024 requests (ENOBUF on > connect), and how long I had to wait to be able to do new requests. > > After debugging I saw that the xfrm gc was called but was not able to > release anything. > > after running "ip route flush cache", which forces all ipv4 dst entries > to be released, suddenly the xfrm gc had something to free, and xfrm > entry count went to zero. Well, "ip route flush cache" bumps the rt_genid what marks all routes as invalid. The xfrm garbage collector will notice this on the next run and deletes all invalid cached routes. Garbage collecting is different from flushing, it only removes stale routes and keeps everything that was used recently. That's the whole point of this caching, we cache recently used IPsec routes to avoid a slow path lookup. > > So if it is correct that once a ipv4 dst entry exists, the xfrm entry > cannot be gc-ed, then we need to make sure we allow more xfrm entries to > be allocated than ipv4 dst. No, we don't cache plain ipv4 routes, we cache only IPsec routes. The original ipv4 dst_entry is cached together with the xfrm dst_entry as a xfrm bundle at the IPsec flow cache, so it does not make sense to ensure something like that. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2013-12-16 at 12:51 +0100, Steffen Klassert wrote: > > while :; do wget -O /dev/null http://remote_host/; done > > > > I was surprised to see it fails after only 1024 requests (ENOBUF on > > connect), and how long I had to wait to be able to do new requests. > > > > After debugging I saw that the xfrm gc was called but was not able to > > release anything. > > > > after running "ip route flush cache", which forces all ipv4 dst entries > > to be released, suddenly the xfrm gc had something to free, and xfrm > > entry count went to zero. > > Well, "ip route flush cache" bumps the rt_genid what marks all > routes as invalid. The xfrm garbage collector will notice this > on the next run and deletes all invalid cached routes. > > Garbage collecting is different from flushing, it only removes > stale routes and keeps everything that was used recently. That's > the whole point of this caching, we cache recently used IPsec > routes to avoid a slow path lookup. ok I don't know this part of the code at all. but if it is only a *cache*, then it should not limit any new allocations for the reason that the cache is full, it should take slow path instead.
diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c index ccde542..4764ee4 100644 --- a/net/ipv4/xfrm4_policy.c +++ b/net/ipv4/xfrm4_policy.c @@ -236,7 +236,7 @@ static struct dst_ops xfrm4_dst_ops = { .destroy = xfrm4_dst_destroy, .ifdown = xfrm4_dst_ifdown, .local_out = __ip_local_out, - .gc_thresh = 1024, + .gc_thresh = 32768, }; static struct xfrm_policy_afinfo xfrm4_policy_afinfo = { diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c index 08ed277..dd503a3 100644 --- a/net/ipv6/xfrm6_policy.c +++ b/net/ipv6/xfrm6_policy.c @@ -285,7 +285,7 @@ static struct dst_ops xfrm6_dst_ops = { .destroy = xfrm6_dst_destroy, .ifdown = xfrm6_dst_ifdown, .local_out = __ip6_local_out, - .gc_thresh = 1024, + .gc_thresh = 32768, }; static struct xfrm_policy_afinfo xfrm6_policy_afinfo = {
With the removal of the routing cache, we lost the option to tweak the garbage collector threshold along with the maximum routing cache size. So git commit 703fb94ec ("xfrm: Fix the gc threshold value for ipv4") moved back to a static threshold. It turned out that the current threshold before we start garbage collecting is much to small for some workloads, so increase it from 1024 to 32768. This means that we start the garbage collector if we have more than 32768 dst entries in the system and refuse new allocations if we are above 65536. Reported-by: Wolfgang Walter <linux@stwm.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> --- net/ipv4/xfrm4_policy.c | 2 +- net/ipv6/xfrm6_policy.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)