Message ID | 20111102153443.38cc1e5c@kryten |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
From: Anton Blanchard <anton@samba.org> Date: Wed, 2 Nov 2011 15:34:43 +1100 > Any ideas how we could make this behave a bit better? I know setting > gc_thresh3 higher is the ultimate solution, but if gc_thresh1 and > gc_thresh2 are always below the route threshold we should either fix > this issue or remove them completely. The solution is to do refcount'less RCU lookups into the neigh tables on every packet send, and long term that's what I intend to implement. That's what's behind making the recent change to make the ARP hash cheaper etc. See slides 5, 6, and 7 in: http://vger.kernel.org/netconf2011_slides/davem_netconf2011.pdf Once that's done you can trim whatever neigh entries you want, whenever you want. You are right that the current situation is silly, because if we're willing to commit to N routing table entries we might as well be willing to commit to N arp table entries as well. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Dave, > > Any ideas how we could make this behave a bit better? I know setting > > gc_thresh3 higher is the ultimate solution, but if gc_thresh1 and > > gc_thresh2 are always below the route threshold we should either fix > > this issue or remove them completely. > > The solution is to do refcount'less RCU lookups into the neigh > tables on every packet send, and long term that's what I intend > to implement. > > That's what's behind making the recent change to make the ARP hash > cheaper etc. > > See slides 5, 6, and 7 in: > > http://vger.kernel.org/netconf2011_slides/davem_netconf2011.pdf > > Once that's done you can trim whatever neigh entries you want, > whenever you want. > > You are right that the current situation is silly, because if > we're willing to commit to N routing table entries we might as > well be willing to commit to N arp table entries as well. Thanks for clearing it up! Looking forwards to the new scheme :) Anton -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 155138d..8104d41 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -876,7 +876,7 @@ static void rt_emergency_hash_rebuild(struct net *net) and when load increases it reduces to limit cache size. */ -static int rt_garbage_collect(struct dst_ops *ops) +static int __rt_garbage_collect(struct dst_ops *ops, int force) { static unsigned long expire = RT_GC_TIMEOUT; static unsigned long last_gc; @@ -895,7 +895,7 @@ static int rt_garbage_collect(struct dst_ops *ops) RT_CACHE_STAT_INC(gc_total); - if (now - last_gc < ip_rt_gc_min_interval && + if (!force && now - last_gc < ip_rt_gc_min_interval && entries < ip_rt_max_size) { RT_CACHE_STAT_INC(gc_ignored); goto out; @@ -920,6 +920,9 @@ static int rt_garbage_collect(struct dst_ops *ops) equilibrium = entries - goal; } + if (force) + goal = 1; + if (now - last_gc >= ip_rt_gc_min_interval) last_gc = now; @@ -996,6 +999,11 @@ work_done: out: return 0; } +static int rt_garbage_collect(struct dst_ops *ops) +{ + return __rt_garbage_collect(ops, 0); +} + /* * Returns number of entries in a hash chain that have different hash_inputs */ @@ -1192,7 +1200,7 @@ restart: int saved_int = ip_rt_gc_min_interval; ip_rt_gc_elasticity = 1; ip_rt_gc_min_interval = 0; - rt_garbage_collect(&ipv4_dst_ops); + __rt_garbage_collect(&ipv4_dst_ops, 1); ip_rt_gc_min_interval = saved_int; ip_rt_gc_elasticity = saved_elasticity; goto restart;