diff mbox

[net-next] Re: rib_trie / Fix inflate_threshold_root. Now=15 size=11 bits

Message ID 20090714194100.GA7952@ami.dom.local
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Jarek Poplawski July 14, 2009, 7:41 p.m. UTC
On Thu, Jul 09, 2009 at 10:34:17PM +0200, Paweł Staszewski wrote:
> Jarek Poplawski pisze:
>> On Wed, Jul 08, 2009 at 12:56:13AM +0200, Paweł Staszewski wrote:
...
>>> Also today i have many messages in dmesg like this:
>>> Fix inflate_threshold_root. Now=25 size=11 bits
>>> :)
>>>     
>>
>>   This is something new and a bit surprising to me: the same threshold
>> in previous tests didn't generate this? Do you mean more than: "Fix 
>> inflate_threshold_root. Now=15 size=11 bits" before?
>>
>>   
> Yes. Sorry for that - this info was not all the day but only 5 minutes  
> when i was making tests.
> This info was reported only when all iBGP peers was down/up fast.
>
>>> And after tune :
>>> /sys/module/fib_trie/parameters/inflate_threshold_root
>>> no more info :)
>>>     
>>
>> With what value?
>>
>>   
> When i set 35 as inflate_threshold_root there was no info even if all  
> iBGP peers was down/up.

So it looks like the patch tested earlier could be still useful; after
changing the inflate_threshold_root it seems these warnings should be
very rare but there is no reason to alarm users with something they
can't fix optimally, anyway.

Thanks,
Jarek P.
--------------------->
ipv4: Fix inflate_threshold_root automatically

During large updates there could be triggered warnings like: "Fix
inflate_threshold_root. Now=25 size=11 bits" if inflate() of the root
node isn't finished in 10 loops. It should be much rarer now, after
changing the threshold from 15 to 25, and a temporary problem, so
this patch tries to handle it automatically using a fix variable to
increase by one inflate threshold for next root resizes (up to the 35
limit, max fix = 10). The fix variable is decreased when root's
inflate() finishes below 7 loops (even if some other, smaller table/
trie is updated -- for simplicity the fix variable is global for now).

Reported-by: Pawel Staszewski <pstaszewski@itcare.pl>
Reported-by: Jorge Boncompte [DTI2] <jorge@dti2.net>
Tested-by: Pawel Staszewski <pstaszewski@itcare.pl>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
---

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Robert Olsson July 15, 2009, 7:43 a.m. UTC | #1
Jarek Poplawski writes:


Looks good. Maybe we're getting close to some generic solution to take 
a very optimistic approach wrt thresholds for root node and adjust to 
settings without the warning. Or maybe now even remove warning totally
with stata counter?

Can we even consider some other different strategy for bumping up the root 
node. 

We need all lookup performance we can get when we now try to route without 
the route cache. And we probably need to evaluate the cost for the multiple 
lookups again at least for LOCAL and MAIN when we talking routing well at 
least straight-forward simple routing. (Semantic change)

I think I've got ~6.2 Gbit/s for simplex forwarding using traffic patterns 
we see in/close to Internet core. This w/o route cache on our hi-end opterons
with 8 CPU cores using niu and ixgbe. I'll test again and your patches when
I'm back from vacation.

Cheers
					--ro

 > So it looks like the patch tested earlier could be still useful; after
 > changing the inflate_threshold_root it seems these warnings should be
 > very rare but there is no reason to alarm users with something they
 > can't fix optimally, anyway.
 > 
 > Thanks,
 > Jarek P.
 > --------------------->
 > ipv4: Fix inflate_threshold_root automatically
 > 
 > During large updates there could be triggered warnings like: "Fix
 > inflate_threshold_root. Now=25 size=11 bits" if inflate() of the root
 > node isn't finished in 10 loops. It should be much rarer now, after
 > changing the threshold from 15 to 25, and a temporary problem, so
 > this patch tries to handle it automatically using a fix variable to
 > increase by one inflate threshold for next root resizes (up to the 35
 > limit, max fix = 10). The fix variable is decreased when root's
 > inflate() finishes below 7 loops (even if some other, smaller table/
 > trie is updated -- for simplicity the fix variable is global for now).
 > 
 > Reported-by: Pawel Staszewski <pstaszewski@itcare.pl>
 > Reported-by: Jorge Boncompte [DTI2] <jorge@dti2.net>
 > Tested-by: Pawel Staszewski <pstaszewski@itcare.pl>
 > Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
 > ---
 > 
 > diff -Nurp a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
 > --- a/net/ipv4/fib_trie.c	2009-07-13 13:32:53.000000000 +0200
 > +++ b/net/ipv4/fib_trie.c	2009-07-13 15:16:18.000000000 +0200
 > @@ -327,6 +327,8 @@ static const int inflate_threshold = 50;
 >  static const int halve_threshold_root = 15;
 >  static const int inflate_threshold_root = 25;
 >  
 > +static int inflate_threshold_root_fix;
 > +#define INFLATE_FIX_MAX 10	/* a comment in resize() */
 >  
 >  static void __alias_free_mem(struct rcu_head *head)
 >  {
 > @@ -617,7 +619,8 @@ static struct node *resize(struct trie *
 >  	/* Keep root node larger  */
 >  
 >  	if (!tn->parent)
 > -		inflate_threshold_use = inflate_threshold_root;
 > +		inflate_threshold_use = inflate_threshold_root +
 > +					inflate_threshold_root_fix;
 >  	else
 >  		inflate_threshold_use = inflate_threshold;
 >  
 > @@ -641,15 +644,27 @@ static struct node *resize(struct trie *
 >  	}
 >  
 >  	if (max_resize < 0) {
 > -		if (!tn->parent)
 > -			pr_warning("Fix inflate_threshold_root."
 > -				   " Now=%d size=%d bits\n",
 > -				   inflate_threshold_root, tn->bits);
 > -		else
 > +		if (!tn->parent) {
 > +			/*
 > +			 * It was observed that during large updates even
 > +			 * inflate_threshold_root = 35 might be needed to avoid
 > +			 * this warning; but it should be temporary, so let's
 > +			 * try to handle this automatically.
 > +			 */
 > +			if (inflate_threshold_root_fix < INFLATE_FIX_MAX)
 > +				inflate_threshold_root_fix++;
 > +			else
 > +				pr_warning("Fix inflate_threshold_root."
 > +					   " Now=%d size=%d bits fix=%d\n",
 > +					   inflate_threshold_root, tn->bits,
 > +					   inflate_threshold_root_fix);
 > +		} else {
 >  			pr_warning("Fix inflate_threshold."
 >  				   " Now=%d size=%d bits\n",
 >  				   inflate_threshold, tn->bits);
 > -	}
 > +		}
 > +	} else if (max_resize > 3 && !tn->parent && inflate_threshold_root_fix)
 > +		inflate_threshold_root_fix--;
 >  
 >  	check_tnode(tn);
 >  
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski July 15, 2009, 1:05 p.m. UTC | #2
On Wed, Jul 15, 2009 at 09:43:11AM +0200, Robert Olsson wrote:
> 
> Jarek Poplawski writes:
> 
> 
> Looks good. Maybe we're getting close to some generic solution to take 
> a very optimistic approach wrt thresholds for root node and adjust to 
> settings without the warning. Or maybe now even remove warning totally
> with stata counter?

I guess, we could, but maybe let's wait a bit to make sure there is
nothing surprising?

> 
> Can we even consider some other different strategy for bumping up the root 
> node. 
> 
> We need all lookup performance we can get when we now try to route without 
> the route cache. And we probably need to evaluate the cost for the multiple 
> lookups again at least for LOCAL and MAIN when we talking routing well at 
> least straight-forward simple routing. (Semantic change)
> 
> I think I've got ~6.2 Gbit/s for simplex forwarding using traffic patterns 
> we see in/close to Internet core. This w/o route cache on our hi-end opterons
> with 8 CPU cores using niu and ixgbe. I'll test again and your patches when
> I'm back from vacation.
> 

Sure, I was mainly aiming at safe defaults (wrt. memory usage), but if
tests show there is a better strategy we should go for it.

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Robert Olsson July 17, 2009, 8:08 a.m. UTC | #3
Jarek Poplawski writes:
 > On Wed, Jul 15, 2009 at 09:43:11AM +0200, Robert Olsson wrote:

 > > a very optimistic approach wrt thresholds for root node and adjust to 
 > > settings without the warning. Or maybe now even remove warning totally
 > > with stata counter?
 > 
 > I guess, we could, but maybe let's wait a bit to make sure there is
 > nothing surprising?

Yes if Pawel is running it we we'll get reports. I've no chance to upgrade
any of our routers now.  I've seen this printout in one our routers but we 
don't do "clear ip bgp *" to often and besides we try to use soft re-
configuration inbound.


 > > I think I've got ~6.2 Gbit/s for simplex forwarding using traffic patterns 
 > > we see in/close to Internet core. This w/o route cache on our hi-end opterons
 > > with 8 CPU cores using niu and ixgbe. I'll test again and your patches when
 > > I'm back from vacation.
 > > 

 > Sure, I was mainly aiming at safe defaults (wrt. memory usage), but if
 > tests show there is a better strategy we should go for it.

Routing without route cache is "new" area probably for minority of systems
were caching is not possible. Read BGP routers in core. 

Yes we should have safe defults. Thanks for all your work.

Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>

Cheers 
					--ro
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller July 20, 2009, 2:41 p.m. UTC | #4
From: Jarek Poplawski <jarkao2@gmail.com>
Date: Tue, 14 Jul 2009 21:41:00 +0200

> ipv4: Fix inflate_threshold_root automatically
> 
> During large updates there could be triggered warnings like: "Fix
> inflate_threshold_root. Now=25 size=11 bits" if inflate() of the root
> node isn't finished in 10 loops. It should be much rarer now, after
> changing the threshold from 15 to 25, and a temporary problem, so
> this patch tries to handle it automatically using a fix variable to
> increase by one inflate threshold for next root resizes (up to the 35
> limit, max fix = 10). The fix variable is decreased when root's
> inflate() finishes below 7 loops (even if some other, smaller table/
> trie is updated -- for simplicity the fix variable is global for now).
> 
> Reported-by: Pawel Staszewski <pstaszewski@itcare.pl>
> Reported-by: Jorge Boncompte [DTI2] <jorge@dti2.net>
> Tested-by: Pawel Staszewski <pstaszewski@itcare.pl>
> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff -Nurp a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
--- a/net/ipv4/fib_trie.c	2009-07-13 13:32:53.000000000 +0200
+++ b/net/ipv4/fib_trie.c	2009-07-13 15:16:18.000000000 +0200
@@ -327,6 +327,8 @@  static const int inflate_threshold = 50;
 static const int halve_threshold_root = 15;
 static const int inflate_threshold_root = 25;
 
+static int inflate_threshold_root_fix;
+#define INFLATE_FIX_MAX 10	/* a comment in resize() */
 
 static void __alias_free_mem(struct rcu_head *head)
 {
@@ -617,7 +619,8 @@  static struct node *resize(struct trie *
 	/* Keep root node larger  */
 
 	if (!tn->parent)
-		inflate_threshold_use = inflate_threshold_root;
+		inflate_threshold_use = inflate_threshold_root +
+					inflate_threshold_root_fix;
 	else
 		inflate_threshold_use = inflate_threshold;
 
@@ -641,15 +644,27 @@  static struct node *resize(struct trie *
 	}
 
 	if (max_resize < 0) {
-		if (!tn->parent)
-			pr_warning("Fix inflate_threshold_root."
-				   " Now=%d size=%d bits\n",
-				   inflate_threshold_root, tn->bits);
-		else
+		if (!tn->parent) {
+			/*
+			 * It was observed that during large updates even
+			 * inflate_threshold_root = 35 might be needed to avoid
+			 * this warning; but it should be temporary, so let's
+			 * try to handle this automatically.
+			 */
+			if (inflate_threshold_root_fix < INFLATE_FIX_MAX)
+				inflate_threshold_root_fix++;
+			else
+				pr_warning("Fix inflate_threshold_root."
+					   " Now=%d size=%d bits fix=%d\n",
+					   inflate_threshold_root, tn->bits,
+					   inflate_threshold_root_fix);
+		} else {
 			pr_warning("Fix inflate_threshold."
 				   " Now=%d size=%d bits\n",
 				   inflate_threshold, tn->bits);
-	}
+		}
+	} else if (max_resize > 3 && !tn->parent && inflate_threshold_root_fix)
+		inflate_threshold_root_fix--;
 
 	check_tnode(tn);