diff mbox

net: fix rtable leak in net/ipv4/route.c

Message ID 4A139FC4.7030309@cosmosbay.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet May 20, 2009, 6:14 a.m. UTC
Eric Dumazet a écrit :
> David Miller a écrit :
>> From: Neil Horman <nhorman@tuxdriver.com>
>> Date: Tue, 19 May 2009 15:24:50 -0400
>>
>>>> Moving whole group in front would defeat the purpose of move, actually,
>>>> since rank in chain is used to decay the timeout in garbage collector.
>>>> (search for tmo >>= 1; )
>>>>
>>> Argh, so the list is implicitly ordered by expiration time.  That
>>> really defeats the entire purpose of doing grouping in the ilst at
>>> all.  If thats the case, then I agree, its probably better to to
>>> take the additional visitation hit in in check_expire above than to
>>> try and preserve ordering.
>> Yes, this seems best.
>>
>> I was worried that somehow the ordering also influences lookups,
>> because the TOS bits don't go into the hash so I worried that it would
>> be important that explicit TOS values appear before wildcard ones.
>> But it doesn't appear that this is an issue, we don't have wildcard
>> TOSs in the rtable entries, they are always explicit.
>>
>> So I would like to see an explicit final patch from Eric so we can get
>> this fixed now.
>>
> 
> I would like to split patches because we have two bugs indeed, and
> I prefer to get attention for both problems, I dont remember Neil acknowledged
> the length computation problem.
> 
> First and small patch, candidate for net-2.6 and stable (for 2.6.29) :
> 

Then here is the patch on top on previous one.

Thanks to all

[PATCH] net: fix rtable leak in net/ipv4/route.c

Alexander V. Lukyanov found a regression in 2.6.29 and made a complete
analysis found in http://bugzilla.kernel.org/show_bug.cgi?id=13339
Quoted here because its a perfect one :

begin_of_quotation 
 2.6.29 patch has introduced flexible route cache rebuilding. Unfortunately the
 patch has at least one critical flaw, and another problem.

 rt_intern_hash calculates rthi pointer, which is later used for new entry
 insertion. The same loop calculates cand pointer which is used to clean the
 list. If the pointers are the same, rtable leak occurs, as first the cand is
 removed then the new entry is appended to it.

 This leak leads to unregister_netdevice problem (usage count > 0).

 Another problem of the patch is that it tries to insert the entries in certain
 order, to facilitate counting of entries distinct by all but QoS parameters.
 Unfortunately, referencing an existing rtable entry moves it to list beginning,
 to speed up further lookups, so the carefully built order is destroyed.

 For the first problem the simplest patch it to set rthi=0 when rthi==cand, but
 it will also destroy the ordering.
end_of_quotation


Problematic commit is 1080d709fb9d8cd4392f93476ee46a9d6ea05a5b
(net: implement emergency route cache rebulds when gc_elasticity is exceeded)

Trying to keep dst_entries ordered is too complex and breaks the fact that
order should depend on the frequency of use for garbage collection.

A possible fix is to make rt_intern_hash() simpler, and only makes
rt_check_expire() a litle bit smarter, being able to cope with an arbitrary
entries order. The added loop is running on cache hot data, while cpu
is prefetching next object, so should be unnoticied.

Reported-and-analyzed-by: Alexander V. Lukyanov <lav@yar.ru>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
---
 net/ipv4/route.c |   55 +++++++++++++--------------------------------
 1 files changed, 17 insertions(+), 38 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Jarek Poplawski May 20, 2009, 10:03 a.m. UTC | #1
On Wed, May 20, 2009 at 08:14:28AM +0200, Eric Dumazet wrote:
> Eric Dumazet a écrit :
> > David Miller a écrit :
> >> From: Neil Horman <nhorman@tuxdriver.com>
> >> Date: Tue, 19 May 2009 15:24:50 -0400
> >>
> >>>> Moving whole group in front would defeat the purpose of move, actually,
> >>>> since rank in chain is used to decay the timeout in garbage collector.
> >>>> (search for tmo >>= 1; )
> >>>>
> >>> Argh, so the list is implicitly ordered by expiration time.  That
> >>> really defeats the entire purpose of doing grouping in the ilst at
> >>> all.  If thats the case, then I agree, its probably better to to
> >>> take the additional visitation hit in in check_expire above than to
> >>> try and preserve ordering.
> >> Yes, this seems best.
> >>
> >> I was worried that somehow the ordering also influences lookups,
> >> because the TOS bits don't go into the hash so I worried that it would
> >> be important that explicit TOS values appear before wildcard ones.
> >> But it doesn't appear that this is an issue, we don't have wildcard
> >> TOSs in the rtable entries, they are always explicit.
> >>
> >> So I would like to see an explicit final patch from Eric so we can get
> >> this fixed now.
> >>
> > 
> > I would like to split patches because we have two bugs indeed, and
> > I prefer to get attention for both problems, I dont remember Neil acknowledged
> > the length computation problem.
> > 
> > First and small patch, candidate for net-2.6 and stable (for 2.6.29) :
> > 
> 
> Then here is the patch on top on previous one.
> 
> Thanks to all
> 
> [PATCH] net: fix rtable leak in net/ipv4/route.c
> 
> Alexander V. Lukyanov found a regression in 2.6.29 and made a complete
> analysis found in http://bugzilla.kernel.org/show_bug.cgi?id=13339
> Quoted here because its a perfect one :
> 
> begin_of_quotation 
>  2.6.29 patch has introduced flexible route cache rebuilding. Unfortunately the
>  patch has at least one critical flaw, and another problem.
> 
>  rt_intern_hash calculates rthi pointer, which is later used for new entry
>  insertion. The same loop calculates cand pointer which is used to clean the
>  list. If the pointers are the same, rtable leak occurs, as first the cand is
>  removed then the new entry is appended to it.
> 
>  This leak leads to unregister_netdevice problem (usage count > 0).
> 
>  Another problem of the patch is that it tries to insert the entries in certain
>  order, to facilitate counting of entries distinct by all but QoS parameters.
>  Unfortunately, referencing an existing rtable entry moves it to list beginning,
>  to speed up further lookups, so the carefully built order is destroyed.
> 
>  For the first problem the simplest patch it to set rthi=0 when rthi==cand, but
>  it will also destroy the ordering.
> end_of_quotation
> 
> 
> Problematic commit is 1080d709fb9d8cd4392f93476ee46a9d6ea05a5b
> (net: implement emergency route cache rebulds when gc_elasticity is exceeded)
> 
> Trying to keep dst_entries ordered is too complex and breaks the fact that
> order should depend on the frequency of use for garbage collection.
> 
> A possible fix is to make rt_intern_hash() simpler, and only makes
> rt_check_expire() a litle bit smarter, being able to cope with an arbitrary
> entries order. The added loop is running on cache hot data, while cpu
> is prefetching next object, so should be unnoticied.
> 
> Reported-and-analyzed-by: Alexander V. Lukyanov <lav@yar.ru>
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> ---
>  net/ipv4/route.c |   55 +++++++++++++--------------------------------
>  1 files changed, 17 insertions(+), 38 deletions(-)
> 
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index 869cf1c..28205e5 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -784,7 +784,7 @@ static void rt_check_expire(void)
>  {
>  	static unsigned int rover;
>  	unsigned int i = rover, goal;
> -	struct rtable *rth, **rthp;
> +	struct rtable *rth, *aux, **rthp;
>  	unsigned long samples = 0;
>  	unsigned long sum = 0, sum2 = 0;
>  	u64 mult;
> @@ -812,6 +812,7 @@ static void rt_check_expire(void)
>  		length = 0;
>  		spin_lock_bh(rt_hash_lock_addr(i));
>  		while ((rth = *rthp) != NULL) {
> +			prefetch(rth->u.dst.rt_next);
>  			if (rt_is_expired(rth)) {
>  				*rthp = rth->u.dst.rt_next;
>  				rt_free(rth);
> @@ -820,33 +821,30 @@ static void rt_check_expire(void)
>  			if (rth->u.dst.expires) {
>  				/* Entry is expired even if it is in use */
>  				if (time_before_eq(jiffies, rth->u.dst.expires)) {
> +nofree:
>  					tmo >>= 1;
>  					rthp = &rth->u.dst.rt_next;
>  					/*
> -					 * Only bump our length if the hash
> -					 * inputs on entries n and n+1 are not
> -					 * the same, we only count entries on
> +					 * We only count entries on
>  					 * a chain with equal hash inputs once
>  					 * so that entries for different QOS
>  					 * levels, and other non-hash input
>  					 * attributes don't unfairly skew
>  					 * the length computation
>  					 */
> -					if ((*rthp == NULL) ||
> -					    !compare_hash_inputs(&(*rthp)->fl,
> -								 &rth->fl))
> -						length += ONE;
> +					for (aux = rt_hash_table[i].chain;;) {
> +						if (aux == rth) {
> +							length += ONE;
> +							break;
> +						}
> +						if (compare_hash_inputs(&aux->fl, &rth->fl))
> +							break;
> +						aux = aux->u.dst.rt_next;
> +					}

Very "interesting" for() usage, but isn't it more readable like this?:

					aux = rt_hash_table[i].chain;
					while (aux != rth) {
						if (compare_hash_inputs(&aux->fl, &rth->fl))
							break;
						aux = aux->u.dst.rt_next;
					}

					if (aux == rth)
						length += ONE;

Jarek P.

>  					continue;
>  				}
> -			} else if (!rt_may_expire(rth, tmo, ip_rt_gc_timeout)) {
> -				tmo >>= 1;
> -				rthp = &rth->u.dst.rt_next;
> -				if ((*rthp == NULL) ||
> -				    !compare_hash_inputs(&(*rthp)->fl,
> -							 &rth->fl))
> -					length += ONE;
> -				continue;
> -			}
> +			} else if (!rt_may_expire(rth, tmo, ip_rt_gc_timeout))
> +				goto nofree;
>  
>  			/* Cleanup aged off entries. */
>  			*rthp = rth->u.dst.rt_next;
> @@ -1069,7 +1067,6 @@ out:	return 0;
>  static int rt_intern_hash(unsigned hash, struct rtable *rt, struct rtable **rp)
>  {
>  	struct rtable	*rth, **rthp;
> -	struct rtable	*rthi;
>  	unsigned long	now;
>  	struct rtable *cand, **candp;
>  	u32 		min_score;
> @@ -1089,7 +1086,6 @@ restart:
>  	}
>  
>  	rthp = &rt_hash_table[hash].chain;
> -	rthi = NULL;
>  
>  	spin_lock_bh(rt_hash_lock_addr(hash));
>  	while ((rth = *rthp) != NULL) {
> @@ -1135,17 +1131,6 @@ restart:
>  		chain_length++;
>  
>  		rthp = &rth->u.dst.rt_next;
> -
> -		/*
> -		 * check to see if the next entry in the chain
> -		 * contains the same hash input values as rt.  If it does
> -		 * This is where we will insert into the list, instead of
> -		 * at the head.  This groups entries that differ by aspects not
> -		 * relvant to the hash function together, which we use to adjust
> -		 * our chain length
> -		 */
> -		if (*rthp && compare_hash_inputs(&(*rthp)->fl, &rt->fl))
> -			rthi = rth;
>  	}
>  
>  	if (cand) {
> @@ -1206,10 +1191,7 @@ restart:
>  		}
>  	}
>  
> -	if (rthi)
> -		rt->u.dst.rt_next = rthi->u.dst.rt_next;
> -	else
> -		rt->u.dst.rt_next = rt_hash_table[hash].chain;
> +	rt->u.dst.rt_next = rt_hash_table[hash].chain;
>  
>  #if RT_CACHE_DEBUG >= 2
>  	if (rt->u.dst.rt_next) {
> @@ -1225,10 +1207,7 @@ restart:
>  	 * previous writes to rt are comitted to memory
>  	 * before making rt visible to other CPUS.
>  	 */
> -	if (rthi)
> -		rcu_assign_pointer(rthi->u.dst.rt_next, rt);
> -	else
> -		rcu_assign_pointer(rt_hash_table[hash].chain, rt);
> +	rcu_assign_pointer(rt_hash_table[hash].chain, rt);
>  
>  	spin_unlock_bh(rt_hash_lock_addr(hash));
>  	*rp = rt;
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neil Horman May 20, 2009, 10:48 a.m. UTC | #2
On Wed, May 20, 2009 at 08:14:28AM +0200, Eric Dumazet wrote:
> Eric Dumazet a écrit :
> > David Miller a écrit :
> >> From: Neil Horman <nhorman@tuxdriver.com>
> >> Date: Tue, 19 May 2009 15:24:50 -0400
> >>
> >>>> Moving whole group in front would defeat the purpose of move, actually,
> >>>> since rank in chain is used to decay the timeout in garbage collector.
> >>>> (search for tmo >>= 1; )
> >>>>
> >>> Argh, so the list is implicitly ordered by expiration time.  That
> >>> really defeats the entire purpose of doing grouping in the ilst at
> >>> all.  If thats the case, then I agree, its probably better to to
> >>> take the additional visitation hit in in check_expire above than to
> >>> try and preserve ordering.
> >> Yes, this seems best.
> >>
> >> I was worried that somehow the ordering also influences lookups,
> >> because the TOS bits don't go into the hash so I worried that it would
> >> be important that explicit TOS values appear before wildcard ones.
> >> But it doesn't appear that this is an issue, we don't have wildcard
> >> TOSs in the rtable entries, they are always explicit.
> >>
> >> So I would like to see an explicit final patch from Eric so we can get
> >> this fixed now.
> >>
> > 
> > I would like to split patches because we have two bugs indeed, and
> > I prefer to get attention for both problems, I dont remember Neil acknowledged
> > the length computation problem.
> > 
> > First and small patch, candidate for net-2.6 and stable (for 2.6.29) :
> > 
> 
> Then here is the patch on top on previous one.
> 
> Thanks to all
> 
> [PATCH] net: fix rtable leak in net/ipv4/route.c
> 
> Alexander V. Lukyanov found a regression in 2.6.29 and made a complete
> analysis found in http://bugzilla.kernel.org/show_bug.cgi?id=13339
> Quoted here because its a perfect one :
> 
> begin_of_quotation 
>  2.6.29 patch has introduced flexible route cache rebuilding. Unfortunately the
>  patch has at least one critical flaw, and another problem.
> 
>  rt_intern_hash calculates rthi pointer, which is later used for new entry
>  insertion. The same loop calculates cand pointer which is used to clean the
>  list. If the pointers are the same, rtable leak occurs, as first the cand is
>  removed then the new entry is appended to it.
> 
>  This leak leads to unregister_netdevice problem (usage count > 0).
> 
>  Another problem of the patch is that it tries to insert the entries in certain
>  order, to facilitate counting of entries distinct by all but QoS parameters.
>  Unfortunately, referencing an existing rtable entry moves it to list beginning,
>  to speed up further lookups, so the carefully built order is destroyed.
> 
>  For the first problem the simplest patch it to set rthi=0 when rthi==cand, but
>  it will also destroy the ordering.
> end_of_quotation
> 
> 
> Problematic commit is 1080d709fb9d8cd4392f93476ee46a9d6ea05a5b
> (net: implement emergency route cache rebulds when gc_elasticity is exceeded)
> 
> Trying to keep dst_entries ordered is too complex and breaks the fact that
> order should depend on the frequency of use for garbage collection.
> 
> A possible fix is to make rt_intern_hash() simpler, and only makes
> rt_check_expire() a litle bit smarter, being able to cope with an arbitrary
> entries order. The added loop is running on cache hot data, while cpu
> is prefetching next object, so should be unnoticied.
> 
> Reported-and-analyzed-by: Alexander V. Lukyanov <lav@yar.ru>
> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet May 20, 2009, 11:13 a.m. UTC | #3
Jarek Poplawski a écrit :
> On Wed, May 20, 2009 at 08:14:28AM +0200, Eric Dumazet wrote:
>> +					for (aux = rt_hash_table[i].chain;;) {
>> +						if (aux == rth) {
>> +							length += ONE;
>> +							break;
>> +						}
>> +						if (compare_hash_inputs(&aux->fl, &rth->fl))
>> +							break;
>> +						aux = aux->u.dst.rt_next;
>> +					}
> 
> Very "interesting" for() usage, but isn't it more readable like this?:
> 
> 					aux = rt_hash_table[i].chain;
> 					while (aux != rth) {
> 						if (compare_hash_inputs(&aux->fl, &rth->fl))
> 							break;
> 						aux = aux->u.dst.rt_next;
> 					}

well, this test is done two times, this is the difference...

> 
> 					if (aux == rth)
> 						length += ONE;
> 
> Jarek P.

I first wrote :

					for (aux = rt_hash_table[i].chain ; ; aux = aux->u.dst.rt_next) {
						if (aux == rth) {
							length += ONE;
							break;
						}
						if (compare_hash_inputs(&aux->fl, &rth->fl))
							break;
					}

but had to split the too long line, so ended in the form in the patch :)


Thank you


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski May 20, 2009, 11:37 a.m. UTC | #4
On Wed, May 20, 2009 at 01:13:41PM +0200, Eric Dumazet wrote:
> Jarek Poplawski a écrit :
> > On Wed, May 20, 2009 at 08:14:28AM +0200, Eric Dumazet wrote:
> >> +					for (aux = rt_hash_table[i].chain;;) {
> >> +						if (aux == rth) {
> >> +							length += ONE;
> >> +							break;
> >> +						}
> >> +						if (compare_hash_inputs(&aux->fl, &rth->fl))
> >> +							break;
> >> +						aux = aux->u.dst.rt_next;
> >> +					}
> > 
> > Very "interesting" for() usage, but isn't it more readable like this?:
> > 
> > 					aux = rt_hash_table[i].chain;
> > 					while (aux != rth) {
> > 						if (compare_hash_inputs(&aux->fl, &rth->fl))
> > 							break;
> > 						aux = aux->u.dst.rt_next;
> > 					}
> 
> well, this test is done two times, this is the difference...

I know, but I guess this is used quite often. And probably it's not
very hard optimization for a compiler (while - else). As a matter of
fact even this would confuse me less here:
					aux = rt_hash_table[i].chain;
					for (;;) {

But of course, it's a matter of taste, so no big deal.

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller May 21, 2009, 12:19 a.m. UTC | #5
From: Neil Horman <nhorman@tuxdriver.com>
Date: Wed, 20 May 2009 06:48:38 -0400

> On Wed, May 20, 2009 at 08:14:28AM +0200, Eric Dumazet wrote:
>> [PATCH] net: fix rtable leak in net/ipv4/route.c
 ...
>> Reported-and-analyzed-by: Alexander V. Lukyanov <lav@yar.ru>
>> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
> Acked-by: Neil Horman <nhorman@tuxdriver.com>
> 

Also applied and queued up for -stable, thanks!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 869cf1c..28205e5 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -784,7 +784,7 @@  static void rt_check_expire(void)
 {
 	static unsigned int rover;
 	unsigned int i = rover, goal;
-	struct rtable *rth, **rthp;
+	struct rtable *rth, *aux, **rthp;
 	unsigned long samples = 0;
 	unsigned long sum = 0, sum2 = 0;
 	u64 mult;
@@ -812,6 +812,7 @@  static void rt_check_expire(void)
 		length = 0;
 		spin_lock_bh(rt_hash_lock_addr(i));
 		while ((rth = *rthp) != NULL) {
+			prefetch(rth->u.dst.rt_next);
 			if (rt_is_expired(rth)) {
 				*rthp = rth->u.dst.rt_next;
 				rt_free(rth);
@@ -820,33 +821,30 @@  static void rt_check_expire(void)
 			if (rth->u.dst.expires) {
 				/* Entry is expired even if it is in use */
 				if (time_before_eq(jiffies, rth->u.dst.expires)) {
+nofree:
 					tmo >>= 1;
 					rthp = &rth->u.dst.rt_next;
 					/*
-					 * Only bump our length if the hash
-					 * inputs on entries n and n+1 are not
-					 * the same, we only count entries on
+					 * We only count entries on
 					 * a chain with equal hash inputs once
 					 * so that entries for different QOS
 					 * levels, and other non-hash input
 					 * attributes don't unfairly skew
 					 * the length computation
 					 */
-					if ((*rthp == NULL) ||
-					    !compare_hash_inputs(&(*rthp)->fl,
-								 &rth->fl))
-						length += ONE;
+					for (aux = rt_hash_table[i].chain;;) {
+						if (aux == rth) {
+							length += ONE;
+							break;
+						}
+						if (compare_hash_inputs(&aux->fl, &rth->fl))
+							break;
+						aux = aux->u.dst.rt_next;
+					}
 					continue;
 				}
-			} else if (!rt_may_expire(rth, tmo, ip_rt_gc_timeout)) {
-				tmo >>= 1;
-				rthp = &rth->u.dst.rt_next;
-				if ((*rthp == NULL) ||
-				    !compare_hash_inputs(&(*rthp)->fl,
-							 &rth->fl))
-					length += ONE;
-				continue;
-			}
+			} else if (!rt_may_expire(rth, tmo, ip_rt_gc_timeout))
+				goto nofree;
 
 			/* Cleanup aged off entries. */
 			*rthp = rth->u.dst.rt_next;
@@ -1069,7 +1067,6 @@  out:	return 0;
 static int rt_intern_hash(unsigned hash, struct rtable *rt, struct rtable **rp)
 {
 	struct rtable	*rth, **rthp;
-	struct rtable	*rthi;
 	unsigned long	now;
 	struct rtable *cand, **candp;
 	u32 		min_score;
@@ -1089,7 +1086,6 @@  restart:
 	}
 
 	rthp = &rt_hash_table[hash].chain;
-	rthi = NULL;
 
 	spin_lock_bh(rt_hash_lock_addr(hash));
 	while ((rth = *rthp) != NULL) {
@@ -1135,17 +1131,6 @@  restart:
 		chain_length++;
 
 		rthp = &rth->u.dst.rt_next;
-
-		/*
-		 * check to see if the next entry in the chain
-		 * contains the same hash input values as rt.  If it does
-		 * This is where we will insert into the list, instead of
-		 * at the head.  This groups entries that differ by aspects not
-		 * relvant to the hash function together, which we use to adjust
-		 * our chain length
-		 */
-		if (*rthp && compare_hash_inputs(&(*rthp)->fl, &rt->fl))
-			rthi = rth;
 	}
 
 	if (cand) {
@@ -1206,10 +1191,7 @@  restart:
 		}
 	}
 
-	if (rthi)
-		rt->u.dst.rt_next = rthi->u.dst.rt_next;
-	else
-		rt->u.dst.rt_next = rt_hash_table[hash].chain;
+	rt->u.dst.rt_next = rt_hash_table[hash].chain;
 
 #if RT_CACHE_DEBUG >= 2
 	if (rt->u.dst.rt_next) {
@@ -1225,10 +1207,7 @@  restart:
 	 * previous writes to rt are comitted to memory
 	 * before making rt visible to other CPUS.
 	 */
-	if (rthi)
-		rcu_assign_pointer(rthi->u.dst.rt_next, rt);
-	else
-		rcu_assign_pointer(rt_hash_table[hash].chain, rt);
+	rcu_assign_pointer(rt_hash_table[hash].chain, rt);
 
 	spin_unlock_bh(rt_hash_lock_addr(hash));
 	*rp = rt;