diff mbox

[RFC,1/3] ipv6: Fix after pmtu events dissapearing host routes

Message ID 20150330103313.GJ3311@secunet.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Steffen Klassert March 30, 2015, 10:33 a.m. UTC
We currently don't clone host routes before we use them.
If a pmtu event is received on such a route, it gets
an expires value. As soon as the expiration time is
elapsed, the route is deleted. As a result, the host
es not reachable any more.

We fix this by cloning host routes if they are not local,
i.e. if pmtu events can happen. Also adjust the route
deletion because the cached and the original host route
are on the same fib node.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv6/route.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

shengyong March 30, 2015, 11:15 a.m. UTC | #1
On 3/30/2015 6:33 PM, Steffen Klassert wrote:
> We currently don't clone host routes before we use them.
> If a pmtu event is received on such a route, it gets
> an expires value. As soon as the expiration time is
> elapsed, the route is deleted. As a result, the host
> es not reachable any more.
> 
> We fix this by cloning host routes if they are not local,
> i.e. if pmtu events can happen. Also adjust the route
> deletion because the cached and the original host route
> are on the same fib node.
> 
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> ---
>  net/ipv6/route.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index 4688bd4..4318626 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -961,7 +961,7 @@ redo_rt6_select:
>  
>  	if (!(rt->rt6i_flags & (RTF_NONEXTHOP | RTF_GATEWAY)))
>  		nrt = rt6_alloc_cow(rt, &fl6->daddr, &fl6->saddr);
> -	else if (!(rt->dst.flags & DST_HOST))
> +	else if (!(rt->dst.flags & DST_HOST) || !(rt->rt6i_flags & RTF_LOCAL))
>  		nrt = rt6_alloc_clone(rt, &fl6->daddr);
>  	else
>  		goto out2;
> @@ -1796,6 +1796,8 @@ static int ip6_route_del(struct fib6_config *cfg)
>  				continue;
>  			if (cfg->fc_metric && cfg->fc_metric != rt->rt6i_metric)
>  				continue;
> +			if (!(cfg->fc_flags & RTF_CACHE) && (rt->rt6i_flags & RTF_CACHE))
There is a trivial coding style problem, which makes checkpatch.pl unhappy. The line is
over 80 characters.
> +				continue;
>  			dst_hold(&rt->dst);
>  			read_unlock_bh(&table->tb6_lock);
>  
> @@ -2433,6 +2435,9 @@ static int rtm_to_fib6_config(struct sk_buff *skb, struct nlmsghdr *nlh,
>  	if (rtm->rtm_type == RTN_LOCAL)
>  		cfg->fc_flags |= RTF_LOCAL;
>  
> +	if (rtm->rtm_flags & RTM_F_CLONED)
> +		cfg->fc_flags |= RTF_CACHE;
> +
>  	cfg->fc_nlinfo.portid = NETLINK_CB(skb).portid;
>  	cfg->fc_nlinfo.nlh = nlh;
>  	cfg->fc_nlinfo.nl_net = sock_net(skb->sk);
> 

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Martin KaFai Lau March 30, 2015, 6:24 p.m. UTC | #2
On Mon, Mar 30, 2015 at 12:33:13PM +0200, Steffen Klassert wrote:
> We currently don't clone host routes before we use them.
> If a pmtu event is received on such a route, it gets
> an expires value. As soon as the expiration time is
> elapsed, the route is deleted. As a result, the host
> es not reachable any more.
s/es/is

> We fix this by cloning host routes if they are not local,
> i.e. if pmtu events can happen. Also adjust the route
> deletion because the cached and the original host route
> are on the same fib node.
> 
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> ---
>  net/ipv6/route.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index 4688bd4..4318626 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -961,7 +961,7 @@ redo_rt6_select:
>  
>  	if (!(rt->rt6i_flags & (RTF_NONEXTHOP | RTF_GATEWAY)))
>  		nrt = rt6_alloc_cow(rt, &fl6->daddr, &fl6->saddr);
> -	else if (!(rt->dst.flags & DST_HOST))
> +	else if (!(rt->dst.flags & DST_HOST) || !(rt->rt6i_flags & RTF_LOCAL))
>  		nrt = rt6_alloc_clone(rt, &fl6->daddr);
I suspect the nrt and rt will share the same inetpeer.  Hence, after the nrt is
removed, the MTU (obtained from PMTU update) will stay.

> @@ -1796,6 +1796,8 @@ static int ip6_route_del(struct fib6_config *cfg)
>  				continue;
>  			if (cfg->fc_metric && cfg->fc_metric != rt->rt6i_metric)
>  				continue;
> +			if (!(cfg->fc_flags & RTF_CACHE) && (rt->rt6i_flags & RTF_CACHE))
> +				continue;
>  			dst_hold(&rt->dst);
>  			read_unlock_bh(&table->tb6_lock);
>  
> @@ -2433,6 +2435,9 @@ static int rtm_to_fib6_config(struct sk_buff *skb, struct nlmsghdr *nlh,
>  	if (rtm->rtm_type == RTN_LOCAL)
>  		cfg->fc_flags |= RTF_LOCAL;
>  
> +	if (rtm->rtm_flags & RTM_F_CLONED)
> +		cfg->fc_flags |= RTF_CACHE;
> +
>  	cfg->fc_nlinfo.portid = NETLINK_CB(skb).portid;
>  	cfg->fc_nlinfo.nlh = nlh;
>  	cfg->fc_nlinfo.nl_net = sock_net(skb->sk);
I had pulled your last patch to my local branch that I am cooking.  I also
added similar changes to ip6_route_del() and rtm_to_fib_config(), and confirmed
it can keep 'ip -6 r del' and 'ip -6 r flush table cache' to keep working
properly.

Thanks,
--Martin
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 4688bd4..4318626 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -961,7 +961,7 @@  redo_rt6_select:
 
 	if (!(rt->rt6i_flags & (RTF_NONEXTHOP | RTF_GATEWAY)))
 		nrt = rt6_alloc_cow(rt, &fl6->daddr, &fl6->saddr);
-	else if (!(rt->dst.flags & DST_HOST))
+	else if (!(rt->dst.flags & DST_HOST) || !(rt->rt6i_flags & RTF_LOCAL))
 		nrt = rt6_alloc_clone(rt, &fl6->daddr);
 	else
 		goto out2;
@@ -1796,6 +1796,8 @@  static int ip6_route_del(struct fib6_config *cfg)
 				continue;
 			if (cfg->fc_metric && cfg->fc_metric != rt->rt6i_metric)
 				continue;
+			if (!(cfg->fc_flags & RTF_CACHE) && (rt->rt6i_flags & RTF_CACHE))
+				continue;
 			dst_hold(&rt->dst);
 			read_unlock_bh(&table->tb6_lock);
 
@@ -2433,6 +2435,9 @@  static int rtm_to_fib6_config(struct sk_buff *skb, struct nlmsghdr *nlh,
 	if (rtm->rtm_type == RTN_LOCAL)
 		cfg->fc_flags |= RTF_LOCAL;
 
+	if (rtm->rtm_flags & RTM_F_CLONED)
+		cfg->fc_flags |= RTF_CACHE;
+
 	cfg->fc_nlinfo.portid = NETLINK_CB(skb).portid;
 	cfg->fc_nlinfo.nlh = nlh;
 	cfg->fc_nlinfo.nl_net = sock_net(skb->sk);