diff mbox

[RFC] ipv4: Do not cache routing failures due to disabled forwarding.

Message ID 1410531260-13794-2-git-send-email-nicolas.cavallari@green-communications.fr
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Nicolas Cavallari Sept. 12, 2014, 2:14 p.m. UTC
If we cache them, the kernel will reuse them, independently of
whether forwarding is enabled or not.  Which means that if forwarding is
disabled on the input interface where the first routing request comes
from, then that unreachable result will be cached and reused for
other interfaces, even if forwarding is enabled on them.

This can be verified with two interfaces A and B and an output interface
C, where B has forwarding enabled, but not A and trying
ip route get $dst iif A from $src && ip route get $dst iif B from $src

Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
---
based on net-next, but not really tested on top of it.

 net/ipv4/route.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

Comments

Julian Anastasov Sept. 12, 2014, 10:13 p.m. UTC | #1
Hello,

On Fri, 12 Sep 2014, Nicolas Cavallari wrote:

> If we cache them, the kernel will reuse them, independently of
> whether forwarding is enabled or not.  Which means that if forwarding is
> disabled on the input interface where the first routing request comes
> from, then that unreachable result will be cached and reused for
> other interfaces, even if forwarding is enabled on them.
> 
> This can be verified with two interfaces A and B and an output interface
> C, where B has forwarding enabled, but not A and trying
> ip route get $dst iif A from $src && ip route get $dst iif B from $src

	Correct. While failed fib_lookup() does not set
res.fi in net/ipv4/fib_trie.c:check_leaf(), on fib_lookup()
success we have res.fi != NULL and it remains for the
!IN_DEV_FORWARD case (the 2nd 'goto no_route').

> Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
> ---
> based on net-next, but not really tested on top of it.
> 
>  net/ipv4/route.c | 21 ++++++++++-----------
>  1 file changed, 10 insertions(+), 11 deletions(-)
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index 234a43e..b537997 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -1655,7 +1655,7 @@ static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
>  	struct rtable	*rth;
>  	int		err = -EINVAL;
>  	struct net    *net = dev_net(dev);
> -	bool do_cache;
> +	bool do_cache = true;
>  
>  	/* IP on this device is disabled. */
>  
> @@ -1723,6 +1723,7 @@ static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
>  
>  	if (!IN_DEV_FORWARD(in_dev)) {
>  		err = -EHOSTUNREACH;
> +		do_cache = false;
>  		goto no_route;
>  	}
>  	if (res.type != RTN_UNICAST)
> @@ -1746,16 +1747,14 @@ brd_input:
>  	RT_CACHE_STAT_INC(in_brd);
>  
>  local_input:
> -	do_cache = false;
> -	if (res.fi) {
> -		if (!itag) {
> -			rth = rcu_dereference(FIB_RES_NH(res).nh_rth_input);
> -			if (rt_cache_valid(rth)) {
> -				skb_dst_set_noref(skb, &rth->dst);
> -				err = 0;
> -				goto out;
> -			}
> -			do_cache = true;
> +	if (!res.fi || itag) {
> +		do_cache = false;
> +	} else if (do_cache) {
> +		rth = rcu_dereference(FIB_RES_NH(res).nh_rth_input);
> +		if (rt_cache_valid(rth)) {
> +			skb_dst_set_noref(skb, &rth->dst);
> +			err = 0;
> +			goto out;
>  		}
>  	}
>  
> -- 
> 2.1.0

	Two alternatives are possible:

1. set res.fi = NULL after 'no_route:' label

or better

2. set do_cache = false after 'no_route:' label,
then instead of 'goto local_input;' jump to a new
label 'create_rt:' just before rt_dst_alloc.

	Not sure, they may generate less code in the fast path.

Regards

--
Julian Anastasov <ja@ssi.bg>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Oct. 29, 2014, 7:03 p.m. UTC | #2
From: Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
Date: Fri, 12 Sep 2014 16:14:20 +0200

> If we cache them, the kernel will reuse them, independently of
> whether forwarding is enabled or not.  Which means that if forwarding is
> disabled on the input interface where the first routing request comes
> from, then that unreachable result will be cached and reused for
> other interfaces, even if forwarding is enabled on them.
> 
> This can be verified with two interfaces A and B and an output interface
> C, where B has forwarding enabled, but not A and trying
> ip route get $dst iif A from $src && ip route get $dst iif B from $src
> 
> Signed-off-by: Nicolas Cavallari <nicolas.cavallari@green-communications.fr>
> ---
> based on net-next, but not really tested on top of it.

Sorry Nicolas, this seems to have fallen on the floor.  Could you please
resubmit your most uptodate version of this patch so I can apply it?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 234a43e..b537997 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1655,7 +1655,7 @@  static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
 	struct rtable	*rth;
 	int		err = -EINVAL;
 	struct net    *net = dev_net(dev);
-	bool do_cache;
+	bool do_cache = true;
 
 	/* IP on this device is disabled. */
 
@@ -1723,6 +1723,7 @@  static int ip_route_input_slow(struct sk_buff *skb, __be32 daddr, __be32 saddr,
 
 	if (!IN_DEV_FORWARD(in_dev)) {
 		err = -EHOSTUNREACH;
+		do_cache = false;
 		goto no_route;
 	}
 	if (res.type != RTN_UNICAST)
@@ -1746,16 +1747,14 @@  brd_input:
 	RT_CACHE_STAT_INC(in_brd);
 
 local_input:
-	do_cache = false;
-	if (res.fi) {
-		if (!itag) {
-			rth = rcu_dereference(FIB_RES_NH(res).nh_rth_input);
-			if (rt_cache_valid(rth)) {
-				skb_dst_set_noref(skb, &rth->dst);
-				err = 0;
-				goto out;
-			}
-			do_cache = true;
+	if (!res.fi || itag) {
+		do_cache = false;
+	} else if (do_cache) {
+		rth = rcu_dereference(FIB_RES_NH(res).nh_rth_input);
+		if (rt_cache_valid(rth)) {
+			skb_dst_set_noref(skb, &rth->dst);
+			err = 0;
+			goto out;
 		}
 	}