| Submitter | Julian Anastasov |
|---|---|
| Date | Oct. 7, 2012, 11:26 a.m. |
| Message ID | <1349609168-9848-2-git-send-email-ja@ssi.bg> |
| Download | mbox | patch |
| Permalink | /patch/189805/ |
| State | Changes Requested |
| Delegated to: | David Miller |
| Headers | show |
Comments
From: Julian Anastasov <ja@ssi.bg> Date: Sun, 7 Oct 2012 14:26:03 +0300 > @@ -322,7 +322,8 @@ int fib_validate_source(struct sk_buff *skb, __be32 src, __be32 dst, > { > int r = secpath_exists(skb) ? 0 : IN_DEV_RPFILTER(idev); > > - if (!r && !fib_num_tclassid_users(dev_net(dev))) { > + if (!r && !fib_num_tclassid_users(dev_net(dev)) && > + dev->ifindex != oif) { > *itag = 0; > return 0; > } Hmmm, won't this cause the slow path to be taken for locally destined traffic? > + rt->rt_gateway ? : ip_hdr(skb)->daddr); Please use the rt_nexthop() helper. > + __be32 gw = rt->rt_gateway ? : ip_hdr(skb)->daddr; Likewise. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Julian Anastasov <ja@ssi.bg> Date: Mon, 8 Oct 2012 23:43:45 +0300 (EEST) > On Mon, 8 Oct 2012, David Miller wrote: > >> From: Julian Anastasov <ja@ssi.bg> >> Date: Sun, 7 Oct 2012 14:26:03 +0300 >> >> > @@ -322,7 +322,8 @@ int fib_validate_source(struct sk_buff *skb, __be32 src, __be32 dst, >> > { >> > int r = secpath_exists(skb) ? 0 : IN_DEV_RPFILTER(idev); >> > >> > - if (!r && !fib_num_tclassid_users(dev_net(dev))) { >> > + if (!r && !fib_num_tclassid_users(dev_net(dev)) && >> > + dev->ifindex != oif) { >> > *itag = 0; >> > return 0; >> > } >> >> Hmmm, won't this cause the slow path to be taken for locally >> destined traffic? > > In this case idev=eth0 and oif=lo. The only case > where we can see same input and output device is for > forwarding and loopback (but only output routes where > there is no such validation). Ok, now I understand. This added condition is fine. Thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, On Mon, 8 Oct 2012, David Miller wrote: > From: Julian Anastasov <ja@ssi.bg> > Date: Sun, 7 Oct 2012 14:26:03 +0300 > > > @@ -322,7 +322,8 @@ int fib_validate_source(struct sk_buff *skb, __be32 src, __be32 dst, > > { > > int r = secpath_exists(skb) ? 0 : IN_DEV_RPFILTER(idev); > > > > - if (!r && !fib_num_tclassid_users(dev_net(dev))) { > > + if (!r && !fib_num_tclassid_users(dev_net(dev)) && > > + dev->ifindex != oif) { > > *itag = 0; > > return 0; > > } > > Hmmm, won't this cause the slow path to be taken for locally > destined traffic? In this case idev=eth0 and oif=lo. The only case where we can see same input and output device is for forwarding and loopback (but only output routes where there is no such validation). Of course, it slows down on traffic that ignores our redirects. We can save some cycles if IN_DEV_TX_REDIRECTS is checked before setting RTCF_DOREDIRECT: (dev->ifindex != oif || !IN_DEV_TX_REDIRECTS(idev)) { RTCF_DOREDIRECT is used just to call ip_rt_send_redirect where all depends on IN_DEV_TX_REDIRECTS. The only difference can be that user space will not see RTCF_DOREDIRECT flag if IN_DEV_TX_REDIRECTS is false. But may be it is better to show "redirect" in ip route only when IN_DEV_TX_REDIRECTS are enabled. The old way is preferred only when there is routing cache and changes in IN_DEV_TX_REDIRECTS do not flush cache in devinet_conf_proc(). > > + rt->rt_gateway ? : ip_hdr(skb)->daddr); > > Please use the rt_nexthop() helper. > > > + __be32 gw = rt->rt_gateway ? : ip_hdr(skb)->daddr; Good idea. Regards -- Julian Anastasov <ja@ssi.bg> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Patch
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index 68c93d1..6dacae6 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -322,7 +322,8 @@ int fib_validate_source(struct sk_buff *skb, __be32 src, __be32 dst, { int r = secpath_exists(skb) ? 0 : IN_DEV_RPFILTER(idev); - if (!r && !fib_num_tclassid_users(dev_net(dev))) { + if (!r && !fib_num_tclassid_users(dev_net(dev)) && + dev->ifindex != oif) { *itag = 0; return 0; } diff --git a/net/ipv4/route.c b/net/ipv4/route.c index ff62206..488a8bb 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -802,7 +802,8 @@ void ip_rt_send_redirect(struct sk_buff *skb) net = dev_net(rt->dst.dev); peer = inet_getpeer_v4(net->ipv4.peers, ip_hdr(skb)->saddr, 1); if (!peer) { - icmp_send(skb, ICMP_REDIRECT, ICMP_REDIR_HOST, rt->rt_gateway); + icmp_send(skb, ICMP_REDIRECT, ICMP_REDIR_HOST, + rt->rt_gateway ? : ip_hdr(skb)->daddr); return; } @@ -827,7 +828,9 @@ void ip_rt_send_redirect(struct sk_buff *skb) time_after(jiffies, (peer->rate_last + (ip_rt_redirect_load << peer->rate_tokens)))) { - icmp_send(skb, ICMP_REDIRECT, ICMP_REDIR_HOST, rt->rt_gateway); + __be32 gw = rt->rt_gateway ? : ip_hdr(skb)->daddr; + + icmp_send(skb, ICMP_REDIRECT, ICMP_REDIR_HOST, gw); peer->rate_last = jiffies; ++peer->rate_tokens; #ifdef CONFIG_IP_ROUTE_VERBOSE @@ -835,7 +838,7 @@ void ip_rt_send_redirect(struct sk_buff *skb) peer->rate_tokens == ip_rt_redirect_number) net_warn_ratelimited("host %pI4/if%d ignores redirects for %pI4 to %pI4\n", &ip_hdr(skb)->saddr, inet_iif(skb), - &ip_hdr(skb)->daddr, &rt->rt_gateway); + &ip_hdr(skb)->daddr, &gw); #endif } out_put_peer: @@ -1439,10 +1442,13 @@ static int __mkroute_input(struct sk_buff *skb, goto cleanup; } + do_cache = res->fi && !itag; if (out_dev == in_dev && err && (IN_DEV_SHARED_MEDIA(out_dev) || - inet_addr_onlink(out_dev, saddr, FIB_RES_GW(*res)))) + inet_addr_onlink(out_dev, saddr, FIB_RES_GW(*res)))) { flags |= RTCF_DOREDIRECT; + do_cache = false; + } if (skb->protocol != htons(ETH_P_IP)) { /* Not IP (i.e. ARP). Do not create route, if it is @@ -1459,15 +1465,11 @@ static int __mkroute_input(struct sk_buff *skb, } } - do_cache = false; - if (res->fi) { - if (!itag) { - rth = rcu_dereference(FIB_RES_NH(*res).nh_rth_input); - if (rt_cache_valid(rth)) { - skb_dst_set_noref(skb, &rth->dst); - goto out; - } - do_cache = true; + if (do_cache) { + rth = rcu_dereference(FIB_RES_NH(*res).nh_rth_input); + if (rt_cache_valid(rth)) { + skb_dst_set_noref(skb, &rth->dst); + goto out; } }
After "Cache input routes in fib_info nexthops" (commit d2d68ba9fe) and "Elide fib_validate_source() completely when possible" (commit 7a9bc9b81a) we can not send ICMP redirects. It seems we should not cache the RTCF_DOREDIRECT flag in nh_rth_input because the same fib_info can be used for traffic that is not redirected, eg. from other input devices or from sources that are not in same subnet. As result, we have to disable the caching of RTCF_DOREDIRECT flag and to force source validation for the case when forwarding traffic to the input device. If traffic comes from directly connected source we allow redirection as it was done before both changes. After the change "Adjust semantics of rt->rt_gateway" (commit f8126f1d51) we should make sure our ICMP_REDIR_HOST messages contain daddr instead of 0.0.0.0 when target is directly connected. Signed-off-by: Julian Anastasov <ja@ssi.bg> --- net/ipv4/fib_frontend.c | 3 ++- net/ipv4/route.c | 28 +++++++++++++++------------- 2 files changed, 17 insertions(+), 14 deletions(-)