Message ID | 20170724030907.GC2938@leo.usersys.redhat.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On Sun, Jul 23, 2017 at 8:09 PM, Hangbin Liu <liuhangbin@gmail.com> wrote: > Do we still need this net->ipv6.ip6_null_entry check? How about remove all > the checks? I believe you only need to check for rt->dst.error, no need to check against NULL or ip6_null_entry. Take a look at other ip6_route_lookup() callers.
On Mon, Jul 24, 2017 at 12:57:43PM -0700, Cong Wang wrote: > On Sun, Jul 23, 2017 at 8:09 PM, Hangbin Liu <liuhangbin@gmail.com> wrote: > > Do we still need this net->ipv6.ip6_null_entry check? How about remove all > > the checks? > > I believe you only need to check for rt->dst.error, no need to check against > NULL or ip6_null_entry. > > Take a look at other ip6_route_lookup() callers. Yes, I saw it. That why I send v2 patch to check both rt->dst.error and ip6_null_entry. The question is the other two caller are rpfilter_lookup_reverse6() and nft_fib6_eval(). From the code it looks these two caller only care about device match. if (rt->rt6i_idev->dev == dev || (flags & XT_RPFILTER_LOOSE)) ret = true; And the device would be lo if it is ip6_null_entry. So they just discard it. I'm not familiar with netfilter, Please correct me if I make any mistake. But what we want in inet6_rtm_getroute() and rt6_dump_route() is to get/dump the route info. So we should get the info even it's unreachable or prohibit. That's why I think we should remove both rt->dst.error and ip6_null_entry check in inet6_rtm_getroute(). And even further, remove the ip6_null_entry check in rt6_dump_route(). What do you think? Thanks Hangbin
On 7/24/17 6:08 PM, Hangbin Liu wrote: > That's why I think we should remove both rt->dst.error and ip6_null_entry > check in inet6_rtm_getroute(). And even further, remove the ip6_null_entry > check in rt6_dump_route(). git blame net/ipv6/route.c find the commits and review.
On Mon, Jul 24, 2017 at 09:28:07PM -0600, David Ahern wrote: > On 7/24/17 6:08 PM, Hangbin Liu wrote: > > That's why I think we should remove both rt->dst.error and ip6_null_entry > > check in inet6_rtm_getroute(). And even further, remove the ip6_null_entry > > check in rt6_dump_route(). > > git blame net/ipv6/route.c > > find the commits and review. Hi David, Sorry, would you like to give me more hints? I saw your commit 1f17e2f2c8a8 ("net: ipv6: ignore null_entry on route dumps"). But I think this issue has been fixed by 2f460933f58e ("ipv6: initialize route null entry in addrconf_init()") and 242d3a49a2a1 ("ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf") I use a reproducer which add unreachable route in netns with lo down. And I could not trigger Panic in the fixed kernel. That's why I think we could remove ip6_null_entry check in rt6_dump_route(). Please correct me if I make any mistake. Thanks Hangbin
On Mon, Jul 24, 2017 at 5:08 PM, Hangbin Liu <liuhangbin@gmail.com> wrote: > But what we want in inet6_rtm_getroute() and rt6_dump_route() is to > get/dump the route info. So we should get the info even it's unreachable or > prohibit. If you want to dump prohibit/blackhole entry, then you have to check for null_entry, and rt->dst.error check is still needed because we could return error on other normal entries too, IOW, your v2 is correct if dumping prohibit/blackhole is expected. I thought we don't dump them. I am not sure about the behavior either.
On Tue, Jul 25, 2017 at 10:49:05AM -0700, Cong Wang wrote: > On Mon, Jul 24, 2017 at 5:08 PM, Hangbin Liu <liuhangbin@gmail.com> wrote: > > But what we want in inet6_rtm_getroute() and rt6_dump_route() is to > > get/dump the route info. So we should get the info even it's unreachable or > > prohibit. > > If you want to dump prohibit/blackhole entry, then you have to check > for null_entry, and rt->dst.error check is still needed because we I could not reproduce the NULL rt6i_idev issue after you route init fix, So I think it's also safe to dump null_entry now. There is a v3 patch. After the patch, we could dump unreachable route info like: + ip netns exec client ip -6 route get 2003::1 unreachable 2003::1 dev lo table unspec proto kernel metric 4294967295 error -101 > could return error on other normal entries too, IOW, your v2 is correct > if dumping prohibit/blackhole is expected. Would you hlep review the v3 patch? I prepare not touch the check in function rt6_dump_route() now. Thanks Hangbin
On 7/25/17 1:32 AM, Hangbin Liu wrote: > On Mon, Jul 24, 2017 at 09:28:07PM -0600, David Ahern wrote: >> On 7/24/17 6:08 PM, Hangbin Liu wrote: >>> That's why I think we should remove both rt->dst.error and ip6_null_entry >>> check in inet6_rtm_getroute(). And even further, remove the ip6_null_entry >>> check in rt6_dump_route(). >> >> git blame net/ipv6/route.c >> >> find the commits and review. > > Hi David, > > Sorry, would you like to give me more hints? > > I saw your commit 1f17e2f2c8a8 ("net: ipv6: ignore null_entry on route > dumps"). But I think this issue has been fixed by > > 2f460933f58e ("ipv6: initialize route null entry in addrconf_init()") and > 242d3a49a2a1 ("ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf") > > I use a reproducer which add unreachable route in netns with lo down. And I > could not trigger Panic in the fixed kernel. That's why I think we could > remove ip6_null_entry check in rt6_dump_route(). Understood. Cong's patch to fix the intialization order (lo device before route init) makes sure the idev is not null. That said, the null_entry route is internal ipv6 logic and is not a route entry to be returned to userspace.
On Wed, Jul 26, 2017 at 10:18 AM, David Ahern <dsahern@gmail.com> wrote: > On 7/25/17 1:32 AM, Hangbin Liu wrote: >> On Mon, Jul 24, 2017 at 09:28:07PM -0600, David Ahern wrote: >>> On 7/24/17 6:08 PM, Hangbin Liu wrote: >>>> That's why I think we should remove both rt->dst.error and ip6_null_entry >>>> check in inet6_rtm_getroute(). And even further, remove the ip6_null_entry >>>> check in rt6_dump_route(). >>> >>> git blame net/ipv6/route.c >>> >>> find the commits and review. >> >> Hi David, >> >> Sorry, would you like to give me more hints? >> >> I saw your commit 1f17e2f2c8a8 ("net: ipv6: ignore null_entry on route >> dumps"). But I think this issue has been fixed by >> >> 2f460933f58e ("ipv6: initialize route null entry in addrconf_init()") and >> 242d3a49a2a1 ("ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf") >> >> I use a reproducer which add unreachable route in netns with lo down. And I >> could not trigger Panic in the fixed kernel. That's why I think we could >> remove ip6_null_entry check in rt6_dump_route(). > > Understood. Cong's patch to fix the intialization order (lo device > before route init) makes sure the idev is not null. That said, the > null_entry route is internal ipv6 logic and is not a route entry to be > returned to userspace. agreed...so looks like the check in v3 should be + if ( rt == net->ipv6.ip6_null_entry || + (rt->dst.error && + #ifdef CONFIG_IPV6_MULTIPLE_TABLES + rt != net->ipv6.ip6_prohibit_entry && + rt != net->ipv6.ip6_blk_hole_entry && +#endif + )) { err = rt->dst.error; ip6_rt_put(rt); goto errout;
On 7/26/17 12:27 PM, Roopa Prabhu wrote: > agreed...so looks like the check in v3 should be > > > + if ( rt == net->ipv6.ip6_null_entry || > + (rt->dst.error && > + #ifdef CONFIG_IPV6_MULTIPLE_TABLES > + rt != net->ipv6.ip6_prohibit_entry && > + rt != net->ipv6.ip6_blk_hole_entry && > +#endif > + )) { > err = rt->dst.error; > ip6_rt_put(rt); > goto errout; > I don't think so. If I add a prohibit route and use the fibmatch attribute, I want to see the route from the FIB that was matched.
On Wed, Jul 26, 2017 at 11:49 AM, David Ahern <dsahern@gmail.com> wrote: > On 7/26/17 12:27 PM, Roopa Prabhu wrote: >> agreed...so looks like the check in v3 should be >> >> >> + if ( rt == net->ipv6.ip6_null_entry || >> + (rt->dst.error && >> + #ifdef CONFIG_IPV6_MULTIPLE_TABLES >> + rt != net->ipv6.ip6_prohibit_entry && >> + rt != net->ipv6.ip6_blk_hole_entry && >> +#endif >> + )) { >> err = rt->dst.error; >> ip6_rt_put(rt); >> goto errout; >> > > I don't think so. If I add a prohibit route and use the fibmatch > attribute, I want to see the route from the FIB that was matched. yes, exactly. wouldn't 'rt != net->ipv6.ip6_prohibit_entry' above let it fall through to the route fill code ? ah...but i guess you are saying that they will have rt6_info's of their own and will not match. got it. ack.
On Wed, Jul 26, 2017 at 11:49 AM, David Ahern <dsahern@gmail.com> wrote: > On 7/26/17 12:27 PM, Roopa Prabhu wrote: >> agreed...so looks like the check in v3 should be >> >> >> + if ( rt == net->ipv6.ip6_null_entry || >> + (rt->dst.error && >> + #ifdef CONFIG_IPV6_MULTIPLE_TABLES >> + rt != net->ipv6.ip6_prohibit_entry && >> + rt != net->ipv6.ip6_blk_hole_entry && >> +#endif >> + )) { >> err = rt->dst.error; >> ip6_rt_put(rt); >> goto errout; >> > > I don't think so. If I add a prohibit route and use the fibmatch > attribute, I want to see the route from the FIB that was matched. But net->ipv6.ip6_prohibit_entry is not the prohibit route you can add in user-space, it is only used by rule actions. So do you really want to dump it?? My gut feeling is no, but I am definitely not sure. When you add a prohibit route, a new rt is allocated dynamically, net->ipv6.ip6_prohibit_entry is relatively static, internal and is the only one per netns. (Same for net->ipv6.ip6_blk_hole_entry) I think Hangbin's example doesn't have ip rules, so this case is not shown up.
On Thu, Jul 27, 2017 at 09:56:08PM -0700, Cong Wang wrote: > On Wed, Jul 26, 2017 at 11:49 AM, David Ahern <dsahern@gmail.com> wrote: > > On 7/26/17 12:27 PM, Roopa Prabhu wrote: > >> agreed...so looks like the check in v3 should be > >> > >> > >> + if ( rt == net->ipv6.ip6_null_entry || > >> + (rt->dst.error && > >> + #ifdef CONFIG_IPV6_MULTIPLE_TABLES > >> + rt != net->ipv6.ip6_prohibit_entry && > >> + rt != net->ipv6.ip6_blk_hole_entry && > >> +#endif > >> + )) { > >> err = rt->dst.error; > >> ip6_rt_put(rt); > >> goto errout; > >> > > > > I don't think so. If I add a prohibit route and use the fibmatch > > attribute, I want to see the route from the FIB that was matched. > > But net->ipv6.ip6_prohibit_entry is not the prohibit route you can > add in user-space, it is only used by rule actions. So do you really > want to dump it?? My gut feeling is no, but I am definitely not sure. > > When you add a prohibit route, a new rt is allocated dynamically, > net->ipv6.ip6_prohibit_entry is relatively static, internal and is the > only one per netns. (Same for net->ipv6.ip6_blk_hole_entry) > > I think Hangbin's example doesn't have ip rules, so this case > is not shown up. I mixed the rule entry and route entry these days. And with your help I can separate them now. When first time I find the rt->dst.error return directly issue, I was testing ip rule actually. e.g. + ip netns exec client ip -6 rule add to 2003::1/64 table 100 unreachable + ip netns exec server ip -6 rule add to 2001::1/64 table 100 prohibit + ip netns exec client ip -6 route get 2003::1 RTNETLINK answers: Network is unreachable + ip netns exec client ip -6 route get 2001::1 RTNETLINK answers: Permission denied After check I thought we returned net->ipv6.ip6_null_entry / net->ipv6.ip6_blk_hole_entry in function fib6_rule_action(). That's the reason I want to delete both rt->dst.error and net->ipv6.ip6_null_entry check in patch v2 and v3. Then with David's comments I realise we also need to take care about ip route entrys. my last mail's comment: > Thanks for your explains. Now I know where I made the mistake. I mis-looked > FR_ACT_UNREACHABLE to RTN_UNREACHABLE and thought we return rt = > net->ipv6.ip6_null_entry in fib6_rule_action(). But then I fall in to the code logic and get lost... And thought FR_ACT_UNREACHABLE and RTN_UNREACHABLE are not same. Today I re-check the code and realise RTN_UNREACHABLE is defined in user space and FR_ACT_UNREACHABLE is in kernel. Actually they are the same. So after PATCH v4, we fixed the route side. And part of ip rule(prohibit and blk hole). I will think over of this. Thanks Hangbin
On 7/27/17 10:56 PM, Cong Wang wrote: > On Wed, Jul 26, 2017 at 11:49 AM, David Ahern <dsahern@gmail.com> wrote: >> On 7/26/17 12:27 PM, Roopa Prabhu wrote: >>> agreed...so looks like the check in v3 should be >>> >>> >>> + if ( rt == net->ipv6.ip6_null_entry || >>> + (rt->dst.error && >>> + #ifdef CONFIG_IPV6_MULTIPLE_TABLES >>> + rt != net->ipv6.ip6_prohibit_entry && >>> + rt != net->ipv6.ip6_blk_hole_entry && >>> +#endif >>> + )) { >>> err = rt->dst.error; >>> ip6_rt_put(rt); >>> goto errout; >>> >> >> I don't think so. If I add a prohibit route and use the fibmatch >> attribute, I want to see the route from the FIB that was matched. > > But net->ipv6.ip6_prohibit_entry is not the prohibit route you can > add in user-space, it is only used by rule actions. So do you really > want to dump it?? My gut feeling is no, but I am definitely not sure. > > When you add a prohibit route, a new rt is allocated dynamically, > net->ipv6.ip6_prohibit_entry is relatively static, internal and is the > only one per netns. (Same for net->ipv6.ip6_blk_hole_entry) > > I think Hangbin's example doesn't have ip rules, so this case > is not shown up. > Understood. The v4 patch returns getroute to the original behavior. The original behavior returned a route entry not just an error code. The following is at 5dafc87f40d7 which the commit before roopa's patch set: # ip -6 ru add to 6000::/120 prohibit # ip -6 ro get 6000::1 prohibit 6000::1 from :: dev lo proto kernel src 2001:db8::3 metric 4294967295 error -13 pref medium # ip -6 ro add vrf red prohibit 5000::1/120 # ip -6 ro get vrf red 5000::1 prohibit 5000::1 from :: dev lo table red src 2001:db8::3 metric 1024 error -13 pref medium Generically, the only time you get just an error response is when the lookup fails to find a match and returns the null_entry which has dst.error = -ENETUNREACH. Now to your point about the new fibmatch option I have gone back and forth but in the end I think returning the route associated with the FIB rule is better than just failing with an error code. Roopa?
On Fri, Jul 28, 2017 at 8:10 AM, David Ahern <dsahern@gmail.com> wrote: > On 7/27/17 10:56 PM, Cong Wang wrote: >> On Wed, Jul 26, 2017 at 11:49 AM, David Ahern <dsahern@gmail.com> wrote: >>> On 7/26/17 12:27 PM, Roopa Prabhu wrote: >>>> agreed...so looks like the check in v3 should be >>>> >>>> >>>> + if ( rt == net->ipv6.ip6_null_entry || >>>> + (rt->dst.error && >>>> + #ifdef CONFIG_IPV6_MULTIPLE_TABLES >>>> + rt != net->ipv6.ip6_prohibit_entry && >>>> + rt != net->ipv6.ip6_blk_hole_entry && >>>> +#endif >>>> + )) { >>>> err = rt->dst.error; >>>> ip6_rt_put(rt); >>>> goto errout; >>>> >>> >>> I don't think so. If I add a prohibit route and use the fibmatch >>> attribute, I want to see the route from the FIB that was matched. >> >> But net->ipv6.ip6_prohibit_entry is not the prohibit route you can >> add in user-space, it is only used by rule actions. So do you really >> want to dump it?? My gut feeling is no, but I am definitely not sure. >> >> When you add a prohibit route, a new rt is allocated dynamically, >> net->ipv6.ip6_prohibit_entry is relatively static, internal and is the >> only one per netns. (Same for net->ipv6.ip6_blk_hole_entry) >> >> I think Hangbin's example doesn't have ip rules, so this case >> is not shown up. >> > > Understood. The v4 patch returns getroute to the original behavior. The > original behavior returned a route entry not just an error code. The > following is at 5dafc87f40d7 which the commit before roopa's patch set: > > # ip -6 ru add to 6000::/120 prohibit > # ip -6 ro get 6000::1 > prohibit 6000::1 from :: dev lo proto kernel src 2001:db8::3 metric > 4294967295 error -13 pref medium > > # ip -6 ro add vrf red prohibit 5000::1/120 > # ip -6 ro get vrf red 5000::1 > prohibit 5000::1 from :: dev lo table red src 2001:db8::3 metric 1024 > error -13 pref medium > > Generically, the only time you get just an error response is when the > lookup fails to find a match and returns the null_entry which has > dst.error = -ENETUNREACH. > > > Now to your point about the new fibmatch option I have gone back and > forth but in the end I think returning the route associated with the FIB > rule is better than just failing with an error code. > > Roopa? for fibmatch, my original intent was to return with an error code. This is similar to the ipv4 behavior. One option is to keep the check in there and put the 'fibmatch' condition around it. But, i do want to make sure that for the fibmatch case, it does not return an error directly on an existing prohibit route entry in the fib. This is probably doable by checking for appropriate net->ipv6.ip6_prohibit_entry entries.
On 7/28/17 11:13 AM, Roopa Prabhu wrote: > for fibmatch, my original intent was to return with an error code. > This is similar > to the ipv4 behavior. One option is to keep the check in there and put > the 'fibmatch' > condition around it. But, i do want to make sure that for the fibmatch case, > it does not return an error directly on an existing prohibit route > entry in the fib. > This is probably doable by checking for appropriate > net->ipv6.ip6_prohibit_entry entries. > IPv4 does not have the notion of null_entry or prohibit route entries which makes IPv4 and IPv6 inconsistent - something we really need to be avoiding from a user experience. We have the following cases: # ip -4 rule add to 172.16.60.0/24 prohibit # ip -4 route add prohibit 172.16.50.0/24 # ip -6 rule add to 6000::/120 prohibit # ip -6 route add prohibit 5000::/120 Behavior before Roopa's patch set: Rule match: # ip ro get 172.16.60.1 RTNETLINK answers: Permission denied # ip -6 ro get 6000::1 prohibit 6000::1 from :: dev lo proto kernel src 2001:db8::3 metric 4294967295 error -13 pref medium Route match: # ip ro get 172.16.50.1 RTNETLINK answers: Permission denied # ip -6 ro get 5000::1 prohibit 5000::1 from :: dev lo table red src 2001:db8::3 metric 1024 error -13 pref medium Behavior after Roopa's patch set: Rule match: # ip ro get 172.16.60.1 RTNETLINK answers: Permission denied # ip -6 ro get 6000::1 RTNETLINK answers: Permission denied Route match: # ip ro get 172.16.50.1 RTNETLINK answers: Permission denied # ip -6 ro get 5000::1 RTNETLINK answers: Permission denied So Roopa's fibmatch patches brings consistency between IPv4 and IPv6 at the cost of breaking backwards compatibility for IPv6 when the prohibit or blackhole routes are hit. If that is not acceptable, then let's wrap the change in 'if (fibmatch)' so that when fibmatch is requested we have consistency between IPv4 and IPv6 when it is set.
On Fri, Jul 28, 2017 at 10:39 AM, David Ahern <dsahern@gmail.com> wrote: > On 7/28/17 11:13 AM, Roopa Prabhu wrote: >> for fibmatch, my original intent was to return with an error code. >> This is similar >> to the ipv4 behavior. One option is to keep the check in there and put >> the 'fibmatch' >> condition around it. But, i do want to make sure that for the fibmatch case, >> it does not return an error directly on an existing prohibit route >> entry in the fib. >> This is probably doable by checking for appropriate >> net->ipv6.ip6_prohibit_entry entries. >> > > IPv4 does not have the notion of null_entry or prohibit route entries > which makes IPv4 and IPv6 inconsistent - something we really need to be > avoiding from a user experience. > > We have the following cases: > > # ip -4 rule add to 172.16.60.0/24 prohibit > # ip -4 route add prohibit 172.16.50.0/24 > # ip -6 rule add to 6000::/120 prohibit > # ip -6 route add prohibit 5000::/120 > > > Behavior before Roopa's patch set: > Rule match: > # ip ro get 172.16.60.1 > RTNETLINK answers: Permission denied > > # ip -6 ro get 6000::1 > prohibit 6000::1 from :: dev lo proto kernel src 2001:db8::3 metric > 4294967295 error -13 pref medium > > Route match: > # ip ro get 172.16.50.1 > RTNETLINK answers: Permission denied > > # ip -6 ro get 5000::1 > prohibit 5000::1 from :: dev lo table red src 2001:db8::3 metric > 1024 error -13 pref medium > > > Behavior after Roopa's patch set: > Rule match: > # ip ro get 172.16.60.1 > RTNETLINK answers: Permission denied > > # ip -6 ro get 6000::1 > RTNETLINK answers: Permission denied > > Route match: > # ip ro get 172.16.50.1 > RTNETLINK answers: Permission denied > > # ip -6 ro get 5000::1 > RTNETLINK answers: Permission denied > > > So Roopa's fibmatch patches brings consistency between IPv4 and IPv6 at > the cost of breaking backwards compatibility for IPv6 when the prohibit > or blackhole routes are hit. > > If that is not acceptable, then let's wrap the change in 'if (fibmatch)' > so that when fibmatch is requested we have consistency between IPv4 and > IPv6 when it is set. David, Thanks for listing all the cases and options. for the route match fibmatch case, if a prohibit route entry exists (added by user), I was hoping fibmatch can return that entry... # ip -6 ro get fibmatch 5000::1 prohibit 5000::1 from :: dev lo because the semantics of fibmatch is to return the matching route entry if exists. I am assuming that is possible with appropriate checks around the dst.error check for fibmatch. what do you say ? I need to verify if this can work for ipv4 the same way.
On 7/28/17 1:52 PM, Roopa Prabhu wrote: > On Fri, Jul 28, 2017 at 10:39 AM, David Ahern <dsahern@gmail.com> wrote: >> IPv4 does not have the notion of null_entry or prohibit route entries >> which makes IPv4 and IPv6 inconsistent - something we really need to be >> avoiding from a user experience. >> >> We have the following cases: >> >> # ip -4 rule add to 172.16.60.0/24 prohibit >> # ip -4 route add prohibit 172.16.50.0/24 >> # ip -6 rule add to 6000::/120 prohibit >> # ip -6 route add prohibit 5000::/120 >> >> >> Behavior before Roopa's patch set: >> Rule match: >> # ip ro get 172.16.60.1 >> RTNETLINK answers: Permission denied >> >> # ip -6 ro get 6000::1 >> prohibit 6000::1 from :: dev lo proto kernel src 2001:db8::3 metric >> 4294967295 error -13 pref medium >> >> Route match: >> # ip ro get 172.16.50.1 >> RTNETLINK answers: Permission denied >> >> # ip -6 ro get 5000::1 >> prohibit 5000::1 from :: dev lo table red src 2001:db8::3 metric >> 1024 error -13 pref medium >> >> >> Behavior after Roopa's patch set: >> Rule match: >> # ip ro get 172.16.60.1 >> RTNETLINK answers: Permission denied >> >> # ip -6 ro get 6000::1 >> RTNETLINK answers: Permission denied >> >> Route match: >> # ip ro get 172.16.50.1 >> RTNETLINK answers: Permission denied >> >> # ip -6 ro get 5000::1 >> RTNETLINK answers: Permission denied >> >> >> So Roopa's fibmatch patches brings consistency between IPv4 and IPv6 at >> the cost of breaking backwards compatibility for IPv6 when the prohibit >> or blackhole routes are hit. >> >> If that is not acceptable, then let's wrap the change in 'if (fibmatch)' >> so that when fibmatch is requested we have consistency between IPv4 and >> IPv6 when it is set. > > > David, Thanks for listing all the cases and options. > > for the route match fibmatch case, if a prohibit route entry exists > (added by user), I was hoping fibmatch can return that entry... > > # ip -6 ro get fibmatch 5000::1 > prohibit 5000::1 from :: dev lo > > because the semantics of fibmatch is to return the matching route > entry if exists. > I am assuming that is possible with appropriate checks around the > dst.error check for fibmatch. what do you say ? > I need to verify if this can work for ipv4 the same way. > Routes such as prohibit, blackhole and unreachable cause fib_table_lookup to return an error (see the fib_props reference in fib_table_lookup) and the IPv4 code does not generate an rtable for them. IMO it does not make sense to complicate the general IPv4 lookup code to create an rtable for these 'special' routes just for the fibmatch getroute case. Furthermore, I think consistency between IPv4 and IPv6 is important from a programmability perspective which is what we have now. DaveM: as I outlined above, Roopa's fibmatch patches caused a change in user behavior in IPv6 getroute for prohibit, blackhole and unreachable route entries. Opinions on whether we should limit that new behavior to just the fibmatch lookup in which case a patch is needed or take the new behavior and consistency in which case nothing is needed?
On Fri, Jul 28, 2017 at 10:39 AM, David Ahern <dsahern@gmail.com> wrote: > On 7/28/17 11:13 AM, Roopa Prabhu wrote: >> for fibmatch, my original intent was to return with an error code. >> This is similar >> to the ipv4 behavior. One option is to keep the check in there and put >> the 'fibmatch' >> condition around it. But, i do want to make sure that for the fibmatch case, >> it does not return an error directly on an existing prohibit route >> entry in the fib. >> This is probably doable by checking for appropriate >> net->ipv6.ip6_prohibit_entry entries. >> > > IPv4 does not have the notion of null_entry or prohibit route entries > which makes IPv4 and IPv6 inconsistent - something we really need to be > avoiding from a user experience. > > We have the following cases: > > # ip -4 rule add to 172.16.60.0/24 prohibit > # ip -4 route add prohibit 172.16.50.0/24 > # ip -6 rule add to 6000::/120 prohibit > # ip -6 route add prohibit 5000::/120 > > > Behavior before Roopa's patch set: > Rule match: > # ip ro get 172.16.60.1 > RTNETLINK answers: Permission denied > > # ip -6 ro get 6000::1 > prohibit 6000::1 from :: dev lo proto kernel src 2001:db8::3 metric > 4294967295 error -13 pref medium > > Route match: > # ip ro get 172.16.50.1 > RTNETLINK answers: Permission denied > > # ip -6 ro get 5000::1 > prohibit 5000::1 from :: dev lo table red src 2001:db8::3 metric > 1024 error -13 pref medium > > > Behavior after Roopa's patch set: > Rule match: > # ip ro get 172.16.60.1 > RTNETLINK answers: Permission denied > > # ip -6 ro get 6000::1 > RTNETLINK answers: Permission denied > > Route match: > # ip ro get 172.16.50.1 > RTNETLINK answers: Permission denied > > # ip -6 ro get 5000::1 > RTNETLINK answers: Permission denied > There must be a reason why we allocate prohibit entries dynamically for IPv6 despite we already have a (relatively) static one. From this point of view, we need to dump them, that is, restore the behavior before Roopa's patch. > > So Roopa's fibmatch patches brings consistency between IPv4 and IPv6 at > the cost of breaking backwards compatibility for IPv6 when the prohibit > or blackhole routes are hit. > There are already many differences between IPv4 and IPv6 behaviors, I don't see why this one is so special that we have to make it consistent.
On 7/31/17 12:37 PM, Cong Wang wrote: > >> So Roopa's fibmatch patches brings consistency between IPv4 and IPv6 at >> the cost of breaking backwards compatibility for IPv6 when the prohibit >> or blackhole routes are hit. >> > There are already many differences between IPv4 and > IPv6 behaviors, I don't see why this one is so special > that we have to make it consistent. yes, and I have sent a number of patches closing the gap.
diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 4d30c96..b5e3fe0 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -3637,17 +3637,6 @@ static int inet6_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh, dst = ip6_route_lookup(net, &fl6, 0); rt = container_of(dst, struct rt6_info, dst); - if (rt->dst.error) { - err = rt->dst.error; - ip6_rt_put(rt); - goto errout; - } - - if (rt == net->ipv6.ip6_null_entry) { - err = rt->dst.error; - ip6_rt_put(rt); - goto errout; - } skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL); if (!skb) {