Message ID | 1275500802.2519.7.camel@edumazet-laptop |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, 2 Jun 2010, Eric Dumazet wrote: > Here is the patch I cooked to account for RP_FILTER errors in multicast > path. > > I will complete it to also do the unicast part before official > submission. > > Christoph, the official counter would be IPSTATS_MIB_INNOROUTES Great. Thanks. > ipSystemStatsInNoRoutes OBJECT-TYPE > SYNTAX Counter32 > MAX-ACCESS read-only > STATUS current > DESCRIPTION > "The number of input IP datagrams discarded because no route > could be found to transmit them to their destination. add "or because the rp_filter rejected the packet"? In the case of MC traffic you dont really need a route. In my particular case it is a weird corner case for the rp_filter. Two NICs are on the same subnet. Different multicast groups are joined on each (Using two NICs to balance the MC load since the drivers have some multicast limitations and having different interrupt lines for each NIC is also beneficial). The rp_filter rejects all multicast traffic to the subscriptions on the second NIC. I guess this is because the source address of the MC traffic (on the same subnet) is also reachable via the first NIC. So you could add also "because of breakage in the rp_filter (rp_filter ignores the multicast subscription tables when determining the correct reverse path of the packet)" -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le mercredi 02 juin 2010 à 13:01 -0500, Christoph Lameter a écrit : > On Wed, 2 Jun 2010, Eric Dumazet wrote: > > > Here is the patch I cooked to account for RP_FILTER errors in multicast > > path. > > > > I will complete it to also do the unicast part before official > > submission. > > > > Christoph, the official counter would be IPSTATS_MIB_INNOROUTES > > Great. Thanks. > > > ipSystemStatsInNoRoutes OBJECT-TYPE > > SYNTAX Counter32 > > MAX-ACCESS read-only > > STATUS current > > DESCRIPTION > > "The number of input IP datagrams discarded because no route > > could be found to transmit them to their destination. > > add "or because the rp_filter rejected the packet"? In the case of MC > traffic you dont really need a route. > Unicast trafic dont need a reverse route, if you only receive packets. rp_filter is an optional check, not covered by standard MIBS, so its borderline. > In my particular case it is a weird corner case for the rp_filter. > > Two NICs are on the same subnet. Different multicast groups are joined > on each (Using two NICs to balance the MC load since the drivers have > some multicast limitations and having different interrupt lines for each > NIC is also beneficial). > yeah, I know about this problem, and am working on it too... > The rp_filter rejects all multicast traffic to the subscriptions on the > second NIC. I guess this is because the source address of the MC traffic > (on the same subnet) is also reachable via the first NIC. > Its clearly a case were rp_filter should be set to 2, dont you think ? > So you could add also "because of breakage in the rp_filter (rp_filter > ignores the multicast subscription tables when determining the correct > reverse path of the packet)" > In standard RFC ? I wont change it :) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2 Jun 2010, Eric Dumazet wrote: > > In my particular case it is a weird corner case for the rp_filter. > > > > Two NICs are on the same subnet. Different multicast groups are joined > > on each (Using two NICs to balance the MC load since the drivers have > > some multicast limitations and having different interrupt lines for each > > NIC is also beneficial). > > > > yeah, I know about this problem, and am working on it too... > > > The rp_filter rejects all multicast traffic to the subscriptions on the > > second NIC. I guess this is because the source address of the MC traffic > > (on the same subnet) is also reachable via the first NIC. > > > > Its clearly a case were rp_filter should be set to 2, dont you think ? The rp_filter is rejecting traffic coming into a NIC for which the kernel has a multicast join list that indicates that this traffic is expected on this NIC. You could consult the MC subscription list to verify that the traffic is coming into the right NIC. In the MC case the user can explicitly specify through which NIC the traffic is expected. See IP_ADD_MEMBERSHIP. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index 4f0ed45..f207289 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -284,7 +284,7 @@ int fib_validate_source(__be32 src, __be32 dst, u8 tos, int oif, if (no_addr) goto last_resort; if (rpf == 1) - goto e_inval; + goto e_rpf; fl.oif = dev->ifindex; ret = 0; @@ -299,7 +299,7 @@ int fib_validate_source(__be32 src, __be32 dst, u8 tos, int oif, last_resort: if (rpf) - goto e_inval; + goto e_rpf; *spec_dst = inet_select_addr(dev, 0, RT_SCOPE_UNIVERSE); *itag = 0; return 0; @@ -308,6 +308,8 @@ e_inval_res: fib_res_put(&res); e_inval: return -EINVAL; +e_rpf: + return -ENETUNREACH; } static inline __be32 sk_extract_addr(struct sockaddr *addr) diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 8495bce..8e9e2f9 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -1851,6 +1851,7 @@ static int ip_route_input_mc(struct sk_buff *skb, __be32 daddr, __be32 saddr, __be32 spec_dst; struct in_device *in_dev = in_dev_get(dev); u32 itag = 0; + int err; /* Primary sanity checks. */ @@ -1865,10 +1866,12 @@ static int ip_route_input_mc(struct sk_buff *skb, __be32 daddr, __be32 saddr, if (!ipv4_is_local_multicast(daddr)) goto e_inval; spec_dst = inet_select_addr(dev, 0, RT_SCOPE_LINK); - } else if (fib_validate_source(saddr, 0, tos, 0, - dev, &spec_dst, &itag, 0) < 0) - goto e_inval; - + } else { + err = fib_validate_source(saddr, 0, tos, 0, dev, &spec_dst, + &itag, 0); + if (err < 0) + goto e_err; + } rth = dst_alloc(&ipv4_dst_ops); if (!rth) goto e_nobufs; @@ -1922,6 +1925,9 @@ e_nobufs: e_inval: in_dev_put(in_dev); return -EINVAL; +e_err: + in_dev_put(in_dev); + return err; }