Patchwork IP: Increment INADDRERRORS if routing for a packet is not successful

login
register
mail settings
Submitter Eric Dumazet
Date June 2, 2010, 5:46 p.m.
Message ID <1275500802.2519.7.camel@edumazet-laptop>
Download mbox | patch
Permalink /patch/54407/
State RFC
Delegated to: David Miller
Headers show

Comments

Eric Dumazet - June 2, 2010, 5:46 p.m.
Le mercredi 02 juin 2010 à 10:31 -0700, David Miller a écrit :
> Just in case people are really so clueless as to be unable to figure
> this out:
> 
> echo 1 >/sys/kernel/debug/tracing/events/skb/kfree_skb/enable
> ...do some stuff...
> cat /sys/kernel/debug/tracing/trace
> 
> You can even trace it using 'perf' by passing "skb:kfree_skb"
> as the event specifier.

Thanks !

Here is the patch I cooked to account for RP_FILTER errors in multicast
path.

I will complete it to also do the unicast part before official
submission.

Christoph, the official counter would be IPSTATS_MIB_INNOROUTES

ipSystemStatsInNoRoutes OBJECT-TYPE
    SYNTAX     Counter32
    MAX-ACCESS read-only
    STATUS     current
    DESCRIPTION
           "The number of input IP datagrams discarded because no route
            could be found to transmit them to their destination.





--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Lameter - June 2, 2010, 6:01 p.m.
On Wed, 2 Jun 2010, Eric Dumazet wrote:

> Here is the patch I cooked to account for RP_FILTER errors in multicast
> path.
>
> I will complete it to also do the unicast part before official
> submission.
>
> Christoph, the official counter would be IPSTATS_MIB_INNOROUTES

Great. Thanks.

> ipSystemStatsInNoRoutes OBJECT-TYPE
>     SYNTAX     Counter32
>     MAX-ACCESS read-only
>     STATUS     current
>     DESCRIPTION
>            "The number of input IP datagrams discarded because no route
>             could be found to transmit them to their destination.

add "or because the rp_filter rejected the packet"? In the case of MC
traffic you dont really need a route.

In my particular case it is a weird corner case for the rp_filter.

Two NICs are on the same subnet. Different multicast groups are joined
on each (Using two NICs to balance the MC load since the drivers have
some multicast limitations and having different interrupt lines for each
NIC is also beneficial).

The rp_filter rejects all multicast traffic to the subscriptions on the
second NIC. I guess this is because the source address of the MC traffic
(on the same subnet) is also reachable via the first NIC.

So you could add also "because of breakage in the rp_filter (rp_filter
ignores the multicast subscription tables when determining the correct
reverse path of the packet)"

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet - June 2, 2010, 6:41 p.m.
Le mercredi 02 juin 2010 à 13:01 -0500, Christoph Lameter a écrit :
> On Wed, 2 Jun 2010, Eric Dumazet wrote:
> 
> > Here is the patch I cooked to account for RP_FILTER errors in multicast
> > path.
> >
> > I will complete it to also do the unicast part before official
> > submission.
> >
> > Christoph, the official counter would be IPSTATS_MIB_INNOROUTES
> 
> Great. Thanks.
> 
> > ipSystemStatsInNoRoutes OBJECT-TYPE
> >     SYNTAX     Counter32
> >     MAX-ACCESS read-only
> >     STATUS     current
> >     DESCRIPTION
> >            "The number of input IP datagrams discarded because no route
> >             could be found to transmit them to their destination.
> 
> add "or because the rp_filter rejected the packet"? In the case of MC
> traffic you dont really need a route.
> 

Unicast trafic dont need a reverse route, if you only receive packets.

rp_filter is an optional check, not covered by standard MIBS, so its
borderline.


> In my particular case it is a weird corner case for the rp_filter.
> 
> Two NICs are on the same subnet. Different multicast groups are joined
> on each (Using two NICs to balance the MC load since the drivers have
> some multicast limitations and having different interrupt lines for each
> NIC is also beneficial).
> 

yeah, I know about this problem, and am working on it too...

> The rp_filter rejects all multicast traffic to the subscriptions on the
> second NIC. I guess this is because the source address of the MC traffic
> (on the same subnet) is also reachable via the first NIC.
> 

Its clearly a case were rp_filter should be set to 2, dont you think ?

> So you could add also "because of breakage in the rp_filter (rp_filter
> ignores the multicast subscription tables when determining the correct
> reverse path of the packet)"
> 

In standard RFC ? I wont change it :)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Lameter - June 2, 2010, 6:59 p.m.
On Wed, 2 Jun 2010, Eric Dumazet wrote:

> > In my particular case it is a weird corner case for the rp_filter.
> >
> > Two NICs are on the same subnet. Different multicast groups are joined
> > on each (Using two NICs to balance the MC load since the drivers have
> > some multicast limitations and having different interrupt lines for each
> > NIC is also beneficial).
> >
>
> yeah, I know about this problem, and am working on it too...
>
> > The rp_filter rejects all multicast traffic to the subscriptions on the
> > second NIC. I guess this is because the source address of the MC traffic
> > (on the same subnet) is also reachable via the first NIC.
> >
>
> Its clearly a case were rp_filter should be set to 2, dont you think ?

The rp_filter is rejecting traffic coming into a NIC for which the kernel
has a multicast join list that indicates that this traffic is expected on
this NIC. You could consult the MC subscription list to verify that the
traffic is coming into the right NIC.

In the MC case the user can explicitly specify through which NIC the
traffic is expected. See IP_ADD_MEMBERSHIP.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 4f0ed45..f207289 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -284,7 +284,7 @@  int fib_validate_source(__be32 src, __be32 dst, u8 tos, int oif,
 	if (no_addr)
 		goto last_resort;
 	if (rpf == 1)
-		goto e_inval;
+		goto e_rpf;
 	fl.oif = dev->ifindex;
 
 	ret = 0;
@@ -299,7 +299,7 @@  int fib_validate_source(__be32 src, __be32 dst, u8 tos, int oif,
 
 last_resort:
 	if (rpf)
-		goto e_inval;
+		goto e_rpf;
 	*spec_dst = inet_select_addr(dev, 0, RT_SCOPE_UNIVERSE);
 	*itag = 0;
 	return 0;
@@ -308,6 +308,8 @@  e_inval_res:
 	fib_res_put(&res);
 e_inval:
 	return -EINVAL;
+e_rpf:
+	return -ENETUNREACH;
 }
 
 static inline __be32 sk_extract_addr(struct sockaddr *addr)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 8495bce..8e9e2f9 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1851,6 +1851,7 @@  static int ip_route_input_mc(struct sk_buff *skb, __be32 daddr, __be32 saddr,
 	__be32 spec_dst;
 	struct in_device *in_dev = in_dev_get(dev);
 	u32 itag = 0;
+	int err;
 
 	/* Primary sanity checks. */
 
@@ -1865,10 +1866,12 @@  static int ip_route_input_mc(struct sk_buff *skb, __be32 daddr, __be32 saddr,
 		if (!ipv4_is_local_multicast(daddr))
 			goto e_inval;
 		spec_dst = inet_select_addr(dev, 0, RT_SCOPE_LINK);
-	} else if (fib_validate_source(saddr, 0, tos, 0,
-					dev, &spec_dst, &itag, 0) < 0)
-		goto e_inval;
-
+	} else {
+		err = fib_validate_source(saddr, 0, tos, 0, dev, &spec_dst,
+					  &itag, 0);
+		if (err < 0)
+			goto e_err;
+	}
 	rth = dst_alloc(&ipv4_dst_ops);
 	if (!rth)
 		goto e_nobufs;
@@ -1922,6 +1925,9 @@  e_nobufs:
 e_inval:
 	in_dev_put(in_dev);
 	return -EINVAL;
+e_err:
+	in_dev_put(in_dev);
+	return err;
 }