diff mbox

[net-next] udp: increment UDP_NO_PORTS when dropping unmatched multicasts

Message ID 20141001151921.7131E29003A2@tardy
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Rick Jones Oct. 1, 2014, 3:19 p.m. UTC
From: Rick Jones <rick.jones2@hp.com>

When there is absolutely no match for an incoming UDP multicast we
should increment a suitable UDP statistic in addition to the kfree_skb().
When there were some matches, we should not and simply call consume_skb().

Signed-off-by: Rick Jones <rick.jones2@hp.com>

---

Noticed __udp4_lib_mcast_deliver showing-up in a perf dropped packet
profile on a system sitting on a network with a bunch of Windows boxes
sending what they are fond of sending.

Verified that UDP_MIB_NOPORTS increments when it was not before and hit
the system with a bit of netperf multicast UDP_RR but that is the extent
of the testing performed.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Eric Dumazet Oct. 1, 2014, 3:41 p.m. UTC | #1
On Wed, 2014-10-01 at 08:19 -0700, Rick Jones wrote:
> From: Rick Jones <rick.jones2@hp.com>
> 
> When there is absolutely no match for an incoming UDP multicast we
> should increment a suitable UDP statistic in addition to the kfree_skb().
> When there were some matches, we should not and simply call consume_skb().
> 
> Signed-off-by: Rick Jones <rick.jones2@hp.com>
> 
> ---
> 
> Noticed __udp4_lib_mcast_deliver showing-up in a perf dropped packet
> profile on a system sitting on a network with a bunch of Windows boxes
> sending what they are fond of sending.
> 
> Verified that UDP_MIB_NOPORTS increments when it was not before and hit
> the system with a bit of netperf multicast UDP_RR but that is the extent
> of the testing performed.
> 
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index cd0db54..376e3d3 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -1656,6 +1656,7 @@ static int __udp4_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
>  	int dif = skb->dev->ifindex;
>  	unsigned int count = 0, offset = offsetof(typeof(*sk), sk_nulls_node);
>  	unsigned int hash2 = 0, hash2_any = 0, use_hash2 = (hslot->count > 10);
> +	unsigned int inner_flushed = 0;
>  
>  	if (use_hash2) {
>  		hash2_any = udp4_portaddr_hash(net, htonl(INADDR_ANY), hnum) &
> @@ -1694,8 +1695,12 @@ start_lookup:
>  	 */
>  	if (count) {
>  		flush_stack(stack, count, skb, count - 1);
> -	} else {
> +	} else if (!inner_flushed) {
> +		UDP_INC_STATS_BH(net, UDP_MIB_NOPORTS, 0);
>  		kfree_skb(skb);
> +	} else {
> +		/* there were matches flushed in the for_each */
> +		consume_skb(skb);
>  	}
>  	return 0;
>  }

It seems you imply UDP Lite do not use multicast ?

I believe we allow them, so you need to add a 'int proto' parameter to
__udp4_lib_mcast_deliver()

Also, please take care of IPv6 in the same time/patch.

Thanks


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David L Stevens Oct. 1, 2014, 3:54 p.m. UTC | #2
I think this would have the unpleasant side-effect of incrementing a drop
stat when we have not joined a multicast group that has UDP traffic, but the
interface is in promiscuous mode. Also, false positives for the multicast
address filter.

Multicast address filters are not perfect matches, so it is "normal" to receive
multicasts and broadcasts that are not addressed to our host. I'm not sure those
should count as "noports" any more than traffic addressed to someone else's IP
address if we're in promiscuous mode should.

In the multicast case, it would really only make sense if we have actually joined
the group it's addressed to.

						+-DLS

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rick Jones Oct. 1, 2014, 4:32 p.m. UTC | #3
On 10/01/2014 08:54 AM, David L Stevens wrote:
> I think this would have the unpleasant side-effect of incrementing a drop
> stat when we have not joined a multicast group that has UDP traffic, but the
> interface is in promiscuous mode. Also, false positives for the multicast
> address filter.
>
> Multicast address filters are not perfect matches, so it is "normal" to receive
> multicasts and broadcasts that are not addressed to our host. I'm not sure those
> should count as "noports" any more than traffic addressed to someone else's IP
> address if we're in promiscuous mode should.
>
> In the multicast case, it would really only make sense if we have actually joined
> the group it's addressed to.

Do you think an "ignored" statistic would be appropriate then?

rick
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David L Stevens Oct. 1, 2014, 4:59 p.m. UTC | #4
On 10/01/2014 12:32 PM, Rick Jones wrote:

> 
> Do you think an "ignored" statistic would be appropriate then?

I guess I don't know what that means. If we didn't join the group, it
should be treated like anything we'd receive in promiscuous mode that
is not addressed to us. I'd assume interface stats are going up, but not
any protocol stats in that case. (already the case)

If we did join the group, it should be treated the way we would, say,
a broadcast UDP packet where we have no listener on that port.

It isn't clear to me that maintaining the stat is worth checking for
group membership, but we wouldn't want an admin to conclude there is
a problem or an error because we increment a stat when doing the
equivalent of hardware MAC address filtering, just because multicast
address filters are defined to be leaky.

						+-DLS

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sergei Shtylyov Oct. 1, 2014, 5:14 p.m. UTC | #5
Hello.

On 10/01/2014 07:19 PM, Rick Jones wrote:

> From: Rick Jones <rick.jones2@hp.com>

> When there is absolutely no match for an incoming UDP multicast we
> should increment a suitable UDP statistic in addition to the kfree_skb().
> When there were some matches, we should not and simply call consume_skb().

> Signed-off-by: Rick Jones <rick.jones2@hp.com>

> ---

> Noticed __udp4_lib_mcast_deliver showing-up in a perf dropped packet
> profile on a system sitting on a network with a bunch of Windows boxes
> sending what they are fond of sending.

> Verified that UDP_MIB_NOPORTS increments when it was not before and hit
> the system with a bit of netperf multicast UDP_RR but that is the extent
> of the testing performed.

> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index cd0db54..376e3d3 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -1656,6 +1656,7 @@ static int __udp4_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
>   	int dif = skb->dev->ifindex;
>   	unsigned int count = 0, offset = offsetof(typeof(*sk), sk_nulls_node);
>   	unsigned int hash2 = 0, hash2_any = 0, use_hash2 = (hslot->count > 10);
> +	unsigned int inner_flushed = 0;
>
>   	if (use_hash2) {
>   		hash2_any = udp4_portaddr_hash(net, htonl(INADDR_ANY), hnum) &
> @@ -1694,8 +1695,12 @@ start_lookup:
>   	 */
>   	if (count) {
>   		flush_stack(stack, count, skb, count - 1);
> -	} else {
> +	} else if (!inner_flushed) {

    I don't see where this new variable is changed. It's always 0 here.

> +		UDP_INC_STATS_BH(net, UDP_MIB_NOPORTS, 0);
>   		kfree_skb(skb);
> +	} else {
> +		/* there were matches flushed in the for_each */
> +		consume_skb(skb);
>   	}
>   	return 0;
>   }

WBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rick Jones Oct. 1, 2014, 5:32 p.m. UTC | #6
On 10/01/2014 09:59 AM, David L Stevens wrote:

> On 10/01/2014 12:32 PM, Rick Jones wrote:
>
>>
>> Do you think an "ignored" statistic would be appropriate then?
>
> I guess I don't know what that means.

It would be an added statistic for "ignored" UDP multicast datagrams, 
incremented instead of UDP_MIB_NOPORTS.  "UDP_MIB_IGNOREDMULTI" if you 
will.

Similar in concept to what HP-UX NIC drivers would increment when they 
received a frame for which there was no bound protocol - "inbound 
unknown protocols" but not a drop 
http://ptgmedia.pearsoncmg.com/images/chap1_0130428167/elementLinks/01fig16.gif

rick
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David L Stevens Oct. 1, 2014, 5:43 p.m. UTC | #7
On 10/01/2014 01:32 PM, Rick Jones wrote:

> It would be an added statistic for "ignored" UDP multicast datagrams, incremented instead of UDP_MIB_NOPORTS.  "UDP_MIB_IGNOREDMULTI" if you will.
> 
> Similar in concept to what HP-UX NIC drivers would increment when they received a frame for which there was no bound protocol - "inbound unknown protocols" but not a drop
> http://ptgmedia.pearsoncmg.com/images/chap1_0130428167/elementLinks/01fig16.gif

I guess I'm ok with that.

Ideally, it wouldn't be affected by running a sniffer, so I think best would be to increment
NOPORTS when someone has joined the group on the interface and IGNOREDMULTI when nobody has,
but I'm not sure it's worth the trouble to check. So I'm good with that, or IGNOREDMULTI all
the time.

								+-DLS

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rick Jones Oct. 1, 2014, 5:51 p.m. UTC | #8
On 10/01/2014 10:43 AM, David L Stevens wrote:
>
>
> On 10/01/2014 01:32 PM, Rick Jones wrote:
>
>> It would be an added statistic for "ignored" UDP multicast datagrams, incremented instead of UDP_MIB_NOPORTS.  "UDP_MIB_IGNOREDMULTI" if you will.
>>
>> Similar in concept to what HP-UX NIC drivers would increment when they received a frame for which there was no bound protocol - "inbound unknown protocols" but not a drop
>> http://ptgmedia.pearsoncmg.com/images/chap1_0130428167/elementLinks/01fig16.gif
>
> I guess I'm ok with that.
>
> Ideally, it wouldn't be affected by running a sniffer, so I think best would be to increment
> NOPORTS when someone has joined the group on the interface and IGNOREDMULTI when nobody has,
> but I'm not sure it's worth the trouble to check. So I'm good with that, or IGNOREDMULTI all
> the time.


Eric -

What is your feeling?  I have a UDP_MIB_NOPORTS v2 with UDP lite, fixed 
inner_flushed and IPv6 I can post now, or I can tweak it to add an 
IGNOREDMULTI stat and use that instead of NOPORTS.  If taking the 
IGNOREDMULTI path, I'd be inclined to go with consume_skb() 
unconditionally since the philosophy would be that it is an ignored 
packet rather than a drop (similar to the recent change in ARP).

rick

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index cd0db54..376e3d3 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1656,6 +1656,7 @@  static int __udp4_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
 	int dif = skb->dev->ifindex;
 	unsigned int count = 0, offset = offsetof(typeof(*sk), sk_nulls_node);
 	unsigned int hash2 = 0, hash2_any = 0, use_hash2 = (hslot->count > 10);
+	unsigned int inner_flushed = 0;
 
 	if (use_hash2) {
 		hash2_any = udp4_portaddr_hash(net, htonl(INADDR_ANY), hnum) &
@@ -1694,8 +1695,12 @@  start_lookup:
 	 */
 	if (count) {
 		flush_stack(stack, count, skb, count - 1);
-	} else {
+	} else if (!inner_flushed) {
+		UDP_INC_STATS_BH(net, UDP_MIB_NOPORTS, 0);
 		kfree_skb(skb);
+	} else {
+		/* there were matches flushed in the for_each */
+		consume_skb(skb);
 	}
 	return 0;
 }