diff mbox

net: rps: support 802.1Q

Message ID 1313730353-25379-1-git-send-email-xiaosuo@gmail.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Changli Gao Aug. 19, 2011, 5:05 a.m. UTC
For the 802.1Q packets, if the NIC doesn't support hw-accel-vlan-rx, RPS
won't inspect the internal 4 tuples to generate skb->rxhash, so this kind
of traffic can't get any benefit from RPS.

This patch adds the support for 802.1Q to RPS.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
---
 net/core/dev.c |    8 ++++++++
 1 file changed, 8 insertions(+)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

David Miller Aug. 19, 2011, 5:08 a.m. UTC | #1
From: Changli Gao <xiaosuo@gmail.com>
Date: Fri, 19 Aug 2011 13:05:53 +0800

> For the 802.1Q packets, if the NIC doesn't support hw-accel-vlan-rx, RPS
> won't inspect the internal 4 tuples to generate skb->rxhash, so this kind
> of traffic can't get any benefit from RPS.
> 
> This patch adds the support for 802.1Q to RPS.
> 
> Signed-off-by: Changli Gao <xiaosuo@gmail.com>

Looks fine, applied to net-next, thanks!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Changli Gao Aug. 19, 2011, 5:22 a.m. UTC | #2
On Fri, Aug 19, 2011 at 1:08 PM, David Miller <davem@davemloft.net> wrote:
> From: Changli Gao <xiaosuo@gmail.com>
> Date: Fri, 19 Aug 2011 13:05:53 +0800
>
>> For the 802.1Q packets, if the NIC doesn't support hw-accel-vlan-rx, RPS
>> won't inspect the internal 4 tuples to generate skb->rxhash, so this kind
>> of traffic can't get any benefit from RPS.
>>
>> This patch adds the support for 802.1Q to RPS.
>>
>> Signed-off-by: Changli Gao <xiaosuo@gmail.com>
>
> Looks fine, applied to net-next, thanks!
>

Thanks for applying it. I have another patch which adds the support
for PPPOE session messages. Do you think it worth doing?
David Miller Aug. 19, 2011, 5:26 a.m. UTC | #3
From: Changli Gao <xiaosuo@gmail.com>
Date: Fri, 19 Aug 2011 13:22:03 +0800

> On Fri, Aug 19, 2011 at 1:08 PM, David Miller <davem@davemloft.net> wrote:
>> From: Changli Gao <xiaosuo@gmail.com>
>> Date: Fri, 19 Aug 2011 13:05:53 +0800
>>
>>> For the 802.1Q packets, if the NIC doesn't support hw-accel-vlan-rx, RPS
>>> won't inspect the internal 4 tuples to generate skb->rxhash, so this kind
>>> of traffic can't get any benefit from RPS.
>>>
>>> This patch adds the support for 802.1Q to RPS.
>>>
>>> Signed-off-by: Changli Gao <xiaosuo@gmail.com>
>>
>> Looks fine, applied to net-next, thanks!
>>
> 
> Thanks for applying it. I have another patch which adds the support
> for PPPOE session messages. Do you think it worth doing?

Sure.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Hutchings Aug. 19, 2011, 11:54 a.m. UTC | #4
On Fri, 2011-08-19 at 13:05 +0800, Changli Gao wrote:
> For the 802.1Q packets, if the NIC doesn't support hw-accel-vlan-rx, RPS
> won't inspect the internal 4 tuples to generate skb->rxhash, so this kind
> of traffic can't get any benefit from RPS.
> 
> This patch adds the support for 802.1Q to RPS.
[...]
> @@ -2565,6 +2566,13 @@ again:
>  		addr2 = (__force u32) ip6->daddr.s6_addr32[3];
>  		nhoff += 40;
>  		break;
> +	case __constant_htons(ETH_P_8021Q):
> +		if (!pskb_may_pull(skb, sizeof(*vlan) + nhoff))
> +			goto done;
> +		vlan = (const struct vlan_hdr *) (skb->data + nhoff);
> +		proto = vlan->h_vlan_encapsulated_proto;
> +		nhoff += sizeof(*vlan);
> +		goto again;
>  	default:
>  		goto done;
>  	}

Should this really be reading an unlimited number of tags?  What if an
attacker starts sending packets full of VLAN tags?  Since this runs
before netfilter, there would be no way to prevent those packets burning
our CPU time.  And if there are legitimately multiple VLAN tags, they
presumably won't all have the 802.1q Ethertype.

Ben.
Changli Gao Aug. 19, 2011, 3:05 p.m. UTC | #5
On Fri, Aug 19, 2011 at 7:54 PM, Ben Hutchings
<bhutchings@solarflare.com> wrote:
>
> Should this really be reading an unlimited number of tags?

Not unlimited, but it won't stop until reaching the end of the packet.

>  What if an
> attacker starts sending packets full of VLAN tags?  Since this runs
> before netfilter, there would be no way to prevent those packets burning
> our CPU time.  And if there are legitimately multiple VLAN tags, they
> presumably won't all have the 802.1q Ethertype.
>

Do we need to limit the number of rounds to stop this kind of "bad"
packets from burning our CPU time? Then,  __netif_receive_skb() has to
be update too, so the inspection of tunnel in __skb_get_rxhash() does.
Is there a such limitation in xfrm?

Thanks.
Ben Hutchings Aug. 20, 2011, 1:12 a.m. UTC | #6
On Fri, 2011-08-19 at 23:05 +0800, Changli Gao wrote:
> On Fri, Aug 19, 2011 at 7:54 PM, Ben Hutchings
> <bhutchings@solarflare.com> wrote:
> >
> > Should this really be reading an unlimited number of tags?
> 
> Not unlimited, but it won't stop until reaching the end of the packet.

Right, I understand that the parsing is properly range-checked against
the length of the packet.

> >  What if an
> > attacker starts sending packets full of VLAN tags?  Since this runs
> > before netfilter, there would be no way to prevent those packets burning
> > our CPU time.  And if there are legitimately multiple VLAN tags, they
> > presumably won't all have the 802.1q Ethertype.
> >
> 
> Do we need to limit the number of rounds to stop this kind of "bad"
> packets from burning our CPU time?

Well, maybe.  Then again, the most effective way for an attacker to
waste a target's CPU time (aside from application-level vulnerabilities)
will often be just to send minimum size packets.

> Then,  __netif_receive_skb() has to
> be update too, so the inspection of tunnel in __skb_get_rxhash() does.

Yes, if we agree this is something worth defending against then we would
need to be consistent in limiting any such parsing loop in pre-netfilter
processing.

> Is there a such limitation in xfrm?

It appears to be limited to 6 levels of encapsulation.

Ben.
diff mbox

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index ead0366..be7ee50 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2529,6 +2529,7 @@  void __skb_get_rxhash(struct sk_buff *skb)
 	int nhoff, hash = 0, poff;
 	const struct ipv6hdr *ip6;
 	const struct iphdr *ip;
+	const struct vlan_hdr *vlan;
 	u8 ip_proto;
 	u32 addr1, addr2;
 	u16 proto;
@@ -2565,6 +2566,13 @@  again:
 		addr2 = (__force u32) ip6->daddr.s6_addr32[3];
 		nhoff += 40;
 		break;
+	case __constant_htons(ETH_P_8021Q):
+		if (!pskb_may_pull(skb, sizeof(*vlan) + nhoff))
+			goto done;
+		vlan = (const struct vlan_hdr *) (skb->data + nhoff);
+		proto = vlan->h_vlan_encapsulated_proto;
+		nhoff += sizeof(*vlan);
+		goto again;
 	default:
 		goto done;
 	}