diff mbox

[v2] ethernet: call __skb_pull() in eth_type_trans()

Message ID 1272895972-13799-1-git-send-email-xiaosuo@gmail.com
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Changli Gao May 3, 2010, 2:12 p.m. UTC
call __skb_pull() in eth_type_trans().

The callers of eth_type_trans() should always feed it long enough packets. When
the length of the packet is less than ETH_ZLEN, a warning message will be shown,
and the later behaviors are undefined.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
 net/ethernet/eth.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Eric Dumazet May 3, 2010, 2:44 p.m. UTC | #1
Le lundi 03 mai 2010 à 22:12 +0800, Changli Gao a écrit :
> call __skb_pull() in eth_type_trans().
> 
> The callers of eth_type_trans() should always feed it long enough packets. When
> the length of the packet is less than ETH_ZLEN, a warning message will be shown,
> and the later behaviors are undefined.
> 
> Signed-off-by: Changli Gao <xiaosuo@gmail.com>
> ----
>  net/ethernet/eth.c |    5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> diff --git a/net/ethernet/eth.c b/net/ethernet/eth.c
> index 61ec032..1df31cc 100644
> --- a/net/ethernet/eth.c
> +++ b/net/ethernet/eth.c
> @@ -162,7 +162,10 @@ __be16 eth_type_trans(struct sk_buff *skb, struct net_device *dev)
>  
>  	skb->dev = dev;
>  	skb_reset_mac_header(skb);
> -	skb_pull_inline(skb, ETH_HLEN);
> +	if (unlikely(skb->len < ETH_ZLEN))
> +		dev_warn(&dev->dev, "too small ethernet packet: %u bytes\n",
> +			 skb->len);
> +	__skb_pull(skb, ETH_HLEN);
>  	eth = eth_hdr(skb);
>  
>  	if (unlikely(is_multicast_ether_addr(eth->h_dest))) {


Hmm, I feel very uncompfortable with this patch.

I am pretty sure some callers dont check minimum ethernet frame length.

At least a WARN_ON_ONCE() is needed, just in case...
In fact our stack has different requirements.

Check net/ipv4/ip_gre.c for example.

                if (tunnel->dev->type == ARPHRD_ETHER) {
                        if (!pskb_may_pull(skb, ETH_HLEN)) {
                                stats->rx_length_errors++;
                                stats->rx_errors++;
                                goto drop;
                        }

                        iph = ip_hdr(skb);
                        skb->protocol = eth_type_trans(skb, tunnel->dev);
                        skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN);
                }


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller May 3, 2010, 7:54 p.m. UTC | #2
From: Changli Gao <xiaosuo@gmail.com>
Date: Mon,  3 May 2010 22:12:52 +0800

> @@ -162,7 +162,10 @@ __be16 eth_type_trans(struct sk_buff *skb, struct net_device *dev)
>  
>  	skb->dev = dev;
>  	skb_reset_mac_header(skb);
> -	skb_pull_inline(skb, ETH_HLEN);
> +	if (unlikely(skb->len < ETH_ZLEN))
> +		dev_warn(&dev->dev, "too small ethernet packet: %u bytes\n",
> +			 skb->len);
> +	__skb_pull(skb, ETH_HLEN);
>  	eth = eth_hdr(skb);

And now it's even more expensive than skb_pull_inline() :-)

Really, things are fine as-is.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Changli Gao May 4, 2010, 2:05 a.m. UTC | #3
On Mon, May 3, 2010 at 10:44 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> Hmm, I feel very uncompfortable with this patch.
>
> I am pretty sure some callers dont check minimum ethernet frame length.
>
> At least a WARN_ON_ONCE() is needed, just in case...
> In fact our stack has different requirements.
>
> Check net/ipv4/ip_gre.c for example.
>
>                if (tunnel->dev->type == ARPHRD_ETHER) {
>                        if (!pskb_may_pull(skb, ETH_HLEN)) {
>                                stats->rx_length_errors++;
>                                stats->rx_errors++;
>                                goto drop;
>                        }
>
>                        iph = ip_hdr(skb);
>                        skb->protocol = eth_type_trans(skb, tunnel->dev);
>                        skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN);
>                }
>

So the minimal packet length eth_type_trans() requires should be
ETH_HLEN, not ETH_ZLEN.
Changli Gao May 4, 2010, 2:34 a.m. UTC | #4
On Tue, May 4, 2010 at 3:54 AM, David Miller <davem@davemloft.net> wrote:
> From: Changli Gao <xiaosuo@gmail.com>
> Date: Mon,  3 May 2010 22:12:52 +0800
>
>> @@ -162,7 +162,10 @@ __be16 eth_type_trans(struct sk_buff *skb, struct net_device *dev)
>>
>>       skb->dev = dev;
>>       skb_reset_mac_header(skb);
>> -     skb_pull_inline(skb, ETH_HLEN);
>> +     if (unlikely(skb->len < ETH_ZLEN))
>> +             dev_warn(&dev->dev, "too small ethernet packet: %u bytes\n",
>> +                      skb->len);
>> +     __skb_pull(skb, ETH_HLEN);
>>       eth = eth_hdr(skb);
>
> And now it's even more expensive than skb_pull_inline() :-)
>
> Really, things are fine as-is.
>

It seems no callers pass eth_type_trans() a packet, whose length is
less than ETH_HLEN. It means that skb_pull() always returns non-NULL.
And if skb_pull() returns NULL, the later memory dereferences must be
invalid. So, we can safely call __skb_pull() instead of skb_pull().
And If the current code works, there is no reason the new code without
the check(skb->len < ETH_HLEN) doesn't work.

As Eric mentioned above, GRE only assures the length of the packets
passed to eth_type_trans() isn't less than ETH_HLEN, we should check
skb->len before we dereference skb->data.

        rawp = skb->data;

        /*
         *      This is a magic hack to spot IPX packets. Older Novell breaks
         *      the protocol design and runs IPX over 802.3 without an 802.2 LLC
         *      layer. We look for FFFF which isn't a used 802.2 SSAP/DSAP. This
         *      won't work for fault tolerant netware but does for the rest.
         */
        if (*(unsigned short *)rawp == 0xFFFF)
                return htons(ETH_P_802_3);


For performance, how about inlining eth_type_trans(). Because its main
users are NIC drivers, and there aren't likely many kinds of NICs at
the same time, inlining it won't increases the size of the kernel
image much.
David Miller May 4, 2010, 6:16 a.m. UTC | #5
From: Changli Gao <xiaosuo@gmail.com>
Date: Tue, 4 May 2010 10:34:07 +0800

> It seems no callers pass eth_type_trans() a packet, whose length is
> less than ETH_HLEN. It means that skb_pull() always returns non-NULL.
> And if skb_pull() returns NULL, the later memory dereferences must be
> invalid.

In your opinion.  As I explained in the reply to your latest
eth_type_trans() patch, we can defererence several bytes past
the end of skb->data's valid packet data area without faulting.

We'll just read in garbage, because it's things like skb_shared_info()
and friends might be there.

But it's completely safe, and frankly I'm fine with the kernel doing
this when runts make it into this code if that allows us to avoid
stupid checks.

> As Eric mentioned above, GRE only assures the length of the packets
> passed to eth_type_trans() isn't less than ETH_HLEN, we should check
> skb->len before we dereference skb->data.

The code needs only ETH_HLEN, but valid ethernet packets must be at
least ETH_ZLEN.

> For performance, how about inlining eth_type_trans(). Because its main
> users are NIC drivers, and there aren't likely many kinds of NICs at
> the same time, inlining it won't increases the size of the kernel
> image much.

No, that's unnecessary bloat, plus I want to make ->ndo_type_trans()
a netdev operation so it can possibly be deferred.

Changli, please go hack elsewhere and on some other piece of code,
all your ideas here in eth_type_trans() are not well founded.

I see from your struct dst union removal patch that you don't even
build test your changes, so why don't you spend your excess energy on
making sure your code at least compiles successfully?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ethernet/eth.c b/net/ethernet/eth.c
index 61ec032..1df31cc 100644
--- a/net/ethernet/eth.c
+++ b/net/ethernet/eth.c
@@ -162,7 +162,10 @@  __be16 eth_type_trans(struct sk_buff *skb, struct net_device *dev)
 
 	skb->dev = dev;
 	skb_reset_mac_header(skb);
-	skb_pull_inline(skb, ETH_HLEN);
+	if (unlikely(skb->len < ETH_ZLEN))
+		dev_warn(&dev->dev, "too small ethernet packet: %u bytes\n",
+			 skb->len);
+	__skb_pull(skb, ETH_HLEN);
 	eth = eth_hdr(skb);
 
 	if (unlikely(is_multicast_ether_addr(eth->h_dest))) {