diff mbox

Repeatable IPv6 crash in 3.19.0-1

Message ID 1425086169.5130.57.camel@edumazet-glaptop2.roam.corp.google.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet Feb. 28, 2015, 1:16 a.m. UTC
On Fri, 2015-02-27 at 16:48 -0800, Eric Dumazet wrote:
> On Fri, 2015-02-27 at 16:37 -0500, Brian Rak wrote:
> > I've been seeing a crash under 3.19.0 that seems to occur when I put 
> > heavy traffic across a macvtap/veth interface.
> > 
> > We have a KVM guest attached to a veth pair using macvtap.  We're 
> > routing IPv6 traffic into one end of the veth pair using some static 
> > routes.  We do *not* have proxy_ndp enabled (though, we are using some 
> > software to do neighbor proxying - http://priv.nu/projects/ndppd/ ).
> > 
> > I've been able to reproduce this pretty easily by downloading some large 
> > files from the guest.  We see two traces in a row when this occurs:
> 
> 
> Nice !
> 
> Crash is in neigh_hh_output()
> 
> -> memcpy(skb->data - HH_DATA_MOD, hh->hh_data, HH_DATA_MOD);
> 
> And there is only 14 bytes of headroom instead of 16.
> 
> Some layer did not align skb_headroom(skb) to HH_DATA_MOD for ethernet
> header.

Could you try following patch ?



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Brian Rak Feb. 28, 2015, 1:54 a.m. UTC | #1
On 2/27/2015 8:16 PM, Eric Dumazet wrote:
> On Fri, 2015-02-27 at 16:48 -0800, Eric Dumazet wrote:
>> On Fri, 2015-02-27 at 16:37 -0500, Brian Rak wrote:
>>> I've been seeing a crash under 3.19.0 that seems to occur when I put
>>> heavy traffic across a macvtap/veth interface.
>>>
>>> We have a KVM guest attached to a veth pair using macvtap.  We're
>>> routing IPv6 traffic into one end of the veth pair using some static
>>> routes.  We do *not* have proxy_ndp enabled (though, we are using some
>>> software to do neighbor proxying - http://priv.nu/projects/ndppd/ ).
>>>
>>> I've been able to reproduce this pretty easily by downloading some large
>>> files from the guest.  We see two traces in a row when this occurs:
>>
>>
>> Nice !
>>
>> Crash is in neigh_hh_output()
>>
>> -> memcpy(skb->data - HH_DATA_MOD, hh->hh_data, HH_DATA_MOD);
>>
>> And there is only 14 bytes of headroom instead of 16.
>>
>> Some layer did not align skb_headroom(skb) to HH_DATA_MOD for ethernet
>> header.
>
> Could you try following patch ?
>
> diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
> index e40fdfccc9c10df4ea8676a1dd59275d5d9c6b88..27ecc5c4fa2665cd42ac1ca81717255f85507113 100644
> --- a/drivers/net/macvtap.c
> +++ b/drivers/net/macvtap.c
> @@ -654,11 +654,14 @@ static void macvtap_skb_to_vnet_hdr(struct macvtap_queue *q,
>   	} /* else everything is zero */
>   }
>
> +/* Neighbour code has some assumptions on HH_DATA_MOD alignment */
> +#define MACVTAP_RESERVE HH_DATA_OFF(ETH_HLEN)
> +
>   /* Get packet from user space buffer */
>   static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
>   				struct iov_iter *from, int noblock)
>   {
> -	int good_linear = SKB_MAX_HEAD(NET_IP_ALIGN);
> +	int good_linear = SKB_MAX_HEAD(MACVTAP_RESERVE);
>   	struct sk_buff *skb;
>   	struct macvlan_dev *vlan;
>   	unsigned long total_len = iov_iter_count(from);
> @@ -722,7 +725,7 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
>   			linear = macvtap16_to_cpu(q, vnet_hdr.hdr_len);
>   	}
>
> -	skb = macvtap_alloc_skb(&q->sk, NET_IP_ALIGN, copylen,
> +	skb = macvtap_alloc_skb(&q->sk, MACVTAP_RESERVE, copylen,
>   				linear, noblock, &err);
>   	if (!skb)
>   		goto err;
>
>

Wow, that was *much* faster then I was expecting, thanks a bunch!

I can confirm that resolves the issue.. I've tested this and it fixes 
the issue perfectly.  I've been able to put a whole bunch of IPv6 
traffic through the interface now, whereas before even a minor amount of 
traffic would crash the host.

Thanks again!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Feb. 28, 2015, 2:01 a.m. UTC | #2
On Fri, 2015-02-27 at 20:54 -0500, Brian Rak wrote:

> Wow, that was *much* faster then I was expecting, thanks a bunch!
> 
> I can confirm that resolves the issue.. I've tested this and it fixes 
> the issue perfectly.  I've been able to put a whole bunch of IPv6 
> traffic through the interface now, whereas before even a minor amount of 
> traffic would crash the host.
> 
> Thanks again!

Interesting...

Had a prior version of linux kernel been fine ?



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Feb. 28, 2015, 2:03 a.m. UTC | #3
On Fri, 2015-02-27 at 18:01 -0800, Eric Dumazet wrote:
> On Fri, 2015-02-27 at 20:54 -0500, Brian Rak wrote:
> 
> > Wow, that was *much* faster then I was expecting, thanks a bunch!
> > 
> > I can confirm that resolves the issue.. I've tested this and it fixes 
> > the issue perfectly.  I've been able to put a whole bunch of IPv6 
> > traffic through the interface now, whereas before even a minor amount of 
> > traffic would crash the host.
> > 
> > Thanks again!
> 
> Interesting...
> 
> Had a prior version of linux kernel been fine ?

Or maybe you recently switched on this config option ?

CONFIG_DEBUG_PAGEALLOC=y



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Brian Rak Feb. 28, 2015, 2:11 a.m. UTC | #4
On 2/27/2015 9:03 PM, Eric Dumazet wrote:
> On Fri, 2015-02-27 at 18:01 -0800, Eric Dumazet wrote:
>> On Fri, 2015-02-27 at 20:54 -0500, Brian Rak wrote:
>>
>>> Wow, that was *much* faster then I was expecting, thanks a bunch!
>>>
>>> I can confirm that resolves the issue.. I've tested this and it fixes
>>> the issue perfectly.  I've been able to put a whole bunch of IPv6
>>> traffic through the interface now, whereas before even a minor amount of
>>> traffic would crash the host.
>>>
>>> Thanks again!
>>
>> Interesting...
>>
>> Had a prior version of linux kernel been fine ?
>
> Or maybe you recently switched on this config option ?
>
> CONFIG_DEBUG_PAGEALLOC=y
>
>
>

We've only recently started using this veth/macvtap combo, so it's 
possible this has been around for awhile and we just hadn't noticed.

I don't have any info on older kernels currently.  I *think* I've seen 
crashes on 3.17.1, but I didn't save any stack traces, so I can't be sure.

CONFIG_DEBUG_PAGEALLOC is not set, and never has been.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Feb. 28, 2015, 2:21 a.m. UTC | #5
On Fri, 2015-02-27 at 21:11 -0500, Brian Rak wrote:

> We've only recently started using this veth/macvtap combo, so it's 
> possible this has been around for awhile and we just hadn't noticed.
> 
> I don't have any info on older kernels currently.  I *think* I've seen 
> crashes on 3.17.1, but I didn't save any stack traces, so I can't be sure.
> 
> CONFIG_DEBUG_PAGEALLOC is not set, and never has been.

OK, thanks for the confirmation. I'll send an official patch.

(I guess same patch is also needed for drivers/net/tun.c)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index e40fdfccc9c10df4ea8676a1dd59275d5d9c6b88..27ecc5c4fa2665cd42ac1ca81717255f85507113 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -654,11 +654,14 @@  static void macvtap_skb_to_vnet_hdr(struct macvtap_queue *q,
 	} /* else everything is zero */
 }
 
+/* Neighbour code has some assumptions on HH_DATA_MOD alignment */
+#define MACVTAP_RESERVE HH_DATA_OFF(ETH_HLEN)
+
 /* Get packet from user space buffer */
 static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
 				struct iov_iter *from, int noblock)
 {
-	int good_linear = SKB_MAX_HEAD(NET_IP_ALIGN);
+	int good_linear = SKB_MAX_HEAD(MACVTAP_RESERVE);
 	struct sk_buff *skb;
 	struct macvlan_dev *vlan;
 	unsigned long total_len = iov_iter_count(from);
@@ -722,7 +725,7 @@  static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
 			linear = macvtap16_to_cpu(q, vnet_hdr.hdr_len);
 	}
 
-	skb = macvtap_alloc_skb(&q->sk, NET_IP_ALIGN, copylen,
+	skb = macvtap_alloc_skb(&q->sk, MACVTAP_RESERVE, copylen,
 				linear, noblock, &err);
 	if (!skb)
 		goto err;