diff mbox

ipv4: ip_check_defrag must not modify skb before unsharing

Message ID 1355132466.9857.6.camel@jlt4.sipsolutions.net
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Johannes Berg Dec. 10, 2012, 9:41 a.m. UTC
From: Johannes Berg <johannes.berg@intel.com>

ip_check_defrag() might be called from af_packet within the
RX path where shared SKBs are used, so it must not modify
the input SKB before it has unshared it for defragmentation.
Use skb_copy_bits() to get the IP header and only pull in
everything later.

The same is true for the other caller in macvlan as it is
called from dev->rx_handler which can also get a shared SKB.

Reported-by: Eric Leblond <eric@regit.org>
Cc: stable@vger.kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
---
For some versions of the kernel, this code goes into af_packet.c

 net/ipv4/ip_fragment.c | 19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

Comments

Eric Leblond Dec. 10, 2012, 11:02 a.m. UTC | #1
Hello,

On Mon, 2012-12-10 at 10:41 +0100, Johannes Berg wrote:
> From: Johannes Berg <johannes.berg@intel.com>
> 
> ip_check_defrag() might be called from af_packet within the
> RX path where shared SKBs are used, so it must not modify
> the input SKB before it has unshared it for defragmentation.
> Use skb_copy_bits() to get the IP header and only pull in
> everything later.
> 
> The same is true for the other caller in macvlan as it is
> called from dev->rx_handler which can also get a shared SKB.

I've applied the patch and built a new kernel. I did not manage to get
it crashed when using the two techniques (suspend to ram and down/up
interface) that were working well to crash kernel without the patch.

BR,

> Reported-by: Eric Leblond <eric@regit.org>
> Cc: stable@vger.kernel.org
> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
> ---
> For some versions of the kernel, this code goes into af_packet.c
> 
>  net/ipv4/ip_fragment.c | 19 +++++++++----------
>  1 file changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
> index 448e685..8d5cc75 100644
> --- a/net/ipv4/ip_fragment.c
> +++ b/net/ipv4/ip_fragment.c
> @@ -707,28 +707,27 @@ EXPORT_SYMBOL(ip_defrag);
>  
>  struct sk_buff *ip_check_defrag(struct sk_buff *skb, u32 user)
>  {
> -	const struct iphdr *iph;
> +	struct iphdr iph;
>  	u32 len;
>  
>  	if (skb->protocol != htons(ETH_P_IP))
>  		return skb;
>  
> -	if (!pskb_may_pull(skb, sizeof(struct iphdr)))
> +	if (!skb_copy_bits(skb, 0, &iph, sizeof(iph)))
>  		return skb;
>  
> -	iph = ip_hdr(skb);
> -	if (iph->ihl < 5 || iph->version != 4)
> +	if (iph.ihl < 5 || iph.version != 4)
>  		return skb;
> -	if (!pskb_may_pull(skb, iph->ihl*4))
> -		return skb;
> -	iph = ip_hdr(skb);
> -	len = ntohs(iph->tot_len);
> -	if (skb->len < len || len < (iph->ihl * 4))
> +
> +	len = ntohs(iph.tot_len);
> +	if (skb->len < len || len < (iph.ihl * 4))
>  		return skb;
>  
> -	if (ip_is_fragment(ip_hdr(skb))) {
> +	if (ip_is_fragment(&iph)) {
>  		skb = skb_share_check(skb, GFP_ATOMIC);
>  		if (skb) {
> +			if (!pskb_may_pull(skb, iph.ihl*4))
> +				return skb;
>  			if (pskb_trim_rcsum(skb, len))
>  				return skb;
>  			memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));
David Miller Dec. 10, 2012, 6:41 p.m. UTC | #2
From: Johannes Berg <johannes@sipsolutions.net>
Date: Mon, 10 Dec 2012 10:41:06 +0100

> From: Johannes Berg <johannes.berg@intel.com>
> 
> ip_check_defrag() might be called from af_packet within the
> RX path where shared SKBs are used, so it must not modify
> the input SKB before it has unshared it for defragmentation.
> Use skb_copy_bits() to get the IP header and only pull in
> everything later.
> 
> The same is true for the other caller in macvlan as it is
> called from dev->rx_handler which can also get a shared SKB.
> 
> Reported-by: Eric Leblond <eric@regit.org>
> Cc: stable@vger.kernel.org
> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
> ---
> For some versions of the kernel, this code goes into af_packet.c

So the bug is that ip_check_defrag() has a precondition which is met
properly by all callers except AF_PACKET.

If this is the case, remind me why are we changing ip_check_defrag()
rather than the violator of the precondition?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Johannes Berg Dec. 10, 2012, 6:45 p.m. UTC | #3
On Mon, 2012-12-10 at 13:41 -0500, David Miller wrote:
> From: Johannes Berg <johannes@sipsolutions.net>
> Date: Mon, 10 Dec 2012 10:41:06 +0100
> 
> > From: Johannes Berg <johannes.berg@intel.com>
> > 
> > ip_check_defrag() might be called from af_packet within the
> > RX path where shared SKBs are used, so it must not modify
> > the input SKB before it has unshared it for defragmentation.
> > Use skb_copy_bits() to get the IP header and only pull in
> > everything later.
> > 
> > The same is true for the other caller in macvlan as it is
> > called from dev->rx_handler which can also get a shared SKB.
> > 
> > Reported-by: Eric Leblond <eric@regit.org>
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Johannes Berg <johannes.berg@intel.com>
> > ---
> > For some versions of the kernel, this code goes into af_packet.c
> 
> So the bug is that ip_check_defrag() has a precondition which is met
> properly by all callers except AF_PACKET.
> 
> If this is the case, remind me why are we changing ip_check_defrag()
> rather than the violator of the precondition?

I don't think this is the case.

If you're referring to my note about af_packet: the kernels where this
goes into af_packet.c are the kernels that don't even have
ip_check_defrag() because macvlan didn't exist/didn't have ip defrag
support and af_packet had this code there -- see commit bc416d9768a.

If you're not referring to my note about af_packet: both callers (there
are only two) of ip_check_defrag() have this bug as far as I can tell
because they're both in the part of the RX path where shared SKBs might
happen.

johannes


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Dec. 10, 2012, 6:50 p.m. UTC | #4
From: Johannes Berg <johannes@sipsolutions.net>
Date: Mon, 10 Dec 2012 19:45:52 +0100

> On Mon, 2012-12-10 at 13:41 -0500, David Miller wrote:
>> So the bug is that ip_check_defrag() has a precondition which is met
>> properly by all callers except AF_PACKET.
>> 
>> If this is the case, remind me why are we changing ip_check_defrag()
>> rather than the violator of the precondition?
> 
> I don't think this is the case.
> 
> If you're referring to my note about af_packet: the kernels where this
> goes into af_packet.c are the kernels that don't even have
> ip_check_defrag() because macvlan didn't exist/didn't have ip defrag
> support and af_packet had this code there -- see commit bc416d9768a.
> 
> If you're not referring to my note about af_packet: both callers (there
> are only two) of ip_check_defrag() have this bug as far as I can tell
> because they're both in the part of the RX path where shared SKBs might
> happen.

You're right, I misinterpreted what's happening here.

My misunderstanding was that this was a situation where normal IPV4
input processing makes sure the SKB is unshared, and we had special
code paths that didn't make sure that was the case.

Rather, here, we have a special entrypoint for macvlan and AF_PACKET
which is supposed to take care of such issues since it is known to
execute in a different kind of environment.

I'm pretty sure I'll apply this, after I check a few more things,
thanks Johannes!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index 448e685..8d5cc75 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -707,28 +707,27 @@  EXPORT_SYMBOL(ip_defrag);
 
 struct sk_buff *ip_check_defrag(struct sk_buff *skb, u32 user)
 {
-	const struct iphdr *iph;
+	struct iphdr iph;
 	u32 len;
 
 	if (skb->protocol != htons(ETH_P_IP))
 		return skb;
 
-	if (!pskb_may_pull(skb, sizeof(struct iphdr)))
+	if (!skb_copy_bits(skb, 0, &iph, sizeof(iph)))
 		return skb;
 
-	iph = ip_hdr(skb);
-	if (iph->ihl < 5 || iph->version != 4)
+	if (iph.ihl < 5 || iph.version != 4)
 		return skb;
-	if (!pskb_may_pull(skb, iph->ihl*4))
-		return skb;
-	iph = ip_hdr(skb);
-	len = ntohs(iph->tot_len);
-	if (skb->len < len || len < (iph->ihl * 4))
+
+	len = ntohs(iph.tot_len);
+	if (skb->len < len || len < (iph.ihl * 4))
 		return skb;
 
-	if (ip_is_fragment(ip_hdr(skb))) {
+	if (ip_is_fragment(&iph)) {
 		skb = skb_share_check(skb, GFP_ATOMIC);
 		if (skb) {
+			if (!pskb_may_pull(skb, iph.ihl*4))
+				return skb;
 			if (pskb_trim_rcsum(skb, len))
 				return skb;
 			memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));