diff mbox

net: ipv4: Fix incorrect free in ICMP receive

Message ID 1421725837.17892.6.camel@edumazet-glaptop2.roam.corp.google.com
State Superseded, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Dumazet Jan. 20, 2015, 3:50 a.m. UTC
On Tue, 2015-01-20 at 02:34 +0000, subashab@codeaurora.org wrote:
> Thanks David and Eric for the insights. In order for me to steer this
> debug in the right direction, can you please help me? Based on your input
> I looked into this a little deeper to understand the refcnts for sockets
> and skb's in this ping receive path.
> 
> from ping_rcv()
> 
>                 sk = ping_lookup(net, skb, ntohs(icmph->un.echo.id));
>                 if (sk != NULL) {
>                                 pr_debug("rcv on socket %p\n", sk);
>                                 ping_queue_rcv_skb(sk, skb_get(skb));
>                                 sock_put(sk);
>                                 return;
>                 }
> 
> From my understanding I have made the following analysis, please correct
> if I am wrong.
> 
> 1) There is no guarantee that sock_put() in the above code snippet
> will not drop the socket refcount to 0 and free the socket. This can
> hypothetically happen if say
> sock_close()->ping_close()->*->ping_unhash()->sock_put()
> can  happen between in a different context between ping_lookup() and
> sock_put() in the above code snippet. Is this observation accurate?
> 
> 2)  Now since this socket is being freed in the ping receive path, I think
> the following is what is happening with the skb.
> alloc_skb()[skb->users=1]   -> deliver_skb()[skb->users=2]  -> * ->
> icmp_rcv() -> ping_rcv() -> sk_free --> inet_sock_destruct()->
> __skb_queue_purge()->kfree_skb()[dec ref cnt, skb->users=1]
> 
> when stack unwinds to icmp_rcv(), refcnt actually hits zero and packet is
> freed calling the destructor which tries to access the freed socket.
> 
> If these observations are right, Can you please tell me what is the call
> flow that is not supposed to happen but is happening in this issue? I am
> trying to understand better to identify next steps to tackle this issue.

This is why skb_get() is very often a bug.

There is no guarantee the consume_skb() in icmp_rcv() is done before the
skb_queue_purge().



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Cong Wang Jan. 20, 2015, 7:42 p.m. UTC | #1
On Mon, Jan 19, 2015 at 7:50 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2015-01-20 at 02:34 +0000, subashab@codeaurora.org wrote:
>> Thanks David and Eric for the insights. In order for me to steer this
>> debug in the right direction, can you please help me? Based on your input
>> I looked into this a little deeper to understand the refcnts for sockets
>> and skb's in this ping receive path.
>>
>> from ping_rcv()
>>
>>                 sk = ping_lookup(net, skb, ntohs(icmph->un.echo.id));
>>                 if (sk != NULL) {
>>                                 pr_debug("rcv on socket %p\n", sk);
>>                                 ping_queue_rcv_skb(sk, skb_get(skb));
>>                                 sock_put(sk);
>>                                 return;
>>                 }
>>
>> From my understanding I have made the following analysis, please correct
>> if I am wrong.
>>
>> 1) There is no guarantee that sock_put() in the above code snippet
>> will not drop the socket refcount to 0 and free the socket. This can
>> hypothetically happen if say
>> sock_close()->ping_close()->*->ping_unhash()->sock_put()
>> can  happen between in a different context between ping_lookup() and
>> sock_put() in the above code snippet. Is this observation accurate?
>>
>> 2)  Now since this socket is being freed in the ping receive path, I think
>> the following is what is happening with the skb.
>> alloc_skb()[skb->users=1]   -> deliver_skb()[skb->users=2]  -> * ->
>> icmp_rcv() -> ping_rcv() -> sk_free --> inet_sock_destruct()->
>> __skb_queue_purge()->kfree_skb()[dec ref cnt, skb->users=1]
>>
>> when stack unwinds to icmp_rcv(), refcnt actually hits zero and packet is
>> freed calling the destructor which tries to access the freed socket.
>>
>> If these observations are right, Can you please tell me what is the call
>> flow that is not supposed to happen but is happening in this issue? I am
>> trying to understand better to identify next steps to tackle this issue.
>
> This is why skb_get() is very often a bug.
>
> There is no guarantee the consume_skb() in icmp_rcv() is done before the
> skb_queue_purge().
>

Or let the socket layer drop the packet instead of unconditionally
dropping all icmp packets after success?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Jan. 20, 2015, 9:35 p.m. UTC | #2
On Tue, 2015-01-20 at 11:42 -0800, Cong Wang wrote:

> 
> Or let the socket layer drop the packet instead of unconditionally
> dropping all icmp packets after success?

Sure, but that's probably at least 50 lines patch.

net-next candidate most likely.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index c0d82f78d364..2a3720fb5a5f 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -966,8 +966,11 @@  bool ping_rcv(struct sk_buff *skb)
 
 	sk = ping_lookup(net, skb, ntohs(icmph->un.echo.id));
 	if (sk != NULL) {
+		struct sk_buff *skb2 = skb_clone(skb, GFP_ATOMIC);
+
 		pr_debug("rcv on socket %p\n", sk);
-		ping_queue_rcv_skb(sk, skb_get(skb));
+		if (skb2)
+			ping_queue_rcv_skb(sk, skb2);
 		sock_put(sk);
 		return true;
 	}