Patchwork net: loopback: fix a dst refcounting issue

login
register
mail settings
Submitter Eric Dumazet
Date Jan. 25, 2013, 5:44 p.m.
Message ID <1359135881.12374.3249.camel@edumazet-glaptop>
Download mbox | patch
Permalink /patch/215818/
State Accepted
Delegated to: David Miller
Headers show

Comments

Eric Dumazet - Jan. 25, 2013, 5:44 p.m.
From: Eric Dumazet <edumazet@google.com>

Ben Greear reported crashes in ip_rcv_finish() on a stress
test involving many macvlans.

We tracked the bug to a dst use after free. ip_rcv_finish()
was calling dst->input() and got garbage for dst->input value.

It appears the bug is in loopback driver, lacking
a skb_dst_force() before calling netif_rx().

As a result, a non refcounted dst, normally protected by a
RCU read_lock section, was escaping this section and could
be freed before the packet being processed.

  [<ffffffff813a3c4d>] loopback_xmit+0x64/0x83
  [<ffffffff81477364>] dev_hard_start_xmit+0x26c/0x35e
  [<ffffffff8147771a>] dev_queue_xmit+0x2c4/0x37c
  [<ffffffff81477456>] ? dev_hard_start_xmit+0x35e/0x35e
  [<ffffffff8148cfa6>] ? eth_header+0x28/0xb6
  [<ffffffff81480f09>] neigh_resolve_output+0x176/0x1a7
  [<ffffffff814ad835>] ip_finish_output2+0x297/0x30d
  [<ffffffff814ad6d5>] ? ip_finish_output2+0x137/0x30d
  [<ffffffff814ad90e>] ip_finish_output+0x63/0x68
  [<ffffffff814ae412>] ip_output+0x61/0x67
  [<ffffffff814ab904>] dst_output+0x17/0x1b
  [<ffffffff814adb6d>] ip_local_out+0x1e/0x23
  [<ffffffff814ae1c4>] ip_queue_xmit+0x315/0x353
  [<ffffffff814adeaf>] ? ip_send_unicast_reply+0x2cc/0x2cc
  [<ffffffff814c018f>] tcp_transmit_skb+0x7ca/0x80b
  [<ffffffff814c3571>] tcp_connect+0x53c/0x587
  [<ffffffff810c2f0c>] ? getnstimeofday+0x44/0x7d
  [<ffffffff810c2f56>] ? ktime_get_real+0x11/0x3e
  [<ffffffff814c6f9b>] tcp_v4_connect+0x3c2/0x431
  [<ffffffff814d6913>] __inet_stream_connect+0x84/0x287
  [<ffffffff814d6b38>] ? inet_stream_connect+0x22/0x49
  [<ffffffff8108d695>] ? _local_bh_enable_ip+0x84/0x9f
  [<ffffffff8108d6c8>] ? local_bh_enable+0xd/0x11
  [<ffffffff8146763c>] ? lock_sock_nested+0x6e/0x79
  [<ffffffff814d6b38>] ? inet_stream_connect+0x22/0x49
  [<ffffffff814d6b49>] inet_stream_connect+0x33/0x49
  [<ffffffff814632c6>] sys_connect+0x75/0x98

This bug was introduced in linux-2.6.35, in commit
7fee226ad2397b (net: add a noref bit on skb dst)

skb_dst_force() is enforced in dev_queue_xmit() for devices having a
qdisc. 

Reported-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Ben Greear <greearb@candelatech.com>
---
 drivers/net/loopback.c |    5 +++++
 1 file changed, 5 insertions(+)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - Jan. 27, 2013, 6:32 a.m.
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 25 Jan 2013 09:44:41 -0800

> From: Eric Dumazet <edumazet@google.com>
> 
> Ben Greear reported crashes in ip_rcv_finish() on a stress
> test involving many macvlans.
> 
> We tracked the bug to a dst use after free. ip_rcv_finish()
> was calling dst->input() and got garbage for dst->input value.
> 
> It appears the bug is in loopback driver, lacking
> a skb_dst_force() before calling netif_rx().
> 
> As a result, a non refcounted dst, normally protected by a
> RCU read_lock section, was escaping this section and could
> be freed before the packet being processed.
 ...
> This bug was introduced in linux-2.6.35, in commit
> 7fee226ad2397b (net: add a noref bit on skb dst)
> 
> skb_dst_force() is enforced in dev_queue_xmit() for devices having a
> qdisc. 
> 
> Reported-by: Ben Greear <greearb@candelatech.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Tested-by: Ben Greear <greearb@candelatech.com>

Applied, I suspect we need to tree wide audit for other instances of
this situation.  Probably greps for skb_orphan() would be a good
start.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet - Jan. 27, 2013, 5:25 p.m.
On Sun, 2013-01-27 at 01:32 -0500, David Miller wrote:

> Applied, I suspect we need to tree wide audit for other instances of
> this situation.  Probably greps for skb_orphan() would be a good
> start.

I highly suspect macvlan needs a skb_dst_drop(), but couldnt trigger
a case to trigger the fault.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - Jan. 28, 2013, 12:26 a.m.
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sun, 27 Jan 2013 09:25:08 -0800

> On Sun, 2013-01-27 at 01:32 -0500, David Miller wrote:
> 
>> Applied, I suspect we need to tree wide audit for other instances of
>> this situation.  Probably greps for skb_orphan() would be a good
>> start.
> 
> I highly suspect macvlan needs a skb_dst_drop(), but couldnt trigger
> a case to trigger the fault.

I completely agree, when we hit the ->forward() or ->receive() method
we are leaving the RCU safety zone for the cached dst.

Even if you can't trigger it, this seems clear enough to me that we
should put those skb_dst_drop() calls in, so please formally submit
that patch you posted for Ben as an RFC last week.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/net/loopback.c b/drivers/net/loopback.c
index 81f8f9e..fcbf680 100644
--- a/drivers/net/loopback.c
+++ b/drivers/net/loopback.c
@@ -77,6 +77,11 @@  static netdev_tx_t loopback_xmit(struct sk_buff *skb,
 
 	skb_orphan(skb);
 
+	/* Before queueing this packet to netif_rx(),
+	 * make sure dst is refcounted.
+	 */
+	skb_dst_force(skb);
+
 	skb->protocol = eth_type_trans(skb, dev);
 
 	/* it's OK to use per_cpu_ptr() because BHs are off */