diff mbox

ipv4: dst_entry leak in ip_append_data()

Message ID 543BA6BE.3040509@parallels.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Vasily Averin Oct. 13, 2014, 10:17 a.m. UTC
Fixes: 2e77d89b2fa8 ("net: avoid a pair of dst_hold()/dst_release() in ip_append_data()")

If sk_write_queue is empty ip_append_data() executes ip_setup_cork()
that "steals" dst entry from rt to cork. Later it calls __ip_append_data()
that creates skb and adds it to sk_write_queue.

If skb was added successfully following ip_push_pending_frames() call
reassign dst entry from cork to skb, and kfree_skb frees dst_entry.

However nobody frees stolen dst_entry if skb was not added into sk_write_queue.

Signed-off-by: Vasily Averin <vvs@parallels.com>
---
 net/ipv4/ip_output.c | 25 +++++++++++++++----------
 1 file changed, 15 insertions(+), 10 deletions(-)

Comments

David Miller Oct. 13, 2014, 4:03 p.m. UTC | #1
From: Vasily Averin <vvs@parallels.com>
Date: Mon, 13 Oct 2014 14:17:34 +0400

> @@ -1152,9 +1161,14 @@ int ip_append_data(struct sock *sk, struct flowi4 *fl4,
>  		transhdrlen = 0;
>  	}
>  
> -	return __ip_append_data(sk, fl4, &sk->sk_write_queue, &inet->cork.base,
> +	err = __ip_append_data(sk, fl4, &sk->sk_write_queue, &inet->cork.base,
>  				sk_page_frag(sk), getfrag,
>  				from, length, transhdrlen, flags);

If you are changing the column of the openning parenthesis of the function
call, you must adjust the indentation of the arguments on the subsequent
lines so that they start exactly at the first column after that openning
parenthesis.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index e35b712..cc7b579 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1120,6 +1120,15 @@  static int ip_setup_cork(struct sock *sk, struct inet_cork *cork,
 	return 0;
 }
 
+static void ip_cork_release(struct inet_cork *cork)
+{
+	cork->flags &= ~IPCORK_OPT;
+	kfree(cork->opt);
+	cork->opt = NULL;
+	dst_release(cork->dst);
+	cork->dst = NULL;
+}
+
 /*
  *	ip_append_data() and ip_append_page() can make one large IP datagram
  *	from many pieces of data. Each pieces will be holded on the socket
@@ -1152,9 +1161,14 @@  int ip_append_data(struct sock *sk, struct flowi4 *fl4,
 		transhdrlen = 0;
 	}
 
-	return __ip_append_data(sk, fl4, &sk->sk_write_queue, &inet->cork.base,
+	err = __ip_append_data(sk, fl4, &sk->sk_write_queue, &inet->cork.base,
 				sk_page_frag(sk), getfrag,
 				from, length, transhdrlen, flags);
+
+	if (skb_queue_empty(&sk->sk_write_queue))
+		ip_cork_release(&inet->cork.base);
+
+	return err;
 }
 
 ssize_t	ip_append_page(struct sock *sk, struct flowi4 *fl4, struct page *page,
@@ -1304,15 +1318,6 @@  error:
 	return err;
 }
 
-static void ip_cork_release(struct inet_cork *cork)
-{
-	cork->flags &= ~IPCORK_OPT;
-	kfree(cork->opt);
-	cork->opt = NULL;
-	dst_release(cork->dst);
-	cork->dst = NULL;
-}
-
 /*
  *	Combined all pending IP fragments on the socket as one IP datagram
  *	and push them out.