diff mbox

Deadlock with icmpv6fuzz

Message ID 20090205130149.GA28152@gondor.apana.org.au
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Herbert Xu Feb. 5, 2009, 1:01 p.m. UTC
On Thu, Jan 29, 2009 at 05:49:54PM -0800, David Miller wrote:
> From: Herbert Xu <herbert@gondor.apana.org.au>
> Date: Wed, 28 Jan 2009 20:35:07 +1100
> 
> > Any volunteers to fix this?
> 
> I'll try to take a stab at it later tonight.

I took a stab at it.

ipv6: Copy cork options in ip6_append_data

As the options passed to ip6_append_data may be ephemeral, we need
to duplicate it for corking.  This patch applies the simplest fix
which is to memdup all the relevant bits.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>


Cheers,

Comments

Eric Sesterhenn Feb. 5, 2009, 2:31 p.m. UTC | #1
* Herbert Xu (herbert@gondor.apana.org.au) wrote:
> On Thu, Jan 29, 2009 at 05:49:54PM -0800, David Miller wrote:
> > From: Herbert Xu <herbert@gondor.apana.org.au>
> > Date: Wed, 28 Jan 2009 20:35:07 +1100
> > 
> > > Any volunteers to fix this?
> > 
> > I'll try to take a stab at it later tonight.
> 
> I took a stab at it.
> 
> ipv6: Copy cork options in ip6_append_data
> 
> As the options passed to ip6_append_data may be ephemeral, we need
> to duplicate it for corking.  This patch applies the simplest fix
> which is to memdup all the relevant bits.

Thanks, this fixes the issue, I've been running icmpv6fuzz for a while
again and the only issue i saw so far was a page allocation failure:

Kernel is only dirty from your patch

[ 2880.044328] icmpv6fuzz: page allocation failure. order:9, mode:0x40d0
[ 2880.044495] Pid: 10968, comm: icmpv6fuzz Not tainted
2.6.29-rc3-00580-ga2fe994-dirty #239
[ 2880.044694] Call Trace:
[ 2880.044802]  [<c016886a>] __alloc_pages_internal+0x38e/0x3aa
[ 2880.044954]  [<c016889a>] __get_free_pages+0x14/0x24
[ 2880.071336]  [<c018412c>] __kmalloc+0x2e/0x122
[ 2880.071466]  [<c06fa1f9>] ? ipv6_flowlabel_opt+0x1b2/0x7b1
[ 2880.071589]  [<c06fa227>] ipv6_flowlabel_opt+0x1e0/0x7b1
[ 2880.071689]  [<c013e32f>] ? mark_held_locks+0x43/0x5a
[ 2880.071818]  [<c0125ecf>] ? local_bh_enable+0xa1/0xba
[ 2880.071910]  [<c013e4f1>] ? trace_hardirqs_on_caller+0x10d/0x14b
[ 2880.092665]  [<c066cfb6>] ? lock_sock_nested+0xb2/0xbd
[ 2880.092800]  [<c06e831d>] ? ipv6_setsockopt+0x8e/0xb89
[ 2880.092922]  [<c06e8c9e>] ipv6_setsockopt+0xa0f/0xb89
[ 2880.093098]  [<c013fce5>] ? __lock_acquire+0x6a8/0x6fe
[ 2880.093192]  [<c013fce5>] ? __lock_acquire+0x6a8/0x6fe
[ 2880.093323]  [<c0106d8d>] ? native_sched_clock+0x41/0x68
[ 2880.093420]  [<c013be58>] ? lock_release_holdtime+0x9f/0xa7
[ 2880.093541]  [<c0106d8d>] ? native_sched_clock+0x41/0x68
[ 2880.093634]  [<c013bda5>] ? put_lock_stats+0xd/0x21
[ 2880.093748]  [<c013be58>] ? lock_release_holdtime+0x9f/0xa7
[ 2880.093847]  [<c06edf93>] rawv6_setsockopt+0x78/0xe9
[ 2880.093963]  [<c066c9dd>] sock_common_setsockopt+0x13/0x18
[ 2880.094257]  [<c066b098>] sys_setsockopt+0x59/0x77
[ 2880.094424]  [<c066c58a>] sys_socketcall+0x12a/0x17b
[ 2880.094631]  [<c0102dc1>] sysenter_do_call+0x12/0x31
[ 2880.094797] Mem-Info:
[ 2880.094961] DMA per-cpu:
[ 2880.095200] CPU    0: hi:    0, btch:   1 usd:   0
[ 2880.095363] Normal per-cpu:
[ 2880.095536] CPU    0: hi:   90, btch:  15 usd:   5
[ 2880.095703] Active_anon:15024 active_file:510 inactive_anon:36637
[ 2880.095707]  inactive_file:761 unevictable:0 dirty:13 writeback:457
unstable:0
[ 2880.095712]  free:1798 slab:3367 mapped:300 pagetables:387 bounce:0
[ 2880.096242] DMA free:300kB min:120kB low:148kB high:180kB
active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB
unevictable:0kB present:15752kB pages_scanned:0 all_unreclaimable? no
[ 2880.096523] lowmem_reserve[]: 0 238 238 238
[ 2880.096838] Normal free:6892kB min:1912kB low:2388kB high:2868kB
active_anon:60096kB inactive_anon:146548kB active_file:2040kB
inactive_file:3044kB unevictable:0kB present:243824kB pages_scanned:0
all_unreclaimable? no
[ 2880.097479] lowmem_reserve[]: 0 0 0 0
[ 2880.097749] DMA: 1*4kB 1*8kB 0*16kB 1*32kB 2*64kB 1*128kB 0*256kB
0*512kB 0*1024kB 0*2048kB 0*4096kB = 300kB
[ 2880.098406] Normal: 415*4kB 130*8kB 10*16kB 4*32kB 5*64kB 2*128kB
1*256kB 2*512kB 2*1024kB 0*2048kB 0*4096kB = 6892kB
[ 2880.098994] 24927 total pagecache pages
[ 2880.130231] 23693 pages in swap cache
[ 2880.130343] Swap cache stats: add 424475, delete 400782, find
17623/61701
[ 2880.130979] Free swap  = 311644kB
[ 2880.131249] Total swap = 746980kB
[ 2880.225394] 65532 pages RAM
[ 2880.225551] 0 pages HighMem
[ 2880.225677] 4932 pages reserved
[ 2880.225772] 1554 pages shared
[ 2880.225889] 57445 pages non-shared

Greetings, Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roland Dreier Feb. 5, 2009, 10:24 p.m. UTC | #2
> [ 2880.044328] icmpv6fuzz: page allocation failure. order:9, mode:0x40d0
 > [ 2880.044495] Pid: 10968, comm: icmpv6fuzz Not tainted
 > 2.6.29-rc3-00580-ga2fe994-dirty #239
 > [ 2880.044694] Call Trace:
 > [ 2880.044802]  [<c016886a>] __alloc_pages_internal+0x38e/0x3aa
 > [ 2880.044954]  [<c016889a>] __get_free_pages+0x14/0x24
 > [ 2880.071336]  [<c018412c>] __kmalloc+0x2e/0x122
 > [ 2880.071589]  [<c06fa227>] ipv6_flowlabel_opt+0x1e0/0x7b1
 > [ 2880.092922]  [<c06e8c9e>] ipv6_setsockopt+0xa0f/0xb89

From a quick scan of the code, it looks as if optlen is never sanity
checked in the case of setsockopt(IPV6_FLOWLABEL_MGR), and
ipv6_flowlabel_opt() calls into fl_create() with whatever value
userspace passes in, which then pretty much does kmalloc(optlen).
So if icmpv6fuzz passes some big random value, it can cause this failure.

I don't know what the appropriate limit should be, so no patch, sorry.

 - R.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Feb. 5, 2009, 11:16 p.m. UTC | #3
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Fri, 6 Feb 2009 00:01:49 +1100

> ipv6: Copy cork options in ip6_append_data
> 
> As the options passed to ip6_append_data may be ephemeral, we need
> to duplicate it for corking.  This patch applies the simplest fix
> which is to memdup all the relevant bits.
> 
> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

Great work, applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Herbert Xu Feb. 5, 2009, 11:43 p.m. UTC | #4
On Thu, Feb 05, 2009 at 02:24:02PM -0800, Roland Dreier wrote:
>
> >From a quick scan of the code, it looks as if optlen is never sanity
> checked in the case of setsockopt(IPV6_FLOWLABEL_MGR), and
> ipv6_flowlabel_opt() calls into fl_create() with whatever value
> userspace passes in, which then pretty much does kmalloc(optlen).
> So if icmpv6fuzz passes some big random value, it can cause this failure.
> 
> I don't know what the appropriate limit should be, so no patch, sorry.

Well it is legal (though unlikely) to pass very long options.
So the real fix would be to avoid copying the control message
at all and modify all the code involved to read user pointers
directly.  Someone with a lot of patience is required for this :)

Cheers,
diff mbox

Patch

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 4b15938..9fb49c3 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1105,6 +1105,18 @@  static inline int ip6_ufo_append_data(struct sock *sk,
 	return err;
 }
 
+static inline struct ipv6_opt_hdr *ip6_opt_dup(struct ipv6_opt_hdr *src,
+					       gfp_t gfp)
+{
+	return src ? kmemdup(src, (src->hdrlen + 1) * 8, gfp) : NULL;
+}
+
+static inline struct ipv6_rt_hdr *ip6_rthdr_dup(struct ipv6_rt_hdr *src,
+						gfp_t gfp)
+{
+	return src ? kmemdup(src, (src->hdrlen + 1) * 8, gfp) : NULL;
+}
+
 int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
 	int offset, int len, int odd, struct sk_buff *skb),
 	void *from, int length, int transhdrlen,
@@ -1130,17 +1142,37 @@  int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
 		 * setup for corking
 		 */
 		if (opt) {
-			if (np->cork.opt == NULL) {
-				np->cork.opt = kmalloc(opt->tot_len,
-						       sk->sk_allocation);
-				if (unlikely(np->cork.opt == NULL))
-					return -ENOBUFS;
-			} else if (np->cork.opt->tot_len < opt->tot_len) {
-				printk(KERN_DEBUG "ip6_append_data: invalid option length\n");
+			if (WARN_ON(np->cork.opt))
 				return -EINVAL;
-			}
-			memcpy(np->cork.opt, opt, opt->tot_len);
-			inet->cork.flags |= IPCORK_OPT;
+
+			np->cork.opt = kmalloc(opt->tot_len, sk->sk_allocation);
+			if (unlikely(np->cork.opt == NULL))
+				return -ENOBUFS;
+
+			np->cork.opt->tot_len = opt->tot_len;
+			np->cork.opt->opt_flen = opt->opt_flen;
+			np->cork.opt->opt_nflen = opt->opt_nflen;
+
+			np->cork.opt->dst0opt = ip6_opt_dup(opt->dst0opt,
+							    sk->sk_allocation);
+			if (opt->dst0opt && !np->cork.opt->dst0opt)
+				return -ENOBUFS;
+
+			np->cork.opt->dst1opt = ip6_opt_dup(opt->dst1opt,
+							    sk->sk_allocation);
+			if (opt->dst1opt && !np->cork.opt->dst1opt)
+				return -ENOBUFS;
+
+			np->cork.opt->hopopt = ip6_opt_dup(opt->hopopt,
+							   sk->sk_allocation);
+			if (opt->hopopt && !np->cork.opt->hopopt)
+				return -ENOBUFS;
+
+			np->cork.opt->srcrt = ip6_rthdr_dup(opt->srcrt,
+							    sk->sk_allocation);
+			if (opt->srcrt && !np->cork.opt->srcrt)
+				return -ENOBUFS;
+
 			/* need source address above miyazawa*/
 		}
 		dst_hold(&rt->u.dst);
@@ -1167,8 +1199,7 @@  int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to,
 	} else {
 		rt = (struct rt6_info *)inet->cork.dst;
 		fl = &inet->cork.fl;
-		if (inet->cork.flags & IPCORK_OPT)
-			opt = np->cork.opt;
+		opt = np->cork.opt;
 		transhdrlen = 0;
 		exthdrlen = 0;
 		mtu = inet->cork.fragsize;
@@ -1407,9 +1438,15 @@  error:
 
 static void ip6_cork_release(struct inet_sock *inet, struct ipv6_pinfo *np)
 {
-	inet->cork.flags &= ~IPCORK_OPT;
-	kfree(np->cork.opt);
-	np->cork.opt = NULL;
+	if (np->cork.opt) {
+		kfree(np->cork.opt->dst0opt);
+		kfree(np->cork.opt->dst1opt);
+		kfree(np->cork.opt->hopopt);
+		kfree(np->cork.opt->srcrt);
+		kfree(np->cork.opt);
+		np->cork.opt = NULL;
+	}
+
 	if (inet->cork.dst) {
 		dst_release(inet->cork.dst);
 		inet->cork.dst = NULL;