Patchwork UNIX: Do not loop forever at unix_autobind().

login
register
mail settings
Submitter Tetsuo Handa
Date Sept. 4, 2010, 6:58 a.m.
Message ID <201009040658.o846wxnU028775@www262.sakura.ne.jp>
Download mbox | patch
Permalink /patch/63757/
State Superseded
Delegated to: David Miller
Headers show

Comments

Tetsuo Handa - Sept. 4, 2010, 6:58 a.m.
From a67ccbb8033993df29f26bde9944e37bffe4fc1b Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Sat, 4 Sep 2010 15:22:22 +0900
Subject: [PATCH] UNIX: Do not loop forever at unix_autobind().

We assumed that unix_autobind() never fails if kzalloc() succeeded.
But unix_autobind() allows only 1048576 names. If /proc/sys/fs/file-max is
larger than 1048576 (e.g. systems with more than 10GB of RAM), a local user can
consume all names using fork()/socket()/bind().

If all names are in use, those who call bind() with addr_len == sizeof(short)
or connect()/sendmsg() with setsockopt(SO_PASSCRED) will continue

  while (1)
  	yield();

loop at unix_autobind() till a name becomes available.
This patch changes unix_autobind() to fail if all names are in use.

Note that currently a local user can consume 2GB of kernel memory if the user
is allowed to create and autobind 1048576 UNIX domain sockets. We should
consider adding some restriction for autobind operation.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 net/unix/af_unix.c |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)
Eric Dumazet - Sept. 4, 2010, 7:11 a.m.
Le samedi 04 septembre 2010 à 15:58 +0900, Tetsuo Handa a écrit :
> From a67ccbb8033993df29f26bde9944e37bffe4fc1b Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Date: Sat, 4 Sep 2010 15:22:22 +0900
> Subject: [PATCH] UNIX: Do not loop forever at unix_autobind().
> 
> We assumed that unix_autobind() never fails if kzalloc() succeeded.
> But unix_autobind() allows only 1048576 names. If /proc/sys/fs/file-max is
> larger than 1048576 (e.g. systems with more than 10GB of RAM), a local user can
> consume all names using fork()/socket()/bind().
> 
> If all names are in use, those who call bind() with addr_len == sizeof(short)
> or connect()/sendmsg() with setsockopt(SO_PASSCRED) will continue
> 
>   while (1)
>   	yield();
> 
> loop at unix_autobind() till a name becomes available.
> This patch changes unix_autobind() to fail if all names are in use.
> 
> Note that currently a local user can consume 2GB of kernel memory if the user
> is allowed to create and autobind 1048576 UNIX domain sockets. We should
> consider adding some restriction for autobind operation.
> 
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> ---
>  net/unix/af_unix.c |    8 ++++++++
>  1 files changed, 8 insertions(+), 0 deletions(-)
> 
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index 4414a18..46fc6b2 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -692,6 +692,7 @@ static int unix_autobind(struct socket *sock)
>  	static u32 ordernum = 1;
>  	struct unix_address *addr;
>  	int err;
> +	u32 stop_ordernum;
>  
>  	mutex_lock(&u->readlock);
>  
> @@ -706,6 +707,7 @@ static int unix_autobind(struct socket *sock)
>  
>  	addr->name->sun_family = AF_UNIX;
>  	atomic_set(&addr->refcnt, 1);
> +	stop_ordernum = ordernum;
>  
>  retry:
>  	addr->len = sprintf(addr->name->sun_path+1, "%05x", ordernum) + 1 + sizeof(short);
> @@ -720,6 +722,12 @@ retry:
>  		/* Sanity yield. It is unusual case, but yet... */
>  		if (!(ordernum&0xFF))
>  			yield();
> +		/* Give up if all names are in use. */
> +		if (ordernum == stop_ordernum) {
> +			err = -ENOMEM;
> +			kfree(addr);
> +			goto out;
> +		}
>  		goto retry;
>  	}
>  	addr->hash ^= sk->sk_type;


Sorry, this wont work very well if you have many processes using
autobind(). Some of them will loop many time before hitting
"stop_ordernum".

unsigned int counter;

...

if (++maxtries == 1<<20) {
	...
}


This is a pathological situation. We are not forced to give a successful
autobind() when so many sockets are in use, even if some slots are
available.

Thanks


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
=?ISO-8859-2?Q?Micha=B3_Miros=B3aw?= - Sept. 4, 2010, 11:52 a.m.
2010/9/4 Eric Dumazet <eric.dumazet@gmail.com>:
> Le samedi 04 septembre 2010 à 15:58 +0900, Tetsuo Handa a écrit :
>> From a67ccbb8033993df29f26bde9944e37bffe4fc1b Mon Sep 17 00:00:00 2001
>> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
>> Date: Sat, 4 Sep 2010 15:22:22 +0900
>> Subject: [PATCH] UNIX: Do not loop forever at unix_autobind().
>>
>> We assumed that unix_autobind() never fails if kzalloc() succeeded.
>> But unix_autobind() allows only 1048576 names. If /proc/sys/fs/file-max is
>> larger than 1048576 (e.g. systems with more than 10GB of RAM), a local user can
>> consume all names using fork()/socket()/bind().
>>
>> If all names are in use, those who call bind() with addr_len == sizeof(short)
>> or connect()/sendmsg() with setsockopt(SO_PASSCRED) will continue
>>
>>   while (1)
>>       yield();
>>
>> loop at unix_autobind() till a name becomes available.
>> This patch changes unix_autobind() to fail if all names are in use.
>>
>> Note that currently a local user can consume 2GB of kernel memory if the user
>> is allowed to create and autobind 1048576 UNIX domain sockets. We should
>> consider adding some restriction for autobind operation.
[cut patch]
> Sorry, this wont work very well if you have many processes using
> autobind(). Some of them will loop many time before hitting
> "stop_ordernum".
>
> unsigned int counter;
>
> ...
>
> if (++maxtries == 1<<20) {
>        ...
> }
>
>
> This is a pathological situation. We are not forced to give a successful
> autobind() when so many sockets are in use, even if some slots are
> available.

Is there any specific requirement on generated names for auto-bind?
Wouldn't it be easier and more efficient to use some pseudo-random
name. i.e. derived from current time and/or owning process state?

Best Regards,
Michał Mirosław
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 4414a18..46fc6b2 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -692,6 +692,7 @@  static int unix_autobind(struct socket *sock)
 	static u32 ordernum = 1;
 	struct unix_address *addr;
 	int err;
+	u32 stop_ordernum;
 
 	mutex_lock(&u->readlock);
 
@@ -706,6 +707,7 @@  static int unix_autobind(struct socket *sock)
 
 	addr->name->sun_family = AF_UNIX;
 	atomic_set(&addr->refcnt, 1);
+	stop_ordernum = ordernum;
 
 retry:
 	addr->len = sprintf(addr->name->sun_path+1, "%05x", ordernum) + 1 + sizeof(short);
@@ -720,6 +722,12 @@  retry:
 		/* Sanity yield. It is unusual case, but yet... */
 		if (!(ordernum&0xFF))
 			yield();
+		/* Give up if all names are in use. */
+		if (ordernum == stop_ordernum) {
+			err = -ENOMEM;
+			kfree(addr);
+			goto out;
+		}
 		goto retry;
 	}
 	addr->hash ^= sk->sk_type;