Message ID | 1290694299.2858.330.camel@edumazet-laptop |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
Eric Dumazet wrote, at 11/25/2010 10:11 PM: > Le jeudi 25 novembre 2010 à 13:35 +0500, Марк Коренберг a écrit : >> quick and dirty fix will be not to allow to pass unix socket inside >> unix socket. I think it would not break much applications. > > Really, if it was not needed, net/unix/garbage.c would not exist at > all... > > It is needed by some apps. > > > [PATCH] af_unix: limit recursion level > > Its easy to eat all kernel memory and trigger NMI watchdog, using an > exploit program that queues unix sockets on top of others. > > lkml ref : http://lkml.org/lkml/2010/11/25/8 > > This mechanism is used in applications, one choice we have is to have a > recursion limit. > > Other limits might be needed as well (if we queue other types of files), > since the passfd mechanism is currently limited by socket receive queue > sizes only. > > Add a recursion_level to unix socket, allowing up to 4 levels. > > Each time we send an unix socket through sendfd mechanism, we copy its > recursion level (plus one) to receiver. This recursion level is cleared > when socket receive queue is emptied. > > Reported-by: Марк Коренберг <socketpair@gmail.com> > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> This problem is same as that reported with title "Unix socket local DOS (OOM)", right? After applied this patch, this program can be killed now. but still eat 100% cpu.
Le vendredi 26 novembre 2010 à 12:38 +0800, Shan Wei a écrit : > Eric Dumazet wrote, at 11/25/2010 10:11 PM: > > Le jeudi 25 novembre 2010 à 13:35 +0500, Марк Коренберг a écrit : > >> quick and dirty fix will be not to allow to pass unix socket inside > >> unix socket. I think it would not break much applications. > > > > Really, if it was not needed, net/unix/garbage.c would not exist at > > all... > > > > It is needed by some apps. > > > > > > [PATCH] af_unix: limit recursion level > > > > Its easy to eat all kernel memory and trigger NMI watchdog, using an > > exploit program that queues unix sockets on top of others. > > > > lkml ref : http://lkml.org/lkml/2010/11/25/8 > > > > This mechanism is used in applications, one choice we have is to have a > > recursion limit. > > > > Other limits might be needed as well (if we queue other types of files), > > since the passfd mechanism is currently limited by socket receive queue > > sizes only. > > > > Add a recursion_level to unix socket, allowing up to 4 levels. > > > > Each time we send an unix socket through sendfd mechanism, we copy its > > recursion level (plus one) to receiver. This recursion level is cleared > > when socket receive queue is emptied. > > > > Reported-by: Марк Коренберг <socketpair@gmail.com> > > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> > > This problem is same as that reported with title "Unix socket local DOS (OOM)", right? > After applied this patch, this program can be killed now. but still eat 100% cpu. > Not the same problem, but a different one. In this case, we queue files on top of another and never give a chance to free them, unless the program dies (and full memory eaten) And yes, its eating 100% cpu, since it has no sleep inside, like for (;;) ; -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Eric Dumazet wrote, at 11/25/2010 10:11 PM: > @@ -1845,6 +1871,7 @@ static int unix_stream_recvmsg(struct kiocb *iocb, struct socket *sock, > unix_state_lock(sk); > skb = skb_dequeue(&sk->sk_receive_queue); > if (skb == NULL) { > + unix_sk(sk)->recursion_level = 0; For SOCK_SEQPACKET type, no need to clear recursion_level counter?
Eric Dumazet wrote, at 11/26/2010 02:23 PM: > Le vendredi 26 novembre 2010 à 12:38 +0800, Shan Wei a écrit : >> Eric Dumazet wrote, at 11/25/2010 10:11 PM: >>> Le jeudi 25 novembre 2010 à 13:35 +0500, Марк Коренберг a écrit : >>>> quick and dirty fix will be not to allow to pass unix socket inside >>>> unix socket. I think it would not break much applications. >>> >>> Really, if it was not needed, net/unix/garbage.c would not exist at >>> all... >>> >>> It is needed by some apps. >>> >>> >>> [PATCH] af_unix: limit recursion level >>> >>> Its easy to eat all kernel memory and trigger NMI watchdog, using an >>> exploit program that queues unix sockets on top of others. >>> >>> lkml ref : http://lkml.org/lkml/2010/11/25/8 >>> >>> This mechanism is used in applications, one choice we have is to have a >>> recursion limit. >>> >>> Other limits might be needed as well (if we queue other types of files), >>> since the passfd mechanism is currently limited by socket receive queue >>> sizes only. >>> >>> Add a recursion_level to unix socket, allowing up to 4 levels. >>> >>> Each time we send an unix socket through sendfd mechanism, we copy its >>> recursion level (plus one) to receiver. This recursion level is cleared >>> when socket receive queue is emptied. >>> >>> Reported-by: Марк Коренберг <socketpair@gmail.com> >>> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> >> >> This problem is same as that reported with title "Unix socket local DOS (OOM)", right? >> After applied this patch, this program can be killed now. but still eat 100% cpu. >> > > Not the same problem, but a different one. > > In this case, we queue files on top of another and never give a chance > to free them, unless the program dies (and full memory eaten) > > And yes, its eating 100% cpu, since it has no sleep inside, like > > for (;;) ; Got it. Thanks. Have a out of topic question. There is some difficulty for me to understand this issue. :-( why can't we kill this program? When send fd[0] to ff[0] socket, fd[0] is in flight and will be add reference value. Athough we close fd[0], their references is still exist. The reason that can't be killed is about the references or about the latest sockets created by socketpair() but never be freeed.
Le vendredi 26 novembre 2010 à 15:41 +0800, Shan Wei a écrit : > Eric Dumazet wrote, at 11/25/2010 10:11 PM: > > @@ -1845,6 +1871,7 @@ static int unix_stream_recvmsg(struct kiocb *iocb, struct socket *sock, > > unix_state_lock(sk); > > skb = skb_dequeue(&sk->sk_receive_queue); > > if (skb == NULL) { > > + unix_sk(sk)->recursion_level = 0; > > For SOCK_SEQPACKET type, no need to clear recursion_level counter? > > There is no need actually to clear it at all. If an application has a complex setup with a dependence tree of unix sockets, it will break if messages are not read fast enough. So, maybe I should remove this line so that underlying problem comes into surface immediately, rather than while in stress load. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le vendredi 26 novembre 2010 à 09:22 +0100, Eric Dumazet a écrit : > Le vendredi 26 novembre 2010 à 15:41 +0800, Shan Wei a écrit : > > Eric Dumazet wrote, at 11/25/2010 10:11 PM: > > > @@ -1845,6 +1871,7 @@ static int unix_stream_recvmsg(struct kiocb *iocb, struct socket *sock, > > > unix_state_lock(sk); > > > skb = skb_dequeue(&sk->sk_receive_queue); > > > if (skb == NULL) { > > > + unix_sk(sk)->recursion_level = 0; > > > > For SOCK_SEQPACKET type, no need to clear recursion_level counter? > > > > > > There is no need actually to clear it at all. > > If an application has a complex setup with a dependence tree of unix > sockets, it will break if messages are not read fast enough. > > So, maybe I should remove this line so that underlying problem comes > into surface immediately, rather than while in stress load. > > The whole sendfd feature is fundamentally flawed, since its not a "give this file to another user", but "give a pointer to file structure" As soon as you can pass af_unix sockets, you cannot know if the intransit "refs to file structure" are going to be consumed by one or other user. So a per user limit is not possible. I am not sure it is fixable at all, unless adding a complete graph structure between af_unix sockets that used the sendfd() mechanism. (Its a NxN relationship... pretty hard to track) Yes, we can add limits (global wide), but they could break legacy apps, and a single user could lock in one fd all the tokens. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Eric Dumazet <eric.dumazet@gmail.com> Date: Thu, 25 Nov 2010 15:11:39 +0100 > [PATCH] af_unix: limit recursion level > > Its easy to eat all kernel memory and trigger NMI watchdog, using an > exploit program that queues unix sockets on top of others. > > lkml ref : http://lkml.org/lkml/2010/11/25/8 > > This mechanism is used in applications, one choice we have is to have a > recursion limit. > > Other limits might be needed as well (if we queue other types of files), > since the passfd mechanism is currently limited by socket receive queue > sizes only. > > Add a recursion_level to unix socket, allowing up to 4 levels. > > Each time we send an unix socket through sendfd mechanism, we copy its > recursion level (plus one) to receiver. This recursion level is cleared > when socket receive queue is emptied. > > Reported-by: Марк Коренберг <socketpair@gmail.com> > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Ok, since such deep recursive AF_UNIX fd sends is pretty rediculious, it seems this is not likely to hit legitimate use cases and thus I've applied this. Also queued up for -stable. Thanks! -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le lundi 29 novembre 2010 à 09:46 -0800, David Miller a écrit : > From: Eric Dumazet <eric.dumazet@gmail.com> > Date: Thu, 25 Nov 2010 15:11:39 +0100 > > > [PATCH] af_unix: limit recursion level > > > > Its easy to eat all kernel memory and trigger NMI watchdog, using an > > exploit program that queues unix sockets on top of others. > > > > lkml ref : http://lkml.org/lkml/2010/11/25/8 > > > > This mechanism is used in applications, one choice we have is to have a > > recursion limit. > > > > Other limits might be needed as well (if we queue other types of files), > > since the passfd mechanism is currently limited by socket receive queue > > sizes only. > > > > Add a recursion_level to unix socket, allowing up to 4 levels. > > > > Each time we send an unix socket through sendfd mechanism, we copy its > > recursion level (plus one) to receiver. This recursion level is cleared > > when socket receive queue is emptied. > > > > Reported-by: Марк Коренберг <socketpair@gmail.com> > > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> > > Ok, since such deep recursive AF_UNIX fd sends is pretty > rediculious, it seems this is not likely to hit legitimate > use cases and thus I've applied this. > > Also queued up for -stable. > > Thanks! I tested FreeBSD (latest) and got a kernel freeze as well with exploit program. I dont know yet how to fully fix this problem. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/net/af_unix.h b/include/net/af_unix.h index 90c9e28..18e5c3f 100644 --- a/include/net/af_unix.h +++ b/include/net/af_unix.h @@ -10,6 +10,7 @@ extern void unix_inflight(struct file *fp); extern void unix_notinflight(struct file *fp); extern void unix_gc(void); extern void wait_for_unix_gc(void); +extern struct sock *unix_get_socket(struct file *filp); #define UNIX_HASH_SIZE 256 @@ -56,6 +57,7 @@ struct unix_sock { spinlock_t lock; unsigned int gc_candidate : 1; unsigned int gc_maybe_cycle : 1; + unsigned char recursion_level; struct socket_wq peer_wq; }; #define unix_sk(__sk) ((struct unix_sock *)__sk) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 3c95304..2268e67 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -1343,9 +1343,25 @@ static void unix_destruct_scm(struct sk_buff *skb) sock_wfree(skb); } +#define MAX_RECURSION_LEVEL 4 + static int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb) { int i; + unsigned char max_level = 0; + int unix_sock_count = 0; + + for (i = scm->fp->count - 1; i >= 0; i--) { + struct sock *sk = unix_get_socket(scm->fp->fp[i]); + + if (sk) { + unix_sock_count++; + max_level = max(max_level, + unix_sk(sk)->recursion_level); + } + } + if (unlikely(max_level > MAX_RECURSION_LEVEL)) + return -ETOOMANYREFS; /* * Need to duplicate file references for the sake of garbage @@ -1356,9 +1372,11 @@ static int unix_attach_fds(struct scm_cookie *scm, struct sk_buff *skb) if (!UNIXCB(skb).fp) return -ENOMEM; - for (i = scm->fp->count-1; i >= 0; i--) - unix_inflight(scm->fp->fp[i]); - return 0; + if (unix_sock_count) { + for (i = scm->fp->count - 1; i >= 0; i--) + unix_inflight(scm->fp->fp[i]); + } + return max_level; } static int unix_scm_to_skb(struct scm_cookie *scm, struct sk_buff *skb, bool send_fds) @@ -1393,6 +1411,7 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock, struct sk_buff *skb; long timeo; struct scm_cookie tmp_scm; + int max_level; if (NULL == siocb->scm) siocb->scm = &tmp_scm; @@ -1431,8 +1450,9 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock, goto out; err = unix_scm_to_skb(siocb->scm, skb, true); - if (err) + if (err < 0) goto out_free; + max_level = err + 1; unix_get_secdata(siocb->scm, skb); skb_reset_transport_header(skb); @@ -1514,6 +1534,8 @@ restart: if (sock_flag(other, SOCK_RCVTSTAMP)) __net_timestamp(skb); skb_queue_tail(&other->sk_receive_queue, skb); + if (max_level > unix_sk(other)->recursion_level) + unix_sk(other)->recursion_level = max_level; unix_state_unlock(other); other->sk_data_ready(other, len); sock_put(other); @@ -1544,6 +1566,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock, int sent = 0; struct scm_cookie tmp_scm; bool fds_sent = false; + int max_level; if (NULL == siocb->scm) siocb->scm = &tmp_scm; @@ -1607,10 +1630,11 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock, /* Only send the fds in the first buffer */ err = unix_scm_to_skb(siocb->scm, skb, !fds_sent); - if (err) { + if (err < 0) { kfree_skb(skb); goto out_err; } + max_level = err + 1; fds_sent = true; err = memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size); @@ -1626,6 +1650,8 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock, goto pipe_err_free; skb_queue_tail(&other->sk_receive_queue, skb); + if (max_level > unix_sk(other)->recursion_level) + unix_sk(other)->recursion_level = max_level; unix_state_unlock(other); other->sk_data_ready(other, size); sent += size; @@ -1845,6 +1871,7 @@ static int unix_stream_recvmsg(struct kiocb *iocb, struct socket *sock, unix_state_lock(sk); skb = skb_dequeue(&sk->sk_receive_queue); if (skb == NULL) { + unix_sk(sk)->recursion_level = 0; if (copied >= target) goto unlock; diff --git a/net/unix/garbage.c b/net/unix/garbage.c index c8df6fd..a2d99a5 100644 --- a/net/unix/garbage.c +++ b/net/unix/garbage.c @@ -96,7 +96,7 @@ static DECLARE_WAIT_QUEUE_HEAD(unix_gc_wait); unsigned int unix_tot_inflight; -static struct sock *unix_get_socket(struct file *filp) +struct sock *unix_get_socket(struct file *filp) { struct sock *u_sock = NULL; struct inode *inode = filp->f_path.dentry->d_inode;