Message ID | 53304186.4000909@redhat.com |
---|---|
State | New |
Headers | show |
On Mon, 24 Mar 2014, Florian Weimer wrote: > Check for syscall error in the SETXID implementation in NPTL (bug 13347). > > At this point, we can only abort the process because we have already switched > credentials on other threads. Returning an error would still leave the > process in an inconsistent state. This may be the best possible in the absence of a kernel interface for setting ids atomically for the whole process, but such an interface would be the desired long-term fix, with aborting from the present code just a fallback - is there ongoing work to agree such an interface?
On Mon, Mar 24, 2014 at 03:19:59PM +0000, Joseph S. Myers wrote: > On Mon, 24 Mar 2014, Florian Weimer wrote: > > > Check for syscall error in the SETXID implementation in NPTL (bug 13347). > > > > At this point, we can only abort the process because we have already switched > > credentials on other threads. Returning an error would still leave the > > process in an inconsistent state. > > This may be the best possible in the absence of a kernel interface for > setting ids atomically for the whole process, but such an interface would > be the desired long-term fix, with aborting from the present code just a > fallback - is there ongoing work to agree such an interface? Are you sure you can't make it so that all setuid calls but the first can't fail? There are basically only 2 reasons it might fail: ulimit on old kernels (this was fixed in early 3.x series), and failure to allocate the kernel data structure to represent the new permissions (they're refcounted/shared, so this failure can only happen on the first one under normal circumstances, but might be possible still if setfsuid, etc. are in use). I'm very much in favor of pressing the kernel folks for a syscall that makes the change atomically for all threads, but this is vastly complicated by the fact that they're storing permissions which should be thread-local (fsuid) in the same object as ones which should be process-global. So implementing this correctly at the kernel level might require radical changes to how they represent permissions... :( Rich
On 03/24/2014 04:19 PM, Joseph S. Myers wrote: > On Mon, 24 Mar 2014, Florian Weimer wrote: > >> Check for syscall error in the SETXID implementation in NPTL (bug 13347). >> >> At this point, we can only abort the process because we have already switched >> credentials on other threads. Returning an error would still leave the >> process in an inconsistent state. > > This may be the best possible in the absence of a kernel interface for > setting ids atomically for the whole process, but such an interface would > be the desired long-term fix, with aborting from the present code just a > fallback - is there ongoing work to agree such an interface? As far as I know, the credentials switching work goes in the other direction, providing additional per-thread credentials to glibc-based userspace: <http://thread.gmane.org/gmane.linux.file-systems/81751> Probably like most kernel developers, I'm not convinced that the POSIX semantics are useful.
On 03/24/2014 04:32 PM, Rich Felker wrote: > On Mon, Mar 24, 2014 at 03:19:59PM +0000, Joseph S. Myers wrote: >> On Mon, 24 Mar 2014, Florian Weimer wrote: >> >>> Check for syscall error in the SETXID implementation in NPTL (bug 13347). >>> >>> At this point, we can only abort the process because we have already switched >>> credentials on other threads. Returning an error would still leave the >>> process in an inconsistent state. >> >> This may be the best possible in the absence of a kernel interface for >> setting ids atomically for the whole process, but such an interface would >> be the desired long-term fix, with aborting from the present code just a >> fallback - is there ongoing work to agree such an interface? > > Are you sure you can't make it so that all setuid calls but the first > can't fail? We already are in this situation if application only ever uses the SETXID wrappers, I think. That's why the test has to resort to directly invoking the syscall.
Florian Weimer <fweimer@redhat.com> writes: > Probably like most kernel developers, I'm not convinced that the POSIX > semantics are useful. Has anybody raised this on the Austin Group? Andreas.
On Mon, Mar 24, 2014 at 04:44:04PM +0100, Florian Weimer wrote: > On 03/24/2014 04:32 PM, Rich Felker wrote: > >On Mon, Mar 24, 2014 at 03:19:59PM +0000, Joseph S. Myers wrote: > >>On Mon, 24 Mar 2014, Florian Weimer wrote: > >> > >>>Check for syscall error in the SETXID implementation in NPTL (bug 13347). > >>> > >>>At this point, we can only abort the process because we have already switched > >>>credentials on other threads. Returning an error would still leave the > >>>process in an inconsistent state. > >> > >>This may be the best possible in the absence of a kernel interface for > >>setting ids atomically for the whole process, but such an interface would > >>be the desired long-term fix, with aborting from the present code just a > >>fallback - is there ongoing work to agree such an interface? > > > >Are you sure you can't make it so that all setuid calls but the first > >can't fail? > > We already are in this situation if application only ever uses the > SETXID wrappers, I think. That's why the test has to resort to > directly invoking the syscall. I don't understand how your message is an answer to the question you quoted. I was asking whether there might be a way to setup the conditions prior to making the setuid syscalls such that if the first one succeeds, the subsequent ones cannot fail. I'll admit it's unlikely that this is possible, but you can reduce the chances of failure by temporaily setting RLIMIT_NPROC to infinity before making the syscalls. Rich
On 03/24/2014 04:49 PM, Rich Felker wrote: > On Mon, Mar 24, 2014 at 04:44:04PM +0100, Florian Weimer wrote: >> On 03/24/2014 04:32 PM, Rich Felker wrote: >>> On Mon, Mar 24, 2014 at 03:19:59PM +0000, Joseph S. Myers wrote: >>>> On Mon, 24 Mar 2014, Florian Weimer wrote: >>>> >>>>> Check for syscall error in the SETXID implementation in NPTL (bug 13347). >>>>> >>>>> At this point, we can only abort the process because we have already switched >>>>> credentials on other threads. Returning an error would still leave the >>>>> process in an inconsistent state. >>>> >>>> This may be the best possible in the absence of a kernel interface for >>>> setting ids atomically for the whole process, but such an interface would >>>> be the desired long-term fix, with aborting from the present code just a >>>> fallback - is there ongoing work to agree such an interface? >>> >>> Are you sure you can't make it so that all setuid calls but the first >>> can't fail? >> >> We already are in this situation if application only ever uses the >> SETXID wrappers, I think. That's why the test has to resort to >> directly invoking the syscall. > > I don't understand how your message is an answer to the question you > quoted. The negations and quantifiers are confusing, so I wasn't sure what you were asking. > I was asking whether there might be a way to setup the > conditions prior to making the setuid syscalls such that if the first > one succeeds, the subsequent ones cannot fail. Not in general, no, because the kernel implementation calls into the Linux Security Module framework, whose modules typically implement additional preconditions we cannot check in glibc due to insufficient information.
On Mon, Mar 24, 2014 at 04:57:23PM +0100, Florian Weimer wrote: > >I was asking whether there might be a way to setup the > >conditions prior to making the setuid syscalls such that if the first > >one succeeds, the subsequent ones cannot fail. > > Not in general, no, because the kernel implementation calls into the > Linux Security Module framework, whose modules typically implement > additional preconditions we cannot check in glibc due to > insufficient information. Yes, I'm well aware of the Linux Insecurity Modules framework. Any framework that can make standard functions with documented interface contracts violate their own interface contracts subtracts from the security of a system rather than adding to it, and I really have no problem with telling users this if they're running broken Insecurity Modules. But back to the topic, I was assuming correct behavior from the kernel. If the kernel misbehaves, aborting is a perfectly reasonable response (but if LSM's make the kernel lie, can you even tell if it misbehaved?). Rich
On Mon, Mar 24, 2014 at 04:41:59PM +0100, Florian Weimer wrote: > On 03/24/2014 04:19 PM, Joseph S. Myers wrote: > >On Mon, 24 Mar 2014, Florian Weimer wrote: > > > >>Check for syscall error in the SETXID implementation in NPTL (bug 13347). > >> > >>At this point, we can only abort the process because we have already switched > >>credentials on other threads. Returning an error would still leave the > >>process in an inconsistent state. > > > >This may be the best possible in the absence of a kernel interface for > >setting ids atomically for the whole process, but such an interface would > >be the desired long-term fix, with aborting from the present code just a > >fallback - is there ongoing work to agree such an interface? > > As far as I know, the credentials switching work goes in the other > direction, providing additional per-thread credentials to > glibc-based userspace: > > <http://thread.gmane.org/gmane.linux.file-systems/81751> > > Probably like most kernel developers, I'm not convinced that the > POSIX semantics are useful. It is a critical security flaw to have multiple tasks (threads/processes) running in the same virtual address space with different privileges. I have described potential attacks for this situation before; I could lookup the references if you care. BTW, normally it's a bad idea to call setuid in a multithreaded program anyway. However the situation does arise quite often, especially in programs written in higher-level languages like Java where the language runtime may have started threads for its own internal purposes before the program drops privileges via a FFI-like interface to the low-level libc functions like setuid. I've seen this issue before in software I've audited. So glibc needs to handle it correctly, Linux should too at the kernel level, and the POSIX-specified behavior is "right" from a security standpoint. Rich
On Mon 24 Mar 2014 12:26:32 Rich Felker wrote: > On Mon, Mar 24, 2014 at 04:57:23PM +0100, Florian Weimer wrote: > > >I was asking whether there might be a way to setup the > > >conditions prior to making the setuid syscalls such that if the first > > >one succeeds, the subsequent ones cannot fail. > > > > Not in general, no, because the kernel implementation calls into the > > Linux Security Module framework, whose modules typically implement > > additional preconditions we cannot check in glibc due to > > insufficient information. > > Yes, I'm well aware of the Linux Insecurity Modules framework. Any > framework that can make standard functions with documented interface > contracts violate their own interface contracts subtracts from the > security of a system rather than adding to it, and I really have no > problem with telling users this if they're running broken Insecurity > Modules. > > But back to the topic, I was assuming correct behavior from the > kernel. If the kernel misbehaves, aborting is a perfectly reasonable > response (but if LSM's make the kernel lie, can you even tell if it > misbehaved?). trying to stack the deck against failure is a good idea, but that is orthogonal to checking the return value. there's no good reason at all to not check & abort when the call fails. -mike
On Mon, Mar 24, 2014 at 03:16:44PM -0400, Mike Frysinger wrote: > On Mon 24 Mar 2014 12:26:32 Rich Felker wrote: > > On Mon, Mar 24, 2014 at 04:57:23PM +0100, Florian Weimer wrote: > > > >I was asking whether there might be a way to setup the > > > >conditions prior to making the setuid syscalls such that if the first > > > >one succeeds, the subsequent ones cannot fail. > > > > > > Not in general, no, because the kernel implementation calls into the > > > Linux Security Module framework, whose modules typically implement > > > additional preconditions we cannot check in glibc due to > > > insufficient information. > > > > Yes, I'm well aware of the Linux Insecurity Modules framework. Any > > framework that can make standard functions with documented interface > > contracts violate their own interface contracts subtracts from the > > security of a system rather than adding to it, and I really have no > > problem with telling users this if they're running broken Insecurity > > Modules. > > > > But back to the topic, I was assuming correct behavior from the > > kernel. If the kernel misbehaves, aborting is a perfectly reasonable > > response (but if LSM's make the kernel lie, can you even tell if it > > misbehaved?). > > trying to stack the deck against failure is a good idea, but that is > orthogonal to checking the return value. there's no good reason at all to not > check & abort when the call fails. > -mike Agreed. Rich
On 03/25/2014 04:07 AM, Rich Felker wrote: > On Mon, Mar 24, 2014 at 03:16:44PM -0400, Mike Frysinger wrote: >>> But back to the topic, I was assuming correct behavior from the >>> kernel. If the kernel misbehaves, aborting is a perfectly reasonable >>> response (but if LSM's make the kernel lie, can you even tell if it >>> misbehaved?). >> >> trying to stack the deck against failure is a good idea, but that is >> orthogonal to checking the return value. there's no good reason at all to not >> check & abort when the call fails. >> -mike > > Agreed. So what about the patch? I have put kernel support on my to-do list, but I have other kernel items that I want to deal with first.
On 03/24/2014 07:22 PM, Rich Felker wrote: > It is a critical security flaw to have multiple tasks > (threads/processes) running in the same virtual address space with > different privileges. I have described potential attacks for this > situation before; I could lookup the references if you care. I think you are wrong—the kernel does it all the time. It is okay as long as you can control what code you run. We already support it through setfsuid/setfsgid, which is per-thread, not per-process.
On 03/24/2014 04:41 PM, Florian Weimer wrote: > As far as I know, the credentials switching work goes in the other > direction, providing additional per-thread credentials to glibc-based > userspace: > > <http://thread.gmane.org/gmane.linux.file-systems/81751> > > Probably like most kernel developers, I'm not convinced that the POSIX > semantics are useful. The credentials switching system call discussion just recommenced: <http://marc.info/?l=linux-fsdevel&m=139587987829677> I'll take a breath by sending more mail to that thread. :-/
On Thu, Mar 27, 2014 at 02:38:54PM +0100, Florian Weimer wrote: > On 03/24/2014 07:22 PM, Rich Felker wrote: > > >It is a critical security flaw to have multiple tasks > >(threads/processes) running in the same virtual address space with > >different privileges. I have described potential attacks for this > >situation before; I could lookup the references if you care. > > I think you are wrong—the kernel does it all the time. It is okay The kernel does a lot of things that are not good security practices. Appeal to authority is not a valid argument. > as long as you can control what code you run. > > We already support it through setfsuid/setfsgid, which is > per-thread, not per-process. The whole reason you're changing uids is because you can't be sure about what code you run; your code could (and probably does) have arbitrary code execution vulnerabilities you're not aware of. This is why you use privilege separation enforced by the kernel/mmu/etc. rather than just assuming your code is all safe and correct. If, after calling setuid with the intent of fully dropping root privileges, the code that subsequently runs has write access to memory used by a code that continues to run with root privileges, it's trivial for an exploit in the "unprivileged" code to elevate itself to compromise of the code that's still running with privileges. Rich
On 03/27/2014 04:21 PM, Rich Felker wrote: >> We already support it through setfsuid/setfsgid, which is >> per-thread, not per-process. > > The whole reason you're changing uids is because you can't be sure > about what code you run; Ah, no, you can also change credentials to impersonate a user and access resources with the privileges of that user. A file server does this, for example.
On Thu, Mar 27, 2014 at 04:27:46PM +0100, Florian Weimer wrote: > On 03/27/2014 04:21 PM, Rich Felker wrote: > > >>We already support it through setfsuid/setfsgid, which is > >>per-thread, not per-process. > > > >The whole reason you're changing uids is because you can't be sure > >about what code you run; > > Ah, no, you can also change credentials to impersonate a user and > access resources with the privileges of that user. A file server > does this, for example. That's what setfsuid is for. setuid is pretty much exclusively for dropping privileges. Rich
On Mon 24 Mar 2014 15:30:30 Florian Weimer wrote: > Check for syscall error in the SETXID implementation in NPTL (bug 13347). > > At this point, we can only abort the process because we have already > switched credentials on other threads. Returning an error would still > leave the process in an inconsistent state. > > The new xtest needs root privileges to run. patch looks OK to me. improving the kernel layer is an independent issue. -mike
On 03/24/2014 03:30 PM, Florian Weimer wrote: > Check for syscall error in the SETXID implementation in NPTL (bug 13347). > > At this point, we can only abort the process because we have already > switched credentials on other threads. Returning an error would still > leave the process in an inconsistent state. > > The new xtest needs root privileges to run. It turns out that the patch is wrong/incomplete. The abort needs to be restricted to cases where we actually see inconsistent failure/success behavior. I will work on a fix.
commit 06f66b8da0d3d2914f1d6f66ebf422007c5b00b7 Author: Florian Weimer <fweimer@redhat.com> Date: Mon Mar 24 15:24:02 2014 +0100 Check for syscall error in the SETXID implementation in NPTL (bug 13347). At this point, we can only abort the process because we have already switched credentials on other threads. Returning an error would still leave the process in an inconsistent state. The new xtest needs root privileges to run. diff --git a/ChangeLog b/ChangeLog index e9fdbe7..e797a20 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,10 @@ +2014-03-24 Florian Weimer <fweimer@redhat.com> + + [BZ #13347] + * nptl/nptl-init.c (sighandler_setxid): Check system call result. + * nptl/tst-setuid2.c: New file. + * nptl/Makefile (xtests): Add tst-setuid2. + 2014-03-24 Joseph Myers <joseph@codesourcery.com> [BZ #16284] diff --git a/nptl/Makefile b/nptl/Makefile index 897ac96..2876224 100644 --- a/nptl/Makefile +++ b/nptl/Makefile @@ -270,7 +270,8 @@ tests = tst-typesizes \ tst-vfork1 tst-vfork2 tst-vfork1x tst-vfork2x \ tst-getpid1 tst-getpid2 tst-getpid3 \ tst-initializers1 $(patsubst %,tst-initializers1-%,c89 gnu89 c99 gnu99) -xtests = tst-setuid1 tst-setuid1-static tst-mutexpp1 tst-mutexpp6 tst-mutexpp10 +xtests = tst-setuid1 tst-setuid1-static tst-setuid2 \ + tst-mutexpp1 tst-mutexpp6 tst-mutexpp10 test-srcs = tst-oddstacklimit # Files which must not be linked with libpthread. diff --git a/nptl/nptl-init.c b/nptl/nptl-init.c index 794156b..2796dc5 100644 --- a/nptl/nptl-init.c +++ b/nptl/nptl-init.c @@ -232,6 +232,7 @@ sighandler_setxid (int sig, siginfo_t *si, void *ctx) /* Determine the process ID. It might be negative if the thread is in the middle of a fork() call. */ pid_t pid = THREAD_GETMEM (THREAD_SELF, pid); + int result; if (__glibc_unlikely (pid < 0)) pid = -pid; @@ -245,8 +246,12 @@ sighandler_setxid (int sig, siginfo_t *si, void *ctx) return; INTERNAL_SYSCALL_DECL (err); - INTERNAL_SYSCALL_NCS (__xidcmd->syscall_no, err, 3, __xidcmd->id[0], - __xidcmd->id[1], __xidcmd->id[2]); + result = INTERNAL_SYSCALL_NCS (__xidcmd->syscall_no, err, 3, __xidcmd->id[0], + __xidcmd->id[1], __xidcmd->id[2]); + if (__glibc_unlikely (INTERNAL_SYSCALL_ERROR_P (result, err))) + /* Safety check. This should never happen if the setxid system + calls are only ever called through their glibc wrappers. */ + abort (); /* Reset the SETXID flag. */ struct pthread *self = THREAD_SELF; diff --git a/nptl/tst-setuid2.c b/nptl/tst-setuid2.c new file mode 100644 index 0000000..951aecc --- /dev/null +++ b/nptl/tst-setuid2.c @@ -0,0 +1,145 @@ +/* Copyright (C) 2014 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#include <errno.h> +#include <pthread.h> +#include <signal.h> +#include <stdbool.h> +#include <stdio.h> +#include <sys/syscall.h> +#include <unistd.h> + +/* Check that a partial setuid failure aborts the process. */ + +static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; +static pthread_cond_t cond_send; +static void (*func_sent) (void); +static pthread_cond_t cond_recv; + +#define FAIL(fmt, ...) \ + do { printf ("FAIL: " fmt "\n", __VA_ARGS__); _exit (1); } while (0) + +static void * +thread_func (void *ctx __attribute__ ((unused))) +{ + int ret = pthread_mutex_lock (&mutex); + if (ret != 0) + FAIL ("pthread_mutex_lock (thread): %d", ret); + + while (true) + { + if (func_sent != NULL) + { + void (*func) (void) = func_sent; + ret = pthread_mutex_unlock (&mutex); + if (ret != 0) + FAIL ("pthread_mutex_unlock (thread): %d", ret); + func (); + ret = pthread_mutex_lock (&mutex); + if (ret != 0) + FAIL ("pthread_mutex_lock (thread): %d", ret); + func_sent = NULL; + ret = pthread_cond_signal (&cond_recv); + if (ret != 0) + FAIL ("pthread_cond_signal (recv): %d", ret); + } + ret = pthread_cond_wait (&cond_send, &mutex); + if (ret != 0) + FAIL ("pthread_cond_wait (send): %d", ret); + } + return NULL; +} + +static void +run_on_thread (void (*func) (void)) +{ + int ret = pthread_mutex_lock (&mutex); + if (ret != 0) + FAIL ("pthread_mutex_lock (%s): %d", __func__, ret); + func_sent = func; + ret = pthread_mutex_unlock (&mutex); + if (ret != 0) + FAIL ("pthread_mutex_unlock (%s): %d", __func__, ret); + + ret = pthread_cond_signal (&cond_send); + if (ret != 0) + FAIL ("pthread_mutex_lock (%s): %d", __func__, ret); + + ret = pthread_mutex_lock (&mutex); + if (ret != 0) + FAIL ("pthread_mutex_lock (%s): %d", __func__, ret); + + while (func_sent != NULL) + { + ret = pthread_cond_wait (&cond_recv, &mutex); + if (ret != 0) + FAIL ("pthread_mutex_wait (%s): %d", __func__, ret); + } + ret = pthread_mutex_unlock (&mutex); + if (ret != 0) + FAIL ("pthread_mutex_unlock (%s): %d", __func__, ret); +} + +static void +change_thread_ids (void) +{ + long ret = syscall (__NR_setresuid, 2001, 2002, 2003); + if (ret != 0) + FAIL ("setresuid (2001, 2002, 2003): %ld", ret); +} + +static uid_t ruid, euid, suid; + +static void +get_thread_ids (void) +{ + if (getresuid (&ruid, &euid, &suid) < 0) + FAIL ("getresuid: %m (%d)", errno); +} + +static void +abort_expected (int signal __attribute__ ((unused))) +{ + _exit (0); +} + +static int +do_test (void) +{ + pthread_t thread; + int ret = pthread_create (&thread, NULL, thread_func, NULL); + if (ret != 0) + FAIL ("pthread_create: %d", ret); + + run_on_thread (change_thread_ids); + + signal (SIGABRT, &abort_expected); + /* This should abort the process. */ + if (setresuid (1001, 1002, 1003) < 0) + FAIL ("setresuid: %m (%d)", errno); + signal (SIGABRT, SIG_DFL); + + /* If we get here, check that the kernel did the right thing. */ + run_on_thread (get_thread_ids); + if (ruid != 1001 || euid != 1002 || euid != 1003) + FAIL ("unexpected UIDs after setuid: %ld, %ld, %ld", + (long) ruid, (long) euid, (long) suid); + return 0; +} + +#define TEST_FUNCTION do_test () +#include "../test-skeleton.c"