Message ID | c72ecf1bce6c98235a76444ca7df38fdc095fbb8.1360062165.git.mprivozn@redhat.com |
---|---|
State | New |
Headers | show |
On 02/05/2013 06:02 AM, Michal Privoznik wrote: > Currently, it we call a not white listed system call, we get killed > immediately without reporting any error. It would be far more useful, > if we can at least shout something on stderr just before dying, so > users know it is because of sandbox, not just random quit. > > Signed-off-by: Michal Privoznik <mprivozn@redhat.com> > --- > os-posix.c | 8 ++++++++ > qemu-seccomp.c | 4 +++- > 2 files changed, 11 insertions(+), 1 deletion(-) > > diff --git a/os-posix.c b/os-posix.c > index 5c64518..1d52306 100644 > --- a/os-posix.c > +++ b/os-posix.c > @@ -62,6 +62,12 @@ void os_setup_early_signal_handling(void) > sigaction(SIGPIPE, &act, NULL); > } > > +static void syssig_handler(int signal, siginfo_t *info, void *c) > +{ > + fprintf(stderr, "Bad system call\n"); > + exit(1); > +} > + > static void termsig_handler(int signal, siginfo_t *info, void *c) > { > qemu_system_killed(info->si_signo, info->si_pid); > @@ -77,6 +83,8 @@ void os_setup_signal_handling(void) > sigaction(SIGINT, &act, NULL); > sigaction(SIGHUP, &act, NULL); > sigaction(SIGTERM, &act, NULL); > + act.sa_sigaction = syssig_handler; > + sigaction(SIGSYS, &act, NULL); > } > > /* Find a likely location for support files using the location of the binary. > diff --git a/qemu-seccomp.c b/qemu-seccomp.c > index 031da1d..897d9b3 100644 > --- a/qemu-seccomp.c > +++ b/qemu-seccomp.c > @@ -2,9 +2,11 @@ > * QEMU seccomp mode 2 support with libseccomp > * > * Copyright IBM, Corp. 2012 > + * Copyright (C) 2013 Red Hat, Inc. > * > * Authors: > * Eduardo Otubo <eotubo@br.ibm.com> > + * Michal Privoznik <mprivozn@redhat.com> > * > * This work is licensed under the terms of the GNU GPL, version 2. See > * the COPYING file in the top-level directory. > @@ -238,7 +240,7 @@ int seccomp_start(void) > unsigned int i = 0; > scmp_filter_ctx ctx; > > - ctx = seccomp_init(SCMP_ACT_KILL); > + ctx = seccomp_init(SCMP_ACT_TRAP); > if (ctx == NULL) { > goto seccomp_return; > } > I think this is going to be better solved in the kernel. I have a kernel patch sitting out there at: https://lkml.org/lkml/2013/1/7/313 Any public support of this patch could be useful to help get it in. Something is definitely needed to learn the syscall that is killing QEMU. But I don't think the signal handler approach is going to work. We tried that and ran into too many situations where signals were being blocked by libraries (spice is one example). And we didn't want to get in the business of patching third party libraries to allow SIGSYS.
On 05.02.2013 15:28, Corey Bryant wrote: > > On 02/05/2013 06:02 AM, Michal Privoznik wrote: >> Currently, it we call a not white listed system call, we get killed >> immediately without reporting any error. It would be far more useful, >> if we can at least shout something on stderr just before dying, so >> users know it is because of sandbox, not just random quit. >> >> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> >> --- >> os-posix.c | 8 ++++++++ >> qemu-seccomp.c | 4 +++- >> 2 files changed, 11 insertions(+), 1 deletion(-) >> >> diff --git a/os-posix.c b/os-posix.c >> index 5c64518..1d52306 100644 >> --- a/os-posix.c >> +++ b/os-posix.c >> @@ -62,6 +62,12 @@ void os_setup_early_signal_handling(void) >> sigaction(SIGPIPE, &act, NULL); >> } >> >> +static void syssig_handler(int signal, siginfo_t *info, void *c) >> +{ >> + fprintf(stderr, "Bad system call\n"); >> + exit(1); >> +} >> + >> static void termsig_handler(int signal, siginfo_t *info, void *c) >> { >> qemu_system_killed(info->si_signo, info->si_pid); >> @@ -77,6 +83,8 @@ void os_setup_signal_handling(void) >> sigaction(SIGINT, &act, NULL); >> sigaction(SIGHUP, &act, NULL); >> sigaction(SIGTERM, &act, NULL); >> + act.sa_sigaction = syssig_handler; >> + sigaction(SIGSYS, &act, NULL); >> } >> >> /* Find a likely location for support files using the location of >> the binary. >> diff --git a/qemu-seccomp.c b/qemu-seccomp.c >> index 031da1d..897d9b3 100644 >> --- a/qemu-seccomp.c >> +++ b/qemu-seccomp.c >> @@ -2,9 +2,11 @@ >> * QEMU seccomp mode 2 support with libseccomp >> * >> * Copyright IBM, Corp. 2012 >> + * Copyright (C) 2013 Red Hat, Inc. >> * >> * Authors: >> * Eduardo Otubo <eotubo@br.ibm.com> >> + * Michal Privoznik <mprivozn@redhat.com> >> * >> * This work is licensed under the terms of the GNU GPL, version 2. >> See >> * the COPYING file in the top-level directory. >> @@ -238,7 +240,7 @@ int seccomp_start(void) >> unsigned int i = 0; >> scmp_filter_ctx ctx; >> >> - ctx = seccomp_init(SCMP_ACT_KILL); >> + ctx = seccomp_init(SCMP_ACT_TRAP); >> if (ctx == NULL) { >> goto seccomp_return; >> } >> > > I think this is going to be better solved in the kernel. I have a > kernel patch sitting out there at: https://lkml.org/lkml/2013/1/7/313 > Any public support of this patch could be useful to help get it in. > > Something is definitely needed to learn the syscall that is killing > QEMU. But I don't think the signal handler approach is going to work. > We tried that and ran into too many situations where signals were being > blocked by libraries (spice is one example). And we didn't want to get > in the business of patching third party libraries to allow SIGSYS. > My approach is slightly different here. While I agree that pr_info() is useful to find which syscalls are not whitelisted yet, it doesn't solve the different issue. That is - when qemu calls a not whitelisted syscall, currently it gets killed without any error being printed anywhere. So libvirt just sees the machine has gone away and it can be really hard to figure out the root cause is actually sandboxing. Re patch itself. Okay, there may be some situations where signal is not delivered or missed in which case we don't print any error message. But in these situations qemu should check / is checking return value of a syscall so it will die itself anyway. Michal
On Tue, Feb 05, 2013 at 09:28:51AM -0500, Corey Bryant wrote: > > On 02/05/2013 06:02 AM, Michal Privoznik wrote: > >Currently, it we call a not white listed system call, we get killed > >immediately without reporting any error. It would be far more useful, > >if we can at least shout something on stderr just before dying, so > >users know it is because of sandbox, not just random quit. > > > >Signed-off-by: Michal Privoznik <mprivozn@redhat.com> > >--- > > os-posix.c | 8 ++++++++ > > qemu-seccomp.c | 4 +++- > > 2 files changed, 11 insertions(+), 1 deletion(-) > > > >diff --git a/os-posix.c b/os-posix.c > >index 5c64518..1d52306 100644 > >--- a/os-posix.c > >+++ b/os-posix.c > >@@ -62,6 +62,12 @@ void os_setup_early_signal_handling(void) > > sigaction(SIGPIPE, &act, NULL); > > } > > > >+static void syssig_handler(int signal, siginfo_t *info, void *c) > >+{ > >+ fprintf(stderr, "Bad system call\n"); > >+ exit(1); > >+} > >+ > > static void termsig_handler(int signal, siginfo_t *info, void *c) > > { > > qemu_system_killed(info->si_signo, info->si_pid); > >@@ -77,6 +83,8 @@ void os_setup_signal_handling(void) > > sigaction(SIGINT, &act, NULL); > > sigaction(SIGHUP, &act, NULL); > > sigaction(SIGTERM, &act, NULL); > >+ act.sa_sigaction = syssig_handler; > >+ sigaction(SIGSYS, &act, NULL); > > } > > > > /* Find a likely location for support files using the location of the binary. > >diff --git a/qemu-seccomp.c b/qemu-seccomp.c > >index 031da1d..897d9b3 100644 > >--- a/qemu-seccomp.c > >+++ b/qemu-seccomp.c > >@@ -2,9 +2,11 @@ > > * QEMU seccomp mode 2 support with libseccomp > > * > > * Copyright IBM, Corp. 2012 > >+ * Copyright (C) 2013 Red Hat, Inc. > > * > > * Authors: > > * Eduardo Otubo <eotubo@br.ibm.com> > >+ * Michal Privoznik <mprivozn@redhat.com> > > * > > * This work is licensed under the terms of the GNU GPL, version 2. See > > * the COPYING file in the top-level directory. > >@@ -238,7 +240,7 @@ int seccomp_start(void) > > unsigned int i = 0; > > scmp_filter_ctx ctx; > > > >- ctx = seccomp_init(SCMP_ACT_KILL); > >+ ctx = seccomp_init(SCMP_ACT_TRAP); > > if (ctx == NULL) { > > goto seccomp_return; > > } > > > > I think this is going to be better solved in the kernel. I have a > kernel patch sitting out there at: > https://lkml.org/lkml/2013/1/7/313 > Any public support of this patch could be useful to help get it in. While that "learning mode" is useful, I don't think that really solves the problem that Michael is looking at. We're running a QEMU guest in production mode and it gets killed with no indication of why. It is definitely desirable that we get a log message in QEMU's stderr for this scenario. > Something is definitely needed to learn the syscall that is killing > QEMU. But I don't think the signal handler approach is going to > work. We tried that and ran into too many situations where signals > were being blocked by libraries (spice is one example). And we > didn't want to get in the business of patching third party libraries > to allow SIGSYS. We absolutely should be fixing libraries not to screw up signal handling in applications. I see that the SPICE worker thread blocks every single signal except SEGV/FPE/ILL, which is just completely bogus. There's no acceptable reason for SPICE to block so many signals. Regards, Daniel
On 02/06/2013 06:13 AM, Michal Privoznik wrote: > On 05.02.2013 15:28, Corey Bryant wrote: >> >> On 02/05/2013 06:02 AM, Michal Privoznik wrote: >>> Currently, it we call a not white listed system call, we get killed >>> immediately without reporting any error. It would be far more useful, >>> if we can at least shout something on stderr just before dying, so >>> users know it is because of sandbox, not just random quit. >>> >>> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> >>> --- >>> os-posix.c | 8 ++++++++ >>> qemu-seccomp.c | 4 +++- >>> 2 files changed, 11 insertions(+), 1 deletion(-) >>> >>> diff --git a/os-posix.c b/os-posix.c >>> index 5c64518..1d52306 100644 >>> --- a/os-posix.c >>> +++ b/os-posix.c >>> @@ -62,6 +62,12 @@ void os_setup_early_signal_handling(void) >>> sigaction(SIGPIPE, &act, NULL); >>> } >>> >>> +static void syssig_handler(int signal, siginfo_t *info, void *c) >>> +{ >>> + fprintf(stderr, "Bad system call\n"); >>> + exit(1); >>> +} >>> + >>> static void termsig_handler(int signal, siginfo_t *info, void *c) >>> { >>> qemu_system_killed(info->si_signo, info->si_pid); >>> @@ -77,6 +83,8 @@ void os_setup_signal_handling(void) >>> sigaction(SIGINT, &act, NULL); >>> sigaction(SIGHUP, &act, NULL); >>> sigaction(SIGTERM, &act, NULL); >>> + act.sa_sigaction = syssig_handler; >>> + sigaction(SIGSYS, &act, NULL); >>> } >>> >>> /* Find a likely location for support files using the location of >>> the binary. >>> diff --git a/qemu-seccomp.c b/qemu-seccomp.c >>> index 031da1d..897d9b3 100644 >>> --- a/qemu-seccomp.c >>> +++ b/qemu-seccomp.c >>> @@ -2,9 +2,11 @@ >>> * QEMU seccomp mode 2 support with libseccomp >>> * >>> * Copyright IBM, Corp. 2012 >>> + * Copyright (C) 2013 Red Hat, Inc. >>> * >>> * Authors: >>> * Eduardo Otubo <eotubo@br.ibm.com> >>> + * Michal Privoznik <mprivozn@redhat.com> >>> * >>> * This work is licensed under the terms of the GNU GPL, version 2. >>> See >>> * the COPYING file in the top-level directory. >>> @@ -238,7 +240,7 @@ int seccomp_start(void) >>> unsigned int i = 0; >>> scmp_filter_ctx ctx; >>> >>> - ctx = seccomp_init(SCMP_ACT_KILL); >>> + ctx = seccomp_init(SCMP_ACT_TRAP); >>> if (ctx == NULL) { >>> goto seccomp_return; >>> } >>> >> >> I think this is going to be better solved in the kernel. I have a >> kernel patch sitting out there at: https://lkml.org/lkml/2013/1/7/313 >> Any public support of this patch could be useful to help get it in. >> >> Something is definitely needed to learn the syscall that is killing >> QEMU. But I don't think the signal handler approach is going to work. >> We tried that and ran into too many situations where signals were being >> blocked by libraries (spice is one example). And we didn't want to get >> in the business of patching third party libraries to allow SIGSYS. >> > > My approach is slightly different here. While I agree that pr_info() is > useful to find which syscalls are not whitelisted yet, it doesn't solve > the different issue. That is - when qemu calls a not whitelisted > syscall, currently it gets killed without any error being printed > anywhere. So libvirt just sees the machine has gone away and it can be > really hard to figure out the root cause is actually sandboxing. Have you looked at the audit_seccomp() call in the kernel's __secure_computing() function? It may already be doing what you're looking for. > > Re patch itself. Okay, there may be some situations where signal is not > delivered or missed in which case we don't print any error message. But > in these situations qemu should check / is checking return value of a > syscall so it will die itself anyway. This just sounds like it is defeating the purpose of seccomp, and a lot of things could go wrong with signal handling. The entire point of SCMP_ACT_KILL is that the process does not get control back after an non-whitelisted syscall is called. Now we'd be giving back control to what could be a malicious guest. > > Michal >
On 02/06/2013 09:08 AM, Corey Bryant wrote: > > > On 02/06/2013 06:13 AM, Michal Privoznik wrote: >> On 05.02.2013 15:28, Corey Bryant wrote: >>> >>> On 02/05/2013 06:02 AM, Michal Privoznik wrote: >>>> Currently, it we call a not white listed system call, we get killed >>>> immediately without reporting any error. It would be far more useful, >>>> if we can at least shout something on stderr just before dying, so >>>> users know it is because of sandbox, not just random quit. >>>> >>>> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> >>>> --- >>>> os-posix.c | 8 ++++++++ >>>> qemu-seccomp.c | 4 +++- >>>> 2 files changed, 11 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/os-posix.c b/os-posix.c >>>> index 5c64518..1d52306 100644 >>>> --- a/os-posix.c >>>> +++ b/os-posix.c >>>> @@ -62,6 +62,12 @@ void os_setup_early_signal_handling(void) >>>> sigaction(SIGPIPE, &act, NULL); >>>> } >>>> >>>> +static void syssig_handler(int signal, siginfo_t *info, void *c) >>>> +{ >>>> + fprintf(stderr, "Bad system call\n"); >>>> + exit(1); >>>> +} >>>> + >>>> static void termsig_handler(int signal, siginfo_t *info, void *c) >>>> { >>>> qemu_system_killed(info->si_signo, info->si_pid); >>>> @@ -77,6 +83,8 @@ void os_setup_signal_handling(void) >>>> sigaction(SIGINT, &act, NULL); >>>> sigaction(SIGHUP, &act, NULL); >>>> sigaction(SIGTERM, &act, NULL); >>>> + act.sa_sigaction = syssig_handler; >>>> + sigaction(SIGSYS, &act, NULL); >>>> } >>>> >>>> /* Find a likely location for support files using the location of >>>> the binary. >>>> diff --git a/qemu-seccomp.c b/qemu-seccomp.c >>>> index 031da1d..897d9b3 100644 >>>> --- a/qemu-seccomp.c >>>> +++ b/qemu-seccomp.c >>>> @@ -2,9 +2,11 @@ >>>> * QEMU seccomp mode 2 support with libseccomp >>>> * >>>> * Copyright IBM, Corp. 2012 >>>> + * Copyright (C) 2013 Red Hat, Inc. >>>> * >>>> * Authors: >>>> * Eduardo Otubo <eotubo@br.ibm.com> >>>> + * Michal Privoznik <mprivozn@redhat.com> >>>> * >>>> * This work is licensed under the terms of the GNU GPL, version 2. >>>> See >>>> * the COPYING file in the top-level directory. >>>> @@ -238,7 +240,7 @@ int seccomp_start(void) >>>> unsigned int i = 0; >>>> scmp_filter_ctx ctx; >>>> >>>> - ctx = seccomp_init(SCMP_ACT_KILL); >>>> + ctx = seccomp_init(SCMP_ACT_TRAP); >>>> if (ctx == NULL) { >>>> goto seccomp_return; >>>> } >>>> >>> >>> I think this is going to be better solved in the kernel. I have a >>> kernel patch sitting out there at: https://lkml.org/lkml/2013/1/7/313 >>> Any public support of this patch could be useful to help get it in. >>> >>> Something is definitely needed to learn the syscall that is killing >>> QEMU. But I don't think the signal handler approach is going to work. >>> We tried that and ran into too many situations where signals were being >>> blocked by libraries (spice is one example). And we didn't want to get >>> in the business of patching third party libraries to allow SIGSYS. >>> >> >> My approach is slightly different here. While I agree that pr_info() is >> useful to find which syscalls are not whitelisted yet, it doesn't solve >> the different issue. That is - when qemu calls a not whitelisted >> syscall, currently it gets killed without any error being printed >> anywhere. So libvirt just sees the machine has gone away and it can be >> really hard to figure out the root cause is actually sandboxing. > > Have you looked at the audit_seccomp() call in the kernel's > __secure_computing() function? It may already be doing what you're > looking for. > Fyi, here's an example of an audit_seccomp() audit log: type=SECCOMP msg=audit(1360162117.139:415): auid=1000 uid=1000 gid=1000 ses=2 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=1808 comm="bpf-direct" sig=31 syscall=1 compat=0 ip=0x35dc8ede49 code=0x0
Il 06/02/2013 12:25, Daniel P. Berrange ha scritto: >> > Something is definitely needed to learn the syscall that is killing >> > QEMU. But I don't think the signal handler approach is going to >> > work. We tried that and ran into too many situations where signals >> > were being blocked by libraries (spice is one example). And we >> > didn't want to get in the business of patching third party libraries >> > to allow SIGSYS. > We absolutely should be fixing libraries not to screw up signal handling > in applications. I see that the SPICE worker thread blocks every single > signal except SEGV/FPE/ILL, which is just completely bogus. There's no > acceptable reason for SPICE to block so many signals. There definitely is. It is blocking the signals only in the worker thread, so that the library does not get in the way of applications that do not expect other threads to exist. By keeping the signals blocked, the application is free to handle them: 1) knowing that a signal will always be delivered to the thread running the event loop, and will interrupt select/poll; 2) without having to enforce _both_ thread-safety and async-signal-safety. The problem is that SIGSYS is a per-thread signal, like SIGSEGV/FPE/ILL. So it is impossible to block it, what the kernel does if you block it is treat it as SIG_DFL. Unfortunately, SIGSYS is obscure enough that nobody knows about it and/or realizes it is a companion to SIGSEGV/FPE/ILL. So _yes_, we should indeed get in the business of patching third-party libraries (and QEMU itself, see util/qemu-thread-posix.c) to allow SIGSIGV/FPE/ILL/SYS, but that's where the scope of the problem ends. Paolo
On 02/05/2013 06:02 AM, Michal Privoznik wrote: > Currently, it we call a not white listed system call, we get killed > immediately without reporting any error. It would be far more useful, > if we can at least shout something on stderr just before dying, so > users know it is because of sandbox, not just random quit. > > Signed-off-by: Michal Privoznik <mprivozn@redhat.com> > --- > os-posix.c | 8 ++++++++ > qemu-seccomp.c | 4 +++- > 2 files changed, 11 insertions(+), 1 deletion(-) > > diff --git a/os-posix.c b/os-posix.c > index 5c64518..1d52306 100644 > --- a/os-posix.c > +++ b/os-posix.c > @@ -62,6 +62,12 @@ void os_setup_early_signal_handling(void) > sigaction(SIGPIPE, &act, NULL); > } > > +static void syssig_handler(int signal, siginfo_t *info, void *c) > +{ > + fprintf(stderr, "Bad system call\n"); > + exit(1); > +} > + > static void termsig_handler(int signal, siginfo_t *info, void *c) > { > qemu_system_killed(info->si_signo, info->si_pid); > @@ -77,6 +83,8 @@ void os_setup_signal_handling(void) > sigaction(SIGINT, &act, NULL); > sigaction(SIGHUP, &act, NULL); > sigaction(SIGTERM, &act, NULL); > + act.sa_sigaction = syssig_handler; > + sigaction(SIGSYS, &act, NULL); > } > > /* Find a likely location for support files using the location of the binary. > diff --git a/qemu-seccomp.c b/qemu-seccomp.c > index 031da1d..897d9b3 100644 > --- a/qemu-seccomp.c > +++ b/qemu-seccomp.c > @@ -2,9 +2,11 @@ > * QEMU seccomp mode 2 support with libseccomp > * > * Copyright IBM, Corp. 2012 > + * Copyright (C) 2013 Red Hat, Inc. > * > * Authors: > * Eduardo Otubo <eotubo@br.ibm.com> > + * Michal Privoznik <mprivozn@redhat.com> > * > * This work is licensed under the terms of the GNU GPL, version 2. See > * the COPYING file in the top-level directory. > @@ -238,7 +240,7 @@ int seccomp_start(void) > unsigned int i = 0; > scmp_filter_ctx ctx; > > - ctx = seccomp_init(SCMP_ACT_KILL); > + ctx = seccomp_init(SCMP_ACT_TRAP); > if (ctx == NULL) { > goto seccomp_return; > } > Another thought.. When seccomp kills a task the exit status of the task will be SIGSYS. Here's the kernel documentation: SECCOMP_RET_KILL: Results in the task exiting immediately without executing the system call. The exit status of the task (status & 0x7f) will be SIGSYS, not SIGKILL. Maybe the right solution is for libvirt to check qemu's exit status and issue a message based on it?
On Fri, Feb 08, 2013 at 09:44:10AM -0500, Corey Bryant wrote: > > > On 02/05/2013 06:02 AM, Michal Privoznik wrote: > >Currently, it we call a not white listed system call, we get killed > >immediately without reporting any error. It would be far more useful, > >if we can at least shout something on stderr just before dying, so > >users know it is because of sandbox, not just random quit. > > > >Signed-off-by: Michal Privoznik <mprivozn@redhat.com> > >--- > > os-posix.c | 8 ++++++++ > > qemu-seccomp.c | 4 +++- > > 2 files changed, 11 insertions(+), 1 deletion(-) > > > >diff --git a/os-posix.c b/os-posix.c > >index 5c64518..1d52306 100644 > >--- a/os-posix.c > >+++ b/os-posix.c > >@@ -62,6 +62,12 @@ void os_setup_early_signal_handling(void) > > sigaction(SIGPIPE, &act, NULL); > > } > > > >+static void syssig_handler(int signal, siginfo_t *info, void *c) > >+{ > >+ fprintf(stderr, "Bad system call\n"); > >+ exit(1); > >+} > >+ > > static void termsig_handler(int signal, siginfo_t *info, void *c) > > { > > qemu_system_killed(info->si_signo, info->si_pid); > >@@ -77,6 +83,8 @@ void os_setup_signal_handling(void) > > sigaction(SIGINT, &act, NULL); > > sigaction(SIGHUP, &act, NULL); > > sigaction(SIGTERM, &act, NULL); > >+ act.sa_sigaction = syssig_handler; > >+ sigaction(SIGSYS, &act, NULL); > > } > > > > /* Find a likely location for support files using the location of the binary. > >diff --git a/qemu-seccomp.c b/qemu-seccomp.c > >index 031da1d..897d9b3 100644 > >--- a/qemu-seccomp.c > >+++ b/qemu-seccomp.c > >@@ -2,9 +2,11 @@ > > * QEMU seccomp mode 2 support with libseccomp > > * > > * Copyright IBM, Corp. 2012 > >+ * Copyright (C) 2013 Red Hat, Inc. > > * > > * Authors: > > * Eduardo Otubo <eotubo@br.ibm.com> > >+ * Michal Privoznik <mprivozn@redhat.com> > > * > > * This work is licensed under the terms of the GNU GPL, version 2. See > > * the COPYING file in the top-level directory. > >@@ -238,7 +240,7 @@ int seccomp_start(void) > > unsigned int i = 0; > > scmp_filter_ctx ctx; > > > >- ctx = seccomp_init(SCMP_ACT_KILL); > >+ ctx = seccomp_init(SCMP_ACT_TRAP); > > if (ctx == NULL) { > > goto seccomp_return; > > } > > > > Another thought.. When seccomp kills a task the exit status of the > task will be SIGSYS. Here's the kernel documentation: > > SECCOMP_RET_KILL: > Results in the task exiting immediately without executing the > system call. The exit status of the task (status & 0x7f) will > be SIGSYS, not SIGKILL. > > Maybe the right solution is for libvirt to check qemu's exit status > and issue a message based on it? QEMU is daemonized, so libvirt doesn't get to see the exit status at all. Daniel
diff --git a/os-posix.c b/os-posix.c index 5c64518..1d52306 100644 --- a/os-posix.c +++ b/os-posix.c @@ -62,6 +62,12 @@ void os_setup_early_signal_handling(void) sigaction(SIGPIPE, &act, NULL); } +static void syssig_handler(int signal, siginfo_t *info, void *c) +{ + fprintf(stderr, "Bad system call\n"); + exit(1); +} + static void termsig_handler(int signal, siginfo_t *info, void *c) { qemu_system_killed(info->si_signo, info->si_pid); @@ -77,6 +83,8 @@ void os_setup_signal_handling(void) sigaction(SIGINT, &act, NULL); sigaction(SIGHUP, &act, NULL); sigaction(SIGTERM, &act, NULL); + act.sa_sigaction = syssig_handler; + sigaction(SIGSYS, &act, NULL); } /* Find a likely location for support files using the location of the binary. diff --git a/qemu-seccomp.c b/qemu-seccomp.c index 031da1d..897d9b3 100644 --- a/qemu-seccomp.c +++ b/qemu-seccomp.c @@ -2,9 +2,11 @@ * QEMU seccomp mode 2 support with libseccomp * * Copyright IBM, Corp. 2012 + * Copyright (C) 2013 Red Hat, Inc. * * Authors: * Eduardo Otubo <eotubo@br.ibm.com> + * Michal Privoznik <mprivozn@redhat.com> * * This work is licensed under the terms of the GNU GPL, version 2. See * the COPYING file in the top-level directory. @@ -238,7 +240,7 @@ int seccomp_start(void) unsigned int i = 0; scmp_filter_ctx ctx; - ctx = seccomp_init(SCMP_ACT_KILL); + ctx = seccomp_init(SCMP_ACT_TRAP); if (ctx == NULL) { goto seccomp_return; }
Currently, it we call a not white listed system call, we get killed immediately without reporting any error. It would be far more useful, if we can at least shout something on stderr just before dying, so users know it is because of sandbox, not just random quit. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- os-posix.c | 8 ++++++++ qemu-seccomp.c | 4 +++- 2 files changed, 11 insertions(+), 1 deletion(-)