From patchwork Wed Mar 13 09:49:03 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= X-Patchwork-Id: 1055984 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44K6Y90mdkz9s47 for ; Wed, 13 Mar 2019 20:51:08 +1100 (AEDT) Received: from localhost ([127.0.0.1]:41259 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h40XS-0005gv-8X for incoming@patchwork.ozlabs.org; Wed, 13 Mar 2019 05:51:06 -0400 Received: from eggs.gnu.org ([209.51.188.92]:56544) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h40Vb-0004in-RB for qemu-devel@nongnu.org; Wed, 13 Mar 2019 05:49:12 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1h40Va-0006Mk-U3 for qemu-devel@nongnu.org; Wed, 13 Mar 2019 05:49:11 -0400 Received: from mx1.redhat.com ([209.132.183.28]:45788) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1h40Va-0006Kx-Je for qemu-devel@nongnu.org; Wed, 13 Mar 2019 05:49:10 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CA901307CB2E; Wed, 13 Mar 2019 09:49:09 +0000 (UTC) Received: from localhost.localdomain.com (ovpn-112-69.ams2.redhat.com [10.36.112.69]) by smtp.corp.redhat.com (Postfix) with ESMTP id 62C8117C7B; Wed, 13 Mar 2019 09:49:05 +0000 (UTC) From: =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= To: qemu-devel@nongnu.org Date: Wed, 13 Mar 2019 09:49:03 +0000 Message-Id: <20190313094903.10006-1-berrange@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.42]); Wed, 13 Mar 2019 09:49:09 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH RFC] seccomp: don't kill process for resource control syscalls X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Eduardo Otubo , =?utf-8?q?Marc-Andr=C3=A9_Lureau?= , =?utf-8?q?Mathias_Fr=C3=B6hlich?= Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The Mesa library tries to set process affinity on some of its threads in order to optimize its performance. Currently this results in QEMU being immediately terminated when seccomp is enabled. Mesa doesn't consider failure of the process affinity settings to be fatal to its operation, but our seccomp policy gives it no choice in gracefully handling this denial. It is reasonable to consider that malicious code using the resource control syscalls to be a less serious attack than if they were trying to spawn processes or change UIDs and other such things. Generally speaking changing the resource control setting will "merely" affect quality of service of processes on the host. With this in mind, rather than kill the process, we can relax the policy for these syscalls to return the EPERM errno value. This allows callers to detect that QEMU does not want them to change resource allocations, and apply some reasonable fallback logic. The main downside to this is for code which uses these syscalls but does not check the return value, blindly assuming they will always succeeed. Returning an errno could result in sub-optimal behaviour. Arguably though such code is already broken & needs fixing regardless. Signed-off-by: Daniel P. Berrangé Reviewed-by: Marc-André Lureau Acked-by: Eduardo Otubo --- qemu-seccomp.c | 32 +++++++++++++++++++++++++------- 1 file changed, 25 insertions(+), 7 deletions(-) diff --git a/qemu-seccomp.c b/qemu-seccomp.c index 36d5829831..9776c9ef40 100644 --- a/qemu-seccomp.c +++ b/qemu-seccomp.c @@ -121,20 +121,37 @@ qemu_seccomp(unsigned int operation, unsigned int flags, void *args) #endif } -static uint32_t qemu_seccomp_get_kill_action(void) +static uint32_t qemu_seccomp_get_kill_action(int set) { + switch (set) { + case QEMU_SECCOMP_SET_DEFAULT: + case QEMU_SECCOMP_SET_OBSOLETE: + case QEMU_SECCOMP_SET_PRIVILEGED: + case QEMU_SECCOMP_SET_SPAWN: { #if defined(SECCOMP_GET_ACTION_AVAIL) && defined(SCMP_ACT_KILL_PROCESS) && \ defined(SECCOMP_RET_KILL_PROCESS) - { - uint32_t action = SECCOMP_RET_KILL_PROCESS; + static int kill_process = -1; + if (kill_process == -1) { + uint32_t action = SECCOMP_RET_KILL_PROCESS; - if (qemu_seccomp(SECCOMP_GET_ACTION_AVAIL, 0, &action) == 0) { + if (qemu_seccomp(SECCOMP_GET_ACTION_AVAIL, 0, &action) == 0) { + kill_process = 1; + } + kill_process = 0; + } + if (kill_process == 1) { return SCMP_ACT_KILL_PROCESS; } - } #endif + return SCMP_ACT_TRAP; + } + + case QEMU_SECCOMP_SET_RESOURCECTL: + return SCMP_ACT_ERRNO(EPERM); - return SCMP_ACT_TRAP; + default: + g_assert_not_reached(); + } } @@ -143,7 +160,6 @@ static int seccomp_start(uint32_t seccomp_opts) int rc = 0; unsigned int i = 0; scmp_filter_ctx ctx; - uint32_t action = qemu_seccomp_get_kill_action(); ctx = seccomp_init(SCMP_ACT_ALLOW); if (ctx == NULL) { @@ -157,10 +173,12 @@ static int seccomp_start(uint32_t seccomp_opts) } for (i = 0; i < ARRAY_SIZE(blacklist); i++) { + uint32_t action; if (!(seccomp_opts & blacklist[i].set)) { continue; } + action = qemu_seccomp_get_kill_action(blacklist[i].set); rc = seccomp_rule_add_array(ctx, action, blacklist[i].num, blacklist[i].narg, blacklist[i].arg_cmp); if (rc < 0) {