{"id":817182,"url":"http://patchwork.ozlabs.org/api/covers/817182/?format=json","web_url":"http://patchwork.ozlabs.org/project/netdev/cover/20170921215636.11097-1-mahesh@bandewar.net/","project":{"id":7,"url":"http://patchwork.ozlabs.org/api/projects/7/?format=json","name":"Linux network development","link_name":"netdev","list_id":"netdev.vger.kernel.org","list_email":"netdev@vger.kernel.org","web_url":null,"scm_url":null,"webscm_url":null,"list_archive_url":"","list_archive_url_format":"","commit_url_format":""},"msgid":"<20170921215636.11097-1-mahesh@bandewar.net>","list_archive_url":null,"date":"2017-09-21T21:56:36","name":"[RFC,0/2] capability controlled user-namespaces","submitter":{"id":68197,"url":"http://patchwork.ozlabs.org/api/people/68197/?format=json","name":"Mahesh Bandewar","email":"mahesh@bandewar.net"},"mbox":"http://patchwork.ozlabs.org/project/netdev/cover/20170921215636.11097-1-mahesh@bandewar.net/mbox/","series":[{"id":4483,"url":"http://patchwork.ozlabs.org/api/series/4483/?format=json","web_url":"http://patchwork.ozlabs.org/project/netdev/list/?series=4483","date":"2017-09-21T21:56:36","name":"capability controlled user-namespaces","version":1,"mbox":"http://patchwork.ozlabs.org/series/4483/mbox/"}],"comments":"http://patchwork.ozlabs.org/api/covers/817182/comments/","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (2048-bit key;\n\tunprotected) header.d=bandewar-net.20150623.gappssmtp.com\n\theader.i=@bandewar-net.20150623.gappssmtp.com\n\theader.b=\"lbjIjHK3\"; dkim-atps=neutral"],"Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xyr7l3rVZz9t2Q\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri, 22 Sep 2017 07:57:39 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751830AbdIUV4w (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tThu, 21 Sep 2017 17:56:52 -0400","from mail-pf0-f196.google.com ([209.85.192.196]:33138 \"EHLO\n\tmail-pf0-f196.google.com\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1751763AbdIUV4v (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Thu, 21 Sep 2017 17:56:51 -0400","by mail-pf0-f196.google.com with SMTP id h4so2980031pfk.0\n\tfor <netdev@vger.kernel.org>; Thu, 21 Sep 2017 14:56:50 -0700 (PDT)","from localhost ([2620:15c:2cb:201:71b1:5d9c:e713:6f1c])\n\tby smtp.gmail.com with ESMTPSA id\n\tn2sm4677167pgs.89.2017.09.21.14.56.49\n\t(version=TLS1_2 cipher=AES128-SHA bits=128/128);\n\tThu, 21 Sep 2017 14:56:49 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=bandewar-net.20150623.gappssmtp.com; s=20150623;\n\th=from:to:cc:subject:date:message-id;\n\tbh=1AG709LdZu8QKG5vhUrgguoAtdHUJhhzzvG2q9c47h0=;\n\tb=lbjIjHK3Q9YX3cZV1HMcFqFtv4NXN+L9vnDKeuTP2I52h0XuA4zBl0x3hQ3zGSyXem\n\tRGaM2FZnS2zueS8IQfzyB56BPZ6j908yTYTlmhQ5UC5kySz+rn79qVRsQ45GxyN52YcR\n\t3xGVC8sV4lIijXdCgvhymV+YO0K8vqrr1G1W5l8Z9PF3YS2pWqkPDsrT0O6e0Y1FhucF\n\taZPGImh2Z3rMDl3LI6YC2knA2bd5mHKMnb7KGPZ7ex5LBs4+nUlN3tVRwh4EujIijSoC\n\tbwjXvtKOwUv6D9bD98as2x52F18CDTXBxgv+ZzrgvoHErlWF7ukohFL+pNB7IZ91chOA\n\tzMqw==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20161025;\n\th=x-gm-message-state:from:to:cc:subject:date:message-id;\n\tbh=1AG709LdZu8QKG5vhUrgguoAtdHUJhhzzvG2q9c47h0=;\n\tb=lyzIn2KKpq55qduNsQ4CYGgTKr+3pHAHLbK8ksmpKLPm98loQWFBgDVJWG0Z1VLR5t\n\tS1xk5Y/WRioyApFxaSy5OBK1RuU1D2fU8FXaomwNc/oA/4oO44bksoJd5OjMOcjmVrM3\n\tK/KM6UZGC9zKu/yTnu/6OIuimAXy/jT/BxGwq6ost5Jve6RXfL/RZoa9IaT4xShMaP7S\n\tG9NaNNJYuj6YXJ4kdQMy9NV1EIHAKUQIlr4VSszqx6T2shpoddhiqYD1GllMJrz/g8kN\n\tTW4wlEyEYhgAv3g6R4gFRT2ZcMzirfFCmolIr2oGDdivblLGJPahkvagowejQb2eyZ9j\n\ts44w==","X-Gm-Message-State":"AHPjjUiLaS3PcE9fT9iM4VSY/w3/b96G1E3FVUEhyDPQkXDzcIrKK3BQ\n\t4SucKMjA/itk4IpYd60X+yjT0Vlhr34=","X-Google-Smtp-Source":"AOwi7QBCSwfNxbS+c+/vW2AllXJrKeVKY8k8X0nZmcrmu11f7bL/xWlbVP/CbQBXSp8p6b6uLNi+AA==","X-Received":"by 10.99.37.66 with SMTP id l63mr7249821pgl.348.1506031010528;\n\tThu, 21 Sep 2017 14:56:50 -0700 (PDT)","From":"Mahesh Bandewar <mahesh@bandewar.net>","To":"LKML <linux-kernel@vger.kernel.org>, Netdev <netdev@vger.kernel.org>","Cc":"Kees Cook <keescook@chromium.org>, Serge Hallyn <serge@hallyn.com>,\n\t\"Eric W . Biederman\" <ebiederm@xmission.com>,\n\tEric Dumazet <edumazet@google.com>, David Miller <davem@davemloft.net>,\n\tMahesh Bandewar <mahesh@bandewar.net>,\n\tMahesh Bandewar <maheshb@google.com>","Subject":"[RFC PATCH 0/2] capability controlled user-namespaces","Date":"Thu, 21 Sep 2017 14:56:36 -0700","Message-Id":"<20170921215636.11097-1-mahesh@bandewar.net>","X-Mailer":"git-send-email 2.14.1.821.g8fa685d3b7-goog","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"},"content":"From: Mahesh Bandewar <maheshb@google.com>\n\nTL;DR version\n-------------\nCreating a sandbox environment with namespaces is challenging\nconsidering what these sandboxed processes can engage into. e.g.\nCVE-2017-6074, CVE-2017-7184, CVE-2017-7308 etc. just to name few.\nCurrent form of user-namespaces, however, if changed a bit can allow\nus to create a sandbox environment without locking down user-\nnamespaces.\n\nDetailed version\n----------------\n\nProblem\n-------\nUser-namespaces in the current form have increased the attack surface as\nany process can acquire capabilities which are not available to them (by\ndefault) by performing combination of clone()/unshare()/setns() syscalls.\n\n    #define _GNU_SOURCE\n    #include <stdio.h>\n    #include <sched.h>\n    #include <netinet/in.h>\n\n    int main(int ac, char **av)\n    {\n        int sock = -1;\n\n        printf(\"Attempting to open RAW socket before unshare()...\\n\");\n        sock = socket(AF_INET6, SOCK_RAW, IPPROTO_RAW);\n        if (sock < 0) {\n            perror(\"socket() SOCK_RAW failed: \");\n        } else {\n            printf(\"Successfully opened RAW-Sock before unshare().\\n\");\n            close(sock);\n            sock = -1;\n        }\n\n        if (unshare(CLONE_NEWUSER | CLONE_NEWNET) < 0) {\n            perror(\"unshare() failed: \");\n            return 1;\n        }\n\n        printf(\"Attempting to open RAW socket after unshare()...\\n\");\n        sock = socket(AF_INET6, SOCK_RAW, IPPROTO_RAW);\n        if (sock < 0) {\n            perror(\"socket() SOCK_RAW failed: \");\n        } else {\n            printf(\"Successfully opened RAW-Sock after unshare().\\n\");\n            close(sock);\n            sock = -1;\n        }\n\n        return 0;\n    }\n\nThe above example shows how easy it is to acquire NET_RAW capabilities\nand once acquired, these processes could take benefit of above mentioned\nor similar issues discovered/undiscovered with malicious intent. Note\nthat this is just an example and the problem/solution is not limited\nto NET_RAW capability *only*. \n\nThe easiest fix one can apply here is to lock-down user-namespaces which\nmany of the distros do (i.e. don't allow users to create user namespaces),\nbut unfortunately that prevents everyone from using them.\n\nApproach\n--------\nIntroduce a notion of 'controlled' user-namespaces. Every process on\nthe host is allowed to create user-namespaces (governed by the limit\nimposed by per-ns sysctl) however, mark user-namespaces created by\nsandboxed processes as 'controlled'. Use this 'mark' at the time of\ncapability check in conjunction with a global capability whitelist.\nIf the capability is not whitelisted, processes that belong to \ncontrolled user-namespaces will not be allowed.\n\nOnce a user-ns is marked as 'controlled'; all its child user-\nnamespaces are marked as 'controlled' too.\n\nA global whitelist is list of capabilities governed by the\nsysctl which is available to (privileged) user in init-ns to modify\nwhile it's applicable to all controlled user-namespaces on the host.\n\nMarking user-namespaces controlled without modifying the whitelist is\nequivalent of the current behavior. The default value of whitelist includes\nall capabilities so that the compatibility is maintained. However it gives\nadmins fine-grained ability to control various capabilities system wide\nwithout locking down user-namespaces.\n\nPlease see individual patches in this series.\n\nMahesh Bandewar (2):\n  capability: introduce sysctl for controlled user-ns capability\n    whitelist\n  userns: control capabilities of some user namespaces\n\n Documentation/sysctl/kernel.txt | 21 +++++++++++++++++\n include/linux/capability.h      |  4 ++++\n include/linux/user_namespace.h  | 20 ++++++++++++++++\n kernel/capability.c             | 52 +++++++++++++++++++++++++++++++++++++++++\n kernel/sysctl.c                 |  5 ++++\n kernel/user_namespace.c         |  3 +++\n security/commoncap.c            |  8 +++++++\n 7 files changed, 113 insertions(+)"}