From patchwork Fri Jun 10 20:50:23 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniele Di Proietto X-Patchwork-Id: 633958 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from archives.nicira.com (archives.nicira.com [96.126.127.54]) by ozlabs.org (Postfix) with ESMTP id 3rRDpG26M6z9syB for ; Sat, 11 Jun 2016 06:50:30 +1000 (AEST) Received: from archives.nicira.com (localhost [127.0.0.1]) by archives.nicira.com (Postfix) with ESMTP id 5B7BD22C42D; Fri, 10 Jun 2016 13:50:29 -0700 (PDT) X-Original-To: dev@openvswitch.org Delivered-To: dev@openvswitch.org Received: from mx1e3.cudamail.com (mx1.cudamail.com [69.90.118.67]) by archives.nicira.com (Postfix) with ESMTPS id 88CCD22C428 for ; Fri, 10 Jun 2016 13:50:27 -0700 (PDT) Received: from bar5.cudamail.com (localhost [127.0.0.1]) by mx1e3.cudamail.com (Postfix) with ESMTPS id 1589B42013B for ; Fri, 10 Jun 2016 14:50:27 -0600 (MDT) X-ASG-Debug-ID: 1465591826-09eadd16ee48070001-byXFYA Received: from mx1-pf2.cudamail.com ([192.168.24.2]) by bar5.cudamail.com with ESMTP id WEdKPiWvunlFBJxa (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Fri, 10 Jun 2016 14:50:26 -0600 (MDT) X-Barracuda-Envelope-From: diproiettod@vmware.com X-Barracuda-RBL-Trusted-Forwarder: 192.168.24.2 Received: from unknown (HELO smtp-outbound-1.vmware.com) (208.91.2.12) by mx1-pf2.cudamail.com with ESMTPS (DHE-RSA-AES256-SHA encrypted); 10 Jun 2016 20:50:25 -0000 Received-SPF: error (mx1-pf2.cudamail.com: error in processing during lookup of vmware.com: DNS problem) X-Barracuda-Apparent-Source-IP: 208.91.2.12 X-Barracuda-RBL-IP: 208.91.2.12 Received: from sc9-mailhost3.vmware.com (sc9-mailhost3.vmware.com [10.113.161.73]) by smtp-outbound-1.vmware.com (Postfix) with ESMTP id 3513998853; Fri, 10 Jun 2016 13:50:24 -0700 (PDT) Received: from EX13-CAS-006.vmware.com (ex13-cas-006.vmware.com [10.113.191.56]) by sc9-mailhost3.vmware.com (Postfix) with ESMTP id 0CED140476; Fri, 10 Jun 2016 13:50:25 -0700 (PDT) Received: from EX13-MBX-001.vmware.com (10.113.191.21) by EX13-MBX-025.vmware.com (10.113.191.45) with Microsoft SMTP Server (TLS) id 15.0.1156.6; Fri, 10 Jun 2016 13:50:24 -0700 Received: from EX13-MBX-001.vmware.com ([fe80::39f0:6ecb:de0:1de0]) by EX13-MBX-001.vmware.com ([fe80::39f0:6ecb:de0:1de0%15]) with mapi id 15.00.1156.000; Fri, 10 Jun 2016 13:50:24 -0700 X-CudaMail-Envelope-Sender: diproiettod@vmware.com From: Daniele Di Proietto To: Aaron Conole , Christian Ehrhardt X-CudaMail-Whitelist-To: dev@openvswitch.org X-CudaMail-MID: CM-E2-609067789 X-CudaMail-DTE: 061016 X-CudaMail-Originating-IP: 208.91.2.12 Thread-Topic: [ovs-dev] [PATCH v2 2/2] netdev-dpdk: Support user-defined socket attribs X-ASG-Orig-Subj: [##CM-E2-609067789##]Re: [ovs-dev] [PATCH v2 2/2] netdev-dpdk: Support user-defined socket attribs Thread-Index: AQHRw0DOJUKlvGN4GUuuu4781baf0J/jLO+A Date: Fri, 10 Jun 2016 20:50:23 +0000 Message-ID: <39F64FC4-A666-400B-8630-1DB4D3374BEC@vmware.com> References: <1463776326-13108-1-git-send-email-aconole@redhat.com> <1463776326-13108-3-git-send-email-aconole@redhat.com> <4EE46C5C-C554-4C84-A0D0-F8B011257E17@vmware.com> In-Reply-To: Accept-Language: en-US, en-GB Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-messagesentrepresentingtype: 1 x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.113.170.11] Content-ID: MIME-Version: 1.0 X-Barracuda-Connect: UNKNOWN[192.168.24.2] X-Barracuda-Start-Time: 1465591826 X-Barracuda-Encrypted: DHE-RSA-AES256-SHA X-Barracuda-URL: https://web.cudamail.com:443/cgi-mod/mark.cgi X-ASG-Whitelist: Header =?UTF-8?B?eFwtY3VkYW1haWxcLXdoaXRlbGlzdFwtdG8=?= X-Virus-Scanned: by bsmtpd at cudamail.com X-Barracuda-BRTS-Status: 1 X-ASG-Whitelist: EmailCat (corporate) Cc: "dev@openvswitch.org" , Ilya Maximets , Ansis Atteka Subject: Re: [ovs-dev] [PATCH v2 2/2] netdev-dpdk: Support user-defined socket attribs X-BeenThere: dev@openvswitch.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@openvswitch.org Sender: "dev" On 10/06/2016 10:51, "Aaron Conole" wrote: >Aaron Conole writes: > >> Christian Ehrhardt writes: >> >>> On Tue, May 24, 2016 at 4:10 PM, Aaron Conole wrote: >>> >>>> Daniele Di Proietto writes: >>>> >>>> > Hi Aaron, >>>> > >>>> > I'm still a little bit nervous about calling chown on a (partially) >>>> > user controlled file name. >>>> >>>> I agree, that always seems scary. >>>> >>>> > Before moving forward I wanted to discuss a couple of other options: >>>> > >>>> > * Ansis (in CC) suggested using -runas parameter in qemu. This way >>>> > qemu can open the socket as root and drop privileges before starting >>>> > guest execution. >>>> >>>> I'm not sure how to do this with libvirt, or via the OpenStack Neutron >>>> plugin. I also don't know if it would be an acceptable workaround for >>>> users. Additionally, I recall there being something of a "don't even >>>> know if this works" around it. Maybe Christian or Ansis (both in CC) >>>> can expound on it. >>>> >>> >>> Hi, >>> IIRC we kind of agree that long term a proper MAC will be much better but >>> most involved people needed something to get it working like "now". >>> Since they are complementary (other than the fix removing a bit of the >>> urgency for more MAC) it was kind of the least bad option. >>> >>> You have to be aware that I brought up the discussion on dev@dpdk.org - see >>> [1] and [2]: >>> But this will take time and eventually still be the applications task to >>> "do something" - no matter if via API or via the chmod's right now. >>> So Aaron is trying to get something that works now until the long term >>> things are in place, which I appreciate. >>> >>> FYI - I was even more in a hurry as it was clear that OVS-2.5 won't get >>> this in time I run with [3] for now. >>> I never intended to suggest that, but with the discussion in place, one >>> could ask if you (Aaron) want to pick up that instead. >>> That would keep OVS free for now until DPDK made up the API (see [2]) for >>> socket ownership control and this then could be implemented in OVS? >>> >>> (I hope) In some months/years we will all be happy to drop this bunch of >>> interim solutions, never the less we need it for now. >>> >>> [1]: http://dpdk.org/dev/patchwork/patch/12222/ >>> [2]: http://dpdk.org/ml/archives/dev/2015-December/030326.html >>> [3]: >>> https://git.launchpad.net/~ubuntu-server/dpdk/commit/?h=ubuntu-xenial-to-dpdk2.2&id=f3c7aa1b2ddea8e092ad4a89e41a0e19d01ed4e7 >>> >>> [...] >>> >>> >>>> I think originally we quickly discussed 4 possible solutions (and >>>> hopefully I captured them correctly): >>>> >>>> 1. OVS downgrades to the ovs user, and kvm runs under the ovs >>>> group. I don't actually like this solution because kvm could then >>>> pollute the ovs database. >>>> >>>> 2. OVS runs as some user and sets the user/group ownership of the socket >>>> via chown/chmod where permissions come from the database (the >>>> original context had ovs running as root - but as I described above >>>> it doesn't need to be root provided ovs+DPDK can start without root). >>>> >>>> 3. OVS runs as some user, kvm starts as root, opens the socket and >>>> downgrades. IIRC, this doesn't actually work, or it may have >>>> implications on other projects. I don't remember exactly what was >>>> not as great about this solution, TBH. >>>> >>>> 4. OVS and KVM run as whatever users; MAC is used to enforce the >>>> layering between them. >>>> >>>> I think solution 2 and solution 4 don't actually interfere with each >>>> other, and can be used to a complementary effect (if implemented >>>> properly) so that the MAC layer enforces access, but even without MAC, >>>> the DAC layer can provide appropriate whitelisting behavior. >>>> >>> >>> I also remember several complex changes needed for the #1 and #3 that >>> always would end up with huge effort and a high risk not being accepted. >>> Probably that is what you refer to with "implications on other projects". >>> >>> Also keep in mind the position of dpdk out of the last few discussions >>> which I'd like to summarize as "dpdk got this path from an app, so this app >>> OWNS that path". >> >> I'd like to continue on, but I am not sure what the concerns are right >> now. Is it possible to enumerate them point by point so that I can >> understand them? I think there are two outstanding concerns right now: >> >> 1. the proposed approach is not good enough (vis-a-vis DAC vs. MAC) >> >> 2. the proposed approach would be better implemented in the utility >> that wants access to the sockets (vis-a-vis the libvirt discussion) >> >> Am I understanding the concerns correctly? > >Ping? I found another theoretical problem with the chmod approach, let me try to explain: There's an extremely small race window between the socket creation and the chmod which could theoretically be exploited to change the owner of a socket (e.g. ovnsb_db.sock) in ovs rundir, by controlling the name of the port: 1. There's no southbound database running, because it's not yet been started or because it's being restarted. 2. The user creates a vhost port, naming it ovnsb_db.sock. rte_vhost_driver_register() succeeds and creates a socket in the file system. 3. The southbound database is started, it removes ovnsb_db.sock and recreates it. 4. Now OVS changes the owner and the permission of what it thinks is a vhost-user socket. If 3 manages to get between 2 and 4, we have a problem. It's a pretty small window, and it's unlikely that an attacker can control when the southbound database is restarted. I feel like I'm nitpicking, but I'm not sure how serious is the security impact of what I'm describing. I suggested an alternative approach, and I've tried implementing a quick POC on top on your patch: ---8<--- ---8<--- Compared to the chmod approach this has some limitation: 1. It doesn't support changing permissions, only the owner. This could be done with umask, but I couldn't find any system call to change the umask for a single thread. 2. Unless vhost-sock-dir is owned by the target owner, the socket cannot be created. I'm not sure whether this is a reasonable limitation for the use cases you have in mind. 3. setfsuid() is Linux specific and somehow deprecated according to the manpage: "Thus, setfsuid() is nowadays unneeded and should be avoided in new applications" I haven't used seteuid, because it changes the euid of the whole process and that may interfere with other operations on OVS. If these limitations are unacceptable, I can see how we can use chmod. After all, as you point out, it's probably better to do it in OVS than in some script. Thanks for your patience in solving this problem, Daniele diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 24ebb41..d7adc66 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -30,6 +30,7 @@ #include #include #include +#include #include "chutil.h" #include "dirs.h" @@ -891,6 +892,17 @@ netdev_dpdk_vhost_user_construct(struct netdev *netdev) */ snprintf(dev->vhost_id, sizeof(dev->vhost_id), "%s/%s", vhost_sock_dir, name); + uid_t orig_u = geteuid(); + gid_t orig_g = getegid(); + if (vhost_sock_def_owner) { + uid_t u; + gid_t g; + if (!ovs_strtousr(vhost_sock_def_owner, &u, NULL, &g, false)) { + VLOG_INFO("UID: %d GID: %d", u, g); + setfsuid(u); + setfsgid(g); + } + } err = rte_vhost_driver_register(dev->vhost_id); if (err) { @@ -903,16 +915,12 @@ netdev_dpdk_vhost_user_construct(struct netdev *netdev) err = vhost_construct_helper(netdev); } - ovs_mutex_unlock(&dpdk_mutex); - if (!err && vhost_sock_def_owner && - (err = ovs_chown(dev->vhost_id, vhost_sock_def_owner))) { - VLOG_ERR("vhost-user socket device ownership change failed."); + if (vhost_sock_def_owner) { + setfsuid(orig_u); + setfsgid(orig_g); } - if (!err && vhost_sock_def_perms && - (err = ovs_chmod(dev->vhost_id, vhost_sock_def_perms))) { - VLOG_ERR("vhost-user socket device permission change failed."); - } + ovs_mutex_unlock(&dpdk_mutex); return err; }