Patchwork [05/12] PCI: Require CAP_COMPROMISE_KERNEL for PCI BAR access

login
register
mail settings
Submitter Matthew Garrett
Date March 18, 2013, 9:32 p.m.
Message ID <1363642353-30749-5-git-send-email-matthew.garrett@nebula.com>
Download mbox | patch
Permalink /patch/228818/
State Not Applicable
Headers show

Comments

Matthew Garrett - March 18, 2013, 9:32 p.m.
Any hardware that can potentially generate DMA has to be locked down from
userspace in order to avoid it being possible for an attacker to cause
arbitrary kernel behaviour. Default to paranoid - in future we can
potentially relax this for sufficiently IOMMU-isolated devices.

Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
---
 drivers/pci/pci-sysfs.c | 9 +++++++++
 drivers/pci/proc.c      | 8 +++++++-
 drivers/pci/syscall.c   | 2 +-
 3 files changed, 17 insertions(+), 2 deletions(-)
Josh Boyer - March 27, 2013, 3:03 p.m.
On Mon, Mar 18, 2013 at 5:32 PM, Matthew Garrett
<matthew.garrett@nebula.com> wrote:
> Any hardware that can potentially generate DMA has to be locked down from
> userspace in order to avoid it being possible for an attacker to cause
> arbitrary kernel behaviour. Default to paranoid - in future we can
> potentially relax this for sufficiently IOMMU-isolated devices.
>
> Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>

As noted here:

https://bugzilla.redhat.com/show_bug.cgi?id=908888

this breaks pci passthru with QEMU.  The suggestion in the bug is to move
the check from read/write to open, but sysfs makes that somewhat
difficult.  The open code is part of the core sysfs functionality shared
with the majority of sysfs files, so adding a check there would restrict
things that clearly don't need to be restricted.

Kyle had the idea to add a cap field to the attribute structure, and do
a capable check if that is set.  That would allow for a more generic
usage of capabilities in sysfs code, at the cost of slightly increasing
the structure size and open path.  That seems somewhat promising if we
stick with capabilities.

I would love to just squarely blame capabilities for causing this, but we
can't just replace it with an efi_enabled(EFI_SECURE_BOOT) check because
of the sysfs open case.  I'm not sure there are great answers here.

josh
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 9c6e9bb..b966089 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -622,6 +622,9 @@  pci_write_config(struct file* filp, struct kobject *kobj,
 	loff_t init_off = off;
 	u8 *data = (u8*) buf;
 
+	if (!capable(CAP_COMPROMISE_KERNEL))
+		return -EPERM;
+
 	if (off > dev->cfg_size)
 		return 0;
 	if (off + count > dev->cfg_size) {
@@ -928,6 +931,9 @@  pci_mmap_resource(struct kobject *kobj, struct bin_attribute *attr,
 	resource_size_t start, end;
 	int i;
 
+	if (!capable(CAP_COMPROMISE_KERNEL))
+		return -EPERM;
+
 	for (i = 0; i < PCI_ROM_RESOURCE; i++)
 		if (res == &pdev->resource[i])
 			break;
@@ -1035,6 +1041,9 @@  pci_write_resource_io(struct file *filp, struct kobject *kobj,
 		      struct bin_attribute *attr, char *buf,
 		      loff_t off, size_t count)
 {
+	if (!capable(CAP_COMPROMISE_KERNEL))
+		return -EPERM;
+
 	return pci_resource_io(filp, kobj, attr, buf, off, count, true);
 }
 
diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c
index 0b00947..7639f68 100644
--- a/drivers/pci/proc.c
+++ b/drivers/pci/proc.c
@@ -139,6 +139,9 @@  proc_bus_pci_write(struct file *file, const char __user *buf, size_t nbytes, lof
 	int size = dp->size;
 	int cnt;
 
+	if (!capable(CAP_COMPROMISE_KERNEL))
+		return -EPERM;
+
 	if (pos >= size)
 		return 0;
 	if (nbytes >= size)
@@ -219,6 +222,9 @@  static long proc_bus_pci_ioctl(struct file *file, unsigned int cmd,
 #endif /* HAVE_PCI_MMAP */
 	int ret = 0;
 
+	if (!capable(CAP_COMPROMISE_KERNEL))
+		return -EPERM;
+
 	switch (cmd) {
 	case PCIIOC_CONTROLLER:
 		ret = pci_domain_nr(dev->bus);
@@ -259,7 +265,7 @@  static int proc_bus_pci_mmap(struct file *file, struct vm_area_struct *vma)
 	struct pci_filp_private *fpriv = file->private_data;
 	int i, ret;
 
-	if (!capable(CAP_SYS_RAWIO))
+	if (!capable(CAP_SYS_RAWIO) || !capable(CAP_COMPROMISE_KERNEL))
 		return -EPERM;
 
 	/* Make sure the caller is mapping a real resource for this device */
diff --git a/drivers/pci/syscall.c b/drivers/pci/syscall.c
index e1c1ec5..97e785f 100644
--- a/drivers/pci/syscall.c
+++ b/drivers/pci/syscall.c
@@ -92,7 +92,7 @@  SYSCALL_DEFINE5(pciconfig_write, unsigned long, bus, unsigned long, dfn,
 	u32 dword;
 	int err = 0;
 
-	if (!capable(CAP_SYS_ADMIN))
+	if (!capable(CAP_SYS_ADMIN) || !capable(CAP_COMPROMISE_KERNEL))
 		return -EPERM;
 
 	dev = pci_get_bus_and_slot(bus, dfn);