From patchwork Fri Oct 9 14:45:28 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joerg Roedel X-Patchwork-Id: 528242 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 18B191402A8 for ; Sat, 10 Oct 2015 01:45:39 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=8bytes.org header.i=@8bytes.org header.b=N/Akflr1; dkim-atps=neutral Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756498AbbJIOpc (ORCPT ); Fri, 9 Oct 2015 10:45:32 -0400 Received: from 8bytes.org ([81.169.241.247]:38372 "EHLO theia.8bytes.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754609AbbJIOpa (ORCPT ); Fri, 9 Oct 2015 10:45:30 -0400 Received: by theia.8bytes.org (Postfix, from userid 1000) id C625B44C; Fri, 9 Oct 2015 16:45:28 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=8bytes.org; s=mail-1; t=1444401928; bh=CB5DZDiO8op8BovPWdSJCtKVm5VTr+mA+lAEAXFWAuA=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=N/Akflr1D3ZzTOiTHRySwa42NN1N3p+ZGFHmN8xVP/GndT8utynlbDTAvL2RfZATA rR7Q31mfatbgZ61fD2zd3BabIpxKgJlsGT8sQ5aDPxqhzPiq85PNBxKcEM339/SIhS dt55qiTGEbAGTqpAjnwAU7qpHcrJUZ2BekCtUsmFEjypy4rF06v712QcGZWb2tShO5 BVBDQcpxJpBDeS39CiEuZVUwXPeAXpxoSIetbjsORBkGT6aiO6h9JNMJP6auVdYQqv AkiAk2drYURolv2iFTMx3VZho+8fRsafMHLfiA9vLouehQCYUflLEFuq5d0NMvFZYF nEitn2PHNUzvg== Date: Fri, 9 Oct 2015 16:45:28 +0200 From: Joerg Roedel To: Andreas Hartmann Cc: Mikulas Patocka , iommu@lists.linux-foundation.org, Leo Duran , Christoph Hellwig , device-mapper development , Milan Broz , Jens Axboe , linux-pci , Linus Torvalds Subject: [PATCH] iommu/amd: Fix NULL pointer deref on device detach READ FPDMA QUEUED errors since Linux 4.0 Message-ID: <20151009144528.GA27420@8bytes.org> References: <20150929162042.GR3036@8bytes.org> <560BF73F.8000008@maya.org> <20151006101356.GE12506@8bytes.org> <56141507.7040103@maya.org> <20151007154005.GH28811@8bytes.org> <56155028.7060906@maya.org> <20151008173007.GL28811@8bytes.org> <5616BCF4.10104@maya.org> <5616C850.2000906@maya.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <5616C850.2000906@maya.org> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Hi Andreas, On Thu, Oct 08, 2015 at 09:47:28PM +0200, Andreas Hartmann wrote: > This time, the oops was caused by the second PCI card I'm passing > through to another VM (the ath9k card worked fine this time - chance?). > I added the lspci output to the attached file, too. I digged a little bit around here and found a 32bit PCI card and plugged it into the AMD IOMMU box. I could reproduce the problem and here is patch which fixes it for me. Can you test it too please? I'd like to send a pull-req with this fix included to Linus for rc5. Thanks, Joerg From d07307c04edffaaa045fb83713f8808e55ffa895 Mon Sep 17 00:00:00 2001 From: Joerg Roedel Date: Fri, 9 Oct 2015 16:23:33 +0200 Subject: [PATCH] iommu/amd: Fix NULL pointer deref on device detach When a device group is detached from its domain, the iommu core code calls into the iommu driver to detach each device individually. Before this functionality went into the iommu core code, it was implemented in the drivers, also in the AMD IOMMU driver as the device alias handling code. This code is still present, as there might be aliases that don't exist as real PCI devices (and are therefore invisible to the iommu core code). Unfortunatly it might happen now, that a device is unbound multiple times from its domain, first by the alias handling code and then by the iommu core code (or vice verca). This ends up in the do_detach function which dereferences the dev_data->domain pointer. When the device is already detached, this pointer is NULL and we get a kernel oops. Removing the alias code completly is not an option, as that would also remove the code which handles invisible aliases. The code could be simplified, but this is too big of a change outside the merge window. For now, just check the dev_data->domain pointer in do_detach and bail out if it is NULL. Andreas Hartmann Signed-off-by: Joerg Roedel --- drivers/iommu/amd_iommu.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c index f82060e7..08d2775 100644 --- a/drivers/iommu/amd_iommu.c +++ b/drivers/iommu/amd_iommu.c @@ -2006,6 +2006,15 @@ static void do_detach(struct iommu_dev_data *dev_data) { struct amd_iommu *iommu; + /* + * First check if the device is still attached. It might already + * be detached from its domain because the generic + * iommu_detach_group code detached it and we try again here in + * our alias handling. + */ + if (!dev_data->domain) + return; + iommu = amd_iommu_rlookup_table[dev_data->devid]; /* decrease reference counters */