From patchwork Wed May 3 20:32:42 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Manoj Iyer X-Patchwork-Id: 758191 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) by ozlabs.org (Postfix) with ESMTP id 3wJ8wz16mVz9ryT; Thu, 4 May 2017 06:32:51 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1d60x6-0006MC-FR; Wed, 03 May 2017 20:32:48 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.76) (envelope-from ) id 1d60x2-0006Ir-5k for kernel-team@lists.ubuntu.com; Wed, 03 May 2017 20:32:44 +0000 Received: from 1.general.manjo.us.vpn ([10.172.65.2] helo=canonical.com) by youngberry.canonical.com with esmtpsa (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.76) (envelope-from ) id 1d60x1-0001fq-KN for kernel-team@lists.ubuntu.com; Wed, 03 May 2017 20:32:43 +0000 From: Manoj Iyer To: kernel-team@lists.ubuntu.com Subject: [PATCH 2/6] iommu/dma: Implement PCI allocation optimisation Date: Wed, 3 May 2017 15:32:42 -0500 Message-Id: <20170503203242.32357-1-manoj.iyer@canonical.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20170503203150.32261-1-manoj.iyer@canonical.com> References: <20170503203150.32261-1-manoj.iyer@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.14 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: kernel-team-bounces@lists.ubuntu.com From: Robin Murphy Whilst PCI devices may have 64-bit DMA masks, they still benefit from using 32-bit addresses wherever possible in order to avoid DAC (PCI) or longer address packets (PCIe), which may incur a performance overhead. Implement the same optimisation as other allocators by trying to get a 32-bit address first, only falling back to the full mask if that fails. BugLink: http://bugs.launchpad.net/bugs/1680549 Signed-off-by: Robin Murphy Signed-off-by: Joerg Roedel (cherry picked from commit 122fac030e912ed723fe94d8eb0d5d0f6b31535e) Signed-off-by: Manoj Iyer --- drivers/iommu/dma-iommu.c | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 9aa432e8fbd3..8adff5f83b38 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -281,19 +281,28 @@ int dma_direction_to_prot(enum dma_data_direction dir, bool coherent) } static struct iova *__alloc_iova(struct iommu_domain *domain, size_t size, - dma_addr_t dma_limit) + dma_addr_t dma_limit, struct device *dev) { struct iova_domain *iovad = cookie_iovad(domain); unsigned long shift = iova_shift(iovad); unsigned long length = iova_align(iovad, size) >> shift; + struct iova *iova = NULL; if (domain->geometry.force_aperture) dma_limit = min(dma_limit, domain->geometry.aperture_end); + + /* Try to get PCI devices a SAC address */ + if (dma_limit > DMA_BIT_MASK(32) && dev_is_pci(dev)) + iova = alloc_iova(iovad, length, DMA_BIT_MASK(32) >> shift, + true); /* * Enforce size-alignment to be safe - there could perhaps be an * attribute to control this per-device, or at least per-domain... */ - return alloc_iova(iovad, length, dma_limit >> shift, true); + if (!iova) + iova = alloc_iova(iovad, length, dma_limit >> shift, true); + + return iova; } /* The IOVA allocator knows what we mapped, so just unmap whatever that was */ @@ -446,7 +455,7 @@ struct page **iommu_dma_alloc(struct device *dev, size_t size, gfp_t gfp, if (!pages) return NULL; - iova = __alloc_iova(domain, size, dev->coherent_dma_mask); + iova = __alloc_iova(domain, size, dev->coherent_dma_mask, dev); if (!iova) goto out_free_pages; @@ -517,7 +526,7 @@ static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys, struct iova_domain *iovad = cookie_iovad(domain); size_t iova_off = iova_offset(iovad, phys); size_t len = iova_align(iovad, size + iova_off); - struct iova *iova = __alloc_iova(domain, len, dma_get_mask(dev)); + struct iova *iova = __alloc_iova(domain, len, dma_get_mask(dev), dev); if (!iova) return DMA_ERROR_CODE; @@ -675,7 +684,7 @@ int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, prev = s; } - iova = __alloc_iova(domain, iova_len, dma_get_mask(dev)); + iova = __alloc_iova(domain, iova_len, dma_get_mask(dev), dev); if (!iova) goto out_restore_sg; @@ -755,7 +764,7 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, msi_page->phys = msi_addr; if (iovad) { - iova = __alloc_iova(domain, size, dma_get_mask(dev)); + iova = __alloc_iova(domain, size, dma_get_mask(dev), dev); if (!iova) goto out_free_page; msi_page->iova = iova_dma_addr(iovad, iova);