From patchwork Thu Jul 14 04:25:36 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Balbir Singh X-Patchwork-Id: 648185 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3rqjLV69Vbz9sR9 for ; Thu, 14 Jul 2016 14:25:54 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b=jl73oD2s; dkim-atps=neutral Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751170AbcGNEZt (ORCPT ); Thu, 14 Jul 2016 00:25:49 -0400 Received: from mail-pf0-f195.google.com ([209.85.192.195]:36211 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751094AbcGNEZs (ORCPT ); Thu, 14 Jul 2016 00:25:48 -0400 Received: by mail-pf0-f195.google.com with SMTP id i123so4170448pfg.3; Wed, 13 Jul 2016 21:25:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:reply-to:mime-version :content-disposition:user-agent; bh=WyfF3MbybftJg6+ZOZPHbvUd4tOpwLMxjZHg8X7hp34=; b=jl73oD2sil53jOGYpzPiUhtl/Q0XdvrYsoQxm6SEJ7EGycmSc74Iw6aE5qTfMYFxX9 qN1zvm2hs1fcnGQkht8rYtX3+hdju2bFLLMEJweDc1UMDHoyz+YZQNXKUePiPBN48R14 GM0LWn/nDC6qftRbHwEx2wHuh8gE/C4blJgxyW/D3yMm9fs+V4qTN9tnX+QFYLmQD+pj 5FtHVUq390QKzAC1VFAbmycbU4WX1/WkET5JGtJZ18fIUf1ZTUG+i+lOl+snYgi5LH1M SD/3vUUfDsI66qxUOUfWf/qEmv3HX9evmLU6JiBnmE+MfJWW0X2FAD3q41zTKUphEKkX jAWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:reply-to :mime-version:content-disposition:user-agent; bh=WyfF3MbybftJg6+ZOZPHbvUd4tOpwLMxjZHg8X7hp34=; b=UqTL//47k6jVB4pAFiHYhP+AWN1pJYZ7iXN2LoQrAKbh3EL8gwKK8pgSq8ZyJD6tjI TWJBacMSqHGavpfKKqjvay8v/1LM9PrBEuyWaQFYK6JP8t8jjCKksb5plsZ934tZTTIZ bvwXRTvLlOe73urF765TN0Xzw2Bg/1oeUgWXOBUNaWpAS48ZjI4+4BeFdvu+2xNViD0e N3zWoZa54dbcf7Fg1KsfUcjUBFTiNZt2kDjCqA5YQghb4YV1HHI5OWpylzMIAa9U4Du1 Gq1i7a1aQk30R7ULYmbanmpB7tXdzmdSEc1zD2ry/ynZTKcZ5aNkwN/23JsvS9dPdYtY pEkg== X-Gm-Message-State: ALyK8tIIMCfIft/PQif8VhbKeGO4zxet//nzPAaKrXvCW+dDaMg3fQGLCg82GOzxcaWU7Q== X-Received: by 10.98.93.25 with SMTP id r25mr9391770pfb.122.1468470347659; Wed, 13 Jul 2016 21:25:47 -0700 (PDT) Received: from balbir.ozlabs.ibm.com ([122.99.82.10]) by smtp.gmail.com with ESMTPSA id 84sm4211125pfp.59.2016.07.13.21.25.41 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 13 Jul 2016 21:25:45 -0700 (PDT) Date: Thu, 14 Jul 2016 14:25:36 +1000 From: Balbir Singh To: linuxppc-dev@lists.ozlabs.org, kvm-ppc@vger.kernel.org, kvm@vger.kernel.org Cc: "linux-mm@kvack.org" , Benjamin Herrenschmidt , Alexey Kardashevskiy , Paul Mackerras , Michael Ellerman Subject: [RESEND][v2][PATCH] KVM: PPC: Book3S HV: Migrate pinned pages out of CMA Message-ID: <20160714042536.GG18277@balbir.ozlabs.ibm.com> Reply-To: bsingharora@gmail.com MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.6.1 (2016-04-27) Sender: kvm-ppc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org From: Balbir Singh Subject: [RESEND][v2][PATCH] KVM: PPC: Book3S HV: Migrate pinned pages out of CMA When PCI Device pass-through is enabled via VFIO, KVM-PPC will pin pages using get_user_pages_fast(). One of the downsides of the pinning is that the page could be in CMA region. The CMA region is used for other allocations like the hash page table. Ideally we want the pinned pages to be from non CMA region. This patch (currently only for KVM PPC with VFIO) forcefully migrates the pages out (huge pages are omitted for the moment). There are more efficient ways of doing this, but that might be elaborate and might impact a larger audience beyond just the kvm ppc implementation. The magic is in new_iommu_non_cma_page() which allocates the new page from a non CMA region. I've tested the patches lightly at my end, but there might be bugs For example if after lru_add_drain(), the page is not isolated is this a BUG? Previous discussion was at http://permalink.gmane.org/gmane.linux.kernel.mm/136738 Cc: Benjamin Herrenschmidt Cc: Michael Ellerman Cc: Paul Mackerras Cc: Alexey Kardashevskiy Signed-off-by: Balbir Singh Acked-by: Alexey Kardashevskiy --- arch/powerpc/include/asm/mmu_context.h | 1 + arch/powerpc/mm/mmu_context_iommu.c | 80 ++++++++++++++++++++++++++++++++-- 2 files changed, 77 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h index 9d2cd0c..475d1be 100644 --- a/arch/powerpc/include/asm/mmu_context.h +++ b/arch/powerpc/include/asm/mmu_context.h @@ -18,6 +18,7 @@ extern void destroy_context(struct mm_struct *mm); #ifdef CONFIG_SPAPR_TCE_IOMMU struct mm_iommu_table_group_mem_t; +extern int isolate_lru_page(struct page *page); /* from internal.h */ extern bool mm_iommu_preregistered(void); extern long mm_iommu_get(unsigned long ua, unsigned long entries, struct mm_iommu_table_group_mem_t **pmem); diff --git a/arch/powerpc/mm/mmu_context_iommu.c b/arch/powerpc/mm/mmu_context_iommu.c index da6a216..c18f742 100644 --- a/arch/powerpc/mm/mmu_context_iommu.c +++ b/arch/powerpc/mm/mmu_context_iommu.c @@ -15,6 +15,9 @@ #include #include #include +#include +#include +#include #include static DEFINE_MUTEX(mem_list_mutex); @@ -72,6 +75,54 @@ bool mm_iommu_preregistered(void) } EXPORT_SYMBOL_GPL(mm_iommu_preregistered); +/* + * Taken from alloc_migrate_target with changes to remove CMA allocations + */ +struct page *new_iommu_non_cma_page(struct page *page, unsigned long private, + int **resultp) +{ + gfp_t gfp_mask = GFP_USER; + struct page *new_page; + + if (PageHuge(page) || PageTransHuge(page) || PageCompound(page)) + return NULL; + + if (PageHighMem(page)) + gfp_mask |= __GFP_HIGHMEM; + + /* + * We don't want the allocation to force an OOM if possibe + */ + new_page = alloc_page(gfp_mask | __GFP_NORETRY | __GFP_NOWARN); + return new_page; +} + +static int mm_iommu_move_page_from_cma(struct page *page) +{ + int ret; + LIST_HEAD(cma_migrate_pages); + + /* Ignore huge pages for now */ + if (PageHuge(page) || PageTransHuge(page) || PageCompound(page)) + return -EBUSY; + + lru_add_drain(); + ret = isolate_lru_page(page); + if (ret) + get_page(page); /* Potential BUG? */ + + list_add(&page->lru, &cma_migrate_pages); + put_page(page); /* Drop the gup reference */ + + ret = migrate_pages(&cma_migrate_pages, new_iommu_non_cma_page, + NULL, 0, MIGRATE_SYNC, MR_CMA); + if (ret) { + if (!list_empty(&cma_migrate_pages)) + putback_movable_pages(&cma_migrate_pages); + } + return 0; +} + long mm_iommu_get(unsigned long ua, unsigned long entries, struct mm_iommu_table_group_mem_t **pmem) { @@ -124,15 +175,36 @@ long mm_iommu_get(unsigned long ua, unsigned long entries, for (i = 0; i < entries; ++i) { if (1 != get_user_pages_fast(ua + (i << PAGE_SHIFT), 1/* pages */, 1/* iswrite */, &page)) { + ret = -EFAULT; for (j = 0; j < i; ++j) - put_page(pfn_to_page( - mem->hpas[j] >> PAGE_SHIFT)); + put_page(pfn_to_page(mem->hpas[j] >> + PAGE_SHIFT)); vfree(mem->hpas); kfree(mem); - ret = -EFAULT; goto unlock_exit; } - + /* + * If we get a page from the CMA zone, since we are going to + * be pinning these entries, we might as well move them out + * of the CMA zone if possible. NOTE: faulting in + migration + * can be expensive. Batching can be considered later + */ + if (get_pageblock_migratetype(page) == MIGRATE_CMA) { + if (mm_iommu_move_page_from_cma(page)) + goto populate; + if (1 != get_user_pages_fast(ua + (i << PAGE_SHIFT), + 1/* pages */, 1/* iswrite */, + &page)) { + ret = -EFAULT; + for (j = 0; j < i; ++j) + put_page(pfn_to_page(mem->hpas[j] >> + PAGE_SHIFT)); + vfree(mem->hpas); + kfree(mem); + goto unlock_exit; + } + } +populate: mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT; }