From patchwork Thu Sep 21 10:00:24 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexey Kardashevskiy X-Patchwork-Id: 816799 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3xyXRM10Lnz9t4Z for ; Thu, 21 Sep 2017 20:10:07 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751554AbdIUKKF (ORCPT ); Thu, 21 Sep 2017 06:10:05 -0400 Received: from ozlabs.ru ([107.173.13.209]:34082 "EHLO ozlabs.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751450AbdIUKKF (ORCPT ); Thu, 21 Sep 2017 06:10:05 -0400 X-Greylist: delayed 573 seconds by postgrey-1.27 at vger.kernel.org; Thu, 21 Sep 2017 06:10:05 EDT Received: from vpl1.ozlabs.ibm.com (localhost [IPv6:::1]) by ozlabs.ru (Postfix) with ESMTP id 9790D3A60017; Thu, 21 Sep 2017 06:01:47 -0400 (EDT) From: Alexey Kardashevskiy To: kvm@vger.kernel.org Cc: Alexey Kardashevskiy , David Gibson , kvm-ppc@vger.kernel.org, Alex Williamson Subject: [PATCH kernel] vfio/spapr: Add cond_resched() for huge updates Date: Thu, 21 Sep 2017 20:00:24 +1000 Message-Id: <20170921100024.44924-1-aik@ozlabs.ru> X-Mailer: git-send-email 2.11.0 Sender: kvm-ppc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Clearing very big IOMMU tables can trigger soft lockups. This adds cond_resched() for every million TCE updates. The testcase is POWER9 box with 264GB guest, 4 VFIO devices from independent IOMMU groups, 64K IOMMU pages. This configuration produces 4325376 TCE entries, each entry update incurs 4 OPAL calls to update an individual PE TCE cache. Reducing table size to 4194304 (i.e. 256GB guest) or removing one of 4 VFIO devices makes the problem go away so doing cond_resched() after every million TCE updates seems sufficient. Signed-off-by: Alexey Kardashevskiy Reviewed-by: David Gibson --- drivers/vfio/vfio_iommu_spapr_tce.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c index 63112c36ab2d..be3839ea3150 100644 --- a/drivers/vfio/vfio_iommu_spapr_tce.c +++ b/drivers/vfio/vfio_iommu_spapr_tce.c @@ -502,11 +502,15 @@ static int tce_iommu_clear(struct tce_container *container, struct iommu_table *tbl, unsigned long entry, unsigned long pages) { - unsigned long oldhpa; + unsigned long oldhpa, n; long ret; enum dma_data_direction direction; - for ( ; pages; --pages, ++entry) { + for (n = 0; pages; --pages, ++entry, ++n) { + + if (n && (n % 1000000 == 0)) + cond_resched(); + direction = DMA_NONE; oldhpa = 0; ret = iommu_tce_xchg(tbl, entry, &oldhpa, &direction);