[kernel,v3] vfio/spapr: Add cond_resched() for huge updates

Message ID 20170928091612.20717-1-aik@ozlabs.ru
State Not Applicable
Headers show
Series
  • [kernel,v3] vfio/spapr: Add cond_resched() for huge updates
Related show

Commit Message

Alexey Kardashevskiy Sept. 28, 2017, 9:16 a.m.
Clearing very big IOMMU tables can trigger soft lockups. This adds
cond_resched() to allow the scheduler to do context switching when
it decides to.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---

The testcase is POWER9 box with 264GB guest, 4 VFIO devices from
independent IOMMU groups, 64K IOMMU pages. This configuration produces
4325376 TCE entries, each entry update incurs 4 OPAL calls to update
an individual PE TCE cache; this produced lockups for more than 20s.
Reducing table size to 4194304 (i.e. 256GB guest) or removing one
of 4 VFIO devices makes the problem go away.

---
Changes:
v3:
* cond_resched() checks for should_resched() so we just call resched()
and let the cpu scheduler decide whether to switch or not

v2:
* replaced with time based solution
---
 drivers/vfio/vfio_iommu_spapr_tce.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

David Gibson Sept. 29, 2017, 12:39 a.m. | #1
On Thu, Sep 28, 2017 at 07:16:12PM +1000, Alexey Kardashevskiy wrote:
> Clearing very big IOMMU tables can trigger soft lockups. This adds
> cond_resched() to allow the scheduler to do context switching when
> it decides to.
> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
> 
> The testcase is POWER9 box with 264GB guest, 4 VFIO devices from
> independent IOMMU groups, 64K IOMMU pages. This configuration produces
> 4325376 TCE entries, each entry update incurs 4 OPAL calls to update
> an individual PE TCE cache; this produced lockups for more than 20s.
> Reducing table size to 4194304 (i.e. 256GB guest) or removing one
> of 4 VFIO devices makes the problem go away.
> 
> ---
> Changes:
> v3:
> * cond_resched() checks for should_resched() so we just call resched()
> and let the cpu scheduler decide whether to switch or not
> 
> v2:
> * replaced with time based solution
> ---
>  drivers/vfio/vfio_iommu_spapr_tce.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c
> index 63112c36ab2d..759a5bdd40e1 100644
> --- a/drivers/vfio/vfio_iommu_spapr_tce.c
> +++ b/drivers/vfio/vfio_iommu_spapr_tce.c
> @@ -507,6 +507,8 @@ static int tce_iommu_clear(struct tce_container *container,
>  	enum dma_data_direction direction;
>  
>  	for ( ; pages; --pages, ++entry) {
> +		cond_resched();
> +
>  		direction = DMA_NONE;
>  		oldhpa = 0;
>  		ret = iommu_tce_xchg(tbl, entry, &oldhpa, &direction);
Alex Williamson Sept. 29, 2017, 10:17 p.m. | #2
On Thu, 28 Sep 2017 19:16:12 +1000
Alexey Kardashevskiy <aik@ozlabs.ru> wrote:

> Clearing very big IOMMU tables can trigger soft lockups. This adds
> cond_resched() to allow the scheduler to do context switching when
> it decides to.
> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> 
> The testcase is POWER9 box with 264GB guest, 4 VFIO devices from
> independent IOMMU groups, 64K IOMMU pages. This configuration produces
> 4325376 TCE entries, each entry update incurs 4 OPAL calls to update
> an individual PE TCE cache; this produced lockups for more than 20s.
> Reducing table size to 4194304 (i.e. 256GB guest) or removing one
> of 4 VFIO devices makes the problem go away.
> 
> ---
> Changes:
> v3:
> * cond_resched() checks for should_resched() so we just call resched()
> and let the cpu scheduler decide whether to switch or not
> 
> v2:
> * replaced with time based solution
> ---
>  drivers/vfio/vfio_iommu_spapr_tce.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c
> index 63112c36ab2d..759a5bdd40e1 100644
> --- a/drivers/vfio/vfio_iommu_spapr_tce.c
> +++ b/drivers/vfio/vfio_iommu_spapr_tce.c
> @@ -507,6 +507,8 @@ static int tce_iommu_clear(struct tce_container *container,
>  	enum dma_data_direction direction;
>  
>  	for ( ; pages; --pages, ++entry) {
> +		cond_resched();
> +
>  		direction = DMA_NONE;
>  		oldhpa = 0;
>  		ret = iommu_tce_xchg(tbl, entry, &oldhpa, &direction);

This looks fine to me, I've applied it to my local next branch for
v4.15.  I'll push that branch next week, once I can rebase to
4.14-rc3.  Thanks,

Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c
index 63112c36ab2d..759a5bdd40e1 100644
--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -507,6 +507,8 @@  static int tce_iommu_clear(struct tce_container *container,
 	enum dma_data_direction direction;
 
 	for ( ; pages; --pages, ++entry) {
+		cond_resched();
+
 		direction = DMA_NONE;
 		oldhpa = 0;
 		ret = iommu_tce_xchg(tbl, entry, &oldhpa, &direction);