diff mbox series

powerpc/mm: Add cond_resched() while removing hpte mappings

Message ID 20210310075938.361656-1-vaibhav@linux.ibm.com (mailing list archive)
State Superseded
Headers show
Series powerpc/mm: Add cond_resched() while removing hpte mappings | expand
Related show

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success Successfully applied on branch powerpc/merge (91966823812efbd175f904599e5cf2a854b39809)
snowpatch_ozlabs/build-ppc64le success Build succeeded
snowpatch_ozlabs/build-ppc64be success Build succeeded
snowpatch_ozlabs/build-ppc64e success Build succeeded
snowpatch_ozlabs/build-pmac32 success Build succeeded
snowpatch_ozlabs/checkpatch success total: 0 errors, 0 warnings, 0 checks, 8 lines checked
snowpatch_ozlabs/needsstable success Patch has no Fixes tags

Commit Message

Vaibhav Jain March 10, 2021, 7:59 a.m. UTC
While removing large number of mappings from hash page tables for
large memory systems as soft-lockup is reported because of the time
spent inside htap_remove_mapping() like one below:

 watchdog: BUG: soft lockup - CPU#8 stuck for 23s!
 <snip>
 NIP plpar_hcall+0x38/0x58
 LR  pSeries_lpar_hpte_invalidate+0x68/0xb0
 Call Trace:
  0x1fffffffffff000 (unreliable)
  pSeries_lpar_hpte_removebolted+0x9c/0x230
  hash__remove_section_mapping+0xec/0x1c0
  remove_section_mapping+0x28/0x3c
  arch_remove_memory+0xfc/0x150
  devm_memremap_pages_release+0x180/0x2f0
  devm_action_release+0x30/0x50
  release_nodes+0x28c/0x300
  device_release_driver_internal+0x16c/0x280
  unbind_store+0x124/0x170
  drv_attr_store+0x44/0x60
  sysfs_kf_write+0x64/0x90
  kernfs_fop_write+0x1b0/0x290
  __vfs_write+0x3c/0x70
  vfs_write+0xd4/0x270
  ksys_write+0xdc/0x130
  system_call+0x5c/0x70

Fix this by adding a cond_resched() to the loop in
htap_remove_mapping() that issues hcall to remove hpte mapping. This
should prevent the soft-lockup from being reported.

Suggested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
 arch/powerpc/mm/book3s64/hash_utils.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Christophe Leroy March 10, 2021, 9:14 a.m. UTC | #1
Le 10/03/2021 à 08:59, Vaibhav Jain a écrit :
> While removing large number of mappings from hash page tables for
> large memory systems as soft-lockup is reported because of the time
> spent inside htap_remove_mapping() like one below:
> 
>   watchdog: BUG: soft lockup - CPU#8 stuck for 23s!
>   <snip>
>   NIP plpar_hcall+0x38/0x58
>   LR  pSeries_lpar_hpte_invalidate+0x68/0xb0
>   Call Trace:
>    0x1fffffffffff000 (unreliable)
>    pSeries_lpar_hpte_removebolted+0x9c/0x230
>    hash__remove_section_mapping+0xec/0x1c0
>    remove_section_mapping+0x28/0x3c
>    arch_remove_memory+0xfc/0x150
>    devm_memremap_pages_release+0x180/0x2f0
>    devm_action_release+0x30/0x50
>    release_nodes+0x28c/0x300
>    device_release_driver_internal+0x16c/0x280
>    unbind_store+0x124/0x170
>    drv_attr_store+0x44/0x60
>    sysfs_kf_write+0x64/0x90
>    kernfs_fop_write+0x1b0/0x290
>    __vfs_write+0x3c/0x70
>    vfs_write+0xd4/0x270
>    ksys_write+0xdc/0x130
>    system_call+0x5c/0x70
> 
> Fix this by adding a cond_resched() to the loop in
> htap_remove_mapping() that issues hcall to remove hpte mapping. This
> should prevent the soft-lockup from being reported.

Isn't it overkill to call is at each iteration ?

Looking at a few other places, there is some mitigation. For instance fadump_free_reserved_memory() 
does it based on elapsed time. Another exemple is drmem_lmb_next() doing it every 16 iteration.


> 
> Suggested-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
> ---
>   arch/powerpc/mm/book3s64/hash_utils.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
> index 581b20a2feaf..ea3945c70b18 100644
> --- a/arch/powerpc/mm/book3s64/hash_utils.c
> +++ b/arch/powerpc/mm/book3s64/hash_utils.c
> @@ -359,6 +359,8 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend,
>   		}
>   		if (rc < 0)
>   			return rc;
> +
> +		cond_resched();
>   	}
>   
>   	return ret;
> 

Christophe
Michael Ellerman March 10, 2021, 9:17 a.m. UTC | #2
Vaibhav Jain <vaibhav@linux.ibm.com> writes:
> While removing large number of mappings from hash page tables for
> large memory systems as soft-lockup is reported because of the time
> spent inside htap_remove_mapping() like one below:
>
>  watchdog: BUG: soft lockup - CPU#8 stuck for 23s!
>  <snip>
>  NIP plpar_hcall+0x38/0x58
>  LR  pSeries_lpar_hpte_invalidate+0x68/0xb0
>  Call Trace:
>   0x1fffffffffff000 (unreliable)
>   pSeries_lpar_hpte_removebolted+0x9c/0x230
>   hash__remove_section_mapping+0xec/0x1c0
>   remove_section_mapping+0x28/0x3c
>   arch_remove_memory+0xfc/0x150
>   devm_memremap_pages_release+0x180/0x2f0
>   devm_action_release+0x30/0x50
>   release_nodes+0x28c/0x300
>   device_release_driver_internal+0x16c/0x280
>   unbind_store+0x124/0x170
>   drv_attr_store+0x44/0x60
>   sysfs_kf_write+0x64/0x90
>   kernfs_fop_write+0x1b0/0x290
>   __vfs_write+0x3c/0x70
>   vfs_write+0xd4/0x270
>   ksys_write+0xdc/0x130
>   system_call+0x5c/0x70
>
> Fix this by adding a cond_resched() to the loop in
> htap_remove_mapping() that issues hcall to remove hpte mapping. This
> should prevent the soft-lockup from being reported.

Can/should we also/instead be using H_BLOCK_REMOVE?

cheers
Vaibhav Jain April 6, 2021, 4:30 a.m. UTC | #3
Hi Mpe,

Thanks for looking into this patch.

Michael Ellerman <mpe@ellerman.id.au> writes:

> Vaibhav Jain <vaibhav@linux.ibm.com> writes:
>> While removing large number of mappings from hash page tables for
>> large memory systems as soft-lockup is reported because of the time
>> spent inside htap_remove_mapping() like one below:
>>
>>  watchdog: BUG: soft lockup - CPU#8 stuck for 23s!
>>  <snip>
>>  NIP plpar_hcall+0x38/0x58
>>  LR  pSeries_lpar_hpte_invalidate+0x68/0xb0
>>  Call Trace:
>>   0x1fffffffffff000 (unreliable)
>>   pSeries_lpar_hpte_removebolted+0x9c/0x230
>>   hash__remove_section_mapping+0xec/0x1c0
>>   remove_section_mapping+0x28/0x3c
>>   arch_remove_memory+0xfc/0x150
>>   devm_memremap_pages_release+0x180/0x2f0
>>   devm_action_release+0x30/0x50
>>   release_nodes+0x28c/0x300
>>   device_release_driver_internal+0x16c/0x280
>>   unbind_store+0x124/0x170
>>   drv_attr_store+0x44/0x60
>>   sysfs_kf_write+0x64/0x90
>>   kernfs_fop_write+0x1b0/0x290
>>   __vfs_write+0x3c/0x70
>>   vfs_write+0xd4/0x270
>>   ksys_write+0xdc/0x130
>>   system_call+0x5c/0x70
>>
>> Fix this by adding a cond_resched() to the loop in
>> htap_remove_mapping() that issues hcall to remove hpte mapping. This
>> should prevent the soft-lockup from being reported.
>
> Can/should we also/instead be using H_BLOCK_REMOVE?
>
> cheers

Current mmp_ops implementation seems to use H_BULK_REMOVE for hugepages
so for removing mappings for regular pages I am looking into adding a
new mmu_op that can take a range to be unmapped and 
I did try implmenting a new mmu_op for this which can reduce the number
of hash_pte lookups needed to invalidate this range. But that would need
some more work so as a stop gap I have sent out a v2 with Christophe's
suggestion to add a cond_resched() every HZ interval.
diff mbox series

Patch

diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index 581b20a2feaf..ea3945c70b18 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -359,6 +359,8 @@  int htab_remove_mapping(unsigned long vstart, unsigned long vend,
 		}
 		if (rc < 0)
 			return rc;
+
+		cond_resched();
 	}
 
 	return ret;