diff mbox series

[v2,03/11] mm/gup: Applies counting method to monitor gup_pgd_range

Message ID 20190920195047.7703-4-leonardo@linux.ibm.com (mailing list archive)
State Superseded, archived
Headers show
Series Introduces new count-based method for monitoring lockless pagetable wakls | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success Successfully applied on branch next (39d90246212423715005d06e4deac033174bae44)
snowpatch_ozlabs/checkpatch success total: 0 errors, 0 warnings, 0 checks, 19 lines checked

Commit Message

Leonardo Bras Sept. 20, 2019, 7:50 p.m. UTC
As decribed, gup_pgd_range is a lockless pagetable walk. So, in order to
monitor against THP split/collapse with the couting method, it's necessary
to bound it with {start,end}_lockless_pgtbl_walk.

There are dummy functions, so it is not going to add any overhead on archs
that don't use this method.

Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com>
---
 mm/gup.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

John Hubbard Sept. 23, 2019, 8:27 p.m. UTC | #1
On 9/20/19 12:50 PM, Leonardo Bras wrote:
> As decribed, gup_pgd_range is a lockless pagetable walk. So, in order to
> monitor against THP split/collapse with the couting method, it's necessary
> to bound it with {start,end}_lockless_pgtbl_walk.
> 
> There are dummy functions, so it is not going to add any overhead on archs
> that don't use this method.
> 
> Signed-off-by: Leonardo Bras <leonardo@linux.ibm.com>
> ---
>  mm/gup.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/mm/gup.c b/mm/gup.c
> index 98f13ab37bac..675e4be27082 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -2404,6 +2404,7 @@ int get_user_pages_fast(unsigned long start, int nr_pages,
>  			unsigned int gup_flags, struct page **pages)
>  {
>  	unsigned long addr, len, end;
> +	struct mm_struct *mm;
>  	int nr = 0, ret = 0;
>  
>  	if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM)))
> @@ -2421,9 +2422,12 @@ int get_user_pages_fast(unsigned long start, int nr_pages,
>  
>  	if (IS_ENABLED(CONFIG_HAVE_FAST_GUP) &&
>  	    gup_fast_permitted(start, end)) {
> +		mm = current->mm;
> +		start_lockless_pgtbl_walk(mm);
>  		local_irq_disable();

I'd also like a second opinion from the "core" -mm maintainers, but it seems like
there is now too much code around the gup_pgd_range() call. Especially since there
are two places where it's called--did you forget the other one in 
__get_user_pages_fast(), btw??

Maybe the irq handling and atomic counting should be moved into start/finish
calls, like this:

     start_gup_fast_walk()
     gup_pgd_range()
     finish_gup_fast_walk()

>  		gup_pgd_range(addr, end, gup_flags, pages, &nr);
>  		local_irq_enable();
> +		end_lockless_pgtbl_walk(mm);
>  		ret = nr;
>  	}
>  
> 



thanks,
Leonardo Bras Sept. 23, 2019, 9:01 p.m. UTC | #2
On Mon, 2019-09-23 at 13:27 -0700, John Hubbard wrote:
> I'd also like a second opinion from the "core" -mm maintainers, but it seems like
> there is now too much code around the gup_pgd_range() call. Especially since there
> are two places where it's called--did you forget the other one in 
> __get_user_pages_fast(), btw??
> 
Oh, sorry, I missed this one. I will put it on v3.
(Also I will make sure to include linux-mm on v3.)

> Maybe the irq handling and atomic counting should be moved into start/finish
> calls, like this:
> 
>      start_gup_fast_walk()
>      gup_pgd_range()
>      finish_gup_fast_walk()

There are cases where interrupt disable/enable is not done around the
lockless pagetable walk.
It may come from functions called above on stack, that's why I opted it
to be only the atomic operation.
John Hubbard Sept. 23, 2019, 9:09 p.m. UTC | #3
On 9/23/19 2:01 PM, Leonardo Bras wrote:
> On Mon, 2019-09-23 at 13:27 -0700, John Hubbard wrote:
>> I'd also like a second opinion from the "core" -mm maintainers, but it seems like
>> there is now too much code around the gup_pgd_range() call. Especially since there
>> are two places where it's called--did you forget the other one in 
>> __get_user_pages_fast(), btw??
>>
> Oh, sorry, I missed this one. I will put it on v3.
> (Also I will make sure to include linux-mm on v3.)
> 
>> Maybe the irq handling and atomic counting should be moved into start/finish
>> calls, like this:
>>
>>      start_gup_fast_walk()
>>      gup_pgd_range()
>>      finish_gup_fast_walk()
> 
> There are cases where interrupt disable/enable is not done around the
> lockless pagetable walk.
> It may come from functions called above on stack, that's why I opted it
> to be only the atomic operation.
> 

That doesn't prevent you from writing the above as shown, though, for mm/gup.c.

(Also, let's see on the other thread if it is even valid to be indicating
a lockless walk, without also disabling interrupts.)

thanks,
diff mbox series

Patch

diff --git a/mm/gup.c b/mm/gup.c
index 98f13ab37bac..675e4be27082 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2404,6 +2404,7 @@  int get_user_pages_fast(unsigned long start, int nr_pages,
 			unsigned int gup_flags, struct page **pages)
 {
 	unsigned long addr, len, end;
+	struct mm_struct *mm;
 	int nr = 0, ret = 0;
 
 	if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM)))
@@ -2421,9 +2422,12 @@  int get_user_pages_fast(unsigned long start, int nr_pages,
 
 	if (IS_ENABLED(CONFIG_HAVE_FAST_GUP) &&
 	    gup_fast_permitted(start, end)) {
+		mm = current->mm;
+		start_lockless_pgtbl_walk(mm);
 		local_irq_disable();
 		gup_pgd_range(addr, end, gup_flags, pages, &nr);
 		local_irq_enable();
+		end_lockless_pgtbl_walk(mm);
 		ret = nr;
 	}