[v6,2/4] mm: Add functions to support extra actions on swap in/out

Message ID 4c4da87ff45b98e236cdfef66055b876074dabfb.1488232597.git.khalid.aziz@oracle.com
State Changes Requested
Delegated to: David Miller
Headers show

Commit Message

Khalid Aziz Feb. 28, 2017, 6:35 p.m.
If a processor supports special metadata for a page, for example ADI
version tags on SPARC M7, this metadata must be saved when the page is
swapped out. The same metadata must be restored when the page is swapped
back in. This patch adds two new architecture specific functions -
arch_do_swap_page() to be called when a page is swapped in,
arch_unmap_one() to be called when a page is being unmapped for swap
out.

Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>
---
v5:
	- Replaced set_swp_pte() function with new architecture
	  functions arch_do_swap_page() and arch_unmap_one()

 include/asm-generic/pgtable.h | 16 ++++++++++++++++
 mm/memory.c                   |  1 +
 mm/rmap.c                     |  2 ++
 3 files changed, 19 insertions(+)

Comments

jmarchan@redhat.com March 1, 2017, 9:24 a.m. | #1
On 02/28/2017 07:35 PM, Khalid Aziz wrote:
> If a processor supports special metadata for a page, for example ADI
> version tags on SPARC M7, this metadata must be saved when the page is
> swapped out. The same metadata must be restored when the page is swapped
> back in. This patch adds two new architecture specific functions -
> arch_do_swap_page() to be called when a page is swapped in,
> arch_unmap_one() to be called when a page is being unmapped for swap
> out.
> 
> Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
> Cc: Khalid Aziz <khalid@gonehiking.org>

This looks much better than your original version.

Acked-by: Jerome Marchand <jmarchan@redhat.com>

Thanks,
Jerome

> ---
> v5:
> 	- Replaced set_swp_pte() function with new architecture
> 	  functions arch_do_swap_page() and arch_unmap_one()
> 
>  include/asm-generic/pgtable.h | 16 ++++++++++++++++
>  mm/memory.c                   |  1 +
>  mm/rmap.c                     |  2 ++
>  3 files changed, 19 insertions(+)
> 
> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
> index 18af2bc..5764d8f 100644
> --- a/include/asm-generic/pgtable.h
> +++ b/include/asm-generic/pgtable.h
> @@ -282,6 +282,22 @@ static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
>  #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>  #endif
>  
> +#ifndef __HAVE_ARCH_DO_SWAP_PAGE
> +static inline void arch_do_swap_page(struct mm_struct *mm, unsigned long addr,
> +				     pte_t pte, pte_t orig_pte)
> +{
> +
> +}
> +#endif
> +
> +#ifndef __HAVE_ARCH_UNMAP_ONE
> +static inline void arch_unmap_one(struct mm_struct *mm, unsigned long addr,
> +				  pte_t pte, pte_t orig_pte)
> +{
> +
> +}
> +#endif
> +
>  #ifndef __HAVE_ARCH_PGD_OFFSET_GATE
>  #define pgd_offset_gate(mm, addr)	pgd_offset(mm, addr)
>  #endif
> diff --git a/mm/memory.c b/mm/memory.c
> index 6bf2b47..b086c76 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2658,6 +2658,7 @@ int do_swap_page(struct vm_fault *vmf)
>  	if (pte_swp_soft_dirty(vmf->orig_pte))
>  		pte = pte_mksoft_dirty(pte);
>  	set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte);
> +	arch_do_swap_page(vma->vm_mm, vmf->address, pte, vmf->orig_pte);
>  	vmf->orig_pte = pte;
>  	if (page == swapcache) {
>  		do_page_add_anon_rmap(page, vma, vmf->address, exclusive);
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 91619fd..192c41a 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1538,6 +1538,7 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
>  		swp_pte = swp_entry_to_pte(entry);
>  		if (pte_soft_dirty(pteval))
>  			swp_pte = pte_swp_mksoft_dirty(swp_pte);
> +		arch_unmap_one(mm, address, swp_pte, pteval);
>  		set_pte_at(mm, address, pte, swp_pte);
>  	} else if (PageAnon(page)) {
>  		swp_entry_t entry = { .val = page_private(page) };
> @@ -1571,6 +1572,7 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
>  		swp_pte = swp_entry_to_pte(entry);
>  		if (pte_soft_dirty(pteval))
>  			swp_pte = pte_swp_mksoft_dirty(swp_pte);
> +		arch_unmap_one(mm, address, swp_pte, pteval);
>  		set_pte_at(mm, address, pte, swp_pte);
>  	} else
>  		dec_mm_counter(mm, mm_counter_file(page));
>
Dave Hansen March 24, 2017, 6:45 p.m. | #2
On 02/28/2017 10:35 AM, Khalid Aziz wrote:
> diff --git a/mm/memory.c b/mm/memory.c
> index 6bf2b47..b086c76 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2658,6 +2658,7 @@ int do_swap_page(struct vm_fault *vmf)
>  	if (pte_swp_soft_dirty(vmf->orig_pte))
>  		pte = pte_mksoft_dirty(pte);
>  	set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte);
> +	arch_do_swap_page(vma->vm_mm, vmf->address, pte, vmf->orig_pte);
>  	vmf->orig_pte = pte;
>  	if (page == swapcache) {
>  		do_page_add_anon_rmap(page, vma, vmf->address, exclusive);
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 91619fd..192c41a 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1538,6 +1538,7 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
>  		swp_pte = swp_entry_to_pte(entry);
>  		if (pte_soft_dirty(pteval))
>  			swp_pte = pte_swp_mksoft_dirty(swp_pte);
> +		arch_unmap_one(mm, address, swp_pte, pteval);
>  		set_pte_at(mm, address, pte, swp_pte);
>  	} else if (PageAnon(page)) {
>  		swp_entry_t entry = { .val = page_private(page) };
> @@ -1571,6 +1572,7 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
>  		swp_pte = swp_entry_to_pte(entry);
>  		if (pte_soft_dirty(pteval))
>  			swp_pte = pte_swp_mksoft_dirty(swp_pte);
> +		arch_unmap_one(mm, address, swp_pte, pteval);
>  		set_pte_at(mm, address, pte, swp_pte);
>  	} else
>  		dec_mm_counter(mm, mm_counter_file(page));

From a core VM perspective, I'm fine with these hooks.  It's minimally
invasive.  It is missing some explanation in the *code* of why sparc is
doing this and when/why other architectures might want to use these
hooks.  I think that would be awfully nice.

I still think the _current_ SPARC implementation of these hooks is
pretty broken because it doesn't allow more than one ADI tag within a
given page.  But, fixing that is confined to sparc code and shouldn't
affect the core VM or these hooks.

I suspect these hooks are still quite incomplete.  For instance, I do
not think KSM goes through these paths.  Couldn't a process *lose* its
ADI tags when KSM merges an underlying physical page?

I think you need to resolve your outstanding issues (from your 0/4
patch) before anyone can really ack these.  I suspect solving your
issues will change the number and placement of these hooks.

There is no mention in these patches of the effectively reduced virtual
address space.  Why?
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Khalid Aziz March 28, 2017, 10:08 p.m. | #3
On 03/24/2017 12:45 PM, Dave Hansen wrote:
> On 02/28/2017 10:35 AM, Khalid Aziz wrote:
>> diff --git a/mm/memory.c b/mm/memory.c
>> index 6bf2b47..b086c76 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -2658,6 +2658,7 @@ int do_swap_page(struct vm_fault *vmf)
>>  	if (pte_swp_soft_dirty(vmf->orig_pte))
>>  		pte = pte_mksoft_dirty(pte);
>>  	set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte);
>> +	arch_do_swap_page(vma->vm_mm, vmf->address, pte, vmf->orig_pte);
>>  	vmf->orig_pte = pte;
>>  	if (page == swapcache) {
>>  		do_page_add_anon_rmap(page, vma, vmf->address, exclusive);
>> diff --git a/mm/rmap.c b/mm/rmap.c
>> index 91619fd..192c41a 100644
>> --- a/mm/rmap.c
>> +++ b/mm/rmap.c
>> @@ -1538,6 +1538,7 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
>>  		swp_pte = swp_entry_to_pte(entry);
>>  		if (pte_soft_dirty(pteval))
>>  			swp_pte = pte_swp_mksoft_dirty(swp_pte);
>> +		arch_unmap_one(mm, address, swp_pte, pteval);
>>  		set_pte_at(mm, address, pte, swp_pte);
>>  	} else if (PageAnon(page)) {
>>  		swp_entry_t entry = { .val = page_private(page) };
>> @@ -1571,6 +1572,7 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
>>  		swp_pte = swp_entry_to_pte(entry);
>>  		if (pte_soft_dirty(pteval))
>>  			swp_pte = pte_swp_mksoft_dirty(swp_pte);
>> +		arch_unmap_one(mm, address, swp_pte, pteval);
>>  		set_pte_at(mm, address, pte, swp_pte);
>>  	} else
>>  		dec_mm_counter(mm, mm_counter_file(page));
>
> From a core VM perspective, I'm fine with these hooks.  It's minimally
> invasive.  It is missing some explanation in the *code* of why sparc is
> doing this and when/why other architectures might want to use these
> hooks.  I think that would be awfully nice.

Hi Dave,

Thanks for the review.

I will add explanation for these hooks.

>
> I still think the _current_ SPARC implementation of these hooks is
> pretty broken because it doesn't allow more than one ADI tag within a
> given page.  But, fixing that is confined to sparc code and shouldn't
> affect the core VM or these hooks.

Yes, this initial implementation is limited and can be expanded to 
support more than one tag per page in sparc code.

>
> I suspect these hooks are still quite incomplete.  For instance, I do
> not think KSM goes through these paths.  Couldn't a process *lose* its
> ADI tags when KSM merges an underlying physical page?

Good point. I will look into KSM integration. KSM could possibly merge 
two physical pages that have identical contents but different ADI tags 
although that comes into play only if userspace sets the VM_MERGEABLE 
flag on pages it has enabled ADI on. It should be addressed nevertheless.

>
> I think you need to resolve your outstanding issues (from your 0/4
> patch) before anyone can really ack these.  I suspect solving your
> issues will change the number and placement of these hooks.
>
> There is no mention in these patches of the effectively reduced virtual
> address space.  Why?

ADI uses bits 63-60 of VA. VA is already limited to only 56 bits by the 
MMU, so virtual address space is not reduced by this patch. I will add 
this explanation to the patch.

Thanks,
Khalid


--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 18af2bc..5764d8f 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -282,6 +282,22 @@  static inline int pmd_same(pmd_t pmd_a, pmd_t pmd_b)
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 #endif
 
+#ifndef __HAVE_ARCH_DO_SWAP_PAGE
+static inline void arch_do_swap_page(struct mm_struct *mm, unsigned long addr,
+				     pte_t pte, pte_t orig_pte)
+{
+
+}
+#endif
+
+#ifndef __HAVE_ARCH_UNMAP_ONE
+static inline void arch_unmap_one(struct mm_struct *mm, unsigned long addr,
+				  pte_t pte, pte_t orig_pte)
+{
+
+}
+#endif
+
 #ifndef __HAVE_ARCH_PGD_OFFSET_GATE
 #define pgd_offset_gate(mm, addr)	pgd_offset(mm, addr)
 #endif
diff --git a/mm/memory.c b/mm/memory.c
index 6bf2b47..b086c76 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2658,6 +2658,7 @@  int do_swap_page(struct vm_fault *vmf)
 	if (pte_swp_soft_dirty(vmf->orig_pte))
 		pte = pte_mksoft_dirty(pte);
 	set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte);
+	arch_do_swap_page(vma->vm_mm, vmf->address, pte, vmf->orig_pte);
 	vmf->orig_pte = pte;
 	if (page == swapcache) {
 		do_page_add_anon_rmap(page, vma, vmf->address, exclusive);
diff --git a/mm/rmap.c b/mm/rmap.c
index 91619fd..192c41a 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1538,6 +1538,7 @@  static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 		swp_pte = swp_entry_to_pte(entry);
 		if (pte_soft_dirty(pteval))
 			swp_pte = pte_swp_mksoft_dirty(swp_pte);
+		arch_unmap_one(mm, address, swp_pte, pteval);
 		set_pte_at(mm, address, pte, swp_pte);
 	} else if (PageAnon(page)) {
 		swp_entry_t entry = { .val = page_private(page) };
@@ -1571,6 +1572,7 @@  static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 		swp_pte = swp_entry_to_pte(entry);
 		if (pte_soft_dirty(pteval))
 			swp_pte = pte_swp_mksoft_dirty(swp_pte);
+		arch_unmap_one(mm, address, swp_pte, pteval);
 		set_pte_at(mm, address, pte, swp_pte);
 	} else
 		dec_mm_counter(mm, mm_counter_file(page));