diff mbox series

[v12,02/11] mm, swap: Add infrastructure for saving page metadata on swap

Message ID f5316c71e645d99ffdd52963f1e9675de3fc6386.1519227112.git.khalid.aziz@oracle.com
State Accepted
Delegated to: David Miller
Headers show
Series Application Data Integrity feature introduced by SPARC M7 | expand

Commit Message

Khalid Aziz Feb. 21, 2018, 5:15 p.m. UTC
If a processor supports special metadata for a page, for example ADI
version tags on SPARC M7, this metadata must be saved when the page is
swapped out. The same metadata must be restored when the page is swapped
back in. This patch adds two new architecture specific functions -
arch_do_swap_page() to be called when a page is swapped in, and
arch_unmap_one() to be called when a page is being unmapped for swap
out. These architecture hooks allow page metadata to be saved if the
architecture supports it.

Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Khalid Aziz <khalid@gonehiking.org>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>
---
v8:
	- Fixed an erroneous "}"
v6:
	- Updated parameter list for arch_do_swap_page() and
	  arch_unmap_one()
v5:
	- Replaced set_swp_pte() function with new architecture
	  functions arch_do_swap_page() and arch_unmap_one()

 include/asm-generic/pgtable.h | 36 ++++++++++++++++++++++++++++++++++++
 mm/memory.c                   |  1 +
 mm/rmap.c                     | 14 ++++++++++++++
 3 files changed, 51 insertions(+)

Comments

Dave Hansen March 5, 2018, 7:20 p.m. UTC | #1
On 02/21/2018 09:15 AM, Khalid Aziz wrote:
> If a processor supports special metadata for a page, for example ADI
> version tags on SPARC M7, this metadata must be saved when the page is
> swapped out. The same metadata must be restored when the page is swapped
> back in. This patch adds two new architecture specific functions -
> arch_do_swap_page() to be called when a page is swapped in, and
> arch_unmap_one() to be called when a page is being unmapped for swap
> out. These architecture hooks allow page metadata to be saved if the
> architecture supports it.

I still think silently squishing cacheline-level hardware data into
page-level software data structures is dangerous.

But, you seem rather determined to do it this way.  I don't think this
will _hurt_ anyone else, though other than needlessly cluttering up the
code.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Khalid Aziz March 5, 2018, 7:29 p.m. UTC | #2
On 03/05/2018 12:20 PM, Dave Hansen wrote:
> On 02/21/2018 09:15 AM, Khalid Aziz wrote:
>> If a processor supports special metadata for a page, for example ADI
>> version tags on SPARC M7, this metadata must be saved when the page is
>> swapped out. The same metadata must be restored when the page is swapped
>> back in. This patch adds two new architecture specific functions -
>> arch_do_swap_page() to be called when a page is swapped in, and
>> arch_unmap_one() to be called when a page is being unmapped for swap
>> out. These architecture hooks allow page metadata to be saved if the
>> architecture supports it.
> 
> I still think silently squishing cacheline-level hardware data into
> page-level software data structures is dangerous.
> 
> But, you seem rather determined to do it this way.  I don't think this
> will _hurt_ anyone else, though other than needlessly cluttering up the
> code.

Hello Dave,

Thanks for taking the time to look at this patch and providing feedback.

ADI data is per page data and is held in the spare bits in the RAM. It 
is loaded into the cache when data is loaded from RAM and flushed out to 
spare bits in the RAM when data is flushed from cache. Sparc allows one 
tag for each ADI block size of data and ADI block size is same as 
cacheline size. When a page is loaded into RAM from swap space, all of 
the associated ADI data for the page must also be loaded into the RAM, 
so it looks like page level data and storing it in page level software 
data structure makes sense. I am open to other suggestions though.

Thanks,
Khalid
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Hansen March 5, 2018, 7:35 p.m. UTC | #3
On 03/05/2018 11:29 AM, Khalid Aziz wrote:
> ADI data is per page data and is held in the spare bits in the RAM. It
> is loaded into the cache when data is loaded from RAM and flushed out to
> spare bits in the RAM when data is flushed from cache. Sparc allows one
> tag for each ADI block size of data and ADI block size is same as
> cacheline size.

Which does not square with your earlier assertion "ADI data is per page
data".  It's per-cacheline data.  Right?

> When a page is loaded into RAM from swap space, all of
> the associated ADI data for the page must also be loaded into the RAM,
> so it looks like page level data and storing it in page level software
> data structure makes sense. I am open to other suggestions though.

Do you have a way to tell that data is not being thrown away?  Like if
the ADI metadata is different for two different cachelines within a
single page?
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Khalid Aziz March 5, 2018, 8:28 p.m. UTC | #4
On 03/05/2018 12:35 PM, Dave Hansen wrote:
> On 03/05/2018 11:29 AM, Khalid Aziz wrote:
>> ADI data is per page data and is held in the spare bits in the RAM. It
>> is loaded into the cache when data is loaded from RAM and flushed out to
>> spare bits in the RAM when data is flushed from cache. Sparc allows one
>> tag for each ADI block size of data and ADI block size is same as
>> cacheline size.
> 
> Which does not square with your earlier assertion "ADI data is per page
> data".  It's per-cacheline data.  Right?

That is one way to look at it. Current sparc processors do implement 
same ADI block size as cacheline size but architecture does not require 
ADI block size to be same as cacheline size. If those two sizes were 
different, we wouldn't call it cacheline data.

> 
>> When a page is loaded into RAM from swap space, all of
>> the associated ADI data for the page must also be loaded into the RAM,
>> so it looks like page level data and storing it in page level software
>> data structure makes sense. I am open to other suggestions though.
> 
> Do you have a way to tell that data is not being thrown away?  Like if
> the ADI metadata is different for two different cachelines within a
> single page?

Yes, since access to tagged data is made using pointers with ADI tag 
embedded in the top bits, any mismatch between what app thinks the ADI 
tags should be and what is stored in the RAM for corresponding page will 
result in exception. If ADI data gets thrown away, we will get an ADI 
tag mismatch exception. If ADI tags for two different ADI blocks on a 
page are different when app expected them to be the same, we will see an 
exception on access to the block with wrong ADI data.

--
Khalid
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Hansen March 5, 2018, 9:04 p.m. UTC | #5
On 03/05/2018 12:28 PM, Khalid Aziz wrote:
>> Do you have a way to tell that data is not being thrown away?  Like if
>> the ADI metadata is different for two different cachelines within a
>> single page?
> 
> Yes, since access to tagged data is made using pointers with ADI tag
> embedded in the top bits, any mismatch between what app thinks the ADI
> tags should be and what is stored in the RAM for corresponding page will
> result in exception. If ADI data gets thrown away, we will get an ADI
> tag mismatch exception. If ADI tags for two different ADI blocks on a
> page are different when app expected them to be the same, we will see an
> exception on access to the block with wrong ADI data.

So, when an app has two different ADI tags on two parts of a page, the
page gets swapped, and the ADI block size is under PAGE_SIZE, the app
will get an ADI exception after swap-in through no fault of its own?

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Khalid Aziz March 5, 2018, 9:14 p.m. UTC | #6
On 03/05/2018 02:04 PM, Dave Hansen wrote:
> On 03/05/2018 12:28 PM, Khalid Aziz wrote:
>>> Do you have a way to tell that data is not being thrown away?  Like if
>>> the ADI metadata is different for two different cachelines within a
>>> single page?
>>
>> Yes, since access to tagged data is made using pointers with ADI tag
>> embedded in the top bits, any mismatch between what app thinks the ADI
>> tags should be and what is stored in the RAM for corresponding page will
>> result in exception. If ADI data gets thrown away, we will get an ADI
>> tag mismatch exception. If ADI tags for two different ADI blocks on a
>> page are different when app expected them to be the same, we will see an
>> exception on access to the block with wrong ADI data.
> 
> So, when an app has two different ADI tags on two parts of a page, the
> page gets swapped, and the ADI block size is under PAGE_SIZE, the app
> will get an ADI exception after swap-in through no fault of its own?
> 

Only if the kernel fails to re-establish ADI tags on the swapped in page 
which is why I added infrastructure to save the ADI tags for a page 
before it is swapped out and then re-establish those tags when the page 
is swapped back in. Kernel needs to save as many as ADI TAGS as may 
exist on each page, not just one tag per page. On sparc M7 8K pages, 
there are 128 ADI tags for the page, so kernel will store and restore 
128 ADI tags for each page on swap-out and swap-in. If kernel restores 
only one ADI tag for the page on swap in, app will get an exception and 
it will be kernel's fault.

--
Khalid
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrew Morton March 6, 2018, 10:47 p.m. UTC | #7
On Wed, 21 Feb 2018 10:15:44 -0700 Khalid Aziz <khalid.aziz@oracle.com> wrote:

> If a processor supports special metadata for a page, for example ADI
> version tags on SPARC M7, this metadata must be saved when the page is
> swapped out. The same metadata must be restored when the page is swapped
> back in. This patch adds two new architecture specific functions -
> arch_do_swap_page() to be called when a page is swapped in, and
> arch_unmap_one() to be called when a page is being unmapped for swap
> out. These architecture hooks allow page metadata to be saved if the
> architecture supports it.
> 
> Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
> Cc: Khalid Aziz <khalid@gonehiking.org>
> Acked-by: Jerome Marchand <jmarchan@redhat.com>
> Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com>

Acked-by: Andrew Morton <akpm@linux-foundation.org>
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox series

Patch

diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
index 2cfa3075d148..6fbbc0b6c05e 100644
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -400,6 +400,42 @@  static inline int pud_same(pud_t pud_a, pud_t pud_b)
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 #endif
 
+#ifndef __HAVE_ARCH_DO_SWAP_PAGE
+/*
+ * Some architectures support metadata associated with a page. When a
+ * page is being swapped out, this metadata must be saved so it can be
+ * restored when the page is swapped back in. SPARC M7 and newer
+ * processors support an ADI (Application Data Integrity) tag for the
+ * page as metadata for the page. arch_do_swap_page() can restore this
+ * metadata when a page is swapped back in.
+ */
+static inline void arch_do_swap_page(struct mm_struct *mm,
+				     struct vm_area_struct *vma,
+				     unsigned long addr,
+				     pte_t pte, pte_t oldpte)
+{
+
+}
+#endif
+
+#ifndef __HAVE_ARCH_UNMAP_ONE
+/*
+ * Some architectures support metadata associated with a page. When a
+ * page is being swapped out, this metadata must be saved so it can be
+ * restored when the page is swapped back in. SPARC M7 and newer
+ * processors support an ADI (Application Data Integrity) tag for the
+ * page as metadata for the page. arch_unmap_one() can save this
+ * metadata on a swap-out of a page.
+ */
+static inline int arch_unmap_one(struct mm_struct *mm,
+				  struct vm_area_struct *vma,
+				  unsigned long addr,
+				  pte_t orig_pte)
+{
+	return 0;
+}
+#endif
+
 #ifndef __HAVE_ARCH_PGD_OFFSET_GATE
 #define pgd_offset_gate(mm, addr)	pgd_offset(mm, addr)
 #endif
diff --git a/mm/memory.c b/mm/memory.c
index 5fcfc24904d1..aed37325d94e 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3053,6 +3053,7 @@  int do_swap_page(struct vm_fault *vmf)
 	if (pte_swp_soft_dirty(vmf->orig_pte))
 		pte = pte_mksoft_dirty(pte);
 	set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte);
+	arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte);
 	vmf->orig_pte = pte;
 
 	/* ksm created a completely new copy */
diff --git a/mm/rmap.c b/mm/rmap.c
index 47db27f8049e..144c66e688a9 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1497,6 +1497,14 @@  static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 				(flags & (TTU_MIGRATION|TTU_SPLIT_FREEZE))) {
 			swp_entry_t entry;
 			pte_t swp_pte;
+
+			if (arch_unmap_one(mm, vma, address, pteval) < 0) {
+				set_pte_at(mm, address, pvmw.pte, pteval);
+				ret = false;
+				page_vma_mapped_walk_done(&pvmw);
+				break;
+			}
+
 			/*
 			 * Store the pfn of the page in a special migration
 			 * pte. do_swap_page() will wait until the migration
@@ -1556,6 +1564,12 @@  static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 				page_vma_mapped_walk_done(&pvmw);
 				break;
 			}
+			if (arch_unmap_one(mm, vma, address, pteval) < 0) {
+				set_pte_at(mm, address, pvmw.pte, pteval);
+				ret = false;
+				page_vma_mapped_walk_done(&pvmw);
+				break;
+			}
 			if (list_empty(&mm->mmlist)) {
 				spin_lock(&mmlist_lock);
 				if (list_empty(&mm->mmlist))