Patchwork [3.5.y.z,extended,stable] Patch "hugetlbfs: add swap entry check in follow_hugetlb_page()" has been added to staging queue

mail settings
Submitter Luis Henriques
Date April 22, 2013, 12:38 p.m.
Message ID <>
Download mbox | patch
Permalink /patch/238880/
State New
Headers show


Luis Henriques - April 22, 2013, 12:38 p.m.
This is a note to let you know that I have just added a patch titled

    hugetlbfs: add swap entry check in follow_hugetlb_page()

to the linux-3.5.y-queue branch of the 3.5.y.z extended stable tree 
which can be found at:;a=shortlog;h=refs/heads/linux-3.5.y-queue

If you, or anyone else, feels it should not be added to this tree, please 
reply to this email.

For more information about the 3.5.y.z tree, see



From 62709d30044278c7f850b13afd02254fee2beb6a Mon Sep 17 00:00:00 2001
From: Naoya Horiguchi <>
Date: Wed, 17 Apr 2013 15:58:30 -0700
Subject: [PATCH] hugetlbfs: add swap entry check in follow_hugetlb_page()

commit 9cc3a5bd40067b9a0fbd49199d0780463fc2140f upstream.

With applying the previous patch "hugetlbfs: stop setting VM_DONTDUMP in
initializing vma(VM_HUGETLB)" to reenable hugepage coredump, if a memory
error happens on a hugepage and the affected processes try to access the
error hugepage, we hit VM_BUG_ON(atomic_read(&page->_count) <= 0) in

The reason for this bug is that coredump-related code doesn't recognise
"hugepage hwpoison entry" with which a pmd entry is replaced when a memory
error occurs on a hugepage.

In other words, physical address information is stored in different bit
layout between hugepage hwpoison entry and pmd entry, so
follow_hugetlb_page() which is called in get_dump_page() returns a wrong
page from a given address.

The expected behavior is like this:

  absent   is_swap_pte   FOLL_DUMP   Expected behavior
   true     false         false       hugetlb_fault
   false    true          false       hugetlb_fault
   false    false         false       return page
   true     false         true        skip page (to avoid allocation)
   false    true          true        hugetlb_fault
   false    false         true        return page

With this patch, we can call hugetlb_fault() and take proper actions (we
wait for migration entries, fail with VM_FAULT_HWPOISON_LARGE for
hwpoisoned entries,) and as the result we can dump all hugepages except
for hwpoisoned ones.

Signed-off-by: Naoya Horiguchi <>
Cc: Rik van Riel <>
Acked-by: Michal Hocko <>
Cc: HATAYAMA Daisuke <>
Acked-by: KOSAKI Motohiro <>
Acked-by: David Rientjes <>
Signed-off-by: Andrew Morton <>
Signed-off-by: Linus Torvalds <>
Signed-off-by: Luis Henriques <>
 mm/hugetlb.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)



diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f30e463..74b8327 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2907,7 +2907,17 @@  int follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma,

-		if (absent ||
+		/*
+		 * We need call hugetlb_fault for both hugepages under migration
+		 * (in which case hugetlb_fault waits for the migration,) and
+		 * hwpoisoned hugepages (in which case we need to prevent the
+		 * caller from accessing to them.) In order to do this, we use
+		 * here is_swap_pte instead of is_hugetlb_entry_migration and
+		 * is_hugetlb_entry_hwpoisoned. This is because it simply covers
+		 * both cases, and because we can't follow correct pages
+		 * directly from any kind of swap entries.
+		 */
+		if (absent || is_swap_pte(huge_ptep_get(pte)) ||
 		    ((flags & FOLL_WRITE) && !pte_write(huge_ptep_get(pte)))) {
 			int ret;