diff mbox

[v2,20/20] powerpc/mm: Add speculative page fault

Message ID 1503007519-26777-21-git-send-email-ldufour@linux.vnet.ibm.com (mailing list archive)
State Not Applicable
Headers show

Commit Message

Laurent Dufour Aug. 17, 2017, 10:05 p.m. UTC
This patch enable the speculative page fault on the PowerPC
architecture.

This will try a speculative page fault without holding the mmap_sem,
if it returns with WM_FAULT_RETRY, the mmap_sem is acquired and the
traditional page fault processing is done.

Support is only provide for BOOK3S_64 currently because:
- require CONFIG_PPC_STD_MMU because checks done in
  set_access_flags_filter()
- require BOOK3S because we can't support for book3e_hugetlb_preload()
  called by update_mmu_cache()

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/pgtable.h |  5 +++++
 arch/powerpc/mm/fault.c                      | 30 +++++++++++++++++++++++++++-
 2 files changed, 34 insertions(+), 1 deletion(-)

Comments

Anshuman Khandual Aug. 21, 2017, 6:58 a.m. UTC | #1
On 08/18/2017 03:35 AM, Laurent Dufour wrote:
> This patch enable the speculative page fault on the PowerPC
> architecture.
> 
> This will try a speculative page fault without holding the mmap_sem,
> if it returns with WM_FAULT_RETRY, the mmap_sem is acquired and the

s/WM_FAULT_RETRY/VM_FAULT_RETRY/

> traditional page fault processing is done.
> 
> Support is only provide for BOOK3S_64 currently because:
> - require CONFIG_PPC_STD_MMU because checks done in
>   set_access_flags_filter()

What checks are done in set_access_flags_filter() ? We are just
adding the code block in do_page_fault().


> - require BOOK3S because we can't support for book3e_hugetlb_preload()
>   called by update_mmu_cache()
> 
> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/pgtable.h |  5 +++++
>  arch/powerpc/mm/fault.c                      | 30 +++++++++++++++++++++++++++-
>  2 files changed, 34 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 818a58fc3f4f..897f8b9f67e6 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -313,6 +313,11 @@ extern unsigned long pci_io_base;
>  /* Advertise support for _PAGE_SPECIAL */
>  #define __HAVE_ARCH_PTE_SPECIAL
>  
> +/* Advertise that we call the Speculative Page Fault handler */
> +#if defined(CONFIG_PPC_BOOK3S_64)
> +#define __HAVE_ARCH_CALL_SPF
> +#endif
> +
>  #ifndef __ASSEMBLY__
>  
>  /*
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 4c422632047b..7b3cc4c30eab 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -291,9 +291,36 @@ int do_page_fault(struct pt_regs *regs, unsigned long address,
>  	if (is_write && is_user)
>  		store_update_sp = store_updates_sp(regs);
>  
> -	if (is_user)
> +	if (is_user) {
>  		flags |= FAULT_FLAG_USER;
>  
> +#if defined(__HAVE_ARCH_CALL_SPF)
> +		/* let's try a speculative page fault without grabbing the
> +		 * mmap_sem.
> +		 */
> +
> +		/*
> +		 * flags is set later based on the VMA's flags, for the common
> +		 * speculative service, we need some flags to be set.
> +		 */
> +		if (is_write)
> +			flags |= FAULT_FLAG_WRITE;
> +
> +		fault = handle_speculative_fault(mm, address, flags);
> +		if (!(fault & VM_FAULT_RETRY || fault & VM_FAULT_ERROR)) {
> +			perf_sw_event(PERF_COUNT_SW_SPF_DONE, 1,
> +				      regs, address);
> +			goto done;

Why we should retry with classical page fault on VM_FAULT_ERROR ?
We should always return VM_FAULT_RETRY in case there is a clear
collision some where which requires retry with classical method
and return VM_FAULT_ERROR in cases where we know that it cannot
be retried and fail for good. Should not handle_speculative_fault()
be changed to accommodate this ?

> +		}
> +
> +		/*
> +		 * Resetting flags since the following code assumes
> +		 * FAULT_FLAG_WRITE is not set.
> +		 */
> +		flags &= ~FAULT_FLAG_WRITE;
> +#endif /* defined(__HAVE_ARCH_CALL_SPF) */

Setting and resetting of FAULT_FLAG_WRITE seems confusing. Why you
say that some flags need to be set for handle_speculative_fault()
function. Could you elaborate on this ?
Laurent Dufour Aug. 29, 2017, 3:13 p.m. UTC | #2
On 21/08/2017 08:58, Anshuman Khandual wrote:
> On 08/18/2017 03:35 AM, Laurent Dufour wrote:
>> This patch enable the speculative page fault on the PowerPC
>> architecture.
>>
>> This will try a speculative page fault without holding the mmap_sem,
>> if it returns with WM_FAULT_RETRY, the mmap_sem is acquired and the
> 
> s/WM_FAULT_RETRY/VM_FAULT_RETRY/

Good catch ;)

>> traditional page fault processing is done.
>>
>> Support is only provide for BOOK3S_64 currently because:
>> - require CONFIG_PPC_STD_MMU because checks done in
>>   set_access_flags_filter()
> 
> What checks are done in set_access_flags_filter() ? We are just
> adding the code block in do_page_fault().

set_access_flags_filter() is checking for vm_flags & VM_EXEC which may be
changed in our back, leading to a spurious WARN displayed.
This being said, I focused on the BOOK3S as this meaningful for large
system, and I didn't get time to check for embedded systems.

> 
>> - require BOOK3S because we can't support for book3e_hugetlb_preload()
>>   called by update_mmu_cache()
>>
>> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
>> ---
>>  arch/powerpc/include/asm/book3s/64/pgtable.h |  5 +++++
>>  arch/powerpc/mm/fault.c                      | 30 +++++++++++++++++++++++++++-
>>  2 files changed, 34 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> index 818a58fc3f4f..897f8b9f67e6 100644
>> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
>> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> @@ -313,6 +313,11 @@ extern unsigned long pci_io_base;
>>  /* Advertise support for _PAGE_SPECIAL */
>>  #define __HAVE_ARCH_PTE_SPECIAL
>>  
>> +/* Advertise that we call the Speculative Page Fault handler */
>> +#if defined(CONFIG_PPC_BOOK3S_64)
>> +#define __HAVE_ARCH_CALL_SPF
>> +#endif
>> +
>>  #ifndef __ASSEMBLY__
>>  
>>  /*
>> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
>> index 4c422632047b..7b3cc4c30eab 100644
>> --- a/arch/powerpc/mm/fault.c
>> +++ b/arch/powerpc/mm/fault.c
>> @@ -291,9 +291,36 @@ int do_page_fault(struct pt_regs *regs, unsigned long address,
>>  	if (is_write && is_user)
>>  		store_update_sp = store_updates_sp(regs);
>>  
>> -	if (is_user)
>> +	if (is_user) {
>>  		flags |= FAULT_FLAG_USER;
>>  
>> +#if defined(__HAVE_ARCH_CALL_SPF)
>> +		/* let's try a speculative page fault without grabbing the
>> +		 * mmap_sem.
>> +		 */
>> +
>> +		/*
>> +		 * flags is set later based on the VMA's flags, for the common
>> +		 * speculative service, we need some flags to be set.
>> +		 */
>> +		if (is_write)
>> +			flags |= FAULT_FLAG_WRITE;
>> +
>> +		fault = handle_speculative_fault(mm, address, flags);
>> +		if (!(fault & VM_FAULT_RETRY || fault & VM_FAULT_ERROR)) {
>> +			perf_sw_event(PERF_COUNT_SW_SPF_DONE, 1,
>> +				      regs, address);
>> +			goto done;
> 
> Why we should retry with classical page fault on VM_FAULT_ERROR ?
> We should always return VM_FAULT_RETRY in case there is a clear
> collision some where which requires retry with classical method
> and return VM_FAULT_ERROR in cases where we know that it cannot
> be retried and fail for good. Should not handle_speculative_fault()
> be changed to accommodate this ?

There is no need to change handle_speculative_fault(), it should return
VM_FAULT_RETRY when a retry is required. If VM_FAULT_ERROR is return, we
should be able to jump to the block dealing with VM_FAULT_ERROR and calling
vm_fault_error().


> 
>> +		}
>> +
>> +		/*
>> +		 * Resetting flags since the following code assumes
>> +		 * FAULT_FLAG_WRITE is not set.
>> +		 */
>> +		flags &= ~FAULT_FLAG_WRITE;
>> +#endif /* defined(__HAVE_ARCH_CALL_SPF) */
> 
> Setting and resetting of FAULT_FLAG_WRITE seems confusing. Why you
> say that some flags need to be set for handle_speculative_fault()
> function. Could you elaborate on this ?

FAULT_FLAG_WRITE is required to handle write access. In the case we retry
with the classical path, the flag is reset and will be set later if
!is_exec and is_write.
diff mbox

Patch

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 818a58fc3f4f..897f8b9f67e6 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -313,6 +313,11 @@  extern unsigned long pci_io_base;
 /* Advertise support for _PAGE_SPECIAL */
 #define __HAVE_ARCH_PTE_SPECIAL
 
+/* Advertise that we call the Speculative Page Fault handler */
+#if defined(CONFIG_PPC_BOOK3S_64)
+#define __HAVE_ARCH_CALL_SPF
+#endif
+
 #ifndef __ASSEMBLY__
 
 /*
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 4c422632047b..7b3cc4c30eab 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -291,9 +291,36 @@  int do_page_fault(struct pt_regs *regs, unsigned long address,
 	if (is_write && is_user)
 		store_update_sp = store_updates_sp(regs);
 
-	if (is_user)
+	if (is_user) {
 		flags |= FAULT_FLAG_USER;
 
+#if defined(__HAVE_ARCH_CALL_SPF)
+		/* let's try a speculative page fault without grabbing the
+		 * mmap_sem.
+		 */
+
+		/*
+		 * flags is set later based on the VMA's flags, for the common
+		 * speculative service, we need some flags to be set.
+		 */
+		if (is_write)
+			flags |= FAULT_FLAG_WRITE;
+
+		fault = handle_speculative_fault(mm, address, flags);
+		if (!(fault & VM_FAULT_RETRY || fault & VM_FAULT_ERROR)) {
+			perf_sw_event(PERF_COUNT_SW_SPF_DONE, 1,
+				      regs, address);
+			goto done;
+		}
+
+		/*
+		 * Resetting flags since the following code assumes
+		 * FAULT_FLAG_WRITE is not set.
+		 */
+		flags &= ~FAULT_FLAG_WRITE;
+#endif /* defined(__HAVE_ARCH_CALL_SPF) */
+	}
+
 	/* When running in the kernel we expect faults to occur only to
 	 * addresses in user space.  All other faults represent errors in the
 	 * kernel and should generate an OOPS.  Unfortunately, in the case of an
@@ -479,6 +506,7 @@  int do_page_fault(struct pt_regs *regs, unsigned long address,
 			rc = 0;
 	}
 
+done:
 	/*
 	 * Major/minor page fault accounting.
 	 */