diff mbox series

[v5,22/22] powerpc/mm: Add speculative page fault

Message ID 1507729966-10660-23-git-send-email-ldufour@linux.vnet.ibm.com (mailing list archive)
State Not Applicable
Headers show
Series Speculative page faults | expand

Commit Message

Laurent Dufour Oct. 11, 2017, 1:52 p.m. UTC
This patch enable the speculative page fault on the PowerPC
architecture.

This will try a speculative page fault without holding the mmap_sem,
if it returns with VM_FAULT_RETRY, the mmap_sem is acquired and the
traditional page fault processing is done.

Build on if CONFIG_SPF is defined (currently for BOOK3S_64 && SMP).

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
---
 arch/powerpc/mm/fault.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

Comments

kemi Oct. 26, 2017, 8:14 a.m. UTC | #1
Some regression is found by LKP-tools(linux kernel performance) on this patch series
tested on Intel 2s/4s Skylake platform. 
The regression result is sorted by the metric will-it-scale.per_process_ops.

Branch:Laurent-Dufour/Speculative-page-faults/20171011-213456(V4 patch series)
Commit id:
     base:9a4b4dd1d8700dd5771f11dd2c048e4363efb493
     head:56a4a8962fb32555a42eefdc9a19eeedd3e8c2e6
Benchmark suite:will-it-scale
Download link:https://github.com/antonblanchard/will-it-scale/tree/master/tests
Metrics:
     will-it-scale.per_process_ops=processes/nr_cpu
     will-it-scale.per_thread_ops=threads/nr_cpu

tbox:lkp-skl-4sp1(nr_cpu=192,memory=768G)
kconfig:CONFIG_TRANSPARENT_HUGEPAGE is not set
testcase        base            change          head            metric                   
brk1            2251803         -18.1%          1843535         will-it-scale.per_process_ops
                341101          -17.5%          281284          will-it-scale.per_thread_ops
malloc1         48833           -9.2%           44343           will-it-scale.per_process_ops
                31555           +2.9%           32473           will-it-scale.per_thread_ops
page_fault3     913019          -8.5%           835203          will-it-scale.per_process_ops
                233978          -18.1%          191593          will-it-scale.per_thread_ops
mmap2           95892           -6.6%           89536           will-it-scale.per_process_ops
                90180           -13.7%          77803           will-it-scale.per_thread_ops
mmap1           109586          -4.7%           104414          will-it-scale.per_process_ops
                104477          -12.4%          91484           will-it-scale.per_thread_ops
sched_yield     4964649         -2.1%           4859927         will-it-scale.per_process_ops
                4946759         -1.7%           4864924         will-it-scale.per_thread_ops
write1          1345159         -1.3%           1327719         will-it-scale.per_process_ops
                1228754         -2.2%           1201915         will-it-scale.per_thread_ops
page_fault2     202519          -1.0%           200545          will-it-scale.per_process_ops
                96573           -10.4%          86526           will-it-scale.per_thread_ops
page_fault1     225608          -0.9%           223585          will-it-scale.per_process_ops
                105945          +14.4%          121199          will-it-scale.per_thread_ops

tbox:lkp-skl-4sp1(nr_cpu=192,memory=768G)
kconfig:CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
testcase        base            change          head            metric                   
context_switch1 333780          -23.0%          256927          will-it-scale.per_process_ops
brk1            2263539         -18.8%          1837462         will-it-scale.per_process_ops
                325854          -15.7%          274752          will-it-scale.per_thread_ops
malloc1         48746           -13.5%          42148           will-it-scale.per_process_ops
mmap1           106860          -12.4%          93634           will-it-scale.per_process_ops
                98082           -18.9%          79506           will-it-scale.per_thread_ops
mmap2           92468           -11.3%          82059           will-it-scale.per_process_ops
                80468           -8.9%           73343           will-it-scale.per_thread_ops
page_fault3     900709          -9.1%           818851          will-it-scale.per_process_ops
                229837          -18.3%          187769          will-it-scale.per_thread_ops
write1          1327409         -1.7%           1305048         will-it-scale.per_process_ops
                1215658         -1.6%           1196479         will-it-scale.per_thread_ops
writeseek3      300639          -1.6%           295882          will-it-scale.per_process_ops
                231118          -2.2%           225929          will-it-scale.per_thread_ops
signal1         122011          -1.5%           120155          will-it-scale.per_process_ops
futex1          5123778         -1.2%           5062087         will-it-scale.per_process_ops
page_fault2     202321          -1.0%           200289          will-it-scale.per_process_ops
                93073           -9.8%           83927           will-it-scale.per_thread_ops

tbox:lkp-skl-2sp2(nr_cpu=112,memory=64G)
kconfig:CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
testcase        base            change          head            metric                   
brk1            2177903         -20.0%          1742054         will-it-scale.per_process_ops
                434558          -15.3%          367896          will-it-scale.per_thread_ops
malloc1         64871           -10.3%          58174           will-it-scale.per_process_ops
page_fault3     882435          -9.0%           802892          will-it-scale.per_process_ops
                299176          -15.7%          252170          will-it-scale.per_thread_ops
mmap2           124567          -8.3%           114214          will-it-scale.per_process_ops
                110674          -12.1%          97272           will-it-scale.per_thread_ops
mmap1           137205          -7.8%           126440          will-it-scale.per_process_ops
                128973          -15.1%          109560          will-it-scale.per_thread_ops
context_switch1 343790          -7.2%           319209          will-it-scale.per_process_ops
page_fault2     161891          -2.1%           158458          will-it-scale.per_process_ops
                123278          -5.4%           116629          will-it-scale.per_thread_ops
malloc2         14354856        -1.8%           14096856        will-it-scale.per_process_ops
read2           1204838         -1.7%           1183993         will-it-scale.per_process_ops
futex1          5017718         -1.6%           4938677         will-it-scale.per_process_ops
                1408250         -1.0%           1394022         will-it-scale.per_thread_ops
writeseek3      399651          -1.4%           393935          will-it-scale.per_process_ops
signal1         157952          -1.0%           156302          will-it-scale.per_process_ops

On 2017年10月11日 21:52, Laurent Dufour wrote:
> This patch enable the speculative page fault on the PowerPC
> architecture.
> 
> This will try a speculative page fault without holding the mmap_sem,
> if it returns with VM_FAULT_RETRY, the mmap_sem is acquired and the
> traditional page fault processing is done.
> 
> Build on if CONFIG_SPF is defined (currently for BOOK3S_64 && SMP).
> 
> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
> ---
>  arch/powerpc/mm/fault.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 4797d08581ce..c018c2554cc8 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -442,6 +442,20 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
>  	if (is_exec)
>  		flags |= FAULT_FLAG_INSTRUCTION;
>  
> +#ifdef CONFIG_SPF
> +	if (is_user) {
> +		/* let's try a speculative page fault without grabbing the
> +		 * mmap_sem.
> +		 */
> +		fault = handle_speculative_fault(mm, address, flags);
> +		if (!(fault & VM_FAULT_RETRY)) {
> +			perf_sw_event(PERF_COUNT_SW_SPF, 1,
> +				      regs, address);
> +			goto done;
> +		}
> +	}
> +#endif /* CONFIG_SPF */
> +
>  	/* When running in the kernel we expect faults to occur only to
>  	 * addresses in user space.  All other faults represent errors in the
>  	 * kernel and should generate an OOPS.  Unfortunately, in the case of an
> @@ -526,6 +540,9 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
>  
>  	up_read(&current->mm->mmap_sem);
>  
> +#ifdef CONFIG_SPF
> +done:
> +#endif
>  	if (unlikely(fault & VM_FAULT_ERROR))
>  		return mm_fault_error(regs, address, fault);
>  
>
Laurent Dufour Nov. 2, 2017, 2:11 p.m. UTC | #2
On 26/10/2017 10:14, kemi wrote:
> Some regression is found by LKP-tools(linux kernel performance) on this patch series
> tested on Intel 2s/4s Skylake platform. 
> The regression result is sorted by the metric will-it-scale.per_process_ops.

Hi Kemi,

Thanks for reporting this, I'll try to address it by turning some features
of the SPF path off when the process is monothreaded.

Laurent.


> Branch:Laurent-Dufour/Speculative-page-faults/20171011-213456(V4 patch series)
> Commit id:
>      base:9a4b4dd1d8700dd5771f11dd2c048e4363efb493
>      head:56a4a8962fb32555a42eefdc9a19eeedd3e8c2e6
> Benchmark suite:will-it-scale
> Download link:https://github.com/antonblanchard/will-it-scale/tree/master/tests
> Metrics:
>      will-it-scale.per_process_ops=processes/nr_cpu
>      will-it-scale.per_thread_ops=threads/nr_cpu
> 
> tbox:lkp-skl-4sp1(nr_cpu=192,memory=768G)
> kconfig:CONFIG_TRANSPARENT_HUGEPAGE is not set
> testcase        base            change          head            metric                   
> brk1            2251803         -18.1%          1843535         will-it-scale.per_process_ops
>                 341101          -17.5%          281284          will-it-scale.per_thread_ops
> malloc1         48833           -9.2%           44343           will-it-scale.per_process_ops
>                 31555           +2.9%           32473           will-it-scale.per_thread_ops
> page_fault3     913019          -8.5%           835203          will-it-scale.per_process_ops
>                 233978          -18.1%          191593          will-it-scale.per_thread_ops
> mmap2           95892           -6.6%           89536           will-it-scale.per_process_ops
>                 90180           -13.7%          77803           will-it-scale.per_thread_ops
> mmap1           109586          -4.7%           104414          will-it-scale.per_process_ops
>                 104477          -12.4%          91484           will-it-scale.per_thread_ops
> sched_yield     4964649         -2.1%           4859927         will-it-scale.per_process_ops
>                 4946759         -1.7%           4864924         will-it-scale.per_thread_ops
> write1          1345159         -1.3%           1327719         will-it-scale.per_process_ops
>                 1228754         -2.2%           1201915         will-it-scale.per_thread_ops
> page_fault2     202519          -1.0%           200545          will-it-scale.per_process_ops
>                 96573           -10.4%          86526           will-it-scale.per_thread_ops
> page_fault1     225608          -0.9%           223585          will-it-scale.per_process_ops
>                 105945          +14.4%          121199          will-it-scale.per_thread_ops
> 
> tbox:lkp-skl-4sp1(nr_cpu=192,memory=768G)
> kconfig:CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
> testcase        base            change          head            metric                   
> context_switch1 333780          -23.0%          256927          will-it-scale.per_process_ops
> brk1            2263539         -18.8%          1837462         will-it-scale.per_process_ops
>                 325854          -15.7%          274752          will-it-scale.per_thread_ops
> malloc1         48746           -13.5%          42148           will-it-scale.per_process_ops
> mmap1           106860          -12.4%          93634           will-it-scale.per_process_ops
>                 98082           -18.9%          79506           will-it-scale.per_thread_ops
> mmap2           92468           -11.3%          82059           will-it-scale.per_process_ops
>                 80468           -8.9%           73343           will-it-scale.per_thread_ops
> page_fault3     900709          -9.1%           818851          will-it-scale.per_process_ops
>                 229837          -18.3%          187769          will-it-scale.per_thread_ops
> write1          1327409         -1.7%           1305048         will-it-scale.per_process_ops
>                 1215658         -1.6%           1196479         will-it-scale.per_thread_ops
> writeseek3      300639          -1.6%           295882          will-it-scale.per_process_ops
>                 231118          -2.2%           225929          will-it-scale.per_thread_ops
> signal1         122011          -1.5%           120155          will-it-scale.per_process_ops
> futex1          5123778         -1.2%           5062087         will-it-scale.per_process_ops
> page_fault2     202321          -1.0%           200289          will-it-scale.per_process_ops
>                 93073           -9.8%           83927           will-it-scale.per_thread_ops
> 
> tbox:lkp-skl-2sp2(nr_cpu=112,memory=64G)
> kconfig:CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y
> testcase        base            change          head            metric                   
> brk1            2177903         -20.0%          1742054         will-it-scale.per_process_ops
>                 434558          -15.3%          367896          will-it-scale.per_thread_ops
> malloc1         64871           -10.3%          58174           will-it-scale.per_process_ops
> page_fault3     882435          -9.0%           802892          will-it-scale.per_process_ops
>                 299176          -15.7%          252170          will-it-scale.per_thread_ops
> mmap2           124567          -8.3%           114214          will-it-scale.per_process_ops
>                 110674          -12.1%          97272           will-it-scale.per_thread_ops
> mmap1           137205          -7.8%           126440          will-it-scale.per_process_ops
>                 128973          -15.1%          109560          will-it-scale.per_thread_ops
> context_switch1 343790          -7.2%           319209          will-it-scale.per_process_ops
> page_fault2     161891          -2.1%           158458          will-it-scale.per_process_ops
>                 123278          -5.4%           116629          will-it-scale.per_thread_ops
> malloc2         14354856        -1.8%           14096856        will-it-scale.per_process_ops
> read2           1204838         -1.7%           1183993         will-it-scale.per_process_ops
> futex1          5017718         -1.6%           4938677         will-it-scale.per_process_ops
>                 1408250         -1.0%           1394022         will-it-scale.per_thread_ops
> writeseek3      399651          -1.4%           393935          will-it-scale.per_process_ops
> signal1         157952          -1.0%           156302          will-it-scale.per_process_ops
> 
> On 2017年10月11日 21:52, Laurent Dufour wrote:
>> This patch enable the speculative page fault on the PowerPC
>> architecture.
>>
>> This will try a speculative page fault without holding the mmap_sem,
>> if it returns with VM_FAULT_RETRY, the mmap_sem is acquired and the
>> traditional page fault processing is done.
>>
>> Build on if CONFIG_SPF is defined (currently for BOOK3S_64 && SMP).
>>
>> Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
>> ---
>>  arch/powerpc/mm/fault.c | 17 +++++++++++++++++
>>  1 file changed, 17 insertions(+)
>>
>> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
>> index 4797d08581ce..c018c2554cc8 100644
>> --- a/arch/powerpc/mm/fault.c
>> +++ b/arch/powerpc/mm/fault.c
>> @@ -442,6 +442,20 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
>>  	if (is_exec)
>>  		flags |= FAULT_FLAG_INSTRUCTION;
>>  
>> +#ifdef CONFIG_SPF
>> +	if (is_user) {
>> +		/* let's try a speculative page fault without grabbing the
>> +		 * mmap_sem.
>> +		 */
>> +		fault = handle_speculative_fault(mm, address, flags);
>> +		if (!(fault & VM_FAULT_RETRY)) {
>> +			perf_sw_event(PERF_COUNT_SW_SPF, 1,
>> +				      regs, address);
>> +			goto done;
>> +		}
>> +	}
>> +#endif /* CONFIG_SPF */
>> +
>>  	/* When running in the kernel we expect faults to occur only to
>>  	 * addresses in user space.  All other faults represent errors in the
>>  	 * kernel and should generate an OOPS.  Unfortunately, in the case of an
>> @@ -526,6 +540,9 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
>>  
>>  	up_read(&current->mm->mmap_sem);
>>  
>> +#ifdef CONFIG_SPF
>> +done:
>> +#endif
>>  	if (unlikely(fault & VM_FAULT_ERROR))
>>  		return mm_fault_error(regs, address, fault);
>>  
>>
>
Sergey Senozhatsky Nov. 6, 2017, 10:27 a.m. UTC | #3
On (11/02/17 15:11), Laurent Dufour wrote:
> On 26/10/2017 10:14, kemi wrote:
> > Some regression is found by LKP-tools(linux kernel performance) on this patch series
> > tested on Intel 2s/4s Skylake platform. 
> > The regression result is sorted by the metric will-it-scale.per_process_ops.
> 
> Hi Kemi,
> 
> Thanks for reporting this, I'll try to address it by turning some features
> of the SPF path off when the process is monothreaded.

make them madvice()-able?
not all multi-threaded apps will necessarily benefit of SPF. right?
just an idea.

	-ss
diff mbox series

Patch

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 4797d08581ce..c018c2554cc8 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -442,6 +442,20 @@  static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 	if (is_exec)
 		flags |= FAULT_FLAG_INSTRUCTION;
 
+#ifdef CONFIG_SPF
+	if (is_user) {
+		/* let's try a speculative page fault without grabbing the
+		 * mmap_sem.
+		 */
+		fault = handle_speculative_fault(mm, address, flags);
+		if (!(fault & VM_FAULT_RETRY)) {
+			perf_sw_event(PERF_COUNT_SW_SPF, 1,
+				      regs, address);
+			goto done;
+		}
+	}
+#endif /* CONFIG_SPF */
+
 	/* When running in the kernel we expect faults to occur only to
 	 * addresses in user space.  All other faults represent errors in the
 	 * kernel and should generate an OOPS.  Unfortunately, in the case of an
@@ -526,6 +540,9 @@  static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 
 	up_read(&current->mm->mmap_sem);
 
+#ifdef CONFIG_SPF
+done:
+#endif
 	if (unlikely(fault & VM_FAULT_ERROR))
 		return mm_fault_error(regs, address, fault);