diff mbox series

[v2,1/7] powerpc/64s/radix: do not flush TLB on spurious fault

Message ID 20180520004347.19508-2-npiggin@gmail.com (mailing list archive)
State Superseded
Headers show
Series Various TLB and PTE improvements | expand

Commit Message

Nicholas Piggin May 20, 2018, 12:43 a.m. UTC
In the case of a spurious fault (which can happen due to a race with
another thread that changes the page table), the default Linux mm code
calls flush_tlb_page for that address. This is not required because
the pte will be re-fetched. Hash does not wire this up to a hardware
TLB flush for this reason. This patch avoids the flush for radix.

From Power ISA v3.0B, p.1090:

    Setting a Reference or Change Bit or Upgrading Access Authority
    (PTE Subject to Atomic Hardware Updates)

    If the only change being made to a valid PTE that is subject to
    atomic hardware updates is to set the Refer- ence or Change bit to
    1 or to add access authorities, a simpler sequence suffices
    because the translation hardware will refetch the PTE if an access
    is attempted for which the only problems were reference and/or
    change bits needing to be set or insufficient access authority.

The nest MMU on POWER9 does not re-fetch the PTE after such an access
attempt before faulting, so address spaces with a coprocessor
attached will continue to flush in these cases.

This reduces tlbies for a kernel compile workload from 0.95M to 0.90M.

fork --fork --exec benchmark improved 0.5% (12300->12400).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
Since v1:
- Added NMMU handling

 arch/powerpc/include/asm/book3s/64/tlbflush.h | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

Comments

Aneesh Kumar K.V May 21, 2018, 6:06 a.m. UTC | #1
Nicholas Piggin <npiggin@gmail.com> writes:

> In the case of a spurious fault (which can happen due to a race with
> another thread that changes the page table), the default Linux mm code
> calls flush_tlb_page for that address. This is not required because
> the pte will be re-fetched. Hash does not wire this up to a hardware
> TLB flush for this reason. This patch avoids the flush for radix.
>
> From Power ISA v3.0B, p.1090:
>
>     Setting a Reference or Change Bit or Upgrading Access Authority
>     (PTE Subject to Atomic Hardware Updates)
>
>     If the only change being made to a valid PTE that is subject to
>     atomic hardware updates is to set the Refer- ence or Change bit to
>     1 or to add access authorities, a simpler sequence suffices
>     because the translation hardware will refetch the PTE if an access
>     is attempted for which the only problems were reference and/or
>     change bits needing to be set or insufficient access authority.
>
> The nest MMU on POWER9 does not re-fetch the PTE after such an access
> attempt before faulting, so address spaces with a coprocessor
> attached will continue to flush in these cases.
>
> This reduces tlbies for a kernel compile workload from 0.95M to 0.90M.
>
> fork --fork --exec benchmark improved 0.5% (12300->12400).
>


Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

Do you want to use flush_tlb_fix_spurious_fault in
ptep_set_access_flags() also?. That would bring it closer to generic version?

> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> Since v1:
> - Added NMMU handling
>
>  arch/powerpc/include/asm/book3s/64/tlbflush.h | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h b/arch/powerpc/include/asm/book3s/64/tlbflush.h
> index 0cac17253513..ebf572ea621e 100644
> --- a/arch/powerpc/include/asm/book3s/64/tlbflush.h
> +++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h
> @@ -4,7 +4,7 @@
>  
>  #define MMU_NO_CONTEXT	~0UL
>  
> -
> +#include <linux/mm_types.h>
>  #include <asm/book3s/64/tlbflush-hash.h>
>  #include <asm/book3s/64/tlbflush-radix.h>
>  
> @@ -137,6 +137,16 @@ static inline void flush_all_mm(struct mm_struct *mm)
>  #define flush_tlb_page(vma, addr)	local_flush_tlb_page(vma, addr)
>  #define flush_all_mm(mm)		local_flush_all_mm(mm)
>  #endif /* CONFIG_SMP */
> +
> +#define flush_tlb_fix_spurious_fault flush_tlb_fix_spurious_fault
> +static inline void flush_tlb_fix_spurious_fault(struct vm_area_struct *vma,
> +						unsigned long address)
> +{
> +	/* See ptep_set_access_flags comment */
> +	if (atomic_read(&vma->vm_mm->context.copros) > 0)
> +		flush_tlb_page(vma, address);
> +}
> +
>  /*
>   * flush the page walk cache for the address
>   */
> -- 
> 2.17.0
Nicholas Piggin May 24, 2018, 10:37 a.m. UTC | #2
On Mon, 21 May 2018 11:36:12 +0530
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> wrote:

> Nicholas Piggin <npiggin@gmail.com> writes:
> 
> > In the case of a spurious fault (which can happen due to a race with
> > another thread that changes the page table), the default Linux mm code
> > calls flush_tlb_page for that address. This is not required because
> > the pte will be re-fetched. Hash does not wire this up to a hardware
> > TLB flush for this reason. This patch avoids the flush for radix.
> >
> > From Power ISA v3.0B, p.1090:
> >
> >     Setting a Reference or Change Bit or Upgrading Access Authority
> >     (PTE Subject to Atomic Hardware Updates)
> >
> >     If the only change being made to a valid PTE that is subject to
> >     atomic hardware updates is to set the Refer- ence or Change bit to
> >     1 or to add access authorities, a simpler sequence suffices
> >     because the translation hardware will refetch the PTE if an access
> >     is attempted for which the only problems were reference and/or
> >     change bits needing to be set or insufficient access authority.
> >
> > The nest MMU on POWER9 does not re-fetch the PTE after such an access
> > attempt before faulting, so address spaces with a coprocessor
> > attached will continue to flush in these cases.
> >
> > This reduces tlbies for a kernel compile workload from 0.95M to 0.90M.
> >
> > fork --fork --exec benchmark improved 0.5% (12300->12400).
> >  
> 
> 
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> 
> Do you want to use flush_tlb_fix_spurious_fault in
> ptep_set_access_flags() also?. That would bring it closer to generic version?

I'm not sure it adds much, it does bring it closer to generic version
and they do happen to do the same thing, but it's not really fixing a
spurious fault in ptep_set_access_flags(). I think adding that just
means you would have to follow another indirection to work out what it
does.

Thanks,
Nick
diff mbox series

Patch

diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h b/arch/powerpc/include/asm/book3s/64/tlbflush.h
index 0cac17253513..ebf572ea621e 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h
@@ -4,7 +4,7 @@ 
 
 #define MMU_NO_CONTEXT	~0UL
 
-
+#include <linux/mm_types.h>
 #include <asm/book3s/64/tlbflush-hash.h>
 #include <asm/book3s/64/tlbflush-radix.h>
 
@@ -137,6 +137,16 @@  static inline void flush_all_mm(struct mm_struct *mm)
 #define flush_tlb_page(vma, addr)	local_flush_tlb_page(vma, addr)
 #define flush_all_mm(mm)		local_flush_all_mm(mm)
 #endif /* CONFIG_SMP */
+
+#define flush_tlb_fix_spurious_fault flush_tlb_fix_spurious_fault
+static inline void flush_tlb_fix_spurious_fault(struct vm_area_struct *vma,
+						unsigned long address)
+{
+	/* See ptep_set_access_flags comment */
+	if (atomic_read(&vma->vm_mm->context.copros) > 0)
+		flush_tlb_page(vma, address);
+}
+
 /*
  * flush the page walk cache for the address
  */