diff mbox

[10/15] powerpc/mm: Add hooks for cxl

Message ID 1411028820-29933-11-git-send-email-mikey@neuling.org (mailing list archive)
State Superseded
Headers show

Commit Message

Michael Neuling Sept. 18, 2014, 8:26 a.m. UTC
From: Ian Munsie <imunsie@au1.ibm.com>

This add a hook into tlbie() so that we use global invalidations when there are
cxl contexts active.

Normally cxl snoops broadcast tlbie.  cxl can have TLB entries invalidated via
MMIO, but we aren't doing that yet.  So for now we are just disabling local
tlbies when cxl contexts are active.  In future we can make tlbie() local mode
smarter so that it invalidates cxl contexts explicitly when it needs to.

This also adds a hooks for when SLBs are invalidated to ensure any
corresponding SLBs in cxl are also invalidated at the same time.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
---
 arch/powerpc/mm/hash_native_64.c | 6 +++++-
 arch/powerpc/mm/hash_utils_64.c  | 3 +++
 arch/powerpc/mm/slice.c          | 3 +++
 3 files changed, 11 insertions(+), 1 deletion(-)

Comments

Anton Blanchard Sept. 26, 2014, 4:33 a.m. UTC | #1
> From: Ian Munsie <imunsie@au1.ibm.com>
> 
> This add a hook into tlbie() so that we use global invalidations when
> there are cxl contexts active.
> 
> Normally cxl snoops broadcast tlbie.  cxl can have TLB entries
> invalidated via MMIO, but we aren't doing that yet.  So for now we
> are just disabling local tlbies when cxl contexts are active.  In
> future we can make tlbie() local mode smarter so that it invalidates
> cxl contexts explicitly when it needs to.
> 
> This also adds a hooks for when SLBs are invalidated to ensure any
> corresponding SLBs in cxl are also invalidated at the same time.
> 
> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
> Signed-off-by: Michael Neuling <mikey@neuling.org>

> +	use_local = local && mmu_has_feature(MMU_FTR_TLBIEL) && !cxl_ctx_in_use();

Seems reasonable until we can get the MMIO based optimisation in.

Will all CAPI cached translations be invalidated before we finish using
a CAPI context? And conversely, could CAPI cache any translations when a
context isn't active? I'm mostly concerned that we can't have a
situation where badly behaving userspace could result in a stale
translation.

>  	spu_flush_all_slbs(mm);
>  #endif
> +	cxl_slbia(mm);

>  			spu_flush_all_slbs(mm);
>  #endif
> +			cxl_slbia(mm);

>  	spu_flush_all_slbs(mm);
>  #endif
> +	cxl_slbia(mm);

>  	spu_flush_all_slbs(mm);
>  #endif
> +	cxl_slbia(mm);

Should we combine the SPU vs CXL callouts into something common -
perhaps copro_flush_all_slbs()?

Anton
Michael Neuling Sept. 26, 2014, 11:33 a.m. UTC | #2
On Fri, 2014-09-26 at 14:33 +1000, Anton Blanchard wrote:
> > From: Ian Munsie <imunsie@au1.ibm.com>
> > 
> > This add a hook into tlbie() so that we use global invalidations when
> > there are cxl contexts active.
> > 
> > Normally cxl snoops broadcast tlbie.  cxl can have TLB entries
> > invalidated via MMIO, but we aren't doing that yet.  So for now we
> > are just disabling local tlbies when cxl contexts are active.  In
> > future we can make tlbie() local mode smarter so that it invalidates
> > cxl contexts explicitly when it needs to.
> > 
> > This also adds a hooks for when SLBs are invalidated to ensure any
> > corresponding SLBs in cxl are also invalidated at the same time.
> > 
> > Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
> > Signed-off-by: Michael Neuling <mikey@neuling.org>
> 
> > +	use_local = local && mmu_has_feature(MMU_FTR_TLBIEL) && !cxl_ctx_in_use();
> 
> Seems reasonable until we can get the MMIO based optimisation in.
> 
> Will all CAPI cached translations be invalidated before we finish using
> a CAPI context? 

I'm not sure I understand. Can you elaborate?

> And conversely, could CAPI cache any translations when a
> context isn't active? 

The kernel invalidates all translations when the file descriptor is
closed.  So no, unless the PSL was badly behaving and ignoring the
invalidations.... but if we can't trust the PSL we're screwed.

> I'm mostly concerned that we can't have a
> situation where badly behaving userspace could result in a stale
> translation.

We only map what a user processes maps and we tear it down when the
process is teared down (on the file descriptor release).  So I think we
are ok.  

Unless there's some lazy teardown you're alluding to that I'm missing?

> 
> >  	spu_flush_all_slbs(mm);
> >  #endif
> > +	cxl_slbia(mm);
> 
> >  			spu_flush_all_slbs(mm);
> >  #endif
> > +			cxl_slbia(mm);
> 
> >  	spu_flush_all_slbs(mm);
> >  #endif
> > +	cxl_slbia(mm);
> 
> >  	spu_flush_all_slbs(mm);
> >  #endif
> > +	cxl_slbia(mm);
> 
> Should we combine the SPU vs CXL callouts into something common -
> perhaps copro_flush_all_slbs()?

Sounds good.  I'll update.

Mikey
Anton Blanchard Sept. 26, 2014, 1:24 p.m. UTC | #3
Hi Mikey,

> We only map what a user processes maps and we tear it down when the
> process is teared down (on the file descriptor release).  So I think
> we are ok.  
> 
> Unless there's some lazy teardown you're alluding to that I'm missing?

I was trying to make sure things like the TLB batching code won't allow
a tlbie to be postponed until after a CAPI mapping is destroyed. It's
been ages since I looked at that part of the mm code.

Anton
Aneesh Kumar K.V Sept. 29, 2014, 9:10 a.m. UTC | #4
Michael Neuling <mikey@neuling.org> writes:

> From: Ian Munsie <imunsie@au1.ibm.com>
>
> This add a hook into tlbie() so that we use global invalidations when there are
> cxl contexts active.
>
> Normally cxl snoops broadcast tlbie.  cxl can have TLB entries invalidated via
> MMIO, but we aren't doing that yet.  So for now we are just disabling local
> tlbies when cxl contexts are active.  In future we can make tlbie() local mode
> smarter so that it invalidates cxl contexts explicitly when it needs to.
>
> This also adds a hooks for when SLBs are invalidated to ensure any
> corresponding SLBs in cxl are also invalidated at the same time.

We are not really invalidating cx1 SLB's when we are doing
slb_flush_and_rebolt(). May be add some code documentation around to
explain when we are invalidating cx1 slb here. ?

>
> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
> Signed-off-by: Michael Neuling <mikey@neuling.org>
> ---
>  arch/powerpc/mm/hash_native_64.c | 6 +++++-
>  arch/powerpc/mm/hash_utils_64.c  | 3 +++
>  arch/powerpc/mm/slice.c          | 3 +++
>  3 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c
> index afc0a82..ae4962a 100644
> --- a/arch/powerpc/mm/hash_native_64.c
> +++ b/arch/powerpc/mm/hash_native_64.c
> @@ -29,6 +29,8 @@
>  #include <asm/kexec.h>
>  #include <asm/ppc-opcode.h>
>  
> +#include <misc/cxl.h>
> +
>  #ifdef DEBUG_LOW
>  #define DBG_LOW(fmt...) udbg_printf(fmt)
>  #else
> @@ -149,9 +151,11 @@ static inline void __tlbiel(unsigned long vpn, int psize, int apsize, int ssize)
>  static inline void tlbie(unsigned long vpn, int psize, int apsize,
>  			 int ssize, int local)
>  {
> -	unsigned int use_local = local && mmu_has_feature(MMU_FTR_TLBIEL);
> +	unsigned int use_local;
>  	int lock_tlbie = !mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE);
>  
> +	use_local = local && mmu_has_feature(MMU_FTR_TLBIEL) && !cxl_ctx_in_use();
> +
>  	if (use_local)
>  		use_local = mmu_psize_defs[psize].tlbiel;
>  	if (lock_tlbie && !use_local)
> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> index 66071af..be40ff7 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -34,6 +34,7 @@
>  #include <linux/signal.h>
>  #include <linux/memblock.h>
>  #include <linux/context_tracking.h>
> +#include <misc/cxl.h>
>  
>  #include <asm/processor.h>
>  #include <asm/pgtable.h>
> @@ -906,6 +907,7 @@ void demote_segment_4k(struct mm_struct *mm, unsigned long addr)
>  #ifdef CONFIG_SPU_BASE
>  	spu_flush_all_slbs(mm);
>  #endif
> +	cxl_slbia(mm);
>  	if (get_paca_psize(addr) != MMU_PAGE_4K) {
>  		get_paca()->context = mm->context;
>  		slb_flush_and_rebolt();
> @@ -1145,6 +1147,7 @@ int hash_page_mm(struct mm_struct *mm, unsigned long ea, unsigned long access, u
>  #ifdef CONFIG_SPU_BASE
>  			spu_flush_all_slbs(mm);
>  #endif
> +			cxl_slbia(mm);
>  		}
>  	}
>  
> diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
> index b0c75cc..4d3a34b 100644
> --- a/arch/powerpc/mm/slice.c
> +++ b/arch/powerpc/mm/slice.c
> @@ -30,6 +30,7 @@
>  #include <linux/err.h>
>  #include <linux/spinlock.h>
>  #include <linux/export.h>
> +#include <misc/cxl.h>
>  #include <asm/mman.h>
>  #include <asm/mmu.h>
>  #include <asm/spu.h>
> @@ -235,6 +236,7 @@ static void slice_convert(struct mm_struct *mm, struct slice_mask mask, int psiz
>  #ifdef CONFIG_SPU_BASE
>  	spu_flush_all_slbs(mm);
>  #endif
> +	cxl_slbia(mm);
>  }
>  
>  /*
> @@ -674,6 +676,7 @@ void slice_set_psize(struct mm_struct *mm, unsigned long address,
>  #ifdef CONFIG_SPU_BASE
>  	spu_flush_all_slbs(mm);
>  #endif
> +	cxl_slbia(mm);
>  }
>  
>  void slice_set_range_psize(struct mm_struct *mm, unsigned long start,
> -- 
> 1.9.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
diff mbox

Patch

diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c
index afc0a82..ae4962a 100644
--- a/arch/powerpc/mm/hash_native_64.c
+++ b/arch/powerpc/mm/hash_native_64.c
@@ -29,6 +29,8 @@ 
 #include <asm/kexec.h>
 #include <asm/ppc-opcode.h>
 
+#include <misc/cxl.h>
+
 #ifdef DEBUG_LOW
 #define DBG_LOW(fmt...) udbg_printf(fmt)
 #else
@@ -149,9 +151,11 @@  static inline void __tlbiel(unsigned long vpn, int psize, int apsize, int ssize)
 static inline void tlbie(unsigned long vpn, int psize, int apsize,
 			 int ssize, int local)
 {
-	unsigned int use_local = local && mmu_has_feature(MMU_FTR_TLBIEL);
+	unsigned int use_local;
 	int lock_tlbie = !mmu_has_feature(MMU_FTR_LOCKLESS_TLBIE);
 
+	use_local = local && mmu_has_feature(MMU_FTR_TLBIEL) && !cxl_ctx_in_use();
+
 	if (use_local)
 		use_local = mmu_psize_defs[psize].tlbiel;
 	if (lock_tlbie && !use_local)
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 66071af..be40ff7 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -34,6 +34,7 @@ 
 #include <linux/signal.h>
 #include <linux/memblock.h>
 #include <linux/context_tracking.h>
+#include <misc/cxl.h>
 
 #include <asm/processor.h>
 #include <asm/pgtable.h>
@@ -906,6 +907,7 @@  void demote_segment_4k(struct mm_struct *mm, unsigned long addr)
 #ifdef CONFIG_SPU_BASE
 	spu_flush_all_slbs(mm);
 #endif
+	cxl_slbia(mm);
 	if (get_paca_psize(addr) != MMU_PAGE_4K) {
 		get_paca()->context = mm->context;
 		slb_flush_and_rebolt();
@@ -1145,6 +1147,7 @@  int hash_page_mm(struct mm_struct *mm, unsigned long ea, unsigned long access, u
 #ifdef CONFIG_SPU_BASE
 			spu_flush_all_slbs(mm);
 #endif
+			cxl_slbia(mm);
 		}
 	}
 
diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
index b0c75cc..4d3a34b 100644
--- a/arch/powerpc/mm/slice.c
+++ b/arch/powerpc/mm/slice.c
@@ -30,6 +30,7 @@ 
 #include <linux/err.h>
 #include <linux/spinlock.h>
 #include <linux/export.h>
+#include <misc/cxl.h>
 #include <asm/mman.h>
 #include <asm/mmu.h>
 #include <asm/spu.h>
@@ -235,6 +236,7 @@  static void slice_convert(struct mm_struct *mm, struct slice_mask mask, int psiz
 #ifdef CONFIG_SPU_BASE
 	spu_flush_all_slbs(mm);
 #endif
+	cxl_slbia(mm);
 }
 
 /*
@@ -674,6 +676,7 @@  void slice_set_psize(struct mm_struct *mm, unsigned long address,
 #ifdef CONFIG_SPU_BASE
 	spu_flush_all_slbs(mm);
 #endif
+	cxl_slbia(mm);
 }
 
 void slice_set_range_psize(struct mm_struct *mm, unsigned long start,