[v5,4/7] powerpc/mm: Use UV_WRITE_PATE ucall to register a PATE
diff mbox series

Message ID 20190808040555.2371-5-cclaudio@linux.ibm.com
State New
Headers show
Series
  • kvmppc: Paravirtualize KVM to support ultravisor
Related show

Commit Message

Claudio Carvalho Aug. 8, 2019, 4:05 a.m. UTC
From: Michael Anderson <andmike@linux.ibm.com>

In ultravisor enabled systems, the ultravisor creates and maintains the
partition table in secure memory where the hypervisor cannot access, and
therefore, the hypervisor have to do the UV_WRITE_PATE ucall whenever it
wants to set a partition table entry (PATE).

This patch adds the UV_WRITE_PATE ucall and uses it to set a PATE if
ultravisor is enabled. Additionally, this also also keeps a copy of the
partition table because the nestMMU does not have access to secure
memory. Such copy has entries for nonsecure and hypervisor partition.

Signed-off-by: Michael Anderson <andmike@linux.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Ram Pai <linuxram@us.ibm.com>
[ cclaudio: Write the PATE in HV's table before doing that in UV's ]
Signed-off-by: Claudio Carvalho <cclaudio@linux.ibm.com>
Reviewed-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/ultravisor-api.h |  5 ++
 arch/powerpc/include/asm/ultravisor.h     |  8 +++
 arch/powerpc/mm/book3s64/pgtable.c        | 60 ++++++++++++++++-------
 3 files changed, 56 insertions(+), 17 deletions(-)

Comments

Michael Ellerman Aug. 14, 2019, 11:33 a.m. UTC | #1
Hi Claudio,

Claudio Carvalho <cclaudio@linux.ibm.com> writes:
> From: Michael Anderson <andmike@linux.ibm.com>
>
> In ultravisor enabled systems, the ultravisor creates and maintains the
> partition table in secure memory where the hypervisor cannot access, and
                                   ^
                                   which?

> therefore, the hypervisor have to do the UV_WRITE_PATE ucall whenever it
                            ^          ^
                            has        a
> wants to set a partition table entry (PATE).
>
> This patch adds the UV_WRITE_PATE ucall and uses it to set a PATE if
> ultravisor is enabled. Additionally, this also also keeps a copy of the
> partition table because the nestMMU does not have access to secure
> memory. Such copy has entries for nonsecure and hypervisor partition.

I'm having trouble parsing the last sentence there.

Or at least it doesn't seem to match the code, or I don't understand
either the code or the comment. More below.

> diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c
> index 85bc81abd286..033731f5dbaa 100644
> --- a/arch/powerpc/mm/book3s64/pgtable.c
> +++ b/arch/powerpc/mm/book3s64/pgtable.c
> @@ -213,34 +223,50 @@ void __init mmu_partition_table_init(void)
>  	powernv_set_nmmu_ptcr(ptcr);
>  }
>  
> -void mmu_partition_table_set_entry(unsigned int lpid, unsigned long dw0,
> -				   unsigned long dw1)
> +/*
> + * Global flush of TLBs and partition table caches for this lpid. The type of
> + * flush (hash or radix) depends on what the previous use of this partition ID
> + * was, not the new use.
> + */
> +static void flush_partition(unsigned int lpid, unsigned long old_patb0)

A nicer API would be for the 2nd param to be a "bool radix", and have
the caller worry about the fact that it comes from (patb0 & PATB_HR).

>  {
> -	unsigned long old = be64_to_cpu(partition_tb[lpid].patb0);
> -
> -	partition_tb[lpid].patb0 = cpu_to_be64(dw0);
> -	partition_tb[lpid].patb1 = cpu_to_be64(dw1);
> -
> -	/*
> -	 * Global flush of TLBs and partition table caches for this lpid.
> -	 * The type of flush (hash or radix) depends on what the previous
> -	 * use of this partition ID was, not the new use.
> -	 */
>  	asm volatile("ptesync" : : : "memory");
> -	if (old & PATB_HR) {
> -		asm volatile(PPC_TLBIE_5(%0,%1,2,0,1) : :
> +	if (old_patb0 & PATB_HR) {
> +		asm volatile(PPC_TLBIE_5(%0, %1, 2, 0, 1) : :
>  			     "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid));
> -		asm volatile(PPC_TLBIE_5(%0,%1,2,1,1) : :
> +		asm volatile(PPC_TLBIE_5(%0, %1, 2, 1, 1) : :

That looks like an unrelated whitespace change.

>  			     "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid));
>  		trace_tlbie(lpid, 0, TLBIEL_INVAL_SET_LPID, lpid, 2, 0, 1);
>  	} else {
> -		asm volatile(PPC_TLBIE_5(%0,%1,2,0,0) : :
> +		asm volatile(PPC_TLBIE_5(%0, %1, 2, 0, 0) : :

Ditto.

>  			     "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid));
>  		trace_tlbie(lpid, 0, TLBIEL_INVAL_SET_LPID, lpid, 2, 0, 0);
>  	}
>  	/* do we need fixup here ?*/
>  	asm volatile("eieio; tlbsync; ptesync" : : : "memory");
>  }
> +
> +void mmu_partition_table_set_entry(unsigned int lpid, unsigned long dw0,
> +				  unsigned long dw1)
> +{
> +	unsigned long old = be64_to_cpu(partition_tb[lpid].patb0);
> +
> +	partition_tb[lpid].patb0 = cpu_to_be64(dw0);
> +	partition_tb[lpid].patb1 = cpu_to_be64(dw1);

ie. here we always update the copy of the partition table, regardless of
whether we're running under an ultravisor or not. So the copy is a
complete copy isn't it?

> +	/*
> +	 * In ultravisor enabled systems, the ultravisor maintains the partition
> +	 * table in secure memory where we don't have access, therefore, we have
> +	 * to do a ucall to set an entry.
> +	 */
> +	if (firmware_has_feature(FW_FEATURE_ULTRAVISOR)) {
> +		uv_register_pate(lpid, dw0, dw1);
> +		pr_info("PATE registered by ultravisor: dw0 = 0x%lx, dw1 = 0x%lx\n",
> +			dw0, dw1);
> +	} else {
> +		flush_partition(lpid, old);
> +	}

What is different is whether we flush or not.

And don't we still need to do the flush for the nestMMU? I assume we're
saying the ultravisor will broadcast a flush for us, which will also
handle the nestMMU case?

cheers
Sukadev Bhattiprolu Aug. 21, 2019, 12:04 a.m. UTC | #2
Michael Ellerman [mpe@ellerman.id.au] wrote:

> Hi Claudio,
> 
> Claudio Carvalho <cclaudio@linux.ibm.com> writes:
> > From: Michael Anderson <andmike@linux.ibm.com>
> >
> > In ultravisor enabled systems, the ultravisor creates and maintains the
> > partition table in secure memory where the hypervisor cannot access, and
>                                    ^
>                                    which?
> 
> > therefore, the hypervisor have to do the UV_WRITE_PATE ucall whenever it
>                             ^          ^
>                             has        a
> > wants to set a partition table entry (PATE).
> >
> > This patch adds the UV_WRITE_PATE ucall and uses it to set a PATE if
> > ultravisor is enabled. Additionally, this also also keeps a copy of the
> > partition table because the nestMMU does not have access to secure
> > memory. Such copy has entries for nonsecure and hypervisor partition.
> 
> I'm having trouble parsing the last sentence there.
> 
> Or at least it doesn't seem to match the code, or I don't understand
> either the code or the comment. More below.

Yes, good catch. We could drop the last sentence. Or maybe change the
last para to:

	This patch adds the UV_WRITE_PATE ucall which is used to update
	the partition table entry (PATE) for a VM (both normal and secure).

	When UV is enabled, the partition table is stored in secure memory
	and can only be accessed via the UV. The HV however maintains a
	copy of the partition table in normal memory to allow NMMU
	translations to occur (for normal VMs). The HV copy includes PATEs
	for secure VMs which would currently be unused (NMMU translations
	cannot access secure memory) but they would be needed as we add
	functionality.

Basically, with UV, PTCR is controlled by the UV and address translations
occur based on the UV's copy of the partition table. (See also:
try_set_ptcr() in "PATCH 5/7 powerpc/mm: Write to PTCR only if ultravisor
disabled")

> 
> > diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c
> > index 85bc81abd286..033731f5dbaa 100644
> > --- a/arch/powerpc/mm/book3s64/pgtable.c
> > +++ b/arch/powerpc/mm/book3s64/pgtable.c
> > @@ -213,34 +223,50 @@ void __init mmu_partition_table_init(void)
> >  	powernv_set_nmmu_ptcr(ptcr);
> >  }
> >  
> > -void mmu_partition_table_set_entry(unsigned int lpid, unsigned long dw0,
> > -				   unsigned long dw1)
> > +/*
> > + * Global flush of TLBs and partition table caches for this lpid. The type of
> > + * flush (hash or radix) depends on what the previous use of this partition ID
> > + * was, not the new use.
> > + */
> > +static void flush_partition(unsigned int lpid, unsigned long old_patb0)
> 
> A nicer API would be for the 2nd param to be a "bool radix", and have
> the caller worry about the fact that it comes from (patb0 & PATB_HR).

Agree

> 
> >  {
> > -	unsigned long old = be64_to_cpu(partition_tb[lpid].patb0);
> > -
> > -	partition_tb[lpid].patb0 = cpu_to_be64(dw0);
> > -	partition_tb[lpid].patb1 = cpu_to_be64(dw1);
> > -
> > -	/*
> > -	 * Global flush of TLBs and partition table caches for this lpid.
> > -	 * The type of flush (hash or radix) depends on what the previous
> > -	 * use of this partition ID was, not the new use.
> > -	 */
> >  	asm volatile("ptesync" : : : "memory");
> > -	if (old & PATB_HR) {
> > -		asm volatile(PPC_TLBIE_5(%0,%1,2,0,1) : :
> > +	if (old_patb0 & PATB_HR) {
> > +		asm volatile(PPC_TLBIE_5(%0, %1, 2, 0, 1) : :
> >  			     "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid));
> > -		asm volatile(PPC_TLBIE_5(%0,%1,2,1,1) : :
> > +		asm volatile(PPC_TLBIE_5(%0, %1, 2, 1, 1) : :
> 
> That looks like an unrelated whitespace change.
> 
> >  			     "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid));
> >  		trace_tlbie(lpid, 0, TLBIEL_INVAL_SET_LPID, lpid, 2, 0, 1);
> >  	} else {
> > -		asm volatile(PPC_TLBIE_5(%0,%1,2,0,0) : :
> > +		asm volatile(PPC_TLBIE_5(%0, %1, 2, 0, 0) : :
> 
> Ditto.
> 
> >  			     "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid));
> >  		trace_tlbie(lpid, 0, TLBIEL_INVAL_SET_LPID, lpid, 2, 0, 0);
> >  	}
> >  	/* do we need fixup here ?*/
> >  	asm volatile("eieio; tlbsync; ptesync" : : : "memory");
> >  }
> > +
> > +void mmu_partition_table_set_entry(unsigned int lpid, unsigned long dw0,
> > +				  unsigned long dw1)
> > +{
> > +	unsigned long old = be64_to_cpu(partition_tb[lpid].patb0);
> > +
> > +	partition_tb[lpid].patb0 = cpu_to_be64(dw0);
> > +	partition_tb[lpid].patb1 = cpu_to_be64(dw1);
> 
> ie. here we always update the copy of the partition table, regardless of
> whether we're running under an ultravisor or not. So the copy is a
> complete copy isn't it?

Yes.
> 
> > +	/*
> > +	 * In ultravisor enabled systems, the ultravisor maintains the partition
> > +	 * table in secure memory where we don't have access, therefore, we have
> > +	 * to do a ucall to set an entry.
> > +	 */
> > +	if (firmware_has_feature(FW_FEATURE_ULTRAVISOR)) {
> > +		uv_register_pate(lpid, dw0, dw1);
> > +		pr_info("PATE registered by ultravisor: dw0 = 0x%lx, dw1 = 0x%lx\n",
> > +			dw0, dw1);
> > +	} else {
> > +		flush_partition(lpid, old);
> > +	}
> 
> What is different is whether we flush or not.

only differences are where the partition table used by hardware is stored
(secure memory) and updated (in UV, with higher privilege).

> 
> And don't we still need to do the flush for the nestMMU? I assume we're
> saying the ultravisor will broadcast a flush for us, which will also
> handle the nestMMU case?

The same sequence of instructions (as HV) are used in uv_register_pate()
to flush partition and process scoped entries (so nest MMU would also be
covered when NMMU sees the tlbie?)

Thanks,

Sukadev

Patch
diff mbox series

diff --git a/arch/powerpc/include/asm/ultravisor-api.h b/arch/powerpc/include/asm/ultravisor-api.h
index 88ffa78f9d61..8cd49abff4f3 100644
--- a/arch/powerpc/include/asm/ultravisor-api.h
+++ b/arch/powerpc/include/asm/ultravisor-api.h
@@ -11,6 +11,7 @@ 
 #include <asm/hvcall.h>
 
 /* Return codes */
+#define U_BUSY			H_BUSY
 #define U_FUNCTION		H_FUNCTION
 #define U_NOT_AVAILABLE		H_NOT_AVAILABLE
 #define U_P2			H_P2
@@ -18,6 +19,10 @@ 
 #define U_P4			H_P4
 #define U_P5			H_P5
 #define U_PARAMETER		H_PARAMETER
+#define U_PERMISSION		H_PERMISSION
 #define U_SUCCESS		H_SUCCESS
 
+/* opcodes */
+#define UV_WRITE_PATE			0xF104
+
 #endif /* _ASM_POWERPC_ULTRAVISOR_API_H */
diff --git a/arch/powerpc/include/asm/ultravisor.h b/arch/powerpc/include/asm/ultravisor.h
index dc6e1ea198f2..6fe1f365dec8 100644
--- a/arch/powerpc/include/asm/ultravisor.h
+++ b/arch/powerpc/include/asm/ultravisor.h
@@ -8,7 +8,15 @@ 
 #ifndef _ASM_POWERPC_ULTRAVISOR_H
 #define _ASM_POWERPC_ULTRAVISOR_H
 
+#include <asm/asm-prototypes.h>
+#include <asm/ultravisor-api.h>
+
 int early_init_dt_scan_ultravisor(unsigned long node, const char *uname,
 				  int depth, void *data);
 
+static inline int uv_register_pate(u64 lpid, u64 dw0, u64 dw1)
+{
+	return ucall_norets(UV_WRITE_PATE, lpid, dw0, dw1);
+}
+
 #endif	/* _ASM_POWERPC_ULTRAVISOR_H */
diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c
index 85bc81abd286..033731f5dbaa 100644
--- a/arch/powerpc/mm/book3s64/pgtable.c
+++ b/arch/powerpc/mm/book3s64/pgtable.c
@@ -16,6 +16,8 @@ 
 #include <asm/tlb.h>
 #include <asm/trace.h>
 #include <asm/powernv.h>
+#include <asm/firmware.h>
+#include <asm/ultravisor.h>
 
 #include <mm/mmu_decl.h>
 #include <trace/events/thp.h>
@@ -198,7 +200,15 @@  void __init mmu_partition_table_init(void)
 	unsigned long ptcr;
 
 	BUILD_BUG_ON_MSG((PATB_SIZE_SHIFT > 36), "Partition table size too large.");
-	/* Initialize the Partition Table with no entries */
+	/*
+	 * Initialize the Partition Table with no entries, even in the presence
+	 * of an ultravisor firmware.
+	 *
+	 * In ultravisor enabled systems, the ultravisor creates and maintains
+	 * the partition table in secure memory. However, we keep a copy of the
+	 * partition table because nestMMU cannot access secure memory. Our copy
+	 * contains entries for nonsecure and hypervisor partition.
+	 */
 	partition_tb = memblock_alloc(patb_size, patb_size);
 	if (!partition_tb)
 		panic("%s: Failed to allocate %lu bytes align=0x%lx\n",
@@ -213,34 +223,50 @@  void __init mmu_partition_table_init(void)
 	powernv_set_nmmu_ptcr(ptcr);
 }
 
-void mmu_partition_table_set_entry(unsigned int lpid, unsigned long dw0,
-				   unsigned long dw1)
+/*
+ * Global flush of TLBs and partition table caches for this lpid. The type of
+ * flush (hash or radix) depends on what the previous use of this partition ID
+ * was, not the new use.
+ */
+static void flush_partition(unsigned int lpid, unsigned long old_patb0)
 {
-	unsigned long old = be64_to_cpu(partition_tb[lpid].patb0);
-
-	partition_tb[lpid].patb0 = cpu_to_be64(dw0);
-	partition_tb[lpid].patb1 = cpu_to_be64(dw1);
-
-	/*
-	 * Global flush of TLBs and partition table caches for this lpid.
-	 * The type of flush (hash or radix) depends on what the previous
-	 * use of this partition ID was, not the new use.
-	 */
 	asm volatile("ptesync" : : : "memory");
-	if (old & PATB_HR) {
-		asm volatile(PPC_TLBIE_5(%0,%1,2,0,1) : :
+	if (old_patb0 & PATB_HR) {
+		asm volatile(PPC_TLBIE_5(%0, %1, 2, 0, 1) : :
 			     "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid));
-		asm volatile(PPC_TLBIE_5(%0,%1,2,1,1) : :
+		asm volatile(PPC_TLBIE_5(%0, %1, 2, 1, 1) : :
 			     "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid));
 		trace_tlbie(lpid, 0, TLBIEL_INVAL_SET_LPID, lpid, 2, 0, 1);
 	} else {
-		asm volatile(PPC_TLBIE_5(%0,%1,2,0,0) : :
+		asm volatile(PPC_TLBIE_5(%0, %1, 2, 0, 0) : :
 			     "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid));
 		trace_tlbie(lpid, 0, TLBIEL_INVAL_SET_LPID, lpid, 2, 0, 0);
 	}
 	/* do we need fixup here ?*/
 	asm volatile("eieio; tlbsync; ptesync" : : : "memory");
 }
+
+void mmu_partition_table_set_entry(unsigned int lpid, unsigned long dw0,
+				  unsigned long dw1)
+{
+	unsigned long old = be64_to_cpu(partition_tb[lpid].patb0);
+
+	partition_tb[lpid].patb0 = cpu_to_be64(dw0);
+	partition_tb[lpid].patb1 = cpu_to_be64(dw1);
+
+	/*
+	 * In ultravisor enabled systems, the ultravisor maintains the partition
+	 * table in secure memory where we don't have access, therefore, we have
+	 * to do a ucall to set an entry.
+	 */
+	if (firmware_has_feature(FW_FEATURE_ULTRAVISOR)) {
+		uv_register_pate(lpid, dw0, dw1);
+		pr_info("PATE registered by ultravisor: dw0 = 0x%lx, dw1 = 0x%lx\n",
+			dw0, dw1);
+	} else {
+		flush_partition(lpid, old);
+	}
+}
 EXPORT_SYMBOL_GPL(mmu_partition_table_set_entry);
 
 static pmd_t *get_pmd_from_cache(struct mm_struct *mm)