diff mbox series

[v2,04/14] spapr: nested: Introduce cap-nested-papr for Nested PAPR API

Message ID 20231012104951.194876-5-harshpb@linux.ibm.com
State New
Headers show
Series Nested PAPR API (KVM on PowerVM) | expand

Commit Message

Harsh Prateek Bora Oct. 12, 2023, 10:49 a.m. UTC
Introduce a SPAPR capability cap-nested-papr which provides a nested
HV facility to the guest. This is similar to cap-nested-hv, but uses
a different (incompatible) API and so they are mutually exclusive.
This new API is to enable support for KVM on PowerVM and recently the
Linux kernel side patches have been accepted upstream as well [1].
Support for related hcalls is being added in next set of patches.

[1]
https://lore.kernel.org/linuxppc-dev/169528846875.874757.8861595746180557787.b4-ty@ellerman.id.au/

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
---
 hw/ppc/spapr.c                |  9 +++++-
 hw/ppc/spapr_caps.c           | 61 +++++++++++++++++++++++++++++++++++
 hw/ppc/spapr_nested.c         | 15 +++++++++
 include/hw/ppc/spapr.h        |  5 ++-
 include/hw/ppc/spapr_nested.h |  5 +++
 5 files changed, 93 insertions(+), 2 deletions(-)

Comments

Nicholas Piggin Nov. 29, 2023, 4:01 a.m. UTC | #1
On Thu Oct 12, 2023 at 8:49 PM AEST, Harsh Prateek Bora wrote:
> Introduce a SPAPR capability cap-nested-papr which provides a nested
> HV facility to the guest. This is similar to cap-nested-hv, but uses
> a different (incompatible) API and so they are mutually exclusive.
> This new API is to enable support for KVM on PowerVM and recently the
> Linux kernel side patches have been accepted upstream as well [1].
> Support for related hcalls is being added in next set of patches.

We do want to be able to support both APIs on a per-guest basis. It
doesn't look like the vmstate bits will be a problem, both could be
enabled if the logic permitted it and that wouldn't cause a
compatibility problem I think?

And it's a bit of a nitpick, but the capability should not be permitted
before the actual APIs are supported IMO. You could split this into
adding .api first, so the implementation can test it, and add the spapr
caps at the end.

Thanks,
Nick
Harsh Prateek Bora Nov. 30, 2023, 6:19 a.m. UTC | #2
On 11/29/23 09:31, Nicholas Piggin wrote:
> On Thu Oct 12, 2023 at 8:49 PM AEST, Harsh Prateek Bora wrote:
>> Introduce a SPAPR capability cap-nested-papr which provides a nested
>> HV facility to the guest. This is similar to cap-nested-hv, but uses
>> a different (incompatible) API and so they are mutually exclusive.
>> This new API is to enable support for KVM on PowerVM and recently the
>> Linux kernel side patches have been accepted upstream as well [1].
>> Support for related hcalls is being added in next set of patches.
> 
> We do want to be able to support both APIs on a per-guest basis. It
> doesn't look like the vmstate bits will be a problem, both could be
> enabled if the logic permitted it and that wouldn't cause a
> compatibility problem I think?
> 

I am not sure if it makes sense to have both APIs working in parallel 
for a nested guest. Former uses h_enter_guest and expects L1 to store 
most of the regs, and has no concept like GSB where the communication 
between L1 and L0 takes place in a standard format which is used at 
nested guest exit also. Here, we have separate APIs for guest/vcpu 
create and then do a run_vcpu for a specific vcpu. So, we cant really 
use both APIs interchangeably while running a nested guest. BTW, L1 
kernel uses only either of the APIs at a time, preferably this one if 
supported.

> And it's a bit of a nitpick, but the capability should not be permitted
> before the actual APIs are supported IMO. You could split this into
> adding .api first, so the implementation can test it, and add the spapr
> caps at the end.
> 

Agree, I shall update as suggested.

regards,
Harsh

> Thanks,
> Nick
Nicholas Piggin Nov. 30, 2023, 11:11 a.m. UTC | #3
On Thu Nov 30, 2023 at 4:19 PM AEST, Harsh Prateek Bora wrote:
>
>
> On 11/29/23 09:31, Nicholas Piggin wrote:
> > On Thu Oct 12, 2023 at 8:49 PM AEST, Harsh Prateek Bora wrote:
> >> Introduce a SPAPR capability cap-nested-papr which provides a nested
> >> HV facility to the guest. This is similar to cap-nested-hv, but uses
> >> a different (incompatible) API and so they are mutually exclusive.
> >> This new API is to enable support for KVM on PowerVM and recently the
> >> Linux kernel side patches have been accepted upstream as well [1].
> >> Support for related hcalls is being added in next set of patches.
> > 
> > We do want to be able to support both APIs on a per-guest basis. It
> > doesn't look like the vmstate bits will be a problem, both could be
> > enabled if the logic permitted it and that wouldn't cause a
> > compatibility problem I think?
> > 
>
> I am not sure if it makes sense to have both APIs working in parallel 
> for a nested guest.

Not for the nested guest, but for the nested KVM host (i.e., the direct
pseries guest running QEMU). QEMU doesn't know ahead of time which API
might be used by the OS.

> Former uses h_enter_guest and expects L1 to store 
> most of the regs, and has no concept like GSB where the communication 
> between L1 and L0 takes place in a standard format which is used at 
> nested guest exit also. Here, we have separate APIs for guest/vcpu 
> create and then do a run_vcpu for a specific vcpu. So, we cant really 
> use both APIs interchangeably while running a nested guest. BTW, L1 
> kernel uses only either of the APIs at a time, preferably this one if 
> supported.

Yeah not on the same guest. And it's less about running two different
APIs on different guests with the same L1 simultaneously (although we
could probably change KVM to support that fairly easily, and we might
want to for testing purposes), but more about compatibility. What if
we boot or exec into an old kernel that doesn't support the new API?
>
> > And it's a bit of a nitpick, but the capability should not be permitted
> > before the actual APIs are supported IMO. You could split this into
> > adding .api first, so the implementation can test it, and add the spapr
> > caps at the end.
> > 
>
> Agree, I shall update as suggested.

Thanks,
Nick
Harsh Prateek Bora Dec. 1, 2023, 5:34 a.m. UTC | #4
On 11/30/23 16:41, Nicholas Piggin wrote:
> On Thu Nov 30, 2023 at 4:19 PM AEST, Harsh Prateek Bora wrote:
>>
>>
>> On 11/29/23 09:31, Nicholas Piggin wrote:
>>> On Thu Oct 12, 2023 at 8:49 PM AEST, Harsh Prateek Bora wrote:
>>>> Introduce a SPAPR capability cap-nested-papr which provides a nested
>>>> HV facility to the guest. This is similar to cap-nested-hv, but uses
>>>> a different (incompatible) API and so they are mutually exclusive.
>>>> This new API is to enable support for KVM on PowerVM and recently the
>>>> Linux kernel side patches have been accepted upstream as well [1].
>>>> Support for related hcalls is being added in next set of patches.
>>>
>>> We do want to be able to support both APIs on a per-guest basis. It
>>> doesn't look like the vmstate bits will be a problem, both could be
>>> enabled if the logic permitted it and that wouldn't cause a
>>> compatibility problem I think?
>>>
>>
>> I am not sure if it makes sense to have both APIs working in parallel
>> for a nested guest.
> 
> Not for the nested guest, but for the nested KVM host (i.e., the direct
> pseries guest running QEMU). QEMU doesn't know ahead of time which API
> might be used by the OS.
> 
>> Former uses h_enter_guest and expects L1 to store
>> most of the regs, and has no concept like GSB where the communication
>> between L1 and L0 takes place in a standard format which is used at
>> nested guest exit also. Here, we have separate APIs for guest/vcpu
>> create and then do a run_vcpu for a specific vcpu. So, we cant really
>> use both APIs interchangeably while running a nested guest. BTW, L1
>> kernel uses only either of the APIs at a time, preferably this one if
>> supported.
> 
> Yeah not on the same guest. And it's less about running two different
> APIs on different guests with the same L1 simultaneously (although we
> could probably change KVM to support that fairly easily, and we might
> want to for testing purposes), but more about compatibility. What if
> we boot or exec into an old kernel that doesn't support the new API?

Hmm, ok, that's a possible use case, will drop the mutual exclusion in v3.

regards,
Harsh

>>
>>> And it's a bit of a nitpick, but the capability should not be permitted
>>> before the actual APIs are supported IMO. You could split this into
>>> adding .api first, so the implementation can test it, and add the spapr
>>> caps at the end.
>>>
>>
>> Agree, I shall update as suggested.
> 
> Thanks,
> Nick
diff mbox series

Patch

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index a2c69d0f4f..14196fdd11 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1355,7 +1355,11 @@  static bool spapr_get_pate(PPCVirtualHypervisor *vhyp, PowerPCCPU *cpu,
         entry->dw1 = spapr->patb_entry;
         return true;
     } else {
-        return spapr_get_pate_nested(spapr, cpu, lpid, entry);
+        assert(spapr->nested.api);
+        if (spapr->nested.api == NESTED_API_KVM_HV) {
+            return spapr_get_pate_nested(spapr, cpu, lpid, entry);
+        }
+        return false;
     }
 }
 
@@ -2093,6 +2097,7 @@  static const VMStateDescription vmstate_spapr = {
         &vmstate_spapr_cap_fwnmi,
         &vmstate_spapr_fwnmi,
         &vmstate_spapr_cap_rpt_invalidate,
+        &vmstate_spapr_cap_nested_papr,
         NULL
     }
 };
@@ -3437,6 +3442,7 @@  static void spapr_instance_init(Object *obj)
         spapr_get_host_serial, spapr_set_host_serial);
     object_property_set_description(obj, "host-serial",
         "Host serial number to advertise in guest device tree");
+    spapr_nested_init(spapr);
 }
 
 static void spapr_machine_finalizefn(Object *obj)
@@ -4675,6 +4681,7 @@  static void spapr_machine_class_init(ObjectClass *oc, void *data)
     smc->default_caps.caps[SPAPR_CAP_IBS] = SPAPR_CAP_WORKAROUND;
     smc->default_caps.caps[SPAPR_CAP_HPT_MAXPAGESIZE] = 16; /* 64kiB */
     smc->default_caps.caps[SPAPR_CAP_NESTED_KVM_HV] = SPAPR_CAP_OFF;
+    smc->default_caps.caps[SPAPR_CAP_NESTED_PAPR] = SPAPR_CAP_OFF;
     smc->default_caps.caps[SPAPR_CAP_LARGE_DECREMENTER] = SPAPR_CAP_ON;
     smc->default_caps.caps[SPAPR_CAP_CCF_ASSIST] = SPAPR_CAP_ON;
     smc->default_caps.caps[SPAPR_CAP_FWNMI] = SPAPR_CAP_ON;
diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index 5a0755d34f..9b53f19ec8 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -454,6 +454,14 @@  static void cap_nested_kvm_hv_apply(SpaprMachineState *spapr,
         return;
     }
 
+    if (!spapr->nested.api) {
+        spapr->nested.api = NESTED_API_KVM_HV;
+    } else {
+        error_setg(errp, "Nested-HV APIs are mutually exclusive/incompatible");
+        error_append_hint(errp, "Please use either cap-nested-hv or "
+                                 "cap-nested-papr to proceed.\n");
+        return;
+    }
     if (kvm_enabled()) {
         if (!ppc_check_compat(cpu, CPU_POWERPC_LOGICAL_3_00, 0,
                               spapr->max_compat_pvr)) {
@@ -490,6 +498,49 @@  static void cap_nested_kvm_hv_apply(SpaprMachineState *spapr,
     }
 }
 
+static void cap_nested_papr_apply(SpaprMachineState *spapr,
+                                    uint8_t val, Error **errp)
+{
+    ERRP_GUARD();
+    PowerPCCPU *cpu = POWERPC_CPU(first_cpu);
+    CPUPPCState *env = &cpu->env;
+
+    if (!val) {
+        /* capability disabled by default */
+        return;
+    }
+
+    if (!spapr->nested.api) {
+        spapr->nested.api = NESTED_API_PAPR;
+        spapr_register_nested_papr();
+    } else {
+        error_setg(errp, "Nested-HV APIs are mutually exclusive/incompatible");
+        error_append_hint(errp, "Please use either cap-nested-hv or "
+                                 "cap-nested-papr to proceed.\n");
+        return;
+    }
+
+    if (tcg_enabled()) {
+        if (!(env->insns_flags2 & PPC2_ISA300)) {
+            error_setg(errp, "Nested-PAPR only supported on POWER9 and later");
+            error_append_hint(errp,
+                              "Try appending -machine cap-nested-papr=off\n");
+            return;
+        }
+    } else if (kvm_enabled()) {
+        /*
+         * this gets executed in L1 qemu when L2 is launched,
+         * needs kvm-hv support in L1 kernel.
+         */
+        if (!kvmppc_has_cap_nested_kvm_hv()) {
+            error_setg(errp,
+                       "KVM implementation does not support Nested-HV");
+        } else if (kvmppc_set_cap_nested_kvm_hv(val) < 0) {
+            error_setg(errp, "Error enabling Nested-HV with KVM");
+        }
+    }
+}
+
 static void cap_large_decr_apply(SpaprMachineState *spapr,
                                  uint8_t val, Error **errp)
 {
@@ -735,6 +786,15 @@  SpaprCapabilityInfo capability_table[SPAPR_CAP_NUM] = {
         .type = "bool",
         .apply = cap_nested_kvm_hv_apply,
     },
+    [SPAPR_CAP_NESTED_PAPR] = {
+        .name = "nested-papr",
+        .description = "Allow Nested HV (PAPR API)",
+        .index = SPAPR_CAP_NESTED_PAPR,
+        .get = spapr_cap_get_bool,
+        .set = spapr_cap_set_bool,
+        .type = "bool",
+        .apply = cap_nested_papr_apply,
+    },
     [SPAPR_CAP_LARGE_DECREMENTER] = {
         .name = "large-decr",
         .description = "Allow Large Decrementer",
@@ -919,6 +979,7 @@  SPAPR_CAP_MIG_STATE(sbbc, SPAPR_CAP_SBBC);
 SPAPR_CAP_MIG_STATE(ibs, SPAPR_CAP_IBS);
 SPAPR_CAP_MIG_STATE(hpt_maxpagesize, SPAPR_CAP_HPT_MAXPAGESIZE);
 SPAPR_CAP_MIG_STATE(nested_kvm_hv, SPAPR_CAP_NESTED_KVM_HV);
+SPAPR_CAP_MIG_STATE(nested_papr, SPAPR_CAP_NESTED_PAPR);
 SPAPR_CAP_MIG_STATE(large_decr, SPAPR_CAP_LARGE_DECREMENTER);
 SPAPR_CAP_MIG_STATE(ccf_assist, SPAPR_CAP_CCF_ASSIST);
 SPAPR_CAP_MIG_STATE(fwnmi, SPAPR_CAP_FWNMI);
diff --git a/hw/ppc/spapr_nested.c b/hw/ppc/spapr_nested.c
index db47c1196f..87a0db22a5 100644
--- a/hw/ppc/spapr_nested.c
+++ b/hw/ppc/spapr_nested.c
@@ -8,6 +8,11 @@ 
 #include "hw/ppc/spapr_nested.h"
 #include "mmu-book3s-v3.h"
 
+void spapr_nested_init(SpaprMachineState *spapr)
+{
+    spapr->nested.api = 0;
+}
+
 bool spapr_get_pate_nested(SpaprMachineState *spapr, PowerPCCPU *cpu,
                            target_ulong lpid, ppc_v3_pate_t *entry)
 {
@@ -411,6 +416,11 @@  void spapr_register_nested(void)
     spapr_register_hypercall(KVMPPC_H_TLB_INVALIDATE, h_tlb_invalidate);
     spapr_register_hypercall(KVMPPC_H_COPY_TOFROM_GUEST, h_copy_tofrom_guest);
 }
+
+void spapr_register_nested_papr(void)
+{
+    /* register hcalls here */
+}
 #else
 void spapr_exit_nested(PowerPCCPU *cpu, int excp)
 {
@@ -421,4 +431,9 @@  void spapr_register_nested(void)
 {
     /* DO NOTHING */
 }
+
+void spapr_register_nested_papr(void)
+{
+    /* DO NOTHING */
+}
 #endif
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 3e825f2787..e33ee87ba4 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -81,8 +81,10 @@  typedef enum {
 #define SPAPR_CAP_RPT_INVALIDATE        0x0B
 /* Support for AIL modes */
 #define SPAPR_CAP_AIL_MODE_3            0x0C
+/* Nested PAPR */
+#define SPAPR_CAP_NESTED_PAPR           0x0D
 /* Num Caps */
-#define SPAPR_CAP_NUM                   (SPAPR_CAP_AIL_MODE_3 + 1)
+#define SPAPR_CAP_NUM                   (SPAPR_CAP_NESTED_PAPR + 1)
 
 /*
  * Capability Values
@@ -982,6 +984,7 @@  extern const VMStateDescription vmstate_spapr_cap_sbbc;
 extern const VMStateDescription vmstate_spapr_cap_ibs;
 extern const VMStateDescription vmstate_spapr_cap_hpt_maxpagesize;
 extern const VMStateDescription vmstate_spapr_cap_nested_kvm_hv;
+extern const VMStateDescription vmstate_spapr_cap_nested_papr;
 extern const VMStateDescription vmstate_spapr_cap_large_decr;
 extern const VMStateDescription vmstate_spapr_cap_ccf_assist;
 extern const VMStateDescription vmstate_spapr_cap_fwnmi;
diff --git a/include/hw/ppc/spapr_nested.h b/include/hw/ppc/spapr_nested.h
index 0722b999cd..efdfc78200 100644
--- a/include/hw/ppc/spapr_nested.h
+++ b/include/hw/ppc/spapr_nested.h
@@ -6,6 +6,9 @@ 
 
 typedef struct SpaprMachineStateNested {
     uint64_t ptcr;
+    uint8_t api;
+#define NESTED_API_KVM_HV  1
+#define NESTED_API_PAPR    2
 } SpaprMachineStateNested;
 
 /*
@@ -105,4 +108,6 @@  void spapr_exit_nested(PowerPCCPU *cpu, int excp);
 typedef struct SpaprMachineState SpaprMachineState;
 bool spapr_get_pate_nested(SpaprMachineState *spapr, PowerPCCPU *cpu,
                            target_ulong lpid, ppc_v3_pate_t *entry);
+void spapr_register_nested_papr(void);
+void spapr_nested_init(SpaprMachineState *spapr);
 #endif /* HW_SPAPR_NESTED_H */