Patchwork [v2] ppc: introduce CPUPPCState::cpu_dt_id

login
register
mail settings
Submitter Alexey Kardashevskiy
Date Nov. 1, 2013, 3:21 a.m.
Message ID <1383276076-4630-1-git-send-email-aik@ozlabs.ru>
Download mbox | patch
Permalink /patch/287695/
State New
Headers show

Comments

Alexey Kardashevskiy - Nov. 1, 2013, 3:21 a.m.
Normally CPUState::cpu_index is used to pick the right CPU for various
operations. However default consecutive numbering does not always work
for POWERPC.

For example, on POWER7 (which supports 4 threads per core),
"-smp 8,threads=4" should create CPUs with indexes 0,1,2,3,4,5,6,7 and
"-smp 8,threads=1" should create CPUs with indexes 0,4,8,12,16,20,24,28.

These indexes are reflected in /proc/device-tree/cpus/PowerPC,POWER7@XX
and used to call KVM VCPU's ioctls. In order to achieve this,
kvmppc_fixup_cpu() was introduced. Roughly speaking, it multiplies
cpu_index by the number of threads per core.

This approach has disadvantages such as:
1. NUMA configuration stays broken after the fixup;
2. CPU-related commands from QEMU Monitor do not work properly as
the accept fixed CPU indexes and the user does not really know
what they are after fixup as the number of threads per core changes
between CPU versions and via QEMU command line.

This introduces a new @cpu_dt_id field in the CPUPPCState struct which
is set from @cpu_index by default but can be fixed later to the value
which a hypervisor can accept. This also introduces two POWERPC-arch
specific functions:
1. int ppc_get_vcpu_dt_id(CPUState *cs) - returns a device-tree ID
for a CPU;
2. CPUState *ppc_get_vcpu_by_dt_id(int cpu_dt_id) - finds CPUState by
a device-tree CPU ID.

This uses the new functions to:
1. fix emulated XICS hypercall handlers as they receive fixed CPU indexes;
2. fix XICS-KVM to enable in-kernel XICS on right CPU;
3. compose correct device-tree.

This removes @cpu_index fixup as @cpu_dt_id is used instead so QEMU monitor
can accept command-line CPU indexes again.

Cc: Badari Pulavarty  <pbadari@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
Changes:
v2:
* added PPC-specific ppc_get_vcpu_dt_id() and ppc_get_vcpu_by_dt_id()
* fixed kvm_arch_vcpu_id() to use ppc_get_vcpu_dt_id()
* fixed emulated XICS
* removed kvm_arch_vcpu_id() stub for non-KVM case
---
 hw/intc/xics.c              | 15 +++++++++++++--
 hw/intc/xics_kvm.c          | 15 +++++++++------
 hw/ppc/ppc.c                | 25 +++++++++++++++++++++++++
 hw/ppc/spapr.c              |  9 +++++----
 hw/ppc/spapr_hcall.c        |  2 +-
 hw/ppc/spapr_rtas.c         |  4 ++--
 target-ppc/cpu.h            |  6 ++++++
 target-ppc/kvm.c            |  4 ++--
 target-ppc/translate_init.c |  1 +
 9 files changed, 64 insertions(+), 17 deletions(-)
Paolo Bonzini - Nov. 4, 2013, 10 a.m.
Il 01/11/2013 04:21, Alexey Kardashevskiy ha scritto:
> Normally CPUState::cpu_index is used to pick the right CPU for various
> operations. However default consecutive numbering does not always work
> for POWERPC.
> 
> For example, on POWER7 (which supports 4 threads per core),
> "-smp 8,threads=4" should create CPUs with indexes 0,1,2,3,4,5,6,7 and
> "-smp 8,threads=1" should create CPUs with indexes 0,4,8,12,16,20,24,28.
> 
> These indexes are reflected in /proc/device-tree/cpus/PowerPC,POWER7@XX
> and used to call KVM VCPU's ioctls. In order to achieve this,
> kvmppc_fixup_cpu() was introduced. Roughly speaking, it multiplies
> cpu_index by the number of threads per core.
> 
> This approach has disadvantages such as:
> 1. NUMA configuration stays broken after the fixup;
> 2. CPU-related commands from QEMU Monitor do not work properly as
> the accept fixed CPU indexes and the user does not really know
> what they are after fixup as the number of threads per core changes
> between CPU versions and via QEMU command line.
> 
> This introduces a new @cpu_dt_id field in the CPUPPCState struct which
> is set from @cpu_index by default but can be fixed later to the value
> which a hypervisor can accept. This also introduces two POWERPC-arch
> specific functions:
> 1. int ppc_get_vcpu_dt_id(CPUState *cs) - returns a device-tree ID
> for a CPU;
> 2. CPUState *ppc_get_vcpu_by_dt_id(int cpu_dt_id) - finds CPUState by
> a device-tree CPU ID.
> 
> This uses the new functions to:
> 1. fix emulated XICS hypercall handlers as they receive fixed CPU indexes;
> 2. fix XICS-KVM to enable in-kernel XICS on right CPU;
> 3. compose correct device-tree.
> 
> This removes @cpu_index fixup as @cpu_dt_id is used instead so QEMU monitor
> can accept command-line CPU indexes again.
> 
> Cc: Badari Pulavarty  <pbadari@linux.vnet.ibm.com>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: David Gibson <david@gibson.dropbear.id.au>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> Changes:
> v2:
> * added PPC-specific ppc_get_vcpu_dt_id() and ppc_get_vcpu_by_dt_id()
> * fixed kvm_arch_vcpu_id() to use ppc_get_vcpu_dt_id()
> * fixed emulated XICS
> * removed kvm_arch_vcpu_id() stub for non-KVM case

Not having non-PPC code in the patch is definitely a good sign!

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Alexey Kardashevskiy - Nov. 4, 2013, 11:20 a.m.
On 11/04/2013 09:00 PM, Paolo Bonzini wrote:
> Il 01/11/2013 04:21, Alexey Kardashevskiy ha scritto:
>> Normally CPUState::cpu_index is used to pick the right CPU for various
>> operations. However default consecutive numbering does not always work
>> for POWERPC.
>>
>> For example, on POWER7 (which supports 4 threads per core),
>> "-smp 8,threads=4" should create CPUs with indexes 0,1,2,3,4,5,6,7 and
>> "-smp 8,threads=1" should create CPUs with indexes 0,4,8,12,16,20,24,28.
>>
>> These indexes are reflected in /proc/device-tree/cpus/PowerPC,POWER7@XX
>> and used to call KVM VCPU's ioctls. In order to achieve this,
>> kvmppc_fixup_cpu() was introduced. Roughly speaking, it multiplies
>> cpu_index by the number of threads per core.
>>
>> This approach has disadvantages such as:
>> 1. NUMA configuration stays broken after the fixup;
>> 2. CPU-related commands from QEMU Monitor do not work properly as
>> the accept fixed CPU indexes and the user does not really know
>> what they are after fixup as the number of threads per core changes
>> between CPU versions and via QEMU command line.
>>
>> This introduces a new @cpu_dt_id field in the CPUPPCState struct which
>> is set from @cpu_index by default but can be fixed later to the value
>> which a hypervisor can accept. This also introduces two POWERPC-arch
>> specific functions:
>> 1. int ppc_get_vcpu_dt_id(CPUState *cs) - returns a device-tree ID
>> for a CPU;
>> 2. CPUState *ppc_get_vcpu_by_dt_id(int cpu_dt_id) - finds CPUState by
>> a device-tree CPU ID.
>>
>> This uses the new functions to:
>> 1. fix emulated XICS hypercall handlers as they receive fixed CPU indexes;
>> 2. fix XICS-KVM to enable in-kernel XICS on right CPU;
>> 3. compose correct device-tree.
>>
>> This removes @cpu_index fixup as @cpu_dt_id is used instead so QEMU monitor
>> can accept command-line CPU indexes again.
>>
>> Cc: Badari Pulavarty  <pbadari@linux.vnet.ibm.com>
>> Cc: Paul Mackerras <paulus@samba.org>
>> Cc: David Gibson <david@gibson.dropbear.id.au>
>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>> Changes:
>> v2:
>> * added PPC-specific ppc_get_vcpu_dt_id() and ppc_get_vcpu_by_dt_id()
>> * fixed kvm_arch_vcpu_id() to use ppc_get_vcpu_dt_id()
>> * fixed emulated XICS
>> * removed kvm_arch_vcpu_id() stub for non-KVM case
> 
> Not having non-PPC code in the patch is definitely a good sign!
> 
> Acked-by: Paolo Bonzini <pbonzini@redhat.com>


Heh. Mr. Graf has objections against v3 of this patch in another thread :)

Patch

diff --git a/hw/intc/xics.c b/hw/intc/xics.c
index a333305..866ee08 100644
--- a/hw/intc/xics.c
+++ b/hw/intc/xics.c
@@ -33,6 +33,17 @@ 
 #include "qemu/error-report.h"
 #include "qapi/visitor.h"
 
+static int get_cpu_index_by_dt_id(int cpu_dt_id)
+{
+    CPUState *cs = ppc_get_vcpu_by_dt_id(cpu_dt_id);
+
+    if (cs) {
+        return cs->cpu_index;
+    }
+
+    return -1;
+}
+
 void xics_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
 {
     CPUState *cs = CPU(cpu);
@@ -659,7 +670,7 @@  static target_ulong h_cppr(PowerPCCPU *cpu, sPAPREnvironment *spapr,
 static target_ulong h_ipi(PowerPCCPU *cpu, sPAPREnvironment *spapr,
                           target_ulong opcode, target_ulong *args)
 {
-    target_ulong server = args[0];
+    target_ulong server = get_cpu_index_by_dt_id(args[0]);
     target_ulong mfrr = args[1];
 
     if (server >= spapr->icp->nr_servers) {
@@ -728,7 +739,7 @@  static void rtas_set_xive(PowerPCCPU *cpu, sPAPREnvironment *spapr,
     }
 
     nr = rtas_ld(args, 0);
-    server = rtas_ld(args, 1);
+    server = get_cpu_index_by_dt_id(rtas_ld(args, 1));
     priority = rtas_ld(args, 2);
 
     if (!ics_valid_irq(ics, nr) || (server >= ics->icp->nr_servers)
diff --git a/hw/intc/xics_kvm.c b/hw/intc/xics_kvm.c
index c203646..091fcca 100644
--- a/hw/intc/xics_kvm.c
+++ b/hw/intc/xics_kvm.c
@@ -65,7 +65,8 @@  static void icp_get_kvm_state(ICPState *ss)
     ret = kvm_vcpu_ioctl(ss->cs, KVM_GET_ONE_REG, &reg);
     if (ret != 0) {
         error_report("Unable to retrieve KVM interrupt controller state"
-                " for CPU %d: %s", ss->cs->cpu_index, strerror(errno));
+                     " for CPU %ld: %s", kvm_arch_vcpu_id(ss->cs),
+                     strerror(errno));
         exit(1);
     }
 
@@ -97,8 +98,8 @@  static int icp_set_kvm_state(ICPState *ss, int version_id)
     ret = kvm_vcpu_ioctl(ss->cs, KVM_SET_ONE_REG, &reg);
     if (ret != 0) {
         error_report("Unable to restore KVM interrupt controller state (0x%"
-                PRIx64 ") for CPU %d: %s", state, ss->cs->cpu_index,
-                strerror(errno));
+                     PRIx64 ") for CPU %ld: %s", state,
+                     kvm_arch_vcpu_id(ss->cs), strerror(errno));
         return ret;
     }
 
@@ -311,8 +312,10 @@  static void xics_kvm_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
     CPUState *cs;
     ICPState *ss;
     KVMXICSState *icpkvm = KVM_XICS(icp);
+    unsigned long cpu_dt_id;
 
     cs = CPU(cpu);
+    cpu_dt_id = kvm_arch_vcpu_id(cs);
     ss = &icp->ss[cs->cpu_index];
 
     assert(cs->cpu_index < icp->nr_servers);
@@ -325,15 +328,15 @@  static void xics_kvm_cpu_setup(XICSState *icp, PowerPCCPU *cpu)
         struct kvm_enable_cap xics_enable_cap = {
             .cap = KVM_CAP_IRQ_XICS,
             .flags = 0,
-            .args = {icpkvm->kernel_xics_fd, cs->cpu_index, 0, 0},
+            .args = {icpkvm->kernel_xics_fd, cpu_dt_id, 0, 0},
         };
 
         ss->cs = cs;
 
         ret = kvm_vcpu_ioctl(ss->cs, KVM_ENABLE_CAP, &xics_enable_cap);
         if (ret < 0) {
-            error_report("Unable to connect CPU%d to kernel XICS: %s",
-                    cs->cpu_index, strerror(errno));
+            error_report("Unable to connect CPU%ld to kernel XICS: %s",
+                         cpu_dt_id, strerror(errno));
             exit(1);
         }
     }
diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
index bf2d3d4..ebe2c5d 100644
--- a/hw/ppc/ppc.c
+++ b/hw/ppc/ppc.c
@@ -1362,3 +1362,28 @@  int PPC_NVRAM_set_params (nvram_t *nvram, uint16_t NVRAM_size,
 
     return 0;
 }
+
+/* CPU device-tree ID helpers */
+int ppc_get_vcpu_dt_id(CPUState *cs)
+{
+    PowerPCCPU *cpu = POWERPC_CPU(cs);
+    CPUPPCState *env = &cpu->env;
+
+    return env->cpu_dt_id;
+}
+
+CPUState *ppc_get_vcpu_by_dt_id(int cpu_dt_id)
+{
+    CPUState *cs;
+
+    CPU_FOREACH(cs) {
+        PowerPCCPU *cpu = POWERPC_CPU(cs);
+        CPUPPCState *env = &cpu->env;
+
+        if (env->cpu_dt_id == cpu_dt_id) {
+            return cs;
+        }
+    }
+
+    return NULL;
+}
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index f76b355..09dc635 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -206,19 +206,20 @@  static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment *spapr)
 
     CPU_FOREACH(cpu) {
         DeviceClass *dc = DEVICE_GET_CLASS(cpu);
+        int cpu_dt_id = ppc_get_vcpu_dt_id(cpu);
         uint32_t associativity[] = {cpu_to_be32(0x5),
                                     cpu_to_be32(0x0),
                                     cpu_to_be32(0x0),
                                     cpu_to_be32(0x0),
                                     cpu_to_be32(cpu->numa_node),
-                                    cpu_to_be32(cpu->cpu_index)};
+                                    cpu_to_be32(cpu_dt_id)};
 
-        if ((cpu->cpu_index % smt) != 0) {
+        if ((cpu_dt_id % smt) != 0) {
             continue;
         }
 
         snprintf(cpu_model, 32, "/cpus/%s@%x", dc->fw_name,
-                 cpu->cpu_index);
+                 cpu_dt_id);
 
         offset = fdt_path_offset(fdt, cpu_model);
         if (offset < 0) {
@@ -367,7 +368,7 @@  static void *spapr_create_fdt_skel(hwaddr initrd_base,
         CPUPPCState *env = &cpu->env;
         DeviceClass *dc = DEVICE_GET_CLASS(cs);
         PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
-        int index = cs->cpu_index;
+        int index = ppc_get_vcpu_dt_id(cs);
         uint32_t servers_prop[smp_threads];
         uint32_t gservers_prop[smp_threads * 2];
         char *nodename;
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index f755a53..1acb0d0 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -466,7 +466,7 @@  static target_ulong h_register_vpa(PowerPCCPU *cpu, sPAPREnvironment *spapr,
     CPUPPCState *tenv;
     CPUState *tcpu;
 
-    tcpu = qemu_get_cpu(procno);
+    tcpu = ppc_get_vcpu_by_dt_id(procno);
     if (!tcpu) {
         return H_PARAMETER;
     }
diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index eb542f2..f56dfca 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -139,7 +139,7 @@  static void rtas_query_cpu_stopped_state(PowerPCCPU *cpu_,
     }
 
     id = rtas_ld(args, 0);
-    cpu = qemu_get_cpu(id);
+    cpu = ppc_get_vcpu_by_dt_id(id);
     if (cpu != NULL) {
         if (cpu->halted) {
             rtas_st(rets, 1, 0);
@@ -172,7 +172,7 @@  static void rtas_start_cpu(PowerPCCPU *cpu_, sPAPREnvironment *spapr,
     start = rtas_ld(args, 1);
     r3 = rtas_ld(args, 2);
 
-    cs = qemu_get_cpu(id);
+    cs = ppc_get_vcpu_by_dt_id(id);
     if (cs != NULL) {
         PowerPCCPU *cpu = POWERPC_CPU(cs);
         CPUPPCState *env = &cpu->env;
diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index 26acdba..6d04b6e 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -1068,6 +1068,9 @@  struct CPUPPCState {
      */
     uint8_t fit_period[4];
     uint8_t wdt_period[4];
+
+    /* The CPU index used in the device tree. KVM uses this index too */
+    int cpu_dt_id;
 };
 
 #define SET_FIT_PERIOD(a_, b_, c_, d_)          \
@@ -2147,4 +2150,7 @@  static inline bool cpu_has_work(CPUState *cpu)
 
 void dump_mmu(FILE *f, fprintf_function cpu_fprintf, CPUPPCState *env);
 
+int ppc_get_vcpu_dt_id(CPUState *cs);
+CPUState *ppc_get_vcpu_by_dt_id(int cpu_dt_id);
+
 #endif /* !defined (__CPU_PPC_H__) */
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 10d0cd9..c579d82 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -401,7 +401,7 @@  static inline void kvm_fixup_page_sizes(PowerPCCPU *cpu)
 
 unsigned long kvm_arch_vcpu_id(CPUState *cpu)
 {
-    return cpu->cpu_index;
+    return ppc_get_vcpu_dt_id(cpu);
 }
 
 int kvm_arch_init_vcpu(CPUState *cs)
@@ -1773,7 +1773,7 @@  int kvmppc_fixup_cpu(PowerPCCPU *cpu)
 
     /* Adjust cpu index for SMT */
     smt = kvmppc_smt_threads();
-    cs->cpu_index = (cs->cpu_index / smp_threads) * smt
+    cpu->env.cpu_dt_id = (cs->cpu_index / smp_threads) * smt
         + (cs->cpu_index % smp_threads);
 
     return 0;
diff --git a/target-ppc/translate_init.c b/target-ppc/translate_init.c
index 3d3952c..c8f28bc 100644
--- a/target-ppc/translate_init.c
+++ b/target-ppc/translate_init.c
@@ -8557,6 +8557,7 @@  static void ppc_cpu_initfn(Object *obj)
 
     cs->env_ptr = env;
     cpu_exec_init(env);
+    env->cpu_dt_id = cs->cpu_index;
 
     env->msr_mask = pcc->msr_mask;
     env->mmu_model = pcc->mmu_model;