diff mbox series

[v2,1/4] accel/kvm: Extract common KVM vCPU {creation, parking} code

Message ID 20240516053211.145504-2-harshpb@linux.ibm.com
State New
Headers show
Series target/ppc: vcpu hotplug failure handling fixes | expand

Commit Message

Harsh Prateek Bora May 16, 2024, 5:32 a.m. UTC
From: Salil Mehta <salil.mehta@huawei.com>

KVM vCPU creation is done once during the vCPU realization when Qemu vCPU thread
is spawned. This is common to all the architectures as of now.

Hot-unplug of vCPU results in destruction of the vCPU object in QOM but the
corresponding KVM vCPU object in the Host KVM is not destroyed as KVM doesn't
support vCPU removal. Therefore, its representative KVM vCPU object/context in
Qemu is parked.

Refactor architecture common logic so that some APIs could be reused by vCPU
Hotplug code of some architectures likes ARM, Loongson etc. Update new/old APIs
with trace events instead of DPRINTF. No functional change is intended here.

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Tested-by: Xianglai Li <lixianglai@loongson.cn>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
[harshpb: fixed rebase failures in include/sysemu/kvm.h]
Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
---
 include/sysemu/kvm.h   | 15 ++++++++++
 accel/kvm/kvm-all.c    | 64 ++++++++++++++++++++++++++++++++----------
 accel/kvm/trace-events |  5 +++-
 3 files changed, 68 insertions(+), 16 deletions(-)

Comments

Harsh Prateek Bora May 16, 2024, 10:15 a.m. UTC | #1
Hi Salil,

Thanks for your email.
Your patch 1/8 is included here based on review comments on my previous 
patch from one of the maintainers in the community and therefore I had 
kept you in CC to be aware of the desire of having this independent 
patch to get merged earlier even if your other patches in the series may 
go through further reviews.

I am hoping to see your v9 soon and thereafter maintainer(s) may choose 
to pick the latest independent patch if needs to be merged earlier.

Thanks for your work and let's be hopeful it gets merged soon.

regards,
Harsh

On 5/16/24 14:00, Salil Mehta wrote:
> Hi Harsh,
> 
> Thanks for your interest in the patch-set but taking away patches like
> this from other series without any discussion can disrupt others work
> and its acceptance on time. This is because we will have to put lot of
> effort in rebasing bigger series and then testing overhead comes along
> with it.
> 
> The patch-set (from where this  patch has been taken) is part of even
> bigger series and there have been many people and companies toiling
> to fix the bugs collectively in that series and for years.
> 
> I'm about float the V9 version of the Arch agnostic series which this
> patch is part of and you can rebase your patch-set from there. I'm
> hopeful that it will get accepted in this cycle.
> 
> 
> Many thanks
> Salil.
> 
>>   From: Harsh Prateek Bora <harshpb@linux.ibm.com>
>>   Sent: Thursday, May 16, 2024 6:32 AM
>>   
>>   From: Salil Mehta <salil.mehta@huawei.com>
>>   
>>   KVM vCPU creation is done once during the vCPU realization when Qemu
>>   vCPU thread is spawned. This is common to all the architectures as of now.
>>   
>>   Hot-unplug of vCPU results in destruction of the vCPU object in QOM but
>>   the corresponding KVM vCPU object in the Host KVM is not destroyed as
>>   KVM doesn't support vCPU removal. Therefore, its representative KVM
>>   vCPU object/context in Qemu is parked.
>>   
>>   Refactor architecture common logic so that some APIs could be reused by
>>   vCPU Hotplug code of some architectures likes ARM, Loongson etc. Update
>>   new/old APIs with trace events instead of DPRINTF. No functional change is
>>   intended here.
>>   
>>   Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>>   Reviewed-by: Gavin Shan <gshan@redhat.com>
>>   Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
>>   Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>>   Tested-by: Xianglai Li <lixianglai@loongson.cn>
>>   Tested-by: Miguel Luis <miguel.luis@oracle.com>
>>   Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
>>   [harshpb: fixed rebase failures in include/sysemu/kvm.h]
>>   Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
>>   ---
>>    include/sysemu/kvm.h   | 15 ++++++++++
>>    accel/kvm/kvm-all.c    | 64 ++++++++++++++++++++++++++++++++--------
>>   --
>>    accel/kvm/trace-events |  5 +++-
>>    3 files changed, 68 insertions(+), 16 deletions(-)
>>   
>>   diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index
>>   eaf801bc93..fa3ec74442 100644
>>   --- a/include/sysemu/kvm.h
>>   +++ b/include/sysemu/kvm.h
>>   @@ -434,6 +434,21 @@ void kvm_set_sigmask_len(KVMState *s, unsigned
>>   int sigmask_len);
>>   
>>    int kvm_physical_memory_addr_from_host(KVMState *s, void
>>   *ram_addr,
>>                                           hwaddr *phys_addr);
>>   +/**
>>   + * kvm_create_vcpu - Gets a parked KVM vCPU or creates a KVM vCPU
>>   + * @cpu: QOM CPUState object for which KVM vCPU has to be
>>   fetched/created.
>>   + *
>>   + * @returns: 0 when success, errno (<0) when failed.
>>   + */
>>   +int kvm_create_vcpu(CPUState *cpu);
>>   +
>>   +/**
>>   + * kvm_park_vcpu - Park QEMU KVM vCPU context
>>   + * @cpu: QOM CPUState object for which QEMU KVM vCPU context has to
>>   be parked.
>>   + *
>>   + * @returns: none
>>   + */
>>   +void kvm_park_vcpu(CPUState *cpu);
>>   
>>    #endif /* COMPILING_PER_TARGET */
>>   
>>   diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index
>>   d7281b93f3..30d42847de 100644
>>   --- a/accel/kvm/kvm-all.c
>>   +++ b/accel/kvm/kvm-all.c
>>   @@ -128,6 +128,7 @@ static QemuMutex kml_slots_lock;  #define
>>   kvm_slots_unlock()  qemu_mutex_unlock(&kml_slots_lock)
>>   
>>    static void kvm_slot_init_dirty_bitmap(KVMSlot *mem);
>>   +static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id);
>>   
>>    static inline void kvm_resample_fd_remove(int gsi)  { @@ -340,14 +341,53
>>   @@ err:
>>        return ret;
>>    }
>>   
>>   +void kvm_park_vcpu(CPUState *cpu)
>>   +{
>>   +    struct KVMParkedVcpu *vcpu;
>>   +
>>   +    trace_kvm_park_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>>   +
>>   +    vcpu = g_malloc0(sizeof(*vcpu));
>>   +    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
>>   +    vcpu->kvm_fd = cpu->kvm_fd;
>>   +    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node); }
>>   +
>>   +int kvm_create_vcpu(CPUState *cpu)
>>   +{
>>   +    unsigned long vcpu_id = kvm_arch_vcpu_id(cpu);
>>   +    KVMState *s = kvm_state;
>>   +    int kvm_fd;
>>   +
>>   +    trace_kvm_create_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>>   +
>>   +    /* check if the KVM vCPU already exist but is parked */
>>   +    kvm_fd = kvm_get_vcpu(s, vcpu_id);
>>   +    if (kvm_fd < 0) {
>>   +        /* vCPU not parked: create a new KVM vCPU */
>>   +        kvm_fd = kvm_vm_ioctl(s, KVM_CREATE_VCPU, vcpu_id);
>>   +        if (kvm_fd < 0) {
>>   +            error_report("KVM_CREATE_VCPU IOCTL failed for vCPU %lu",
>>   vcpu_id);
>>   +            return kvm_fd;
>>   +        }
>>   +    }
>>   +
>>   +    cpu->kvm_fd = kvm_fd;
>>   +    cpu->kvm_state = s;
>>   +    cpu->vcpu_dirty = true;
>>   +    cpu->dirty_pages = 0;
>>   +    cpu->throttle_us_per_full = 0;
>>   +
>>   +    return 0;
>>   +}
>>   +
>>    static int do_kvm_destroy_vcpu(CPUState *cpu)  {
>>        KVMState *s = kvm_state;
>>        long mmap_size;
>>   -    struct KVMParkedVcpu *vcpu = NULL;
>>        int ret = 0;
>>   
>>   -    trace_kvm_destroy_vcpu();
>>   +    trace_kvm_destroy_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>>   
>>        ret = kvm_arch_destroy_vcpu(cpu);
>>        if (ret < 0) {
>>   @@ -373,10 +413,7 @@ static int do_kvm_destroy_vcpu(CPUState *cpu)
>>            }
>>        }
>>   
>>   -    vcpu = g_malloc0(sizeof(*vcpu));
>>   -    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
>>   -    vcpu->kvm_fd = cpu->kvm_fd;
>>   -    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node);
>>   +    kvm_park_vcpu(cpu);
>>    err:
>>        return ret;
>>    }
>>   @@ -397,6 +434,8 @@ static int kvm_get_vcpu(KVMState *s, unsigned long
>>   vcpu_id)
>>            if (cpu->vcpu_id == vcpu_id) {
>>                int kvm_fd;
>>   
>>   +            trace_kvm_get_vcpu(vcpu_id);
>>   +
>>                QLIST_REMOVE(cpu, node);
>>                kvm_fd = cpu->kvm_fd;
>>                g_free(cpu);
>>   @@ -404,7 +443,7 @@ static int kvm_get_vcpu(KVMState *s, unsigned long
>>   vcpu_id)
>>            }
>>        }
>>   
>>   -    return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
>>   +    return -ENOENT;
>>    }
>>   
>>    int kvm_init_vcpu(CPUState *cpu, Error **errp) @@ -415,19 +454,14 @@
>>   int kvm_init_vcpu(CPUState *cpu, Error **errp)
>>   
>>        trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>>   
>>   -    ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
>>   +    ret = kvm_create_vcpu(cpu);
>>        if (ret < 0) {
>>   -        error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed
>>   (%lu)",
>>   +        error_setg_errno(errp, -ret,
>>   +                         "kvm_init_vcpu: kvm_create_vcpu failed (%lu)",
>>                             kvm_arch_vcpu_id(cpu));
>>            goto err;
>>        }
>>   
>>   -    cpu->kvm_fd = ret;
>>   -    cpu->kvm_state = s;
>>   -    cpu->vcpu_dirty = true;
>>   -    cpu->dirty_pages = 0;
>>   -    cpu->throttle_us_per_full = 0;
>>   -
>>        mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
>>        if (mmap_size < 0) {
>>            ret = mmap_size;
>>   diff --git a/accel/kvm/trace-events b/accel/kvm/trace-events index
>>   681ccb667d..75c1724e78 100644
>>   --- a/accel/kvm/trace-events
>>   +++ b/accel/kvm/trace-events
>>   @@ -9,6 +9,10 @@ kvm_device_ioctl(int fd, int type, void *arg) "dev fd %d,
>>   type 0x%x, arg %p"
>>    kvm_failed_reg_get(uint64_t id, const char *msg) "Warning: Unable to
>>   retrieve ONEREG %" PRIu64 " from KVM: %s"
>>    kvm_failed_reg_set(uint64_t id, const char *msg) "Warning: Unable to set
>>   ONEREG %" PRIu64 " to KVM: %s"
>>    kvm_init_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id:
>>   %lu"
>>   +kvm_create_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d
>>   id: %lu"
>>   +kvm_get_vcpu(unsigned long arch_cpu_id) "id: %lu"
>>   +kvm_destroy_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d
>>   id: %lu"
>>   +kvm_park_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id:
>>   %lu"
>>    kvm_irqchip_commit_routes(void) ""
>>    kvm_irqchip_add_msi_route(char *name, int vector, int virq) "dev %s
>>   vector %d virq %d"
>>    kvm_irqchip_update_msi_route(int virq) "Updating MSI route virq=%d"
>>   @@ -25,7 +29,6 @@ kvm_dirty_ring_reaper(const char *s) "%s"
>>    kvm_dirty_ring_reap(uint64_t count, int64_t t) "reaped %"PRIu64" pages
>>   (took %"PRIi64" us)"
>>    kvm_dirty_ring_reaper_kick(const char *reason) "%s"
>>    kvm_dirty_ring_flush(int finished) "%d"
>>   -kvm_destroy_vcpu(void) ""
>>    kvm_failed_get_vcpu_mmap_size(void) ""
>>    kvm_cpu_exec(void) ""
>>    kvm_interrupt_exit_request(void) ""
>>   --
>>   2.39.3
>
Harsh Prateek Bora May 16, 2024, 1:06 p.m. UTC | #2
Hi Salil,

On 5/16/24 17:42, Salil Mehta wrote:
> Hi Harsh,
> 
>>   From: Harsh Prateek Bora <harshpb@linux.ibm.com>
>>   Sent: Thursday, May 16, 2024 11:15 AM
>>   
>>   Hi Salil,
>>   
>>   Thanks for your email.
>>   Your patch 1/8 is included here based on review comments on my previous
>>   patch from one of the maintainers in the community and therefore I had
>>   kept you in CC to be aware of the desire of having this independent patch to
>>   get merged earlier even if your other patches in the series may go through
>>   further reviews.
> 
> I really don’t know which discussion are  you pointing at? Please understand
> you are fixing a bug and we are pushing a feature which has got large series.
> It will break the patch-set  which is about t be merged.
> 
> There will be significant overhead of testing on us for the work we have been
> carrying forward for large time. This will be disruptive. Please dont!
> 

I was referring to the review discussion on my prev patch here:
https://lore.kernel.org/qemu-devel/D191D2JFAR7L.2EH4S445M4TGK@gmail.com/

Although your patch was included with this series only to facilitate
review of the additional patches depending on just one of your patch.

I am not sure what is appearing disruptive here. It is a common practive
in the community that maintainer(s) can pick individual patches from the
series if it has been vetted by siginificant number of reviewers.

However, in this case, since you have mentioned to post next version
soon, you need not worry about it as that would be the preferred version
for both of the series.

> 
>>   
>>   I am hoping to see your v9 soon and thereafter maintainer(s) may choose to
>>   pick the latest independent patch if needs to be merged earlier.
> 
> 
> I don’t think you are understanding what problem it is causing. For your
> small bug fix you are causing significant delays at our end.
> 

I hope I clarfied above that including your patch here doesnt delay
anything. Hoping to see your v9 soon!

Thanks
Harsh
> 
> Thanks
> Salil.
>>   
>>   Thanks for your work and let's be hopeful it gets merged soon.
>>   
>>   regards,
>>   Harsh
>>   
>>   On 5/16/24 14:00, Salil Mehta wrote:
>>   > Hi Harsh,
>>   >
>>   > Thanks for your interest in the patch-set but taking away patches like
>>   > this from other series without any discussion can disrupt others work
>>   > and its acceptance on time. This is because we will have to put lot of
>>   > effort in rebasing bigger series and then testing overhead comes along
>>   > with it.
>>   >
>>   > The patch-set (from where this  patch has been taken) is part of even
>>   > bigger series and there have been many people and companies toiling to
>>   > fix the bugs collectively in that series and for years.
>>   >
>>   > I'm about float the V9 version of the Arch agnostic series which this
>>   > patch is part of and you can rebase your patch-set from there. I'm
>>   > hopeful that it will get accepted in this cycle.
>>   >
>>   >
>>   > Many thanks
>>   > Salil.
>>   >
>>   >>   From: Harsh Prateek Bora <harshpb@linux.ibm.com>
>>   >>   Sent: Thursday, May 16, 2024 6:32 AM
>>   >>
>>   >>   From: Salil Mehta <salil.mehta@huawei.com>
>>   >>
>>   >>   KVM vCPU creation is done once during the vCPU realization when
>>   Qemu
>>   >>   vCPU thread is spawned. This is common to all the architectures as of
>>   now.
>>   >>
>>   >>   Hot-unplug of vCPU results in destruction of the vCPU object in QOM
>>   but
>>   >>   the corresponding KVM vCPU object in the Host KVM is not destroyed
>>   as
>>   >>   KVM doesn't support vCPU removal. Therefore, its representative KVM
>>   >>   vCPU object/context in Qemu is parked.
>>   >>
>>   >>   Refactor architecture common logic so that some APIs could be reused
>>   by
>>   >>   vCPU Hotplug code of some architectures likes ARM, Loongson etc.
>>   Update
>>   >>   new/old APIs with trace events instead of DPRINTF. No functional
>>   change is
>>   >>   intended here.
>>   >>
>>   >>   Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>>   >>   Reviewed-by: Gavin Shan <gshan@redhat.com>
>>   >>   Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
>>   >>   Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>>   >>   Tested-by: Xianglai Li <lixianglai@loongson.cn>
>>   >>   Tested-by: Miguel Luis <miguel.luis@oracle.com>
>>   >>   Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
>>   >>   [harshpb: fixed rebase failures in include/sysemu/kvm.h]
>>   >>   Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
>>   >>   ---
>>   >>    include/sysemu/kvm.h   | 15 ++++++++++
>>   >>    accel/kvm/kvm-all.c    | 64 ++++++++++++++++++++++++++++++++---
>>   -----
>>   >>   --
>>   >>    accel/kvm/trace-events |  5 +++-
>>   >>    3 files changed, 68 insertions(+), 16 deletions(-)
>>   >>
>>   >>   diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index
>>   >>   eaf801bc93..fa3ec74442 100644
>>   >>   --- a/include/sysemu/kvm.h
>>   >>   +++ b/include/sysemu/kvm.h
>>   >>   @@ -434,6 +434,21 @@ void kvm_set_sigmask_len(KVMState *s,
>>   unsigned
>>   >>   int sigmask_len);
>>   >>
>>   >>    int kvm_physical_memory_addr_from_host(KVMState *s, void
>>   >>   *ram_addr,
>>   >>                                           hwaddr *phys_addr);
>>   >>   +/**
>>   >>   + * kvm_create_vcpu - Gets a parked KVM vCPU or creates a KVM
>>   vCPU
>>   >>   + * @cpu: QOM CPUState object for which KVM vCPU has to be
>>   >>   fetched/created.
>>   >>   + *
>>   >>   + * @returns: 0 when success, errno (<0) when failed.
>>   >>   + */
>>   >>   +int kvm_create_vcpu(CPUState *cpu);
>>   >>   +
>>   >>   +/**
>>   >>   + * kvm_park_vcpu - Park QEMU KVM vCPU context
>>   >>   + * @cpu: QOM CPUState object for which QEMU KVM vCPU context
>>   has to
>>   >>   be parked.
>>   >>   + *
>>   >>   + * @returns: none
>>   >>   + */
>>   >>   +void kvm_park_vcpu(CPUState *cpu);
>>   >>
>>   >>    #endif /* COMPILING_PER_TARGET */
>>   >>
>>   >>   diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index
>>   >>   d7281b93f3..30d42847de 100644
>>   >>   --- a/accel/kvm/kvm-all.c
>>   >>   +++ b/accel/kvm/kvm-all.c
>>   >>   @@ -128,6 +128,7 @@ static QemuMutex kml_slots_lock;  #define
>>   >>   kvm_slots_unlock()  qemu_mutex_unlock(&kml_slots_lock)
>>   >>
>>   >>    static void kvm_slot_init_dirty_bitmap(KVMSlot *mem);
>>   >>   +static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id);
>>   >>
>>   >>    static inline void kvm_resample_fd_remove(int gsi)  { @@ -340,14
>>   +341,53
>>   >>   @@ err:
>>   >>        return ret;
>>   >>    }
>>   >>
>>   >>   +void kvm_park_vcpu(CPUState *cpu)
>>   >>   +{
>>   >>   +    struct KVMParkedVcpu *vcpu;
>>   >>   +
>>   >>   +    trace_kvm_park_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>>   >>   +
>>   >>   +    vcpu = g_malloc0(sizeof(*vcpu));
>>   >>   +    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
>>   >>   +    vcpu->kvm_fd = cpu->kvm_fd;
>>   >>   +    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu,
>>   node); }
>>   >>   +
>>   >>   +int kvm_create_vcpu(CPUState *cpu)
>>   >>   +{
>>   >>   +    unsigned long vcpu_id = kvm_arch_vcpu_id(cpu);
>>   >>   +    KVMState *s = kvm_state;
>>   >>   +    int kvm_fd;
>>   >>   +
>>   >>   +    trace_kvm_create_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>>   >>   +
>>   >>   +    /* check if the KVM vCPU already exist but is parked */
>>   >>   +    kvm_fd = kvm_get_vcpu(s, vcpu_id);
>>   >>   +    if (kvm_fd < 0) {
>>   >>   +        /* vCPU not parked: create a new KVM vCPU */
>>   >>   +        kvm_fd = kvm_vm_ioctl(s, KVM_CREATE_VCPU, vcpu_id);
>>   >>   +        if (kvm_fd < 0) {
>>   >>   +            error_report("KVM_CREATE_VCPU IOCTL failed for vCPU %lu",
>>   >>   vcpu_id);
>>   >>   +            return kvm_fd;
>>   >>   +        }
>>   >>   +    }
>>   >>   +
>>   >>   +    cpu->kvm_fd = kvm_fd;
>>   >>   +    cpu->kvm_state = s;
>>   >>   +    cpu->vcpu_dirty = true;
>>   >>   +    cpu->dirty_pages = 0;
>>   >>   +    cpu->throttle_us_per_full = 0;
>>   >>   +
>>   >>   +    return 0;
>>   >>   +}
>>   >>   +
>>   >>    static int do_kvm_destroy_vcpu(CPUState *cpu)  {
>>   >>        KVMState *s = kvm_state;
>>   >>        long mmap_size;
>>   >>   -    struct KVMParkedVcpu *vcpu = NULL;
>>   >>        int ret = 0;
>>   >>
>>   >>   -    trace_kvm_destroy_vcpu();
>>   >>   +    trace_kvm_destroy_vcpu(cpu->cpu_index,
>>   kvm_arch_vcpu_id(cpu));
>>   >>
>>   >>        ret = kvm_arch_destroy_vcpu(cpu);
>>   >>        if (ret < 0) {
>>   >>   @@ -373,10 +413,7 @@ static int do_kvm_destroy_vcpu(CPUState
>>   *cpu)
>>   >>            }
>>   >>        }
>>   >>
>>   >>   -    vcpu = g_malloc0(sizeof(*vcpu));
>>   >>   -    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
>>   >>   -    vcpu->kvm_fd = cpu->kvm_fd;
>>   >>   -    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu,
>>   node);
>>   >>   +    kvm_park_vcpu(cpu);
>>   >>    err:
>>   >>        return ret;
>>   >>    }
>>   >>   @@ -397,6 +434,8 @@ static int kvm_get_vcpu(KVMState *s, unsigned
>>   long
>>   >>   vcpu_id)
>>   >>            if (cpu->vcpu_id == vcpu_id) {
>>   >>                int kvm_fd;
>>   >>
>>   >>   +            trace_kvm_get_vcpu(vcpu_id);
>>   >>   +
>>   >>                QLIST_REMOVE(cpu, node);
>>   >>                kvm_fd = cpu->kvm_fd;
>>   >>                g_free(cpu);
>>   >>   @@ -404,7 +443,7 @@ static int kvm_get_vcpu(KVMState *s, unsigned
>>   long
>>   >>   vcpu_id)
>>   >>            }
>>   >>        }
>>   >>
>>   >>   -    return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
>>   >>   +    return -ENOENT;
>>   >>    }
>>   >>
>>   >>    int kvm_init_vcpu(CPUState *cpu, Error **errp) @@ -415,19 +454,14
>>   @@
>>   >>   int kvm_init_vcpu(CPUState *cpu, Error **errp)
>>   >>
>>   >>        trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>>   >>
>>   >>   -    ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
>>   >>   +    ret = kvm_create_vcpu(cpu);
>>   >>        if (ret < 0) {
>>   >>   -        error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed
>>   >>   (%lu)",
>>   >>   +        error_setg_errno(errp, -ret,
>>   >>   +                         "kvm_init_vcpu: kvm_create_vcpu failed (%lu)",
>>   >>                             kvm_arch_vcpu_id(cpu));
>>   >>            goto err;
>>   >>        }
>>   >>
>>   >>   -    cpu->kvm_fd = ret;
>>   >>   -    cpu->kvm_state = s;
>>   >>   -    cpu->vcpu_dirty = true;
>>   >>   -    cpu->dirty_pages = 0;
>>   >>   -    cpu->throttle_us_per_full = 0;
>>   >>   -
>>   >>        mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
>>   >>        if (mmap_size < 0) {
>>   >>            ret = mmap_size;
>>   >>   diff --git a/accel/kvm/trace-events b/accel/kvm/trace-events index
>>   >>   681ccb667d..75c1724e78 100644
>>   >>   --- a/accel/kvm/trace-events
>>   >>   +++ b/accel/kvm/trace-events
>>   >>   @@ -9,6 +9,10 @@ kvm_device_ioctl(int fd, int type, void *arg) "dev fd
>>   %d,
>>   >>   type 0x%x, arg %p"
>>   >>    kvm_failed_reg_get(uint64_t id, const char *msg) "Warning: Unable to
>>   >>   retrieve ONEREG %" PRIu64 " from KVM: %s"
>>   >>    kvm_failed_reg_set(uint64_t id, const char *msg) "Warning: Unable to
>>   set
>>   >>   ONEREG %" PRIu64 " to KVM: %s"
>>   >>    kvm_init_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d
>>   id:
>>   >>   %lu"
>>   >>   +kvm_create_vcpu(int cpu_index, unsigned long arch_cpu_id) "index:
>>   %d
>>   >>   id: %lu"
>>   >>   +kvm_get_vcpu(unsigned long arch_cpu_id) "id: %lu"
>>   >>   +kvm_destroy_vcpu(int cpu_index, unsigned long arch_cpu_id) "index:
>>   %d
>>   >>   id: %lu"
>>   >>   +kvm_park_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d
>>   id:
>>   >>   %lu"
>>   >>    kvm_irqchip_commit_routes(void) ""
>>   >>    kvm_irqchip_add_msi_route(char *name, int vector, int virq) "dev %s
>>   >>   vector %d virq %d"
>>   >>    kvm_irqchip_update_msi_route(int virq) "Updating MSI route
>>   virq=%d"
>>   >>   @@ -25,7 +29,6 @@ kvm_dirty_ring_reaper(const char *s) "%s"
>>   >>    kvm_dirty_ring_reap(uint64_t count, int64_t t) "reaped %"PRIu64"
>>   pages
>>   >>   (took %"PRIi64" us)"
>>   >>    kvm_dirty_ring_reaper_kick(const char *reason) "%s"
>>   >>    kvm_dirty_ring_flush(int finished) "%d"
>>   >>   -kvm_destroy_vcpu(void) ""
>>   >>    kvm_failed_get_vcpu_mmap_size(void) ""
>>   >>    kvm_cpu_exec(void) ""
>>   >>    kvm_interrupt_exit_request(void) ""
>>   >>   --
>>   >>   2.39.3
>>   >
Salil Mehta May 16, 2024, 2:19 p.m. UTC | #3
[+] Adding this email address to the conversation.

(sorry for the noise)

>  From: Salil Mehta
>  Sent: Thursday, May 16, 2024 2:36 PM
>
>  >  From: Harsh Prateek Bora <harshpb@linux.ibm.com>
>  >  Sent: Thursday, May 16, 2024 2:07 PM
>  >
>  >  Hi Salil,
>  >
>  >  On 5/16/24 17:42, Salil Mehta wrote:
>  >  > Hi Harsh,
>  >  >
>  >  >>   From: Harsh Prateek Bora <harshpb@linux.ibm.com>
>  >  >>   Sent: Thursday, May 16, 2024 11:15 AM
>  >  >>
>  >  >>   Hi Salil,
>  >  >>
>  >  >>   Thanks for your email.
>  >  >>   Your patch 1/8 is included here based on review comments on my
previous
>  >  >>   patch from one of the maintainers in the community and therefore
I had
>  >  >>   kept you in CC to be aware of the desire of having this
independent patch to
>  >  >>   get merged earlier even if your other patches in the series may
go through
>  >  >>   further reviews.
>  >  >
>  >  > I really don’t know which discussion you are pointing at? Please
>  > > understand you are fixing a bug and we are pushing a feature which
has got large series.
>  >  > It will break the patch-set  which is about to be merged.
>  >  >
>  >  > There will be significant overhead of testing on us for the work we
>  > > have been carrying forward for large time. This will be disruptive.
Please dont!
>  >  >
>  >
>  >  I was referring to the review discussion on my prev patch here:
>  >
>  >
https://lore.kernel.org/qemu-devel/D191D2JFAR7L.2EH4S445M4TGK@gmail.com/
>
>
>  Ok, I'm, not sure what this means.
>
>
>  >  Although your patch was included with this series only to facilitate
>  > review of  the additional patches depending on just one of your patch.
>
>
>  Generally you rebase your patch-set over the other and clearly state on
the
>  cover letter that this patch-set is dependent upon such and such
patch-set.
>  Just imagine if everyone starts to unilaterally pick up patches from each
>  other's patch-set it will create a chaos not only for the feature owners
but
>  also for the maintainers.
>
>
>  >
>  >  I am not sure what is appearing disruptive here. It is a common
>  > practive in  the community that maintainer(s) can pick individual
>  > patches from the  series if it has been vetted by siginificant number
of reviewers.
>
>
>  Don’t you think this patch-set is asking for acceptance for a patch
already
>  part of another patch-set which is about to be accepted and is a bigger
>  feature? Will it cause maintenance overhead at the last moment? Yes, of
course!
>
>
>  >  However, in this case, since you have mentioned to post next version
soon,
>  >  you need not worry about it as that would be the preferred version
for both
>  >  of the series.
>
>
>  Yes, but please understand we are working for the benefit of overall
>  community. Please cooperate here.
>
>  >
>  >  >
>  >  >>
>  >  >>   I am hoping to see your v9 soon and thereafter maintainer(s) may
choose to
>  >  >>   pick the latest independent patch if needs to be merged earlier.
>  >  >
>  >  >
>  >  > I don’t think you are understanding what problem it is causing. For
>  >  > your small bug fix you are causing significant delays at our end.
>  >  >
>  >
>  >  I hope I clarfied above that including your patch here doesnt delay
anything.
>  >  Hoping to see your v9 soon!
>  >
>  >  Thanks
>  >  Harsh
>  >  >
>  >  > Thanks
>  >  > Salil.
>  >  >>
>  >  >>   Thanks for your work and let's be hopeful it gets merged soon.
>  >  >>
>  >  >>   regards,
>  >  >>   Harsh
>  >  >>
>  >  >>   On 5/16/24 14:00, Salil Mehta wrote:
>  >  >>   > Hi Harsh,
>  >  >>   >
>  >  >>   > Thanks for your interest in the patch-set but taking away
patches like
>  >  >>   > this from other series without any discussion can disrupt
others work
>  >  >>   > and its acceptance on time. This is because we will have to
put lot of
>  >  >>   > effort in rebasing bigger series and then testing overhead
comes along
>  >  >>   > with it.
>  >  >>   >
>  >  >>   > The patch-set (from where this  patch has been taken) is part
of even
>  >  >>   > bigger series and there have been many people and companies
toiling to
>  >  >>   > fix the bugs collectively in that series and for years.
>  >  >>   >
>  >  >>   > I'm about float the V9 version of the Arch agnostic series
which this
>  >  >>   > patch is part of and you can rebase your patch-set from there.
I'm
>  >  >>   > hopeful that it will get accepted in this cycle.
>  >  >>   >
>  >  >>   >
>  >  >>   > Many thanks
>  >  >>   > Salil.
>  >  >>   >
>  >  >>   >>   From: Harsh Prateek Bora <harshpb@linux.ibm.com>
>  >  >>   >>   Sent: Thursday, May 16, 2024 6:32 AM
>  >  >>   >>
>  >  >>   >>   From: Salil Mehta <salil.mehta@huawei.com>
>  >  >>   >>
>  >  >>   >>   KVM vCPU creation is done once during the vCPU realization
when Qemu
>  >  >>   >>   vCPU thread is spawned. This is common to all the
architectures as of now.
>  >  >>   >>
>  >  >>   >>   Hot-unplug of vCPU results in destruction of the vCPU
object in QOM but
>  >  >>   >>   the corresponding KVM vCPU object in the Host KVM is not
destroyed as
>  >  >>   >>   KVM doesn't support vCPU removal. Therefore, its
representative KVM
>  >  >>   >>   vCPU object/context in Qemu is parked.
>  >  >>   >>
>  >  >>   >>   Refactor architecture common logic so that some APIs could
be reused by
>  >  >>   >>   vCPU Hotplug code of some architectures likes ARM, Loongson
etc. Update
>  >  >>   >>   new/old APIs with trace events instead of DPRINTF. No
functional change is
>  >  >>   >>   intended here.
>  >  >>   >>
>  >  >>   >>   Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>  >  >>   >>   Reviewed-by: Gavin Shan <gshan@redhat.com>
>  >  >>   >>   Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
>  >  >>   >>   Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>  >  >>   >>   Tested-by: Xianglai Li <lixianglai@loongson.cn>
>  >  >>   >>   Tested-by: Miguel Luis <miguel.luis@oracle.com>
>  >  >>   >>   Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
>  >  >>   >>   [harshpb: fixed rebase failures in include/sysemu/kvm.h]
>  >  >>   >>   Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
>  >  >>   >>   ---
>  >  >>   >>    include/sysemu/kvm.h   | 15 ++++++++++
>  >  >>   >>    accel/kvm/kvm-all.c    | 64
>  >  ++++++++++++++++++++++++++++++++---
>  >  >>   -----
>  >  >>   >>   --
>  >  >>   >>    accel/kvm/trace-events |  5 +++-
>  >  >>   >>    3 files changed, 68 insertions(+), 16 deletions(-)
>  >  >>   >>
>  >  >>   >>   diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
>  index
>  >  >>   >>   eaf801bc93..fa3ec74442 100644
>  >  >>   >>   --- a/include/sysemu/kvm.h
>  >  >>   >>   +++ b/include/sysemu/kvm.h
>  >  >>   >>   @@ -434,6 +434,21 @@ void kvm_set_sigmask_len(KVMState *s,
>  >  >>   unsigned
>  >  >>   >>   int sigmask_len);
>  >  >>   >>
>  >  >>   >>    int kvm_physical_memory_addr_from_host(KVMState *s, void
>  >  >>   >>   *ram_addr,
>  >  >>   >>                                           hwaddr *phys_addr);
>  >  >>   >>   +/**
>  >  >>   >>   + * kvm_create_vcpu - Gets a parked KVM vCPU or creates a
KVM
>  >  >>   vCPU
>  >  >>   >>   + * @cpu: QOM CPUState object for which KVM vCPU has to be
>  >  >>   >>   fetched/created.
>  >  >>   >>   + *
>  >  >>   >>   + * @returns: 0 when success, errno (<0) when failed.
>  >  >>   >>   + */
>  >  >>   >>   +int kvm_create_vcpu(CPUState *cpu);
>  >  >>   >>   +
>  >  >>   >>   +/**
>  >  >>   >>   + * kvm_park_vcpu - Park QEMU KVM vCPU context
>  >  >>   >>   + * @cpu: QOM CPUState object for which QEMU KVM vCPU
context has to
>  >  >>   >>   be parked.
>  >  >>   >>   + *
>  >  >>   >>   + * @returns: none
>  >  >>   >>   + */
>  >  >>   >>   +void kvm_park_vcpu(CPUState *cpu);
>  >  >>   >>
>  >  >>   >>    #endif /* COMPILING_PER_TARGET */
>  >  >>   >>
>  >  >>   >>   diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index
>  >  >>   >>   d7281b93f3..30d42847de 100644
>  >  >>   >>   --- a/accel/kvm/kvm-all.c
>  >  >>   >>   +++ b/accel/kvm/kvm-all.c
>  >  >>   >>   @@ -128,6 +128,7 @@ static QemuMutex kml_slots_lock;
>  #define
>  >  >>   >>   kvm_slots_unlock()  qemu_mutex_unlock(&kml_slots_lock)
>  >  >>   >>
>  >  >>   >>    static void kvm_slot_init_dirty_bitmap(KVMSlot *mem);
>  >  >>   >>   +static int kvm_get_vcpu(KVMState *s, unsigned long
vcpu_id);
>  >  >>   >>
>  >  >>   >>    static inline void kvm_resample_fd_remove(int gsi)  { @@ -
>  340,14
>  >  >>   +341,53
>  >  >>   >>   @@ err:
>  >  >>   >>        return ret;
>  >  >>   >>    }
>  >  >>   >>
>  >  >>   >>   +void kvm_park_vcpu(CPUState *cpu)
>  >  >>   >>   +{
>  >  >>   >>   +    struct KVMParkedVcpu *vcpu;
>  >  >>   >>   +
>  >  >>   >>   +    trace_kvm_park_vcpu(cpu->cpu_index,
kvm_arch_vcpu_id(cpu));
>  >  >>   >>   +
>  >  >>   >>   +    vcpu = g_malloc0(sizeof(*vcpu));
>  >  >>   >>   +    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
>  >  >>   >>   +    vcpu->kvm_fd = cpu->kvm_fd;
>  >  >>   >>   +    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu,
node); }
>  >  >>   >>   +
>  >  >>   >>   +int kvm_create_vcpu(CPUState *cpu)
>  >  >>   >>   +{
>  >  >>   >>   +    unsigned long vcpu_id = kvm_arch_vcpu_id(cpu);
>  >  >>   >>   +    KVMState *s = kvm_state;
>  >  >>   >>   +    int kvm_fd;
>  >  >>   >>   +
>  >  >>   >>   +    trace_kvm_create_vcpu(cpu->cpu_index,
kvm_arch_vcpu_id(cpu));
>  >  >>   >>   +
>  >  >>   >>   +    /* check if the KVM vCPU already exist but is parked */
>  >  >>   >>   +    kvm_fd = kvm_get_vcpu(s, vcpu_id);
>  >  >>   >>   +    if (kvm_fd < 0) {
>  >  >>   >>   +        /* vCPU not parked: create a new KVM vCPU */
>  >  >>   >>   +        kvm_fd = kvm_vm_ioctl(s, KVM_CREATE_VCPU, vcpu_id);
>  >  >>   >>   +        if (kvm_fd < 0) {
>  >  >>   >>   +            error_report("KVM_CREATE_VCPU IOCTL failed for
vCPU %lu", vcpu_id);
>  >  >>   >>   +            return kvm_fd;
>  >  >>   >>   +        }
>  >  >>   >>   +    }
>  >  >>   >>   +
>  >  >>   >>   +    cpu->kvm_fd = kvm_fd;
>  >  >>   >>   +    cpu->kvm_state = s;
>  >  >>   >>   +    cpu->vcpu_dirty = true;
>  >  >>   >>   +    cpu->dirty_pages = 0;
>  >  >>   >>   +    cpu->throttle_us_per_full = 0;
>  >  >>   >>   +
>  >  >>   >>   +    return 0;
>  >  >>   >>   +}
>  >  >>   >>   +
>  >  >>   >>    static int do_kvm_destroy_vcpu(CPUState *cpu)  {
>  >  >>   >>        KVMState *s = kvm_state;
>  >  >>   >>        long mmap_size;
>  >  >>   >>   -    struct KVMParkedVcpu *vcpu = NULL;
>  >  >>   >>        int ret = 0;
>  >  >>   >>
>  >  >>   >>   -    trace_kvm_destroy_vcpu();
>  >  >>   >>   +    trace_kvm_destroy_vcpu(cpu->cpu_index,
>  >  >>   kvm_arch_vcpu_id(cpu));
>  >  >>   >>
>  >  >>   >>        ret = kvm_arch_destroy_vcpu(cpu);
>  >  >>   >>        if (ret < 0) {
>  >  >>   >>   @@ -373,10 +413,7 @@ static int
do_kvm_destroy_vcpu(CPUState *cpu)
>  >  >>   >>            }
>  >  >>   >>        }
>  >  >>   >>
>  >  >>   >>   -    vcpu = g_malloc0(sizeof(*vcpu));
>  >  >>   >>   -    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
>  >  >>   >>   -    vcpu->kvm_fd = cpu->kvm_fd;
>  >  >>   >>   -    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu,
>  >  >>   node);
>  >  >>   >>   +    kvm_park_vcpu(cpu);
>  >  >>   >>    err:
>  >  >>   >>        return ret;
>  >  >>   >>    }
>  >  >>   >>   @@ -397,6 +434,8 @@ static int kvm_get_vcpu(KVMState *s,
>  >  unsigned
>  >  >>   long
>  >  >>   >>   vcpu_id)
>  >  >>   >>            if (cpu->vcpu_id == vcpu_id) {
>  >  >>   >>                int kvm_fd;
>  >  >>   >>
>  >  >>   >>   +            trace_kvm_get_vcpu(vcpu_id);
>  >  >>   >>   +
>  >  >>   >>                QLIST_REMOVE(cpu, node);
>  >  >>   >>                kvm_fd = cpu->kvm_fd;
>  >  >>   >>                g_free(cpu);
>  >  >>   >>   @@ -404,7 +443,7 @@ static int kvm_get_vcpu(KVMState *s,
>  >  unsigned
>  >  >>   long
>  >  >>   >>   vcpu_id)
>  >  >>   >>            }
>  >  >>   >>        }
>  >  >>   >>
>  >  >>   >>   -    return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void
>  *)vcpu_id);
>  >  >>   >>   +    return -ENOENT;
>  >  >>   >>    }
>  >  >>   >>
>  >  >>   >>    int kvm_init_vcpu(CPUState *cpu, Error **errp) @@ -415,19
>  >  +454,14
>  >  >>   @@
>  >  >>   >>   int kvm_init_vcpu(CPUState *cpu, Error **errp)
>  >  >>   >>
>  >  >>   >>        trace_kvm_init_vcpu(cpu->cpu_index,
>  kvm_arch_vcpu_id(cpu));
>  >  >>   >>
>  >  >>   >>   -    ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
>  >  >>   >>   +    ret = kvm_create_vcpu(cpu);
>  >  >>   >>        if (ret < 0) {
>  >  >>   >>   -        error_setg_errno(errp, -ret, "kvm_init_vcpu:
kvm_get_vcpu
>  >  failed
>  >  >>   >>   (%lu)",
>  >  >>   >>   +        error_setg_errno(errp, -ret,
>  >  >>   >>   +                         "kvm_init_vcpu: kvm_create_vcpu
failed (%lu)",
>  >  >>   >>                             kvm_arch_vcpu_id(cpu));
>  >  >>   >>            goto err;
>  >  >>   >>        }
>  >  >>   >>
>  >  >>   >>   -    cpu->kvm_fd = ret;
>  >  >>   >>   -    cpu->kvm_state = s;
>  >  >>   >>   -    cpu->vcpu_dirty = true;
>  >  >>   >>   -    cpu->dirty_pages = 0;
>  >  >>   >>   -    cpu->throttle_us_per_full = 0;
>  >  >>   >>   -
>  >  >>   >>        mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
>  >  >>   >>        if (mmap_size < 0) {
>  >  >>   >>            ret = mmap_size;
>  >  >>   >>   diff --git a/accel/kvm/trace-events b/accel/kvm/trace-events
>  index
>  >  >>   >>   681ccb667d..75c1724e78 100644
>  >  >>   >>   --- a/accel/kvm/trace-events
>  >  >>   >>   +++ b/accel/kvm/trace-events
>  >  >>   >>   @@ -9,6 +9,10 @@ kvm_device_ioctl(int fd, int type, void
*arg)
>  >  "dev fd
>  >  >>   %d,
>  >  >>   >>   type 0x%x, arg %p"
>  >  >>   >>    kvm_failed_reg_get(uint64_t id, const char *msg) "Warning:
>  >  Unable to
>  >  >>   >>   retrieve ONEREG %" PRIu64 " from KVM: %s"
>  >  >>   >>    kvm_failed_reg_set(uint64_t id, const char *msg) "Warning:
>  >  Unable to
>  >  >>   set
>  >  >>   >>   ONEREG %" PRIu64 " to KVM: %s"
>  >  >>   >>    kvm_init_vcpu(int cpu_index, unsigned long arch_cpu_id)
>  "index:
>  >  %d
>  >  >>   id:
>  >  >>   >>   %lu"
>  >  >>   >>   +kvm_create_vcpu(int cpu_index, unsigned long arch_cpu_id)
>  >  "index:
>  >  >>   %d
>  >  >>   >>   id: %lu"
>  >  >>   >>   +kvm_get_vcpu(unsigned long arch_cpu_id) "id: %lu"
>  >  >>   >>   +kvm_destroy_vcpu(int cpu_index, unsigned long arch_cpu_id)
>  >  "index:
>  >  >>   %d
>  >  >>   >>   id: %lu"
>  >  >>   >>   +kvm_park_vcpu(int cpu_index, unsigned long arch_cpu_id)
>  >  "index: %d
>  >  >>   id:
>  >  >>   >>   %lu"
>  >  >>   >>    kvm_irqchip_commit_routes(void) ""
>  >  >>   >>    kvm_irqchip_add_msi_route(char *name, int vector, int
virq) "dev
>  >  %s
>  >  >>   >>   vector %d virq %d"
>  >  >>   >>    kvm_irqchip_update_msi_route(int virq) "Updating MSI route
>  >  >>   virq=%d"
>  >  >>   >>   @@ -25,7 +29,6 @@ kvm_dirty_ring_reaper(const char *s) "%s"
>  >  >>   >>    kvm_dirty_ring_reap(uint64_t count, int64_t t)
"reaped %"PRIu64"
>  >  >>   pages
>  >  >>   >>   (took %"PRIi64" us)"
>  >  >>   >>    kvm_dirty_ring_reaper_kick(const char *reason) "%s"
>  >  >>   >>    kvm_dirty_ring_flush(int finished) "%d"
>  >  >>   >>   -kvm_destroy_vcpu(void) ""
>  >  >>   >>    kvm_failed_get_vcpu_mmap_size(void) ""
>  >  >>   >>    kvm_cpu_exec(void) ""
>  >  >>   >>    kvm_interrupt_exit_request(void) ""
>  >  >>   >>   --
>  >  >>   >>   2.39.3
>  >  >>   >
Harsh Prateek Bora May 16, 2024, 2:53 p.m. UTC | #4
Hi Salil,

On 5/16/24 19:05, Salil Mehta wrote:
> 
>>   From: Harsh Prateek Bora <harshpb@linux.ibm.com>
>>   Sent: Thursday, May 16, 2024 2:07 PM
>>   
>>   Hi Salil,
>>   
>>   On 5/16/24 17:42, Salil Mehta wrote:
>>   > Hi Harsh,
>>   >
>>   >>   From: Harsh Prateek Bora <harshpb@linux.ibm.com>
>>   >>   Sent: Thursday, May 16, 2024 11:15 AM
>>   >>
>>   >>   Hi Salil,
>>   >>
>>   >>   Thanks for your email.
>>   >>   Your patch 1/8 is included here based on review comments on my  previous
>>   >>   patch from one of the maintainers in the community and therefore I  had
>>   >>   kept you in CC to be aware of the desire of having this independent patch to
>>   >>   get merged earlier even if your other patches in the series may go through
>>   >>   further reviews.
>>   >
>>   > I really don’t know which discussion are  you pointing at? Please
>>   > understand you are fixing a bug and we are pushing a feature which has got large series.
>>   > It will break the patch-set  which is about t be merged.
>>   >
>>   > There will be significant overhead of testing on us for the work we
>>   > have been carrying forward for large time. This will be disruptive. Please dont!
>>   >
>>   
>>   I was referring to the review discussion on my prev patch here:
>>   https://lore.kernel.org/qemu-devel/D191D2JFAR7L.2EH4S445M4TGK@gmail.com/
> 
> 
> Sure, I'm, not sure what this means.
> 

No worries. If you had followed the conversation on the review
link I shared, I had made it clear that we are expecting a patch update
from you and it is included here just to facilitate review of additional
patches on the top.


> 
>>   Although your patch was included with this series only to facilitate review of
>>   the additional patches depending on just one of your patch.
> 
> 
> Generally you rebase your patch-set over the other and clearly state on the cover
> letter that this patch-set is dependent upon such and such patch-set. Just imagine
> if everyone starts to unilaterally pick up patches from each other's patch-set it will
> create a chaos not only for the feature owners but also for the maintainers.
> 

Please go through the review discussion on the link I shared above. It
was included on the suggestion of one of the maintainers. However, if
you are going to send v9 soon, everyone would be happy to wait.

> 
>>   
>>   I am not sure what is appearing disruptive here. It is a common practive in
>>   the community that maintainer(s) can pick individual patches from the
>>   series if it has been vetted by siginificant number of reviewers.
> 
> 
> Don’t you think this patch-set is asking for acceptance for a patch already
> part of another patch-set which is about to be accepted and is a bigger feature?
> Will it cause maintenance overhead at the last moment? Yes, of course!

No, I dont think so.

> 
> 
>>   However, in this case, since you have mentioned to post next version soon,
>>   you need not worry about it as that would be the preferred version for both
>>   of the series.
> 
> 
> Yes, but please understand we are working for the benefit of overall community.
> Please cooperate here.
> 

Hope I cleared your confusion. We are waiting to see your v9 soon.

>>   
>>   >
>>   >>
>>   >>   I am hoping to see your v9 soon and thereafter maintainer(s) may
>>   choose to
>>   >>   pick the latest independent patch if needs to be merged earlier.
>>   >
>>   >
>>   > I don’t think you are understanding what problem it is causing. For
>>   > your small bug fix you are causing significant delays at our end.
>>   >
>>   
>>   I hope I clarfied above that including your patch here doesnt delay anything.
>>   Hoping to see your v9 soon!
>>   
>>   Thanks
>>   Harsh
>>   >
>>   > Thanks
>>   > Salil.
>>   >>
>>   >>   Thanks for your work and let's be hopeful it gets merged soon.
>>   >>
>>   >>   regards,
>>   >>   Harsh
>>   >>
>>   >>   On 5/16/24 14:00, Salil Mehta wrote:
>>   >>   > Hi Harsh,
>>   >>   >
>>   >>   > Thanks for your interest in the patch-set but taking away patches like
>>   >>   > this from other series without any discussion can disrupt others work
>>   >>   > and its acceptance on time. This is because we will have to put lot of
>>   >>   > effort in rebasing bigger series and then testing overhead comes
>>   along
>>   >>   > with it.
>>   >>   >
>>   >>   > The patch-set (from where this  patch has been taken) is part of even
>>   >>   > bigger series and there have been many people and companies toiling
>>   to
>>   >>   > fix the bugs collectively in that series and for years.
>>   >>   >
>>   >>   > I'm about float the V9 version of the Arch agnostic series which this
>>   >>   > patch is part of and you can rebase your patch-set from there. I'm
>>   >>   > hopeful that it will get accepted in this cycle.
>>   >>   >
>>   >>   >
>>   >>   > Many thanks
>>   >>   > Salil.
>>   >>   >
>>   >>   >>   From: Harsh Prateek Bora <harshpb@linux.ibm.com>
>>   >>   >>   Sent: Thursday, May 16, 2024 6:32 AM
>>   >>   >>
>>   >>   >>   From: Salil Mehta <salil.mehta@huawei.com>
>>   >>   >>
>>   >>   >>   KVM vCPU creation is done once during the vCPU realization when
>>   >>   Qemu
>>   >>   >>   vCPU thread is spawned. This is common to all the architectures as
>>   of
>>   >>   now.
>>   >>   >>
>>   >>   >>   Hot-unplug of vCPU results in destruction of the vCPU object in
>>   QOM
>>   >>   but
>>   >>   >>   the corresponding KVM vCPU object in the Host KVM is not
>>   destroyed
>>   >>   as
>>   >>   >>   KVM doesn't support vCPU removal. Therefore, its representative
>>   KVM
>>   >>   >>   vCPU object/context in Qemu is parked.
>>   >>   >>
>>   >>   >>   Refactor architecture common logic so that some APIs could be
>>   reused
>>   >>   by
>>   >>   >>   vCPU Hotplug code of some architectures likes ARM, Loongson etc.
>>   >>   Update
>>   >>   >>   new/old APIs with trace events instead of DPRINTF. No functional
>>   >>   change is
>>   >>   >>   intended here.
>>   >>   >>
>>   >>   >>   Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
>>   >>   >>   Reviewed-by: Gavin Shan <gshan@redhat.com>
>>   >>   >>   Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
>>   >>   >>   Reviewed-by: Jonathan Cameron
>>   <Jonathan.Cameron@huawei.com>
>>   >>   >>   Tested-by: Xianglai Li <lixianglai@loongson.cn>
>>   >>   >>   Tested-by: Miguel Luis <miguel.luis@oracle.com>
>>   >>   >>   Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
>>   >>   >>   [harshpb: fixed rebase failures in include/sysemu/kvm.h]
>>   >>   >>   Signed-off-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
>>   >>   >>   ---
>>   >>   >>    include/sysemu/kvm.h   | 15 ++++++++++
>>   >>   >>    accel/kvm/kvm-all.c    | 64
>>   ++++++++++++++++++++++++++++++++---
>>   >>   -----
>>   >>   >>   --
>>   >>   >>    accel/kvm/trace-events |  5 +++-
>>   >>   >>    3 files changed, 68 insertions(+), 16 deletions(-)
>>   >>   >>
>>   >>   >>   diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h index
>>   >>   >>   eaf801bc93..fa3ec74442 100644
>>   >>   >>   --- a/include/sysemu/kvm.h
>>   >>   >>   +++ b/include/sysemu/kvm.h
>>   >>   >>   @@ -434,6 +434,21 @@ void kvm_set_sigmask_len(KVMState *s,
>>   >>   unsigned
>>   >>   >>   int sigmask_len);
>>   >>   >>
>>   >>   >>    int kvm_physical_memory_addr_from_host(KVMState *s, void
>>   >>   >>   *ram_addr,
>>   >>   >>                                           hwaddr *phys_addr);
>>   >>   >>   +/**
>>   >>   >>   + * kvm_create_vcpu - Gets a parked KVM vCPU or creates a KVM
>>   >>   vCPU
>>   >>   >>   + * @cpu: QOM CPUState object for which KVM vCPU has to be
>>   >>   >>   fetched/created.
>>   >>   >>   + *
>>   >>   >>   + * @returns: 0 when success, errno (<0) when failed.
>>   >>   >>   + */
>>   >>   >>   +int kvm_create_vcpu(CPUState *cpu);
>>   >>   >>   +
>>   >>   >>   +/**
>>   >>   >>   + * kvm_park_vcpu - Park QEMU KVM vCPU context
>>   >>   >>   + * @cpu: QOM CPUState object for which QEMU KVM vCPU
>>   context
>>   >>   has to
>>   >>   >>   be parked.
>>   >>   >>   + *
>>   >>   >>   + * @returns: none
>>   >>   >>   + */
>>   >>   >>   +void kvm_park_vcpu(CPUState *cpu);
>>   >>   >>
>>   >>   >>    #endif /* COMPILING_PER_TARGET */
>>   >>   >>
>>   >>   >>   diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index
>>   >>   >>   d7281b93f3..30d42847de 100644
>>   >>   >>   --- a/accel/kvm/kvm-all.c
>>   >>   >>   +++ b/accel/kvm/kvm-all.c
>>   >>   >>   @@ -128,6 +128,7 @@ static QemuMutex kml_slots_lock;  #define
>>   >>   >>   kvm_slots_unlock()  qemu_mutex_unlock(&kml_slots_lock)
>>   >>   >>
>>   >>   >>    static void kvm_slot_init_dirty_bitmap(KVMSlot *mem);
>>   >>   >>   +static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id);
>>   >>   >>
>>   >>   >>    static inline void kvm_resample_fd_remove(int gsi)  { @@ -340,14
>>   >>   +341,53
>>   >>   >>   @@ err:
>>   >>   >>        return ret;
>>   >>   >>    }
>>   >>   >>
>>   >>   >>   +void kvm_park_vcpu(CPUState *cpu)
>>   >>   >>   +{
>>   >>   >>   +    struct KVMParkedVcpu *vcpu;
>>   >>   >>   +
>>   >>   >>   +    trace_kvm_park_vcpu(cpu->cpu_index,
>>   kvm_arch_vcpu_id(cpu));
>>   >>   >>   +
>>   >>   >>   +    vcpu = g_malloc0(sizeof(*vcpu));
>>   >>   >>   +    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
>>   >>   >>   +    vcpu->kvm_fd = cpu->kvm_fd;
>>   >>   >>   +    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu,
>>   >>   node); }
>>   >>   >>   +
>>   >>   >>   +int kvm_create_vcpu(CPUState *cpu)
>>   >>   >>   +{
>>   >>   >>   +    unsigned long vcpu_id = kvm_arch_vcpu_id(cpu);
>>   >>   >>   +    KVMState *s = kvm_state;
>>   >>   >>   +    int kvm_fd;
>>   >>   >>   +
>>   >>   >>   +    trace_kvm_create_vcpu(cpu->cpu_index,
>>   kvm_arch_vcpu_id(cpu));
>>   >>   >>   +
>>   >>   >>   +    /* check if the KVM vCPU already exist but is parked */
>>   >>   >>   +    kvm_fd = kvm_get_vcpu(s, vcpu_id);
>>   >>   >>   +    if (kvm_fd < 0) {
>>   >>   >>   +        /* vCPU not parked: create a new KVM vCPU */
>>   >>   >>   +        kvm_fd = kvm_vm_ioctl(s, KVM_CREATE_VCPU, vcpu_id);
>>   >>   >>   +        if (kvm_fd < 0) {
>>   >>   >>   +            error_report("KVM_CREATE_VCPU IOCTL failed for vCPU
>>   %lu",
>>   >>   >>   vcpu_id);
>>   >>   >>   +            return kvm_fd;
>>   >>   >>   +        }
>>   >>   >>   +    }
>>   >>   >>   +
>>   >>   >>   +    cpu->kvm_fd = kvm_fd;
>>   >>   >>   +    cpu->kvm_state = s;
>>   >>   >>   +    cpu->vcpu_dirty = true;
>>   >>   >>   +    cpu->dirty_pages = 0;
>>   >>   >>   +    cpu->throttle_us_per_full = 0;
>>   >>   >>   +
>>   >>   >>   +    return 0;
>>   >>   >>   +}
>>   >>   >>   +
>>   >>   >>    static int do_kvm_destroy_vcpu(CPUState *cpu)  {
>>   >>   >>        KVMState *s = kvm_state;
>>   >>   >>        long mmap_size;
>>   >>   >>   -    struct KVMParkedVcpu *vcpu = NULL;
>>   >>   >>        int ret = 0;
>>   >>   >>
>>   >>   >>   -    trace_kvm_destroy_vcpu();
>>   >>   >>   +    trace_kvm_destroy_vcpu(cpu->cpu_index,
>>   >>   kvm_arch_vcpu_id(cpu));
>>   >>   >>
>>   >>   >>        ret = kvm_arch_destroy_vcpu(cpu);
>>   >>   >>        if (ret < 0) {
>>   >>   >>   @@ -373,10 +413,7 @@ static int do_kvm_destroy_vcpu(CPUState
>>   >>   *cpu)
>>   >>   >>            }
>>   >>   >>        }
>>   >>   >>
>>   >>   >>   -    vcpu = g_malloc0(sizeof(*vcpu));
>>   >>   >>   -    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
>>   >>   >>   -    vcpu->kvm_fd = cpu->kvm_fd;
>>   >>   >>   -    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu,
>>   >>   node);
>>   >>   >>   +    kvm_park_vcpu(cpu);
>>   >>   >>    err:
>>   >>   >>        return ret;
>>   >>   >>    }
>>   >>   >>   @@ -397,6 +434,8 @@ static int kvm_get_vcpu(KVMState *s,
>>   unsigned
>>   >>   long
>>   >>   >>   vcpu_id)
>>   >>   >>            if (cpu->vcpu_id == vcpu_id) {
>>   >>   >>                int kvm_fd;
>>   >>   >>
>>   >>   >>   +            trace_kvm_get_vcpu(vcpu_id);
>>   >>   >>   +
>>   >>   >>                QLIST_REMOVE(cpu, node);
>>   >>   >>                kvm_fd = cpu->kvm_fd;
>>   >>   >>                g_free(cpu);
>>   >>   >>   @@ -404,7 +443,7 @@ static int kvm_get_vcpu(KVMState *s,
>>   unsigned
>>   >>   long
>>   >>   >>   vcpu_id)
>>   >>   >>            }
>>   >>   >>        }
>>   >>   >>
>>   >>   >>   -    return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
>>   >>   >>   +    return -ENOENT;
>>   >>   >>    }
>>   >>   >>
>>   >>   >>    int kvm_init_vcpu(CPUState *cpu, Error **errp) @@ -415,19
>>   +454,14
>>   >>   @@
>>   >>   >>   int kvm_init_vcpu(CPUState *cpu, Error **errp)
>>   >>   >>
>>   >>   >>        trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
>>   >>   >>
>>   >>   >>   -    ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
>>   >>   >>   +    ret = kvm_create_vcpu(cpu);
>>   >>   >>        if (ret < 0) {
>>   >>   >>   -        error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu
>>   failed
>>   >>   >>   (%lu)",
>>   >>   >>   +        error_setg_errno(errp, -ret,
>>   >>   >>   +                         "kvm_init_vcpu: kvm_create_vcpu failed (%lu)",
>>   >>   >>                             kvm_arch_vcpu_id(cpu));
>>   >>   >>            goto err;
>>   >>   >>        }
>>   >>   >>
>>   >>   >>   -    cpu->kvm_fd = ret;
>>   >>   >>   -    cpu->kvm_state = s;
>>   >>   >>   -    cpu->vcpu_dirty = true;
>>   >>   >>   -    cpu->dirty_pages = 0;
>>   >>   >>   -    cpu->throttle_us_per_full = 0;
>>   >>   >>   -
>>   >>   >>        mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
>>   >>   >>        if (mmap_size < 0) {
>>   >>   >>            ret = mmap_size;
>>   >>   >>   diff --git a/accel/kvm/trace-events b/accel/kvm/trace-events index
>>   >>   >>   681ccb667d..75c1724e78 100644
>>   >>   >>   --- a/accel/kvm/trace-events
>>   >>   >>   +++ b/accel/kvm/trace-events
>>   >>   >>   @@ -9,6 +9,10 @@ kvm_device_ioctl(int fd, int type, void *arg)
>>   "dev fd
>>   >>   %d,
>>   >>   >>   type 0x%x, arg %p"
>>   >>   >>    kvm_failed_reg_get(uint64_t id, const char *msg) "Warning:
>>   Unable to
>>   >>   >>   retrieve ONEREG %" PRIu64 " from KVM: %s"
>>   >>   >>    kvm_failed_reg_set(uint64_t id, const char *msg) "Warning:
>>   Unable to
>>   >>   set
>>   >>   >>   ONEREG %" PRIu64 " to KVM: %s"
>>   >>   >>    kvm_init_vcpu(int cpu_index, unsigned long arch_cpu_id) "index:
>>   %d
>>   >>   id:
>>   >>   >>   %lu"
>>   >>   >>   +kvm_create_vcpu(int cpu_index, unsigned long arch_cpu_id)
>>   "index:
>>   >>   %d
>>   >>   >>   id: %lu"
>>   >>   >>   +kvm_get_vcpu(unsigned long arch_cpu_id) "id: %lu"
>>   >>   >>   +kvm_destroy_vcpu(int cpu_index, unsigned long arch_cpu_id)
>>   "index:
>>   >>   %d
>>   >>   >>   id: %lu"
>>   >>   >>   +kvm_park_vcpu(int cpu_index, unsigned long arch_cpu_id)
>>   "index: %d
>>   >>   id:
>>   >>   >>   %lu"
>>   >>   >>    kvm_irqchip_commit_routes(void) ""
>>   >>   >>    kvm_irqchip_add_msi_route(char *name, int vector, int virq) "dev
>>   %s
>>   >>   >>   vector %d virq %d"
>>   >>   >>    kvm_irqchip_update_msi_route(int virq) "Updating MSI route
>>   >>   virq=%d"
>>   >>   >>   @@ -25,7 +29,6 @@ kvm_dirty_ring_reaper(const char *s) "%s"
>>   >>   >>    kvm_dirty_ring_reap(uint64_t count, int64_t t) "reaped %"PRIu64"
>>   >>   pages
>>   >>   >>   (took %"PRIi64" us)"
>>   >>   >>    kvm_dirty_ring_reaper_kick(const char *reason) "%s"
>>   >>   >>    kvm_dirty_ring_flush(int finished) "%d"
>>   >>   >>   -kvm_destroy_vcpu(void) ""
>>   >>   >>    kvm_failed_get_vcpu_mmap_size(void) ""
>>   >>   >>    kvm_cpu_exec(void) ""
>>   >>   >>    kvm_interrupt_exit_request(void) ""
>>   >>   >>   --
>>   >>   >>   2.39.3
>>   >>   >
Nicholas Piggin May 17, 2024, 3:44 a.m. UTC | #5
On Thu May 16, 2024 at 11:35 PM AEST, Salil Mehta wrote:
>
> >  From: Harsh Prateek Bora <harshpb@linux.ibm.com>
> >  Sent: Thursday, May 16, 2024 2:07 PM
> >  
> >  Hi Salil,
> >  
> >  On 5/16/24 17:42, Salil Mehta wrote:
> >  > Hi Harsh,
> >  >
> >  >>   From: Harsh Prateek Bora <harshpb@linux.ibm.com>
> >  >>   Sent: Thursday, May 16, 2024 11:15 AM
> >  >>
> >  >>   Hi Salil,
> >  >>
> >  >>   Thanks for your email.
> >  >>   Your patch 1/8 is included here based on review comments on my  previous
> >  >>   patch from one of the maintainers in the community and therefore I  had
> >  >>   kept you in CC to be aware of the desire of having this independent patch to
> >  >>   get merged earlier even if your other patches in the series may go through
> >  >>   further reviews.
> >  >
> >  > I really don’t know which discussion are  you pointing at? Please
> >  > understand you are fixing a bug and we are pushing a feature which has got large series.
> >  > It will break the patch-set  which is about t be merged.
> >  >
> >  > There will be significant overhead of testing on us for the work we
> >  > have been carrying forward for large time. This will be disruptive. Please dont!
> >  >
> >  
> >  I was referring to the review discussion on my prev patch here:
> >  https://lore.kernel.org/qemu-devel/D191D2JFAR7L.2EH4S445M4TGK@gmail.com/
>
>
> Sure, I'm, not sure what this means. 
>
>
> >  Although your patch was included with this series only to facilitate review of
> >  the additional patches depending on just one of your patch.
>
>
> Generally you rebase your patch-set over the other and clearly state on the cover
> letter that this patch-set is dependent upon such and such patch-set. Just imagine
> if everyone starts to unilaterally pick up patches from each other's patch-set it will
> create a chaos not only for the feature owners but also for the maintainers.
>
>
> >  
> >  I am not sure what is appearing disruptive here. It is a common practive in
> >  the community that maintainer(s) can pick individual patches from the
> >  series if it has been vetted by siginificant number of reviewers.
>
>
> Don’t you think this patch-set is asking for acceptance for a patch already 
> part of another patch-set which is about to be accepted and is a bigger feature?
> Will it cause maintenance overhead at the last moment? Yes, of course!
>
>
> >  However, in this case, since you have mentioned to post next version soon,
> >  you need not worry about it as that would be the preferred version for both
> >  of the series.
>
>
> Yes, but please understand we are working for the benefit of overall community.
> Please cooperate here.

There might be a misunderstanding, Harsh just said there had not been
much progress on your series for a while and he wasn't sure what the
status was. I mentioned that we *could* take your patch 1 (with your
blessing) if there was a hold up with the rest of the series. He was
going to check in with you to see how it was going.

This patch 1 was not intended to be merged as is without syncing up with
you first, but it's understandable you were concerned because that was
probably not communicated with you clearly.

I appreciate you bringing up your concerns, we'll try to do better.

Thanks,
Nick
diff mbox series

Patch

diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index eaf801bc93..fa3ec74442 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -434,6 +434,21 @@  void kvm_set_sigmask_len(KVMState *s, unsigned int sigmask_len);
 
 int kvm_physical_memory_addr_from_host(KVMState *s, void *ram_addr,
                                        hwaddr *phys_addr);
+/**
+ * kvm_create_vcpu - Gets a parked KVM vCPU or creates a KVM vCPU
+ * @cpu: QOM CPUState object for which KVM vCPU has to be fetched/created.
+ *
+ * @returns: 0 when success, errno (<0) when failed.
+ */
+int kvm_create_vcpu(CPUState *cpu);
+
+/**
+ * kvm_park_vcpu - Park QEMU KVM vCPU context
+ * @cpu: QOM CPUState object for which QEMU KVM vCPU context has to be parked.
+ *
+ * @returns: none
+ */
+void kvm_park_vcpu(CPUState *cpu);
 
 #endif /* COMPILING_PER_TARGET */
 
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index d7281b93f3..30d42847de 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -128,6 +128,7 @@  static QemuMutex kml_slots_lock;
 #define kvm_slots_unlock()  qemu_mutex_unlock(&kml_slots_lock)
 
 static void kvm_slot_init_dirty_bitmap(KVMSlot *mem);
+static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id);
 
 static inline void kvm_resample_fd_remove(int gsi)
 {
@@ -340,14 +341,53 @@  err:
     return ret;
 }
 
+void kvm_park_vcpu(CPUState *cpu)
+{
+    struct KVMParkedVcpu *vcpu;
+
+    trace_kvm_park_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
+
+    vcpu = g_malloc0(sizeof(*vcpu));
+    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
+    vcpu->kvm_fd = cpu->kvm_fd;
+    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node);
+}
+
+int kvm_create_vcpu(CPUState *cpu)
+{
+    unsigned long vcpu_id = kvm_arch_vcpu_id(cpu);
+    KVMState *s = kvm_state;
+    int kvm_fd;
+
+    trace_kvm_create_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
+
+    /* check if the KVM vCPU already exist but is parked */
+    kvm_fd = kvm_get_vcpu(s, vcpu_id);
+    if (kvm_fd < 0) {
+        /* vCPU not parked: create a new KVM vCPU */
+        kvm_fd = kvm_vm_ioctl(s, KVM_CREATE_VCPU, vcpu_id);
+        if (kvm_fd < 0) {
+            error_report("KVM_CREATE_VCPU IOCTL failed for vCPU %lu", vcpu_id);
+            return kvm_fd;
+        }
+    }
+
+    cpu->kvm_fd = kvm_fd;
+    cpu->kvm_state = s;
+    cpu->vcpu_dirty = true;
+    cpu->dirty_pages = 0;
+    cpu->throttle_us_per_full = 0;
+
+    return 0;
+}
+
 static int do_kvm_destroy_vcpu(CPUState *cpu)
 {
     KVMState *s = kvm_state;
     long mmap_size;
-    struct KVMParkedVcpu *vcpu = NULL;
     int ret = 0;
 
-    trace_kvm_destroy_vcpu();
+    trace_kvm_destroy_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
 
     ret = kvm_arch_destroy_vcpu(cpu);
     if (ret < 0) {
@@ -373,10 +413,7 @@  static int do_kvm_destroy_vcpu(CPUState *cpu)
         }
     }
 
-    vcpu = g_malloc0(sizeof(*vcpu));
-    vcpu->vcpu_id = kvm_arch_vcpu_id(cpu);
-    vcpu->kvm_fd = cpu->kvm_fd;
-    QLIST_INSERT_HEAD(&kvm_state->kvm_parked_vcpus, vcpu, node);
+    kvm_park_vcpu(cpu);
 err:
     return ret;
 }
@@ -397,6 +434,8 @@  static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
         if (cpu->vcpu_id == vcpu_id) {
             int kvm_fd;
 
+            trace_kvm_get_vcpu(vcpu_id);
+
             QLIST_REMOVE(cpu, node);
             kvm_fd = cpu->kvm_fd;
             g_free(cpu);
@@ -404,7 +443,7 @@  static int kvm_get_vcpu(KVMState *s, unsigned long vcpu_id)
         }
     }
 
-    return kvm_vm_ioctl(s, KVM_CREATE_VCPU, (void *)vcpu_id);
+    return -ENOENT;
 }
 
 int kvm_init_vcpu(CPUState *cpu, Error **errp)
@@ -415,19 +454,14 @@  int kvm_init_vcpu(CPUState *cpu, Error **errp)
 
     trace_kvm_init_vcpu(cpu->cpu_index, kvm_arch_vcpu_id(cpu));
 
-    ret = kvm_get_vcpu(s, kvm_arch_vcpu_id(cpu));
+    ret = kvm_create_vcpu(cpu);
     if (ret < 0) {
-        error_setg_errno(errp, -ret, "kvm_init_vcpu: kvm_get_vcpu failed (%lu)",
+        error_setg_errno(errp, -ret,
+                         "kvm_init_vcpu: kvm_create_vcpu failed (%lu)",
                          kvm_arch_vcpu_id(cpu));
         goto err;
     }
 
-    cpu->kvm_fd = ret;
-    cpu->kvm_state = s;
-    cpu->vcpu_dirty = true;
-    cpu->dirty_pages = 0;
-    cpu->throttle_us_per_full = 0;
-
     mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
     if (mmap_size < 0) {
         ret = mmap_size;
diff --git a/accel/kvm/trace-events b/accel/kvm/trace-events
index 681ccb667d..75c1724e78 100644
--- a/accel/kvm/trace-events
+++ b/accel/kvm/trace-events
@@ -9,6 +9,10 @@  kvm_device_ioctl(int fd, int type, void *arg) "dev fd %d, type 0x%x, arg %p"
 kvm_failed_reg_get(uint64_t id, const char *msg) "Warning: Unable to retrieve ONEREG %" PRIu64 " from KVM: %s"
 kvm_failed_reg_set(uint64_t id, const char *msg) "Warning: Unable to set ONEREG %" PRIu64 " to KVM: %s"
 kvm_init_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id: %lu"
+kvm_create_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id: %lu"
+kvm_get_vcpu(unsigned long arch_cpu_id) "id: %lu"
+kvm_destroy_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id: %lu"
+kvm_park_vcpu(int cpu_index, unsigned long arch_cpu_id) "index: %d id: %lu"
 kvm_irqchip_commit_routes(void) ""
 kvm_irqchip_add_msi_route(char *name, int vector, int virq) "dev %s vector %d virq %d"
 kvm_irqchip_update_msi_route(int virq) "Updating MSI route virq=%d"
@@ -25,7 +29,6 @@  kvm_dirty_ring_reaper(const char *s) "%s"
 kvm_dirty_ring_reap(uint64_t count, int64_t t) "reaped %"PRIu64" pages (took %"PRIi64" us)"
 kvm_dirty_ring_reaper_kick(const char *reason) "%s"
 kvm_dirty_ring_flush(int finished) "%d"
-kvm_destroy_vcpu(void) ""
 kvm_failed_get_vcpu_mmap_size(void) ""
 kvm_cpu_exec(void) ""
 kvm_interrupt_exit_request(void) ""