Message ID | 20230323131914.35583-1-zhoujiajing.vergil@bytedance.com |
---|---|
State | New |
Headers | show |
Series | [1/1] accel/kvm/kvm-all: fix vm crash when set dirty ring and memorybacking | expand |
On Thu, Mar 23, 2023 at 09:19:15PM +0800, Jiajing Zhou wrote: > From: "zhoujiajing.vergil" <zhoujiajing.vergil@bytedance.com> > > It is possible enter this function when the cpu not finished creating but > is already in the cpu list. The value of dirty_gfns is null, causing vm > crash here. > > When both dirty-ring and memorybacking are set, creating a vm will assert > on kvm_dirty_ring_reap_one. Part of the xml as follows: > > > <domain type='kvm' id='9'> > ... > <memoryBacking> > <hugepages> > <page size='2048' unit='KiB' memAccess='shared'/> > </hugepages> > </memoryBacking> > ... > <features> > <acpi/> > <kvm> > <dirty-ring state='on' size='4096'/> > </kvm> > </features> > ... > <domain/> > > The kvm-reaper thread was created before vcpu thread, and the value of > cpu->kvm_dirty_gfns is assigned at cpu thread. In the x86_cpu_realizefn > function, the cpu is inserted into the cpu list first, and then the cpu > thread is created for initialization. The entry functions are > cpu_exec_realizefn and qemu_init_vcpu. In the existing logic, the > kvm-reaper thread traverses the cpu list every second and finally call > kvm_dirty_ring_reap_one for each cpu in the list. If cpu has been inserted > into cpu list but has not been initialized so that the value of dirty_gfns > is null, kvm-reaper thread call kvm_dirty_ring_reap_one will cause vm crash. > > The call stack is as follows: > kvm_dirty_ring_reaper_thread > -> kvm_dirty_ring_reap > ->kvm_dirty_ring_reap_locked > ->kvm_dirty_ring_reap_one > > > Signed-off-by: zhoujiajing.vergil <zhoujiajing.vergil@bytedance.com> Acked-by: Peter Xu <peterx@redhat.com> And there's a prior fix last year: https://lore.kernel.org/r/20220927154653.77296-1-peterx@redhat.com The most recent post that I'm aware of is by Yong: https://lore.kernel.org/r/1d14deb6684bcb7de1c9633c5bd21113988cc698.1676563222.git.huangy81@chinatelecom.cn A bunch of people hit this already. Paolo, ping yet again - would you consider merging any of the versions? For this one I'd think it'll be good to have even for 8.0. Thanks!
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index f2a6ea6a68..698a59162a 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -685,6 +685,11 @@ static uint32_t kvm_dirty_ring_reap_one(KVMState *s, CPUState *cpu) uint32_t ring_size = s->kvm_dirty_ring_size; uint32_t count = 0, fetch = cpu->kvm_fetch_index; + /* return 0 when cpu not finished creating */ + if (!cpu->created) { + return 0; + } + assert(dirty_gfns && ring_size); trace_kvm_dirty_ring_reap_vcpu(cpu->cpu_index);