Message ID | 54116F40.8090704@redhat.com |
---|---|
State | New |
Headers | show |
Paolo Bonzini <pbonzini@redhat.com> wrote: > This is a hypercall that should have kicked VCPU 3 (see rcx). > > Can you please apply this patch and gather a trace of the host > (using "trace-cmd -e kvm qemu-kvm <arguments>")? Sure, no problem. I've built the trace-cmd tool against udis86 (I hope) and have put the resulting trace.dat at http://cdw.me.uk/tmp/trace.dat This is actually for a -smp 2 qemu (failing to kick VCPU 1?) as I was having trouble persuading the -smp 4 qemu to crash as reliably under tracing. (Something timing related?) Otherwise the qemu-system-x86 command line is exactly as before. The guest kernel crash message which corresponds to this trace was: divide error: 0000 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 618 Comm: mkdir Not tainted 3.16.2-guest #2 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 task: ffff88007c997080 ti: ffff88007c614000 task.ti: ffff88007c614000 RIP: 0010:[<ffffffff81037fe2>] [<ffffffff81037fe2>] kvm_unlock_kick+0x72/0x80 RSP: 0018:ffff88007c617d40 EFLAGS: 00010046 RAX: 0000000000000005 RBX: 0000000000000000 RCX: 0000000000000001 RDX: 0000000000000001 RSI: ffff88007fd11c40 RDI: 0000000000000000 RBP: ffff88007fd11c40 R08: ffffffff81b98940 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000007 R12: 00000000000000f6 R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000011c40 FS: 00007f43eb1ed700(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f43eace0a30 CR3: 0000000001a12000 CR4: 00000000000406f0 Stack: ffff88007c994380 ffff88007c9949aa 0000000000000046 ffffffff81689715 ffffffff810f3174 0000000000000001 ffffea0001f16320 ffffea0001f17860 0000000000000000 ffff88007c99e1e8 ffff88007c997080 0000000000000001 Call Trace: [<ffffffff81689715>] ? _raw_spin_unlock+0x45/0x70 [<ffffffff810f3174>] ? try_to_wake_up+0x2a4/0x330 [<ffffffff81101e2c>] ? __wake_up_common+0x4c/0x80 [<ffffffff81102418>] ? __wake_up_sync_key+0x38/0x60 [<ffffffff810d873a>] ? do_notify_parent+0x19a/0x280 [<ffffffff810f4d56>] ? sched_move_task+0xb6/0x190 [<ffffffff810cb4fc>] ? do_exit+0xa1c/0xab0 [<ffffffff810cc344>] ? do_group_exit+0x34/0xb0 [<ffffffff810cc3cb>] ? SyS_exit_group+0xb/0x10 [<ffffffff8168a16d>] ? system_call_fastpath+0x1a/0x1f Code: c0 ca a7 81 48 8d 04 0b 48 8b 30 48 39 ee 75 c9 0f b6 40 08 44 38 e0 75 c0 48 c7 c0 22 b0 00 00 31 db 0f b7 0c 08 b8 05 00 00 00 <0f> 01 c1 0f 1f 00 5b 5d 41 5c c3 0f 1f 00 48 c7 c0 10 cf 00 00 RIP [<ffffffff81037fe2>] kvm_unlock_kick+0x72/0x80 RSP <ffff88007c617d40> ---[ end trace bf5a4445f9decdbb ]--- Fixing recursive fault but reboot is needed! BUG: scheduling while atomic: mkdir/618/0x00000006 Modules linked in: CPU: 0 PID: 618 Comm: mkdir Tainted: G D 3.16.2-guest #2 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 0000000000000000 00000000c022d302 ffffffff81684029 0000000000000000 ffffffff810ee956 ffffffff81686266 0000000000011c40 ffff88007c617fd8 0000000000011c40 ffff88007c997080 0000000000000006 0000000000000046 Call Trace: [<ffffffff81684029>] ? dump_stack+0x49/0x6a [<ffffffff810ee956>] ? __schedule_bug+0x46/0x60 [<ffffffff81686266>] ? __schedule+0x5a6/0x7c0 [<ffffffff816828cd>] ? printk+0x59/0x75 [<ffffffff810cb33b>] ? do_exit+0x85b/0xab0 [<ffffffff816828cd>] ? printk+0x59/0x75 [<ffffffff8100614a>] ? oops_end+0x7a/0x100 [<ffffffff810033e5>] ? do_error_trap+0x85/0x110 [<ffffffff81037fe2>] ? kvm_unlock_kick+0x72/0x80 [<ffffffff8114a358>] ? __alloc_pages_nodemask+0x108/0xa60 [<ffffffff8168b57e>] ? divide_error+0x1e/0x30 [<ffffffff81037fe2>] ? kvm_unlock_kick+0x72/0x80 [<ffffffff81689715>] ? _raw_spin_unlock+0x45/0x70 [<ffffffff810f3174>] ? try_to_wake_up+0x2a4/0x330 [<ffffffff81101e2c>] ? __wake_up_common+0x4c/0x80 [<ffffffff81102418>] ? __wake_up_sync_key+0x38/0x60 [<ffffffff810d873a>] ? do_notify_parent+0x19a/0x280 [<ffffffff810f4d56>] ? sched_move_task+0xb6/0x190 [<ffffffff810cb4fc>] ? do_exit+0xa1c/0xab0 [<ffffffff810cc344>] ? do_group_exit+0x34/0xb0 [<ffffffff810cc3cb>] ? SyS_exit_group+0xb/0x10 [<ffffffff8168a16d>] ? system_call_fastpath+0x1a/0x1f Best wishes, Chris.
Il 11/09/2014 19:03, Chris Webb ha scritto: > Paolo Bonzini <pbonzini@redhat.com> wrote: > >> This is a hypercall that should have kicked VCPU 3 (see rcx). >> >> Can you please apply this patch and gather a trace of the host >> (using "trace-cmd -e kvm qemu-kvm <arguments>")? > > Sure, no problem. I've built the trace-cmd tool against udis86 (I hope) and > have put the resulting trace.dat at > > http://cdw.me.uk/tmp/trace.dat > > This is actually for a -smp 2 qemu (failing to kick VCPU 1?) as I was having > trouble persuading the -smp 4 qemu to crash as reliably under tracing. > (Something timing related?) Otherwise the qemu-system-x86 command line is > exactly as before. Do you by chance have CONFIG_DEBUG_RODATA set? In that case, the fix is simply not to set it. Paolo > The guest kernel crash message which corresponds to this trace was: > > divide error: 0000 [#1] PREEMPT SMP > Modules linked in: > CPU: 0 PID: 618 Comm: mkdir Not tainted 3.16.2-guest #2 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 > task: ffff88007c997080 ti: ffff88007c614000 task.ti: ffff88007c614000 > RIP: 0010:[<ffffffff81037fe2>] [<ffffffff81037fe2>] kvm_unlock_kick+0x72/0x80 > RSP: 0018:ffff88007c617d40 EFLAGS: 00010046 > RAX: 0000000000000005 RBX: 0000000000000000 RCX: 0000000000000001 > RDX: 0000000000000001 RSI: ffff88007fd11c40 RDI: 0000000000000000 > RBP: ffff88007fd11c40 R08: ffffffff81b98940 R09: 0000000000000001 > R10: 0000000000000000 R11: 0000000000000007 R12: 00000000000000f6 > R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000011c40 > FS: 00007f43eb1ed700(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 00007f43eace0a30 CR3: 0000000001a12000 CR4: 00000000000406f0 > Stack: > ffff88007c994380 ffff88007c9949aa 0000000000000046 ffffffff81689715 > ffffffff810f3174 0000000000000001 ffffea0001f16320 ffffea0001f17860 > 0000000000000000 ffff88007c99e1e8 ffff88007c997080 0000000000000001 > Call Trace: > [<ffffffff81689715>] ? _raw_spin_unlock+0x45/0x70 > [<ffffffff810f3174>] ? try_to_wake_up+0x2a4/0x330 > [<ffffffff81101e2c>] ? __wake_up_common+0x4c/0x80 > [<ffffffff81102418>] ? __wake_up_sync_key+0x38/0x60 > [<ffffffff810d873a>] ? do_notify_parent+0x19a/0x280 > [<ffffffff810f4d56>] ? sched_move_task+0xb6/0x190 > [<ffffffff810cb4fc>] ? do_exit+0xa1c/0xab0 > [<ffffffff810cc344>] ? do_group_exit+0x34/0xb0 > [<ffffffff810cc3cb>] ? SyS_exit_group+0xb/0x10 > [<ffffffff8168a16d>] ? system_call_fastpath+0x1a/0x1f > Code: c0 ca a7 81 48 8d 04 0b 48 8b 30 48 39 ee 75 c9 0f b6 40 08 44 38 e0 75 c0 48 c7 c0 22 b0 00 00 31 db 0f b7 0c 08 b8 05 00 00 00 <0f> 01 c1 0f 1f 00 5b 5d 41 5c c3 0f 1f 00 48 c7 c0 10 cf 00 00 > RIP [<ffffffff81037fe2>] kvm_unlock_kick+0x72/0x80 > RSP <ffff88007c617d40> > ---[ end trace bf5a4445f9decdbb ]--- > Fixing recursive fault but reboot is needed! > BUG: scheduling while atomic: mkdir/618/0x00000006 > Modules linked in: > CPU: 0 PID: 618 Comm: mkdir Tainted: G D 3.16.2-guest #2 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 > 0000000000000000 00000000c022d302 ffffffff81684029 0000000000000000 > ffffffff810ee956 ffffffff81686266 0000000000011c40 ffff88007c617fd8 > 0000000000011c40 ffff88007c997080 0000000000000006 0000000000000046 > Call Trace: > [<ffffffff81684029>] ? dump_stack+0x49/0x6a > [<ffffffff810ee956>] ? __schedule_bug+0x46/0x60 > [<ffffffff81686266>] ? __schedule+0x5a6/0x7c0 > [<ffffffff816828cd>] ? printk+0x59/0x75 > [<ffffffff810cb33b>] ? do_exit+0x85b/0xab0 > [<ffffffff816828cd>] ? printk+0x59/0x75 > [<ffffffff8100614a>] ? oops_end+0x7a/0x100 > [<ffffffff810033e5>] ? do_error_trap+0x85/0x110 > [<ffffffff81037fe2>] ? kvm_unlock_kick+0x72/0x80 > [<ffffffff8114a358>] ? __alloc_pages_nodemask+0x108/0xa60 > [<ffffffff8168b57e>] ? divide_error+0x1e/0x30 > [<ffffffff81037fe2>] ? kvm_unlock_kick+0x72/0x80 > [<ffffffff81689715>] ? _raw_spin_unlock+0x45/0x70 > [<ffffffff810f3174>] ? try_to_wake_up+0x2a4/0x330 > [<ffffffff81101e2c>] ? __wake_up_common+0x4c/0x80 > [<ffffffff81102418>] ? __wake_up_sync_key+0x38/0x60 > [<ffffffff810d873a>] ? do_notify_parent+0x19a/0x280 > [<ffffffff810f4d56>] ? sched_move_task+0xb6/0x190 > [<ffffffff810cb4fc>] ? do_exit+0xa1c/0xab0 > [<ffffffff810cc344>] ? do_group_exit+0x34/0xb0 > [<ffffffff810cc3cb>] ? SyS_exit_group+0xb/0x10 > [<ffffffff8168a16d>] ? system_call_fastpath+0x1a/0x1f > > Best wishes, > > Chris. >
Paolo Bonzini <pbonzini@redhat.com> wrote: > Il 11/09/2014 19:03, Chris Webb ha scritto: >> Paolo Bonzini <pbonzini@redhat.com> wrote: >> >>> This is a hypercall that should have kicked VCPU 3 (see rcx). >>> >>> Can you please apply this patch and gather a trace of the host >>> (using "trace-cmd -e kvm qemu-kvm <arguments>")? >> >> Sure, no problem. I've built the trace-cmd tool against udis86 (I hope) and >> have put the resulting trace.dat at >> >> http://cdw.me.uk/tmp/trace.dat >> >> This is actually for a -smp 2 qemu (failing to kick VCPU 1?) as I was having >> trouble persuading the -smp 4 qemu to crash as reliably under tracing. >> (Something timing related?) Otherwise the qemu-system-x86 command line is >> exactly as before. > > Do you by chance have CONFIG_DEBUG_RODATA set? In that case, the fix is > simply not to set it. Absolutely right: my host and guest kernels do have CONFIG_DEBUG_RODATA set! Your patch to use alternatives for VMCALL vs VMMCALL definitely fixed the divide-by-zero crashes I saw. Given that I can easily use either (or both) of these solutions, is it be more efficient to turn off CONFIG_DEBUG_RODATA in the guest kernel so kvm can fix up the instructions in-place, or is using alternatives for VMCALL/VMMCALL as implemented by your patch just as good? Best wishes, Chris.
Il 22/09/2014 21:08, Chris Webb ha scritto: > > Do you by chance have CONFIG_DEBUG_RODATA set? In that case, the fix is > > simply not to set it. > > Absolutely right: my host and guest kernels do have CONFIG_DEBUG_RODATA set! > > Your patch to use alternatives for VMCALL vs VMMCALL definitely fixed the > divide-by-zero crashes I saw. > > Given that I can easily use either (or both) of these solutions, is it be > more efficient to turn off CONFIG_DEBUG_RODATA in the guest kernel so kvm > can fix up the instructions in-place, or is using alternatives for > VMCALL/VMMCALL as implemented by your patch just as good? I posted a patch to use alternatives if CONFIG_DEBUG_RODATA is enabled, but the bug is in KVM that explicitly documents "you can use any of VMCALL or VMMCALL". I'll also see to fix KVM, but the patch is still useful because a) KVM would not patch a read-only text segment, so there would be a small performance benefit; b) you cannot control already deployed hypervisors. However, since there is a workaround, I won't push it into 3.17 so late in the cycle. Also, there's a chance that it is NACKed since it touches non-KVM files. Paolo
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index fb919c574e23..25ed29f68419 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -709,6 +709,8 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode, int result = 0; struct kvm_vcpu *vcpu = apic->vcpu; + trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode, + trig_mode, vector, false); switch (delivery_mode) { case APIC_DM_LOWEST: vcpu->arch.apic_arb_prio++; @@ -730,8 +732,6 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode, kvm_make_request(KVM_REQ_EVENT, vcpu); kvm_vcpu_kick(vcpu); } - trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode, - trig_mode, vector, false); break; case APIC_DM_REMRD: