Message ID | 20221202002256.39243-1-dongli.zhang@oracle.com |
---|---|
Headers | show |
Series | target/i386/kvm: fix two svm pmu virtualization bugs | expand |
Can I get feedback for this patchset, especially the [PATCH v2 2/2]? About the [PATCH v2 2/2], currently the issue impacts the usage of PMUs on AMD VM, especially the below case: 1. Enable panic on nmi. 2. Use perf to monitor the performance of VM. Although without a test, I think the nmi watchdog has the same effect. 3. A sudden system reset, or a kernel panic (kdump/kexec). 4. After reboot, there will be random unknown NMI. 5. Unfortunately, the "panic on nmi" may panic the VM randomly at any time. Thank you very much! Dongli Zhang On 12/1/22 16:22, Dongli Zhang wrote: > This patchset is to fix two svm pmu virtualization bugs, x86 only. > > version 1: > https://lore.kernel.org/all/20221119122901.2469-1-dongli.zhang@oracle.com/ > > 1. The 1st bug is that "-cpu,-pmu" cannot disable svm pmu virtualization. > > To use "-cpu EPYC" or "-cpu host,-pmu" cannot disable the pmu > virtualization. There is still below at the VM linux side ... > > [ 0.510611] Performance Events: Fam17h+ core perfctr, AMD PMU driver. > > ... although we expect something like below. > > [ 0.596381] Performance Events: PMU not available due to virtualization, using software events only. > [ 0.600972] NMI watchdog: Perf NMI watchdog permanently disabled > > The 1st patch has introduced a new x86 only accel/kvm property > "pmu-cap-disabled=true" to disable the pmu virtualization via > KVM_PMU_CAP_DISABLE. > > I considered 'KVM_X86_SET_MSR_FILTER' initially before patchset v1. > Since both KVM_X86_SET_MSR_FILTER and KVM_PMU_CAP_DISABLE are VM ioctl. I > finally used the latter because it is easier to use. > > > 2. The 2nd bug is that un-reclaimed perf events (after QEMU system_reset) > at the KVM side may inject random unwanted/unknown NMIs to the VM. > > The svm pmu registers are not reset during QEMU system_reset. > > (1). The VM resets (e.g., via QEMU system_reset or VM kdump/kexec) while it > is running "perf top". The pmu registers are not disabled gracefully. > > (2). Although the x86_cpu_reset() resets many registers to zero, the > kvm_put_msrs() does not puts AMD pmu registers to KVM side. As a result, > some pmu events are still enabled at the KVM side. > > (3). The KVM pmc_speculative_in_use() always returns true so that the events > will not be reclaimed. The kvm_pmc->perf_event is still active. > > (4). After the reboot, the VM kernel reports below error: > > [ 0.092011] Performance Events: Fam17h+ core perfctr, Broken BIOS detected, complain to your hardware vendor. > [ 0.092023] [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR c0010200 is 530076) > > (5). In a worse case, the active kvm_pmc->perf_event is still able to > inject unknown NMIs randomly to the VM kernel. > > [...] Uhhuh. NMI received for unknown reason 30 on CPU 0. > > The 2nd patch is to fix the issue by resetting AMD pmu registers as well as > Intel registers. > > > This patchset does not cover PerfMonV2, until the below patchset is merged > into the KVM side. > > [PATCH v3 0/8] KVM: x86: Add AMD Guest PerfMonV2 PMU support > https://lore.kernel.org/all/20221111102645.82001-1-likexu@tencent.com/ > > > Dongli Zhang (2): > target/i386/kvm: introduce 'pmu-cap-disabled' to set KVM_PMU_CAP_DISABLE > target/i386/kvm: get and put AMD pmu registers > > accel/kvm/kvm-all.c | 1 + > include/sysemu/kvm_int.h | 1 + > qemu-options.hx | 7 +++ > target/i386/cpu.h | 5 ++ > target/i386/kvm/kvm.c | 129 +++++++++++++++++++++++++++++++++++++++++- > 5 files changed, 141 insertions(+), 2 deletions(-) > > Thank you very much! > > Dongli Zhang > >
Ping? About [PATCH v2 2/2], the bad thing is that the customer will not be able to notice the issue, that is, the "Broken BIOS detected" in dmesg, immediately. As a result, the customer VM many panic randomly anytime in the future (once issue is encountered) if "/proc/sys/kernel/unknown_nmi_panic" is enabled. Thank you very much! Dongli Zhang On 12/19/22 06:45, Dongli Zhang wrote: > Can I get feedback for this patchset, especially the [PATCH v2 2/2]? > > About the [PATCH v2 2/2], currently the issue impacts the usage of PMUs on AMD > VM, especially the below case: > > 1. Enable panic on nmi. > 2. Use perf to monitor the performance of VM. Although without a test, I think > the nmi watchdog has the same effect. > 3. A sudden system reset, or a kernel panic (kdump/kexec). > 4. After reboot, there will be random unknown NMI. > 5. Unfortunately, the "panic on nmi" may panic the VM randomly at any time. > > Thank you very much! > > Dongli Zhang > > On 12/1/22 16:22, Dongli Zhang wrote: >> This patchset is to fix two svm pmu virtualization bugs, x86 only. >> >> version 1: >> https://lore.kernel.org/all/20221119122901.2469-1-dongli.zhang@oracle.com/ >> >> 1. The 1st bug is that "-cpu,-pmu" cannot disable svm pmu virtualization. >> >> To use "-cpu EPYC" or "-cpu host,-pmu" cannot disable the pmu >> virtualization. There is still below at the VM linux side ... >> >> [ 0.510611] Performance Events: Fam17h+ core perfctr, AMD PMU driver. >> >> ... although we expect something like below. >> >> [ 0.596381] Performance Events: PMU not available due to virtualization, using software events only. >> [ 0.600972] NMI watchdog: Perf NMI watchdog permanently disabled >> >> The 1st patch has introduced a new x86 only accel/kvm property >> "pmu-cap-disabled=true" to disable the pmu virtualization via >> KVM_PMU_CAP_DISABLE. >> >> I considered 'KVM_X86_SET_MSR_FILTER' initially before patchset v1. >> Since both KVM_X86_SET_MSR_FILTER and KVM_PMU_CAP_DISABLE are VM ioctl. I >> finally used the latter because it is easier to use. >> >> >> 2. The 2nd bug is that un-reclaimed perf events (after QEMU system_reset) >> at the KVM side may inject random unwanted/unknown NMIs to the VM. >> >> The svm pmu registers are not reset during QEMU system_reset. >> >> (1). The VM resets (e.g., via QEMU system_reset or VM kdump/kexec) while it >> is running "perf top". The pmu registers are not disabled gracefully. >> >> (2). Although the x86_cpu_reset() resets many registers to zero, the >> kvm_put_msrs() does not puts AMD pmu registers to KVM side. As a result, >> some pmu events are still enabled at the KVM side. >> >> (3). The KVM pmc_speculative_in_use() always returns true so that the events >> will not be reclaimed. The kvm_pmc->perf_event is still active. >> >> (4). After the reboot, the VM kernel reports below error: >> >> [ 0.092011] Performance Events: Fam17h+ core perfctr, Broken BIOS detected, complain to your hardware vendor. >> [ 0.092023] [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR c0010200 is 530076) >> >> (5). In a worse case, the active kvm_pmc->perf_event is still able to >> inject unknown NMIs randomly to the VM kernel. >> >> [...] Uhhuh. NMI received for unknown reason 30 on CPU 0. >> >> The 2nd patch is to fix the issue by resetting AMD pmu registers as well as >> Intel registers. >> >> >> This patchset does not cover PerfMonV2, until the below patchset is merged >> into the KVM side. >> >> [PATCH v3 0/8] KVM: x86: Add AMD Guest PerfMonV2 PMU support >> https://lore.kernel.org/all/20221111102645.82001-1-likexu@tencent.com/ >> >> >> Dongli Zhang (2): >> target/i386/kvm: introduce 'pmu-cap-disabled' to set KVM_PMU_CAP_DISABLE >> target/i386/kvm: get and put AMD pmu registers >> >> accel/kvm/kvm-all.c | 1 + >> include/sysemu/kvm_int.h | 1 + >> qemu-options.hx | 7 +++ >> target/i386/cpu.h | 5 ++ >> target/i386/kvm/kvm.c | 129 +++++++++++++++++++++++++++++++++++++++++- >> 5 files changed, 141 insertions(+), 2 deletions(-) >> >> Thank you very much! >> >> Dongli Zhang >> >>
I think we've been stuck here too long. Sorry Dongli. +zhenyu, could you get someone to follow up on this, or I will start working on that. On 9/1/2023 9:19 am, Dongli Zhang wrote: > Ping? > > About [PATCH v2 2/2], the bad thing is that the customer will not be able to > notice the issue, that is, the "Broken BIOS detected" in dmesg, immediately. > > As a result, the customer VM many panic randomly anytime in the future (once > issue is encountered) if "/proc/sys/kernel/unknown_nmi_panic" is enabled. > > Thank you very much! > > Dongli Zhang > > On 12/19/22 06:45, Dongli Zhang wrote: >> Can I get feedback for this patchset, especially the [PATCH v2 2/2]? >> >> About the [PATCH v2 2/2], currently the issue impacts the usage of PMUs on AMD >> VM, especially the below case: >> >> 1. Enable panic on nmi. >> 2. Use perf to monitor the performance of VM. Although without a test, I think >> the nmi watchdog has the same effect. >> 3. A sudden system reset, or a kernel panic (kdump/kexec). >> 4. After reboot, there will be random unknown NMI. >> 5. Unfortunately, the "panic on nmi" may panic the VM randomly at any time. >> >> Thank you very much! >> >> Dongli Zhang >> >> On 12/1/22 16:22, Dongli Zhang wrote: >>> This patchset is to fix two svm pmu virtualization bugs, x86 only. >>> >>> version 1: >>> https://lore.kernel.org/all/20221119122901.2469-1-dongli.zhang@oracle.com/ >>> >>> 1. The 1st bug is that "-cpu,-pmu" cannot disable svm pmu virtualization. >>> >>> To use "-cpu EPYC" or "-cpu host,-pmu" cannot disable the pmu >>> virtualization. There is still below at the VM linux side ... >>> >>> [ 0.510611] Performance Events: Fam17h+ core perfctr, AMD PMU driver. >>> >>> ... although we expect something like below. >>> >>> [ 0.596381] Performance Events: PMU not available due to virtualization, using software events only. >>> [ 0.600972] NMI watchdog: Perf NMI watchdog permanently disabled >>> >>> The 1st patch has introduced a new x86 only accel/kvm property >>> "pmu-cap-disabled=true" to disable the pmu virtualization via >>> KVM_PMU_CAP_DISABLE. >>> >>> I considered 'KVM_X86_SET_MSR_FILTER' initially before patchset v1. >>> Since both KVM_X86_SET_MSR_FILTER and KVM_PMU_CAP_DISABLE are VM ioctl. I >>> finally used the latter because it is easier to use. >>> >>> >>> 2. The 2nd bug is that un-reclaimed perf events (after QEMU system_reset) >>> at the KVM side may inject random unwanted/unknown NMIs to the VM. >>> >>> The svm pmu registers are not reset during QEMU system_reset. >>> >>> (1). The VM resets (e.g., via QEMU system_reset or VM kdump/kexec) while it >>> is running "perf top". The pmu registers are not disabled gracefully. >>> >>> (2). Although the x86_cpu_reset() resets many registers to zero, the >>> kvm_put_msrs() does not puts AMD pmu registers to KVM side. As a result, >>> some pmu events are still enabled at the KVM side. >>> >>> (3). The KVM pmc_speculative_in_use() always returns true so that the events >>> will not be reclaimed. The kvm_pmc->perf_event is still active. >>> >>> (4). After the reboot, the VM kernel reports below error: >>> >>> [ 0.092011] Performance Events: Fam17h+ core perfctr, Broken BIOS detected, complain to your hardware vendor. >>> [ 0.092023] [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR c0010200 is 530076) >>> >>> (5). In a worse case, the active kvm_pmc->perf_event is still able to >>> inject unknown NMIs randomly to the VM kernel. >>> >>> [...] Uhhuh. NMI received for unknown reason 30 on CPU 0. >>> >>> The 2nd patch is to fix the issue by resetting AMD pmu registers as well as >>> Intel registers. >>> >>> >>> This patchset does not cover PerfMonV2, until the below patchset is merged >>> into the KVM side. >>> >>> [PATCH v3 0/8] KVM: x86: Add AMD Guest PerfMonV2 PMU support >>> https://lore.kernel.org/all/20221111102645.82001-1-likexu@tencent.com/ >>> >>> >>> Dongli Zhang (2): >>> target/i386/kvm: introduce 'pmu-cap-disabled' to set KVM_PMU_CAP_DISABLE >>> target/i386/kvm: get and put AMD pmu registers >>> >>> accel/kvm/kvm-all.c | 1 + >>> include/sysemu/kvm_int.h | 1 + >>> qemu-options.hx | 7 +++ >>> target/i386/cpu.h | 5 ++ >>> target/i386/kvm/kvm.c | 129 +++++++++++++++++++++++++++++++++++++++++- >>> 5 files changed, 141 insertions(+), 2 deletions(-) >>> >>> Thank you very much! >>> >>> Dongli Zhang >>> >>> > >
Hi Like and zhenyu, Thank you very much! That will be very helpful. In order to help the review, I will rebase the patchset on top of the most recent QEMU. Thank you very much! Dongli Zhang On 6/19/23 01:52, Like Xu wrote: > I think we've been stuck here too long. Sorry Dongli. > > +zhenyu, could you get someone to follow up on this, or I will start working on > that. > > On 9/1/2023 9:19 am, Dongli Zhang wrote: >> Ping? >> >> About [PATCH v2 2/2], the bad thing is that the customer will not be able to >> notice the issue, that is, the "Broken BIOS detected" in dmesg, immediately. >> >> As a result, the customer VM many panic randomly anytime in the future (once >> issue is encountered) if "/proc/sys/kernel/unknown_nmi_panic" is enabled. >> >> Thank you very much! >> >> Dongli Zhang >> >> On 12/19/22 06:45, Dongli Zhang wrote: >>> Can I get feedback for this patchset, especially the [PATCH v2 2/2]? >>> >>> About the [PATCH v2 2/2], currently the issue impacts the usage of PMUs on AMD >>> VM, especially the below case: >>> >>> 1. Enable panic on nmi. >>> 2. Use perf to monitor the performance of VM. Although without a test, I think >>> the nmi watchdog has the same effect. >>> 3. A sudden system reset, or a kernel panic (kdump/kexec). >>> 4. After reboot, there will be random unknown NMI. >>> 5. Unfortunately, the "panic on nmi" may panic the VM randomly at any time. >>> >>> Thank you very much! >>> >>> Dongli Zhang >>> >>> On 12/1/22 16:22, Dongli Zhang wrote: >>>> This patchset is to fix two svm pmu virtualization bugs, x86 only. >>>> >>>> version 1: >>>> https://lore.kernel.org/all/20221119122901.2469-1-dongli.zhang@oracle.com/ >>>> >>>> 1. The 1st bug is that "-cpu,-pmu" cannot disable svm pmu virtualization. >>>> >>>> To use "-cpu EPYC" or "-cpu host,-pmu" cannot disable the pmu >>>> virtualization. There is still below at the VM linux side ... >>>> >>>> [ 0.510611] Performance Events: Fam17h+ core perfctr, AMD PMU driver. >>>> >>>> ... although we expect something like below. >>>> >>>> [ 0.596381] Performance Events: PMU not available due to virtualization, >>>> using software events only. >>>> [ 0.600972] NMI watchdog: Perf NMI watchdog permanently disabled >>>> >>>> The 1st patch has introduced a new x86 only accel/kvm property >>>> "pmu-cap-disabled=true" to disable the pmu virtualization via >>>> KVM_PMU_CAP_DISABLE. >>>> >>>> I considered 'KVM_X86_SET_MSR_FILTER' initially before patchset v1. >>>> Since both KVM_X86_SET_MSR_FILTER and KVM_PMU_CAP_DISABLE are VM ioctl. I >>>> finally used the latter because it is easier to use. >>>> >>>> >>>> 2. The 2nd bug is that un-reclaimed perf events (after QEMU system_reset) >>>> at the KVM side may inject random unwanted/unknown NMIs to the VM. >>>> >>>> The svm pmu registers are not reset during QEMU system_reset. >>>> >>>> (1). The VM resets (e.g., via QEMU system_reset or VM kdump/kexec) while it >>>> is running "perf top". The pmu registers are not disabled gracefully. >>>> >>>> (2). Although the x86_cpu_reset() resets many registers to zero, the >>>> kvm_put_msrs() does not puts AMD pmu registers to KVM side. As a result, >>>> some pmu events are still enabled at the KVM side. >>>> >>>> (3). The KVM pmc_speculative_in_use() always returns true so that the events >>>> will not be reclaimed. The kvm_pmc->perf_event is still active. >>>> >>>> (4). After the reboot, the VM kernel reports below error: >>>> >>>> [ 0.092011] Performance Events: Fam17h+ core perfctr, Broken BIOS >>>> detected, complain to your hardware vendor. >>>> [ 0.092023] [Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR >>>> c0010200 is 530076) >>>> >>>> (5). In a worse case, the active kvm_pmc->perf_event is still able to >>>> inject unknown NMIs randomly to the VM kernel. >>>> >>>> [...] Uhhuh. NMI received for unknown reason 30 on CPU 0. >>>> >>>> The 2nd patch is to fix the issue by resetting AMD pmu registers as well as >>>> Intel registers. >>>> >>>> >>>> This patchset does not cover PerfMonV2, until the below patchset is merged >>>> into the KVM side. >>>> >>>> [PATCH v3 0/8] KVM: x86: Add AMD Guest PerfMonV2 PMU support >>>> https://lore.kernel.org/all/20221111102645.82001-1-likexu@tencent.com/ >>>> >>>> >>>> Dongli Zhang (2): >>>> target/i386/kvm: introduce 'pmu-cap-disabled' to set KVM_PMU_CAP_DISABLE >>>> target/i386/kvm: get and put AMD pmu registers >>>> >>>> accel/kvm/kvm-all.c | 1 + >>>> include/sysemu/kvm_int.h | 1 + >>>> qemu-options.hx | 7 +++ >>>> target/i386/cpu.h | 5 ++ >>>> target/i386/kvm/kvm.c | 129 +++++++++++++++++++++++++++++++++++++++++- >>>> 5 files changed, 141 insertions(+), 2 deletions(-) >>>> >>>> Thank you very much! >>>> >>>> Dongli Zhang >>>> >>>> >> >>