diff mbox series

[v2,5/5] hvf: save away type as well as vector so we can reinject them

Message ID e07e6085d8ab9054e58f85ae58e112df6adc024d.1574625592.git.dirty@apple.com
State New
Headers show
Series hvf: stability fixes for HVF | expand

Commit Message

Cameron Esfahani via Nov. 24, 2019, 8:05 p.m. UTC
Save away type as well as vector in hvf_store_events() so we can
correctly reinject both in hvf_inject_interrupts().

Make sure to clear ins_len and has_error_code when ins_len isn't
valid and error_code isn't set.

Signed-off-by: Cameron Esfahani <dirty@apple.com>
---
 target/i386/hvf/hvf.c    | 18 ++++++++++++++----
 target/i386/hvf/x86hvf.c | 22 ++++++++++------------
 2 files changed, 24 insertions(+), 16 deletions(-)

Comments

Paolo Bonzini Nov. 25, 2019, 10:26 a.m. UTC | #1
On 24/11/19 21:05, Cameron Esfahani wrote:
> Save away type as well as vector in hvf_store_events() so we can
> correctly reinject both in hvf_inject_interrupts().
> 
> Make sure to clear ins_len and has_error_code when ins_len isn't
> valid and error_code isn't set.

Do you have a testcase for this?  (I could guess it's about the INT1
instruction).

Paolo
Cameron Esfahani via Nov. 26, 2019, 8:04 p.m. UTC | #2
Our test case was booting many concurrent macOS VMs under heavy system load.  I don't know if I could create one to replicate that.

Cameron Esfahani
dirty@apple.com

"In the elder days of Art, Builders wrought with greatest care each minute and unseen part; For the gods see everywhere."

"The Builders", H. W. Longfellow



> On Nov 25, 2019, at 2:26 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> On 24/11/19 21:05, Cameron Esfahani wrote:
>> Save away type as well as vector in hvf_store_events() so we can
>> correctly reinject both in hvf_inject_interrupts().
>> 
>> Make sure to clear ins_len and has_error_code when ins_len isn't
>> valid and error_code isn't set.
> 
> Do you have a testcase for this?  (I could guess it's about the INT1
> instruction).
> 
> Paolo
>
Cameron Esfahani via Nov. 28, 2019, 5:57 a.m. UTC | #3
I added some asserts to our internal version of QEMU.  It's a few months off of master and, specifically, doesn't have fd13f23b8c95311eff74426921557eee592b0ed3.

With the previous version of hvf_inject_interrupts(), before our fix, the code looked like the following:

>     if (env->interrupt_injected != -1) {
>         vector = env->interrupt_injected;
>         intr_type = VMCS_INTR_T_SWINTR;
>     } else if (env->exception_injected != -1) {


What we were seeing was, under heavy loads, running many concurrent macOS VMs, that we would get spurious interrupts.  Tracking it down, we discovered that VMCS_INTR_T_SWINTR was getting injected when VMCS_INTR_T_HWINTR was expected.

If I take our proposed patch code, which built on top of master from a few days ago, and has fd13f23b8c95311eff74426921557eee592b0ed3,  and add an assert, like the following:

>     if (env->interrupt_injected != -1) {
>         /* Type and vector are both saved in interrupt_injected. */
>         vector = env->interrupt_injected & VMCS_IDT_VEC_VECNUM;
>         intr_type = env->interrupt_injected & VMCS_IDT_VEC_TYPE;
>         if (VMCS_INTR_T_SWINTR != intr_type) {
>             printf("VMCS_INTR_T_SWINTR (%x) != intr_type (%llx)\n", VMCS_INTR_T_SWINTR, intr_type);
>             assert(VMCS_INTR_T_SWINTR == intr_type);
>         }
>     } else if (env->exception_nr != -1) {


Then we will see the assert trigger and get the following output:

> VMCS_INTR_T_SWINTR (400) != intr_type (0)
> Assertion failed: (VMCS_INTR_T_SWINTR == intr_type), function hvf_inject_interrupts, file qemu_upstream/target/i386/hvf/x86hvf.c, line 362.


So, as far as I can see, the proposed changes are still necessary.

Cameron Esfahani
dirty@apple.com

"Americans are very skilled at creating a custom meaning from something that's mass-produced."

Ann Powers


> On Nov 26, 2019, at 12:04 PM, Cameron Esfahani via <qemu-devel@nongnu.org> wrote:
> 
> Our test case was booting many concurrent macOS VMs under heavy system load.  I don't know if I could create one to replicate that.
> 
> Cameron Esfahani
> dirty@apple.com
> 
> "In the elder days of Art, Builders wrought with greatest care each minute and unseen part; For the gods see everywhere."
> 
> "The Builders", H. W. Longfellow
> 
> 
> 
>> On Nov 25, 2019, at 2:26 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> 
>> On 24/11/19 21:05, Cameron Esfahani wrote:
>>> Save away type as well as vector in hvf_store_events() so we can
>>> correctly reinject both in hvf_inject_interrupts().
>>> 
>>> Make sure to clear ins_len and has_error_code when ins_len isn't
>>> valid and error_code isn't set.
>> 
>> Do you have a testcase for this?  (I could guess it's about the INT1
>> instruction).
>> 
>> Paolo
>> 
> 
>
Paolo Bonzini Nov. 28, 2019, 1:52 p.m. UTC | #4
On 28/11/19 06:57, Cameron Esfahani wrote:
> 
> What we were seeing was, under heavy loads, running many concurrent
macOS VMs, that we would get spurious interrupts.  Tracking it down, we
discovered that VMCS_INTR_T_SWINTR was getting injected when
VMCS_INTR_T_HWINTR was expected.
> 
> If I take our proposed patch code, which built on top of master from a
> few days ago, and has fd13f23b8c95311eff74426921557eee592b0ed3,  and add
> an assert, like the following:
> 
>>     if (env->interrupt_injected != -1) {
>>         /* Type and vector are both saved in interrupt_injected. */
>>         vector = env->interrupt_injected & VMCS_IDT_VEC_VECNUM;
>>         intr_type = env->interrupt_injected & VMCS_IDT_VEC_TYPE;
>>         if (VMCS_INTR_T_SWINTR != intr_type) {
>>             printf("VMCS_INTR_T_SWINTR (%x) != intr_type (%llx)\n", VMCS_INTR_T_SWINTR, intr_type);
>>             assert(VMCS_INTR_T_SWINTR == intr_type);
>>         }
>>     } else if (env->exception_nr != -1) {
> 
> Then we will see the assert trigger and get the following output:
> 
>> VMCS_INTR_T_SWINTR (400) != intr_type (0)
>> Assertion failed: (VMCS_INTR_T_SWINTR == intr_type), function hvf_inject_interrupts, file qemu_upstream/target/i386/hvf/x86hvf.c, line 362.

Great, thanks.  It's good to know that it's only software vs. hardware
interrupt.  I'll compare the KVM and QEMU source code to see why KVM
does not lose software vs. hardware interrupt, since the QEMU event
injection code was modeled against KVM's.

Paolo
Paolo Bonzini Nov. 28, 2019, 1:56 p.m. UTC | #5
On 26/11/19 21:04, Cameron Esfahani wrote:
> Our test case was booting many concurrent macOS VMs under heavy
> system load.  I don't know if I could create one to replicate that.

Does this work?

diff --git a/target/i386/hvf/x86hvf.c b/target/i386/hvf/x86hvf.c
index 1485b95776..26c6c3a49f 100644
--- a/target/i386/hvf/x86hvf.c
+++ b/target/i386/hvf/x86hvf.c
@@ -357,7 +357,11 @@ bool hvf_inject_interrupts(CPUState *cpu_state)
     bool have_event = true;
     if (env->interrupt_injected != -1) {
         vector = env->interrupt_injected;
-        intr_type = VMCS_INTR_T_SWINTR;
+        if (env->ins_len) {
+            intr_type = VMCS_INTR_T_SWINTR;
+        } else {
+            intr_type = VMCS_INTR_T_HWINTR;
+        }
     } else if (env->exception_nr != -1) {
         vector = env->exception_nr;
         if (vector == EXCP03_INT3 || vector == EXCP04_INTO) {

Thanks,

Paolo
Paolo Bonzini Nov. 28, 2019, 1:59 p.m. UTC | #6
On 28/11/19 14:56, Paolo Bonzini wrote:
> On 26/11/19 21:04, Cameron Esfahani wrote:
>> Our test case was booting many concurrent macOS VMs under heavy
>> system load.  I don't know if I could create one to replicate that.
> 
> Does this work?
> 
> diff --git a/target/i386/hvf/x86hvf.c b/target/i386/hvf/x86hvf.c
> index 1485b95776..26c6c3a49f 100644
> --- a/target/i386/hvf/x86hvf.c
> +++ b/target/i386/hvf/x86hvf.c
> @@ -357,7 +357,11 @@ bool hvf_inject_interrupts(CPUState *cpu_state)
>      bool have_event = true;
>      if (env->interrupt_injected != -1) {
>          vector = env->interrupt_injected;
> -        intr_type = VMCS_INTR_T_SWINTR;
> +        if (env->ins_len) {
> +            intr_type = VMCS_INTR_T_SWINTR;
> +        } else {
> +            intr_type = VMCS_INTR_T_HWINTR;
> +        }
>      } else if (env->exception_nr != -1) {
>          vector = env->exception_nr;
>          if (vector == EXCP03_INT3 || vector == EXCP04_INTO) {

Better include this too:

diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 784e67d77e..5dc7515841 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -637,6 +637,7 @@ static void hvf_store_events(CPUState *cpu, uint32_t
ins_len, uint64_t idtvec_in
     env->exception_injected = 0;
     env->interrupt_injected = -1;
     env->nmi_injected = false;
+    env->ins_len = 0;
     if (idtvec_info & VMCS_IDT_VEC_VALID) {
         switch (idtvec_info & VMCS_IDT_VEC_TYPE) {
         case VMCS_IDT_VEC_HWINTR:

Paolo
Cameron Esfahani via Nov. 30, 2019, 8:31 a.m. UTC | #7
So far so good.  Without any workaround, I could get it to fail within a few seconds.  With your change, I've been running for a few minutes without a problem.  But, this is on my laptop, so I'll wait until I can test it on a wider-range of machines at work next week.  If it continues to work, I'll update my patch to include this fix.

Now, can you help me understand why this approach is better than what I had written?  When we're in hvf_store_events(), we have vector type and number.  All the information we need to reinject later.  So why not save vector type away, instead of attempting to reconstruct it from other information (env->ins_len) in hvf_inject_interrupts()?

Cameron Esfahani
dirty@apple.com

"There are times in the life of a nation when the only place a decent man can find himself is in prison."



> On Nov 28, 2019, at 5:56 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> On 26/11/19 21:04, Cameron Esfahani wrote:
>> Our test case was booting many concurrent macOS VMs under heavy
>> system load.  I don't know if I could create one to replicate that.
> 
> Does this work?
> 
> diff --git a/target/i386/hvf/x86hvf.c b/target/i386/hvf/x86hvf.c
> index 1485b95776..26c6c3a49f 100644
> --- a/target/i386/hvf/x86hvf.c
> +++ b/target/i386/hvf/x86hvf.c
> @@ -357,7 +357,11 @@ bool hvf_inject_interrupts(CPUState *cpu_state)
>     bool have_event = true;
>     if (env->interrupt_injected != -1) {
>         vector = env->interrupt_injected;
> -        intr_type = VMCS_INTR_T_SWINTR;
> +        if (env->ins_len) {
> +            intr_type = VMCS_INTR_T_SWINTR;
> +        } else {
> +            intr_type = VMCS_INTR_T_HWINTR;
> +        }
>     } else if (env->exception_nr != -1) {
>         vector = env->exception_nr;
>         if (vector == EXCP03_INT3 || vector == EXCP04_INTO) {
> 
> Thanks,
> 
> Paolo
>
Paolo Bonzini Nov. 30, 2019, 8:46 a.m. UTC | #8
On 30/11/19 09:31, Cameron Esfahani wrote:
> So far so good.  Without any workaround, I could get it to fail
> within a few seconds.  With your change, I've been running for a few
> minutes without a problem.  But, this is on my laptop, so I'll wait
> until I can test it on a wider-range of machines at work next week.
> If it continues to work, I'll update my patch to include this fix.

Great!  Note that there was a follow-up hunk in another email.

> Now, can you help me understand why this approach is better than what
> I had written?  When we're in hvf_store_events(), we have vector type
> and number.  All the information we need to reinject later.  So why
> not save vector type away, instead of attempting to reconstruct it
> from other information (env->ins_len) in hvf_inject_interrupts()?

No huge reason, I agree.  However, the event injection code in the
Android emulator was broken for pretty much every corner case, so when
we rewrote it in QEMU for Summer of Code we took KVM as a model (commit
b7394c8394, "i386: hvf: refactor event injection code for hvf",
2017-12-22).  Keeping the code similar can help with debugging, so I
prefer having the same meaning for env->interrupt_injected and
env->exception_nr across HVF and KVM.

I didn't write the code for KVM, so I cannot really say why they chose
to not save the event type.

Thanks,

Paolo

> 
> Cameron Esfahani
> dirty@apple.com
> 
> "There are times in the life of a nation when the only place a decent man can find himself is in prison."
> 
> 
> 
>> On Nov 28, 2019, at 5:56 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>> On 26/11/19 21:04, Cameron Esfahani wrote:
>>> Our test case was booting many concurrent macOS VMs under heavy
>>> system load.  I don't know if I could create one to replicate that.
>>
>> Does this work?
>>
>> diff --git a/target/i386/hvf/x86hvf.c b/target/i386/hvf/x86hvf.c
>> index 1485b95776..26c6c3a49f 100644
>> --- a/target/i386/hvf/x86hvf.c
>> +++ b/target/i386/hvf/x86hvf.c
>> @@ -357,7 +357,11 @@ bool hvf_inject_interrupts(CPUState *cpu_state)
>>     bool have_event = true;
>>     if (env->interrupt_injected != -1) {
>>         vector = env->interrupt_injected;
>> -        intr_type = VMCS_INTR_T_SWINTR;
>> +        if (env->ins_len) {
>> +            intr_type = VMCS_INTR_T_SWINTR;
>> +        } else {
>> +            intr_type = VMCS_INTR_T_HWINTR;
>> +        }
>>     } else if (env->exception_nr != -1) {
>>         vector = env->exception_nr;
>>         if (vector == EXCP03_INT3 || vector == EXCP04_INTO) {
>>
>> Thanks,
>>
>> Paolo
>>
>
diff mbox series

Patch

diff --git a/target/i386/hvf/hvf.c b/target/i386/hvf/hvf.c
index 784e67d77e..8a8aee4495 100644
--- a/target/i386/hvf/hvf.c
+++ b/target/i386/hvf/hvf.c
@@ -641,14 +641,18 @@  static void hvf_store_events(CPUState *cpu, uint32_t ins_len, uint64_t idtvec_in
         switch (idtvec_info & VMCS_IDT_VEC_TYPE) {
         case VMCS_IDT_VEC_HWINTR:
         case VMCS_IDT_VEC_SWINTR:
-            env->interrupt_injected = idtvec_info & VMCS_IDT_VEC_VECNUM;
+            /* Save event type as well so we can inject the correct type. */
+            env->interrupt_injected =
+                idtvec_info & (VMCS_IDT_VEC_TYPE | VMCS_IDT_VEC_VECNUM);
             break;
         case VMCS_IDT_VEC_NMI:
             env->nmi_injected = true;
             break;
         case VMCS_IDT_VEC_HWEXCEPTION:
         case VMCS_IDT_VEC_SWEXCEPTION:
-            env->exception_nr = idtvec_info & VMCS_IDT_VEC_VECNUM;
+            /* Save event type as well so we can inject the correct type. */
+            env->exception_nr =
+                idtvec_info & (VMCS_IDT_VEC_TYPE | VMCS_IDT_VEC_VECNUM);
             env->exception_injected = 1;
             break;
         case VMCS_IDT_VEC_PRIV_SWEXCEPTION:
@@ -658,10 +662,16 @@  static void hvf_store_events(CPUState *cpu, uint32_t ins_len, uint64_t idtvec_in
         if ((idtvec_info & VMCS_IDT_VEC_TYPE) == VMCS_IDT_VEC_SWEXCEPTION ||
             (idtvec_info & VMCS_IDT_VEC_TYPE) == VMCS_IDT_VEC_SWINTR) {
             env->ins_len = ins_len;
+        } else {
+            /* Clear ins_len when it isn't valid. */
+            env->ins_len = 0;
         }
-        if (idtvec_info & VMCS_INTR_DEL_ERRCODE) {
+        if (idtvec_info & VMCS_IDT_VEC_ERRCODE_VALID) {
             env->has_error_code = true;
             env->error_code = rvmcs(cpu->hvf_fd, VMCS_IDT_VECTORING_ERROR);
+        } else {
+            /* Clear has_error_code when error_code isn't valid. */
+            env->has_error_code = false;
         }
     }
     if ((rvmcs(cpu->hvf_fd, VMCS_GUEST_INTERRUPTIBILITY) &
@@ -942,7 +952,7 @@  int hvf_vcpu_exec(CPUState *cpu)
             macvm_set_rip(cpu, rip + ins_len);
             break;
         case VMX_REASON_VMCALL:
-            env->exception_nr = EXCP0D_GPF;
+            env->exception_nr = VMCS_INTR_T_HWEXCEPTION | EXCP0D_GPF;
             env->exception_injected = 1;
             env->has_error_code = true;
             env->error_code = 0;
diff --git a/target/i386/hvf/x86hvf.c b/target/i386/hvf/x86hvf.c
index 1485b95776..d25ae4585b 100644
--- a/target/i386/hvf/x86hvf.c
+++ b/target/i386/hvf/x86hvf.c
@@ -345,8 +345,6 @@  void vmx_clear_int_window_exiting(CPUState *cpu)
              ~VMCS_PRI_PROC_BASED_CTLS_INT_WINDOW_EXITING);
 }
 
-#define NMI_VEC 2
-
 bool hvf_inject_interrupts(CPUState *cpu_state)
 {
     X86CPU *x86cpu = X86_CPU(cpu_state);
@@ -356,17 +354,15 @@  bool hvf_inject_interrupts(CPUState *cpu_state)
     uint64_t intr_type;
     bool have_event = true;
     if (env->interrupt_injected != -1) {
-        vector = env->interrupt_injected;
-        intr_type = VMCS_INTR_T_SWINTR;
+        /* Type and vector are both saved in interrupt_injected. */
+        vector = env->interrupt_injected & VMCS_IDT_VEC_VECNUM;
+        intr_type = env->interrupt_injected & VMCS_IDT_VEC_TYPE;
     } else if (env->exception_nr != -1) {
-        vector = env->exception_nr;
-        if (vector == EXCP03_INT3 || vector == EXCP04_INTO) {
-            intr_type = VMCS_INTR_T_SWEXCEPTION;
-        } else {
-            intr_type = VMCS_INTR_T_HWEXCEPTION;
-        }
+        /* Type and vector are both saved in exception_nr. */
+        vector = env->exception_nr & VMCS_IDT_VEC_VECNUM;
+        intr_type = env->exception_nr & VMCS_IDT_VEC_TYPE;
     } else if (env->nmi_injected) {
-        vector = NMI_VEC;
+        vector = EXCP02_NMI;
         intr_type = VMCS_INTR_T_NMI;
     } else {
         have_event = false;
@@ -390,6 +386,8 @@  bool hvf_inject_interrupts(CPUState *cpu_state)
             if (env->has_error_code) {
                 wvmcs(cpu_state->hvf_fd, VMCS_ENTRY_EXCEPTION_ERROR,
                       env->error_code);
+                /* Indicate that VMCS_ENTRY_EXCEPTION_ERROR is valid */
+                info |= VMCS_INTR_DEL_ERRCODE;
             }
             /*printf("reinject  %lx err %d\n", info, err);*/
             wvmcs(cpu_state->hvf_fd, VMCS_ENTRY_INTR_INFO, info);
@@ -399,7 +397,7 @@  bool hvf_inject_interrupts(CPUState *cpu_state)
     if (cpu_state->interrupt_request & CPU_INTERRUPT_NMI) {
         if (!(env->hflags2 & HF2_NMI_MASK) && !(info & VMCS_INTR_VALID)) {
             cpu_state->interrupt_request &= ~CPU_INTERRUPT_NMI;
-            info = VMCS_INTR_VALID | VMCS_INTR_T_NMI | NMI_VEC;
+            info = VMCS_INTR_VALID | VMCS_INTR_T_NMI | EXCP02_NMI;
             wvmcs(cpu_state->hvf_fd, VMCS_ENTRY_INTR_INFO, info);
         } else {
             vmx_set_nmi_window_exiting(cpu_state);