Patchwork KVM: Windows 64-bit troubles with user space irqchip

login
register
mail settings
Submitter Gleb Natapov
Date Feb. 3, 2011, 2:15 p.m.
Message ID <20110203141537.GY14984@redhat.com>
Download mbox | patch
Permalink /patch/81655/
State New
Headers show

Comments

Gleb Natapov - Feb. 3, 2011, 2:15 p.m.
On Thu, Feb 03, 2011 at 11:11:23AM +0100, Jan Kiszka wrote:
> On 2011-02-03 11:04, Marcelo Tosatti wrote:
> > On Thu, Feb 03, 2011 at 10:32:25AM +0100, Jan Kiszka wrote:
> >> On 2011-02-03 09:18, Avi Kivity wrote:
> >>> On 02/02/2011 05:52 PM, Jan Kiszka wrote:
> >>>>>>
> >>>>>  If there is no problem in the logic of this commit (and I do not see
> >>>>>  one yet) then we somewhere miss kicking vcpu when interrupt, that should be
> >>>>>  handled, arrives?
> >>>>
> >>>> I'm not yet confident about the logic of the kernel patch: mov to cr8 is
> >>>> serializing. If the guest raises the tpr and then signals this with a
> >>>> succeeding, non vm-exiting instruction to the other vcpus, one of those
> >>>> could inject an interrupt with a higher priority than the previous tpr,
> >>>> but a lower one than current tpr. QEMU user space would accept this
> >>>> interrupt - and would likely surprise the guest. Do I miss something?
> >>>
> >>> apic_get_interrupt() is only called from the vcpu thread, so it should 
> >>> see a correct tpr.
> >>>
> >>> The only difference I can see with the patch is that we may issue a 
> >>> spurious cpu_interrupt().  But that shouldn't do anything bad, should it?
> >>
> >> I tested this yesterday, and it doesn't confuse Windows. It actually
> >> receives quite a few spurious IRQs in normal operation, w/ or w/o the
> >> kernel's tpr optimization.
> > 
> > http://www.mail-archive.com/kvm@vger.kernel.org/msg41681.html
> 
> Don't get the scenario yet: We do not inject (or set isr) over the
> context of apic_set_irq caller.
> 
> > 
> > tpr of a vcpu should always be inspected in vcpu context, instead of 
> > iothread context?
> 
> Maybe this is true for the in-kernel model, but I don't see the issue
> (anymore) for the way user space works.
> 
With patch below I can boot Windows7.

--
			Gleb.
Avi Kivity - Feb. 6, 2011, 10:26 a.m.
On 02/03/2011 04:15 PM, Gleb Natapov wrote:
> >
> >  Maybe this is true for the in-kernel model, but I don't see the issue
> >  (anymore) for the way user space works.
> >
> With patch below I can boot Windows7.
>
> diff --git a/hw/apic.c b/hw/apic.c
> index 146deca..fdcac88 100644
> --- a/hw/apic.c
> +++ b/hw/apic.c
> @@ -600,7 +600,7 @@ int apic_get_interrupt(DeviceState *d)
>       intno = get_highest_priority_int(s->irr);
>       if (intno<  0)
>           return -1;
> -    if (s->tpr&&  intno<= s->tpr)
> +    if ((s->tpr>>  4)&&  (intno>>  4)<= (s->tpr>>  4))
>           return s->spurious_vec&  0xff;
>       reset_bit(s->irr, intno);
>       set_bit(s->isr, intno);

That still allows interrupts that have higher priority than the TPR, but 
lower priority than interrupts in the ISR to be injected.  I think we 
need to use the PPR here (same as apic_update_irq()).
Gleb Natapov - Feb. 6, 2011, 10:28 a.m.
On Sun, Feb 06, 2011 at 12:26:40PM +0200, Avi Kivity wrote:
> On 02/03/2011 04:15 PM, Gleb Natapov wrote:
> >>
> >>  Maybe this is true for the in-kernel model, but I don't see the issue
> >>  (anymore) for the way user space works.
> >>
> >With patch below I can boot Windows7.
> >
> >diff --git a/hw/apic.c b/hw/apic.c
> >index 146deca..fdcac88 100644
> >--- a/hw/apic.c
> >+++ b/hw/apic.c
> >@@ -600,7 +600,7 @@ int apic_get_interrupt(DeviceState *d)
> >      intno = get_highest_priority_int(s->irr);
> >      if (intno<  0)
> >          return -1;
> >-    if (s->tpr&&  intno<= s->tpr)
> >+    if ((s->tpr>>  4)&&  (intno>>  4)<= (s->tpr>>  4))
> >          return s->spurious_vec&  0xff;
> >      reset_bit(s->irr, intno);
> >      set_bit(s->isr, intno);
> 
> That still allows interrupts that have higher priority than the TPR,
> but lower priority than interrupts in the ISR to be injected.  I
> think we need to use the PPR here (same as apic_update_irq()).
> 
We shouldn't get here if isr is non-empty, but see the patch I posted
today to qemu-devel. It does what you say anyway.

--
			Gleb.

Patch

diff --git a/hw/apic.c b/hw/apic.c
index 146deca..fdcac88 100644
--- a/hw/apic.c
+++ b/hw/apic.c
@@ -600,7 +600,7 @@  int apic_get_interrupt(DeviceState *d)
     intno = get_highest_priority_int(s->irr);
     if (intno < 0)
         return -1;
-    if (s->tpr && intno <= s->tpr)
+    if ((s->tpr >> 4) && (intno >> 4) <= (s->tpr >> 4))
         return s->spurious_vec & 0xff;
     reset_bit(s->irr, intno);
     set_bit(s->isr, intno);