Patchwork MSI-X bug with ivshmem since msix_reset moved to PCI

login
register
mail settings
Submitter Michael S. Tsirkin
Date Aug. 24, 2012, 8:36 a.m.
Message ID <20120824083640.GD7830@redhat.com>
Download mbox | patch
Permalink /patch/179789/
State New
Headers show

Comments

Michael S. Tsirkin - Aug. 24, 2012, 8:36 a.m.
On Fri, Aug 24, 2012 at 10:15:33AM +0200, Jan Kiszka wrote:
> On 2012-08-24 10:11, Michael S. Tsirkin wrote:
> > On Fri, Aug 24, 2012 at 07:59:06AM +0200, Jan Kiszka wrote:
> >> On 2012-08-24 01:13, Cam Macdonell wrote:
> >>> Hi Jan,
> >>>
> >>> I've bisected a bug in which MSI interrupts are not being delivered to
> >>> the following patch, where msix_reset was moved in tot he PCI core.
> >>>
> >>> commit cbd2d4342b3d42ab33baa99f5b7a23491b5692f2
> >>> Author: Jan Kiszka <jan.kiszka@siemens.com>
> >>> Date:   Tue May 15 20:09:56 2012 -0300
> >>>
> >>>     msi: Invoke msi/msix_reset from PCI core
> >>>
> >>>     There is no point in pushing this burden to the devices, they tend to
> >>>     forget to call them (like intel-hda, ahci, xhci did). Instead, reset
> >>>     functions are now called from pci_device_reset. They do nothing if
> >>>     MSI/MSI-X is not in use.
> >>>
> >>> I've been debugging and it seems that when msix_notify() is triggered
> >>> the second test in the "if" fails
> >>>
> >>> /* Send an MSI-X message */
> >>> void msix_notify(PCIDevice *dev, unsigned vector)
> >>> {
> >>>     MSIMessage msg;
> >>>
> >>>     if (vector >= dev->msix_entries_nr || !dev->msix_entry_used[vector])
> >>>         return;
> >>>
> >>>    …
> >>> }
> >>>
> >>> here is some MSI-X debugging statements
> >>>
> >>> msix_init
> >>> IVSHMEM: msix initialized (1 vectors)
> >>> IVSHMEM: using vector 0
> >>> IVSHMEM: ivshmem_reset
> >>> IVSHMEM: using vector 0
> >>> msix_reset
> >>> msix_free_irq_entries 0x7fd52d1cea20
> >>>
> >>> msix_free_irq_entries() sets dev->msix_entries_nr to 0, so I think it
> >>> may be the cause.
> >>
> >> I suppose you mean it sets the msix_entry_used array to 0.
> >>
> >>>
> >>> Shouldn't ivshmem's reset (which reenables the vectors) be triggered
> >>> by the msix_reset?
> >>
> >> Actually, the whole msix vector usage tracking is useless today, this
> >> just shows its downsides (in the absence of benefits). Megasas is
> >> affected by this problem as well, virtio not as it calls msix_vector_use
> >> during the configuration process the guest driver triggers.
> >>
> >> Two options:
> >>  - I can send my removal patch for msix_vector_use/unuse that I was
> >>    only planning for 1.3 so far, and we kill this pitfall earlier.
> >>  - We re-add msix_vector_use calls to the affected device models for
> >>    1.2 and drop them later again for 1.3 when removing usage tracking.
> >> [The third option to keep the usage tracking is a non-option for me. ;)]
> >>
> >> Michael?
> > 
> > Second option seems more prudent to me. Can you send a patch pls?
> 
> In fact, it's not as easy. ivshmem already tries to restore the usage
> flag but fails due to reset handler ordering. I do not see ATM where
> there is some "enable device" during re-initialization, at least for
> ivshmem. Haven't checked megasas yet.
> 
> I tend to think removing is simpler and less risky. Please have a look
> at the patch I sent earlier.
> 
> Jan
> 
> 

Actually virtio is the only one that changes vector use
so the only one that needs to reset it.

I am thinking we should find a way to move vector use
out from core to virtio-pci.
But for now something like the below should do the trick.
Untested unfortunately, will test virtio Sunday
but would appreciate review and testing reports
esp with ivshmem etc.

-->

msix: avoid need to use/unuse vectors for most devices

The facility to use/unuse vectors is helpful for virtio
but little else since everyone just seems to use
vectors in their init function.

Avoid clearing msix vector use info on reset and load.
For virtio, clear it explicitly.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

---
Jan Kiszka - Aug. 24, 2012, 8:39 a.m.
On 2012-08-24 10:36, Michael S. Tsirkin wrote:
> On Fri, Aug 24, 2012 at 10:15:33AM +0200, Jan Kiszka wrote:
>> On 2012-08-24 10:11, Michael S. Tsirkin wrote:
>>> On Fri, Aug 24, 2012 at 07:59:06AM +0200, Jan Kiszka wrote:
>>>> On 2012-08-24 01:13, Cam Macdonell wrote:
>>>>> Hi Jan,
>>>>>
>>>>> I've bisected a bug in which MSI interrupts are not being delivered to
>>>>> the following patch, where msix_reset was moved in tot he PCI core.
>>>>>
>>>>> commit cbd2d4342b3d42ab33baa99f5b7a23491b5692f2
>>>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>>>>> Date:   Tue May 15 20:09:56 2012 -0300
>>>>>
>>>>>     msi: Invoke msi/msix_reset from PCI core
>>>>>
>>>>>     There is no point in pushing this burden to the devices, they tend to
>>>>>     forget to call them (like intel-hda, ahci, xhci did). Instead, reset
>>>>>     functions are now called from pci_device_reset. They do nothing if
>>>>>     MSI/MSI-X is not in use.
>>>>>
>>>>> I've been debugging and it seems that when msix_notify() is triggered
>>>>> the second test in the "if" fails
>>>>>
>>>>> /* Send an MSI-X message */
>>>>> void msix_notify(PCIDevice *dev, unsigned vector)
>>>>> {
>>>>>     MSIMessage msg;
>>>>>
>>>>>     if (vector >= dev->msix_entries_nr || !dev->msix_entry_used[vector])
>>>>>         return;
>>>>>
>>>>>    …
>>>>> }
>>>>>
>>>>> here is some MSI-X debugging statements
>>>>>
>>>>> msix_init
>>>>> IVSHMEM: msix initialized (1 vectors)
>>>>> IVSHMEM: using vector 0
>>>>> IVSHMEM: ivshmem_reset
>>>>> IVSHMEM: using vector 0
>>>>> msix_reset
>>>>> msix_free_irq_entries 0x7fd52d1cea20
>>>>>
>>>>> msix_free_irq_entries() sets dev->msix_entries_nr to 0, so I think it
>>>>> may be the cause.
>>>>
>>>> I suppose you mean it sets the msix_entry_used array to 0.
>>>>
>>>>>
>>>>> Shouldn't ivshmem's reset (which reenables the vectors) be triggered
>>>>> by the msix_reset?
>>>>
>>>> Actually, the whole msix vector usage tracking is useless today, this
>>>> just shows its downsides (in the absence of benefits). Megasas is
>>>> affected by this problem as well, virtio not as it calls msix_vector_use
>>>> during the configuration process the guest driver triggers.
>>>>
>>>> Two options:
>>>>  - I can send my removal patch for msix_vector_use/unuse that I was
>>>>    only planning for 1.3 so far, and we kill this pitfall earlier.
>>>>  - We re-add msix_vector_use calls to the affected device models for
>>>>    1.2 and drop them later again for 1.3 when removing usage tracking.
>>>> [The third option to keep the usage tracking is a non-option for me. ;)]
>>>>
>>>> Michael?
>>>
>>> Second option seems more prudent to me. Can you send a patch pls?
>>
>> In fact, it's not as easy. ivshmem already tries to restore the usage
>> flag but fails due to reset handler ordering. I do not see ATM where
>> there is some "enable device" during re-initialization, at least for
>> ivshmem. Haven't checked megasas yet.
>>
>> I tend to think removing is simpler and less risky. Please have a look
>> at the patch I sent earlier.
>>
>> Jan
>>
>>
> 
> Actually virtio is the only one that changes vector use
> so the only one that needs to reset it.
> 
> I am thinking we should find a way to move vector use
> out from core to virtio-pci.
> But for now something like the below should do the trick.
> Untested unfortunately, will test virtio Sunday
> but would appreciate review and testing reports
> esp with ivshmem etc.
> 
> -->
> 
> msix: avoid need to use/unuse vectors for most devices
> 
> The facility to use/unuse vectors is helpful for virtio
> but little else since everyone just seems to use
> vectors in their init function.
> 
> Avoid clearing msix vector use info on reset and load.
> For virtio, clear it explicitly.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> ---
> 
> diff --git a/hw/msix.c b/hw/msix.c
> index 800fc32..d040cc2 100644
> --- a/hw/msix.c
> +++ b/hw/msix.c
> @@ -340,6 +340,15 @@ static void msix_free_irq_entries(PCIDevice *dev)
>      }
>  }
>  
> +static void msix_clear_all_vectors(PCIDevice *dev)
> +{
> +    int vector;
> +
> +    for (vector = 0; vector < dev->msix_entries_nr; ++vector) {
> +        msix_clr_pending(dev, vector);
> +    }
> +}
> +
>  /* Clean up resources for the device. */
>  void msix_uninit(PCIDevice *dev, MemoryRegion *table_bar, MemoryRegion *pba_bar)
>  {
> @@ -394,7 +403,7 @@ void msix_load(PCIDevice *dev, QEMUFile *f)
>          return;
>      }
>  
> -    msix_free_irq_entries(dev);
> +    msix_clear_all_vectors(dev);
>      qemu_get_buffer(f, dev->msix_table, n * PCI_MSIX_ENTRY_SIZE);
>      qemu_get_buffer(f, dev->msix_pba, (n + 7) / 8);
>      msix_update_function_masked(dev);
> @@ -440,7 +449,7 @@ void msix_reset(PCIDevice *dev)
>      if (!msix_present(dev)) {
>          return;
>      }
> -    msix_free_irq_entries(dev);
> +    msix_clear_all_vectors(dev);
>      dev->config[dev->msix_cap + MSIX_CONTROL_OFFSET] &=
>  	    ~dev->wmask[dev->msix_cap + MSIX_CONTROL_OFFSET];
>      memset(dev->msix_table, 0, dev->msix_entries_nr * PCI_MSIX_ENTRY_SIZE);
> diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> index 125eded..ca0b204 100644
> --- a/hw/virtio-pci.c
> +++ b/hw/virtio-pci.c
> @@ -131,6 +131,7 @@ static int virtio_pci_load_config(void * opaque, QEMUFile *f)
>      if (ret) {
>          return ret;
>      }
> +    msix_unuse_all_vectors(&proxy->pci_dev);
>      msix_load(&proxy->pci_dev, f);
>      if (msix_present(&proxy->pci_dev)) {
>          qemu_get_be16s(f, &proxy->vdev->config_vector);
> @@ -246,6 +247,7 @@ void virtio_pci_reset(DeviceState *d)
>      VirtIOPCIProxy *proxy = container_of(d, VirtIOPCIProxy, pci_dev.qdev);
>      virtio_pci_stop_ioeventfd(proxy);
>      virtio_reset(proxy->vdev);
> +    msix_unuse_all_vectors(&proxy->pci_dev);
>      proxy->flags &= ~VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
>  }
>  
> 

This just prolongs the life of a wrong API, but I'm fine for 1.2 if it
helps. However, I will push on removing all this for 1.3.

Jan
Michael S. Tsirkin - Aug. 24, 2012, 9:20 a.m.
On Fri, Aug 24, 2012 at 10:39:15AM +0200, Jan Kiszka wrote:
> On 2012-08-24 10:36, Michael S. Tsirkin wrote:
> > On Fri, Aug 24, 2012 at 10:15:33AM +0200, Jan Kiszka wrote:
> >> On 2012-08-24 10:11, Michael S. Tsirkin wrote:
> >>> On Fri, Aug 24, 2012 at 07:59:06AM +0200, Jan Kiszka wrote:
> >>>> On 2012-08-24 01:13, Cam Macdonell wrote:
> >>>>> Hi Jan,
> >>>>>
> >>>>> I've bisected a bug in which MSI interrupts are not being delivered to
> >>>>> the following patch, where msix_reset was moved in tot he PCI core.
> >>>>>
> >>>>> commit cbd2d4342b3d42ab33baa99f5b7a23491b5692f2
> >>>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
> >>>>> Date:   Tue May 15 20:09:56 2012 -0300
> >>>>>
> >>>>>     msi: Invoke msi/msix_reset from PCI core
> >>>>>
> >>>>>     There is no point in pushing this burden to the devices, they tend to
> >>>>>     forget to call them (like intel-hda, ahci, xhci did). Instead, reset
> >>>>>     functions are now called from pci_device_reset. They do nothing if
> >>>>>     MSI/MSI-X is not in use.
> >>>>>
> >>>>> I've been debugging and it seems that when msix_notify() is triggered
> >>>>> the second test in the "if" fails
> >>>>>
> >>>>> /* Send an MSI-X message */
> >>>>> void msix_notify(PCIDevice *dev, unsigned vector)
> >>>>> {
> >>>>>     MSIMessage msg;
> >>>>>
> >>>>>     if (vector >= dev->msix_entries_nr || !dev->msix_entry_used[vector])
> >>>>>         return;
> >>>>>
> >>>>>    …
> >>>>> }
> >>>>>
> >>>>> here is some MSI-X debugging statements
> >>>>>
> >>>>> msix_init
> >>>>> IVSHMEM: msix initialized (1 vectors)
> >>>>> IVSHMEM: using vector 0
> >>>>> IVSHMEM: ivshmem_reset
> >>>>> IVSHMEM: using vector 0
> >>>>> msix_reset
> >>>>> msix_free_irq_entries 0x7fd52d1cea20
> >>>>>
> >>>>> msix_free_irq_entries() sets dev->msix_entries_nr to 0, so I think it
> >>>>> may be the cause.
> >>>>
> >>>> I suppose you mean it sets the msix_entry_used array to 0.
> >>>>
> >>>>>
> >>>>> Shouldn't ivshmem's reset (which reenables the vectors) be triggered
> >>>>> by the msix_reset?
> >>>>
> >>>> Actually, the whole msix vector usage tracking is useless today, this
> >>>> just shows its downsides (in the absence of benefits). Megasas is
> >>>> affected by this problem as well, virtio not as it calls msix_vector_use
> >>>> during the configuration process the guest driver triggers.
> >>>>
> >>>> Two options:
> >>>>  - I can send my removal patch for msix_vector_use/unuse that I was
> >>>>    only planning for 1.3 so far, and we kill this pitfall earlier.
> >>>>  - We re-add msix_vector_use calls to the affected device models for
> >>>>    1.2 and drop them later again for 1.3 when removing usage tracking.
> >>>> [The third option to keep the usage tracking is a non-option for me. ;)]
> >>>>
> >>>> Michael?
> >>>
> >>> Second option seems more prudent to me. Can you send a patch pls?
> >>
> >> In fact, it's not as easy. ivshmem already tries to restore the usage
> >> flag but fails due to reset handler ordering. I do not see ATM where
> >> there is some "enable device" during re-initialization, at least for
> >> ivshmem. Haven't checked megasas yet.
> >>
> >> I tend to think removing is simpler and less risky. Please have a look
> >> at the patch I sent earlier.
> >>
> >> Jan
> >>
> >>
> > 
> > Actually virtio is the only one that changes vector use
> > so the only one that needs to reset it.
> > 
> > I am thinking we should find a way to move vector use
> > out from core to virtio-pci.
> > But for now something like the below should do the trick.
> > Untested unfortunately, will test virtio Sunday
> > but would appreciate review and testing reports
> > esp with ivshmem etc.
> > 
> > -->
> > 
> > msix: avoid need to use/unuse vectors for most devices
> > 
> > The facility to use/unuse vectors is helpful for virtio
> > but little else since everyone just seems to use
> > vectors in their init function.
> > 
> > Avoid clearing msix vector use info on reset and load.
> > For virtio, clear it explicitly.
> > 
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > 
> > ---
> > 
> > diff --git a/hw/msix.c b/hw/msix.c
> > index 800fc32..d040cc2 100644
> > --- a/hw/msix.c
> > +++ b/hw/msix.c
> > @@ -340,6 +340,15 @@ static void msix_free_irq_entries(PCIDevice *dev)
> >      }
> >  }
> >  
> > +static void msix_clear_all_vectors(PCIDevice *dev)
> > +{
> > +    int vector;
> > +
> > +    for (vector = 0; vector < dev->msix_entries_nr; ++vector) {
> > +        msix_clr_pending(dev, vector);
> > +    }
> > +}
> > +
> >  /* Clean up resources for the device. */
> >  void msix_uninit(PCIDevice *dev, MemoryRegion *table_bar, MemoryRegion *pba_bar)
> >  {
> > @@ -394,7 +403,7 @@ void msix_load(PCIDevice *dev, QEMUFile *f)
> >          return;
> >      }
> >  
> > -    msix_free_irq_entries(dev);
> > +    msix_clear_all_vectors(dev);
> >      qemu_get_buffer(f, dev->msix_table, n * PCI_MSIX_ENTRY_SIZE);
> >      qemu_get_buffer(f, dev->msix_pba, (n + 7) / 8);
> >      msix_update_function_masked(dev);
> > @@ -440,7 +449,7 @@ void msix_reset(PCIDevice *dev)
> >      if (!msix_present(dev)) {
> >          return;
> >      }
> > -    msix_free_irq_entries(dev);
> > +    msix_clear_all_vectors(dev);
> >      dev->config[dev->msix_cap + MSIX_CONTROL_OFFSET] &=
> >  	    ~dev->wmask[dev->msix_cap + MSIX_CONTROL_OFFSET];
> >      memset(dev->msix_table, 0, dev->msix_entries_nr * PCI_MSIX_ENTRY_SIZE);
> > diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
> > index 125eded..ca0b204 100644
> > --- a/hw/virtio-pci.c
> > +++ b/hw/virtio-pci.c
> > @@ -131,6 +131,7 @@ static int virtio_pci_load_config(void * opaque, QEMUFile *f)
> >      if (ret) {
> >          return ret;
> >      }
> > +    msix_unuse_all_vectors(&proxy->pci_dev);
> >      msix_load(&proxy->pci_dev, f);
> >      if (msix_present(&proxy->pci_dev)) {
> >          qemu_get_be16s(f, &proxy->vdev->config_vector);
> > @@ -246,6 +247,7 @@ void virtio_pci_reset(DeviceState *d)
> >      VirtIOPCIProxy *proxy = container_of(d, VirtIOPCIProxy, pci_dev.qdev);
> >      virtio_pci_stop_ioeventfd(proxy);
> >      virtio_reset(proxy->vdev);
> > +    msix_unuse_all_vectors(&proxy->pci_dev);
> >      proxy->flags &= ~VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
> >  }
> >  
> > 
> 
> This just prolongs the life of a wrong API, but I'm fine for 1.2 if it
> helps. However, I will push on removing all this for 1.3.
> 
> Jan
> 

For 1.3 we'll move use tracking into virtio, yes.

Patch

diff --git a/hw/msix.c b/hw/msix.c
index 800fc32..d040cc2 100644
--- a/hw/msix.c
+++ b/hw/msix.c
@@ -340,6 +340,15 @@  static void msix_free_irq_entries(PCIDevice *dev)
     }
 }
 
+static void msix_clear_all_vectors(PCIDevice *dev)
+{
+    int vector;
+
+    for (vector = 0; vector < dev->msix_entries_nr; ++vector) {
+        msix_clr_pending(dev, vector);
+    }
+}
+
 /* Clean up resources for the device. */
 void msix_uninit(PCIDevice *dev, MemoryRegion *table_bar, MemoryRegion *pba_bar)
 {
@@ -394,7 +403,7 @@  void msix_load(PCIDevice *dev, QEMUFile *f)
         return;
     }
 
-    msix_free_irq_entries(dev);
+    msix_clear_all_vectors(dev);
     qemu_get_buffer(f, dev->msix_table, n * PCI_MSIX_ENTRY_SIZE);
     qemu_get_buffer(f, dev->msix_pba, (n + 7) / 8);
     msix_update_function_masked(dev);
@@ -440,7 +449,7 @@  void msix_reset(PCIDevice *dev)
     if (!msix_present(dev)) {
         return;
     }
-    msix_free_irq_entries(dev);
+    msix_clear_all_vectors(dev);
     dev->config[dev->msix_cap + MSIX_CONTROL_OFFSET] &=
 	    ~dev->wmask[dev->msix_cap + MSIX_CONTROL_OFFSET];
     memset(dev->msix_table, 0, dev->msix_entries_nr * PCI_MSIX_ENTRY_SIZE);
diff --git a/hw/virtio-pci.c b/hw/virtio-pci.c
index 125eded..ca0b204 100644
--- a/hw/virtio-pci.c
+++ b/hw/virtio-pci.c
@@ -131,6 +131,7 @@  static int virtio_pci_load_config(void * opaque, QEMUFile *f)
     if (ret) {
         return ret;
     }
+    msix_unuse_all_vectors(&proxy->pci_dev);
     msix_load(&proxy->pci_dev, f);
     if (msix_present(&proxy->pci_dev)) {
         qemu_get_be16s(f, &proxy->vdev->config_vector);
@@ -246,6 +247,7 @@  void virtio_pci_reset(DeviceState *d)
     VirtIOPCIProxy *proxy = container_of(d, VirtIOPCIProxy, pci_dev.qdev);
     virtio_pci_stop_ioeventfd(proxy);
     virtio_reset(proxy->vdev);
+    msix_unuse_all_vectors(&proxy->pci_dev);
     proxy->flags &= ~VIRTIO_PCI_FLAG_BUS_MASTER_BUG;
 }