diff mbox series

[v4] acpi: pcihp: allow repeating hot-unplug requests

Message ID 20230418090449.2155757-1-imammedo@redhat.com
State New
Headers show
Series [v4] acpi: pcihp: allow repeating hot-unplug requests | expand

Commit Message

Igor Mammedov April 18, 2023, 9:04 a.m. UTC
with Q35 using ACPI PCI hotplug by default, user's request to unplug
device is ignored when it's issued before guest OS has been booted.
And any additional attempt to request device hot-unplug afterwards
results in following error:

  "Device XYZ is already in the process of unplug"

arguably it can be considered as a regression introduced by [2],
before which it was possible to issue unplug request multiple
times.

Accept new uplug requests after timeout (1ms). This brings ACPI PCI
hotplug on par with native PCIe unplug behavior [1] and allows user
to repeat unplug requests at propper times.
Set expire timeout to arbitrary 1msec so user won't be able to
flood guest with SCI interrupts by calling device_del in tight loop.

PS:
ACPI spec doesn't mandate what OSPM can do with GPEx.status
bits set before it's booted => it's impl. depended.
Status bits may be retained (I tested with one Windows version)
or cleared (Linux since 2.6 kernel times) during guest's ACPI
subsystem initialization.
Clearing status bits (though not wrong per se) hides the unplug
event from guest, and it's upto user to repeat device_del later
when guest is able to handle unplug requests.

1) 18416c62e3 ("pcie: expire pending delete")
2)
Fixes: cce8944cc9ef ("qdev-monitor: Forbid repeated device_del")
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
CC: mst@redhat.com
CC: anisinha@redhat.com
CC: jusual@redhat.com
CC: kraxel@redhat.com
---
v4:
 * massage commit message some more (Kashyap Chamarthy <kchamart@redhat.com>)
 * pickup Gerd's ACK
---
 hw/acpi/pcihp.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Ani Sinha April 18, 2023, 11:33 a.m. UTC | #1
On Tue, 18 Apr 2023, Igor Mammedov wrote:

> with Q35 using ACPI PCI hotplug by default, user's request to unplug
> device is ignored when it's issued before guest OS has been booted.
> And any additional attempt to request device hot-unplug afterwards
> results in following error:
>
>   "Device XYZ is already in the process of unplug"
>
> arguably it can be considered as a regression introduced by [2],
> before which it was possible to issue unplug request multiple
> times.
>
> Accept new uplug requests after timeout (1ms). This brings ACPI PCI
> hotplug on par with native PCIe unplug behavior [1] and allows user
> to repeat unplug requests at propper times.
> Set expire timeout to arbitrary 1msec so user won't be able to
> flood guest with SCI interrupts by calling device_del in tight loop.
>
> PS:
> ACPI spec doesn't mandate what OSPM can do with GPEx.status
> bits set before it's booted => it's impl. depended.
> Status bits may be retained (I tested with one Windows version)
> or cleared (Linux since 2.6 kernel times) during guest's ACPI
> subsystem initialization.
> Clearing status bits (though not wrong per se) hides the unplug
> event from guest, and it's upto user to repeat device_del later
> when guest is able to handle unplug requests.
>
> 1) 18416c62e3 ("pcie: expire pending delete")
> 2)
> Fixes: cce8944cc9ef ("qdev-monitor: Forbid repeated device_del")
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> Acked-by: Gerd Hoffmann <kraxel@redhat.com>

Reviewed-by: Ani Sinha <anisinha@redhat.com>

> CC: mst@redhat.com
> CC: anisinha@redhat.com
> CC: jusual@redhat.com
> CC: kraxel@redhat.com
> ---
> v4:
>  * massage commit message some more (Kashyap Chamarthy <kchamart@redhat.com>)
>  * pickup Gerd's ACK
> ---
>  hw/acpi/pcihp.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
> index dcfb779a7a..cdd6f775a1 100644
> --- a/hw/acpi/pcihp.c
> +++ b/hw/acpi/pcihp.c
> @@ -357,6 +357,16 @@ void acpi_pcihp_device_unplug_request_cb(HotplugHandler *hotplug_dev,
>       * acpi_pcihp_eject_slot() when the operation is completed.
>       */
>      pdev->qdev.pending_deleted_event = true;
> +    /* if unplug was requested before OSPM is initialized,
> +     * linux kernel will clear GPE0.sts[] bits during boot, which effectively
> +     * hides unplug event. And than followup qmp_device_del() calls remain
> +     * blocked by above flag permanently.
> +     * Unblock qmp_device_del() by setting expire limit, so user can
> +     * repeat unplug request later when OSPM has been booted.
> +     */
> +    pdev->qdev.pending_deleted_expires_ms =
> +        qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL); /* 1 msec */
> +
>      s->acpi_pcihp_pci_status[bsel].down |= (1U << slot);
>      acpi_send_event(DEVICE(hotplug_dev), ACPI_PCI_HOTPLUG_STATUS);
>  }
> --
> 2.39.1
>
>
Michael Tokarev April 26, 2023, 4:40 p.m. UTC | #2
18.04.2023 12:04, Igor Mammedov wrote:
> with Q35 using ACPI PCI hotplug by default, user's request to unplug
> device is ignored when it's issued before guest OS has been booted.
> And any additional attempt to request device hot-unplug afterwards
> results in following error:
> 
>    "Device XYZ is already in the process of unplug"
> 
> arguably it can be considered as a regression introduced by [2],
> before which it was possible to issue unplug request multiple
> times.

Stable-8.0 material?

Thanks,

/mjt
Kashyap Chamarthy April 26, 2023, 5:20 p.m. UTC | #3
On Wed, Apr 26, 2023 at 07:40:02PM +0300, Michael Tokarev wrote:
> 18.04.2023 12:04, Igor Mammedov wrote:
> > with Q35 using ACPI PCI hotplug by default, user's request to unplug
> > device is ignored when it's issued before guest OS has been booted.
> > And any additional attempt to request device hot-unplug afterwards
> > results in following error:
> > 
> >    "Device XYZ is already in the process of unplug"
> > 
> > arguably it can be considered as a regression introduced by [2],
> > before which it was possible to issue unplug request multiple
> > times.
> 
> Stable-8.0 material?

FWIW, I'd say, yes. This fix is useful for stable releases.  As this
solves a real problem for upper-management tools.

I have tested this fix; and it works.  I'll post my testing notes /
reproducer in a follow-up email.  In short, I followed the
reproducer steps from here[1].

[1] https://gitlab.com/libvirt/libvirt/-/issues/309
Michael S. Tsirkin April 27, 2023, 7:01 a.m. UTC | #4
On Wed, Apr 26, 2023 at 07:20:08PM +0200, Kashyap Chamarthy wrote:
> On Wed, Apr 26, 2023 at 07:40:02PM +0300, Michael Tokarev wrote:
> > 18.04.2023 12:04, Igor Mammedov wrote:
> > > with Q35 using ACPI PCI hotplug by default, user's request to unplug
> > > device is ignored when it's issued before guest OS has been booted.
> > > And any additional attempt to request device hot-unplug afterwards
> > > results in following error:
> > > 
> > >    "Device XYZ is already in the process of unplug"
> > > 
> > > arguably it can be considered as a regression introduced by [2],
> > > before which it was possible to issue unplug request multiple
> > > times.
> > 
> > Stable-8.0 material?
> 
> FWIW, I'd say, yes. This fix is useful for stable releases.  As this
> solves a real problem for upper-management tools.
> 
> I have tested this fix; and it works.  I'll post my testing notes /
> reproducer in a follow-up email.  In short, I followed the
> reproducer steps from here[1].
> 
> [1] https://gitlab.com/libvirt/libvirt/-/issues/309
> 


You can CC stable then.

> -- 
> /kashyap
Michael Tokarev April 27, 2023, 7:04 a.m. UTC | #5
27.04.2023 10:01, Michael S. Tsirkin пишет:
..
> You can CC stable then.

Yes, please do in the future.
I picked up this one already.

Thanks!

/mjt
Kashyap Chamarthy May 4, 2023, 2:33 p.m. UTC | #6
On Wed, Apr 26, 2023 at 07:20:08PM +0200, Kashyap Chamarthy wrote:
> On Wed, Apr 26, 2023 at 07:40:02PM +0300, Michael Tokarev wrote:
> > 18.04.2023 12:04, Igor Mammedov wrote:
> > > with Q35 using ACPI PCI hotplug by default, user's request to unplug
> > > device is ignored when it's issued before guest OS has been booted.
> > > And any additional attempt to request device hot-unplug afterwards
> > > results in following error:
> > > 
> > >    "Device XYZ is already in the process of unplug"
> > > 
> > > arguably it can be considered as a regression introduced by [2],
> > > before which it was possible to issue unplug request multiple
> > > times.
> > 
> > Stable-8.0 material?
> 
> FWIW, I'd say, yes. This fix is useful for stable releases.  As this
> solves a real problem for upper-management tools.
> 
> I have tested this fix; and it works.  I'll post my testing notes /
> reproducer in a follow-up email.  In short, I followed the
> reproducer steps from here[1].

Tested-by: Kashyap Chamarthy <kchamart@redhat.com>    

It solves the device-detach bug noted here[1].  As promised, here are my
reproducer notes (expanded from[1]):

Disk image prep
---------------

(1) Download an Ubuntu "Jammy" guest image from here:
    https://cloud-images.ubuntu.com/jammy/current/jammy-server-cloudimg-amd64.img

(2) Update the above disk image's kernel command-line to have the
    guest-boot slowed down by 100 seconds; use "boot_delay=100".

(3) Have an additional image ("disk2.img") ready for hot-plug/un-plug.

Test
----

(1) Build QEMU with the patch in question:

      $ git describe
      7.2.94v8.0.0-rc4-1-gfa6650df6d7

(2) Use the above QEMU binary to launch the Ubuntu "Jammy" guest:

      $ virsh dumpxml jammy1 | grep emulator
      <emulator>/home/kashyapc/tinker-space/qemu-upstream/build/qemu-system-x86_64</emulator>

(3) Have a split `tmux` ready; start the guest in the first pane, with
    the serial console logs rolling:

      $ virsh start jammy1 --console

(4) Wait until the guest consoles messages start rolling.  Once they do,
    on the other `tmux` pane, issue the below command (it's a live
    attach, followed by a detach):

      $ virsh attach-disk jammy1 ./disk2.img vdb --live --persistent \
          && sleep 1 \
          && virsh detach-disk jammy1 --live ./disk2.img
  
      Disk attached successfully
                                                     
      Disk detached successfully

(5) Enumerate the attached block devices to the guest.  We still see the
    second disk, "disk2.img":

      $> virsh domblklist jammy1
       Target   Source                         
      -------------------------------------------
       vda      /data/images/jammy-ubuntu.qcow2      
       vdb      /data/images/disk2.img

(6) Now detach the disk from the inactive guest XML (that affects
    next boot) by using "--persistent" flag; and enumerate the live
    block devices (we still see the second disk)

      $> virsh detach-disk jammy1 --persistent /data/images/disk2.img
      Disk detached successfully
  
      $> virsh domblklist jammy1
       Target   Source                         
      -------------------------------------------
       vda      /data/images/jammy-ubuntu.qcow2      
       vdb      /data/images/disk2.img
 
(NOTE: We're using two separate calls to `virsh detach-disk`, one with
"--live" and the other with "--persistent" based on upstream libvirt
recommendation in [1].)

(7) Again, re-issue the detach command with just "--live" flag:

      $> virsh detach-disk jammy1 --live /data/images/disk2.img                                       
      Disk detached successfully

(8) Re-enumerate the attached block devices: 

      $> virsh domblklist jammy1                                                                      
       Target   Source
      -------------------------------------------
       vda      /data/images/jammy-ubuntu.qcow2

Now we see the second device is detached "for real".  Overall, we were
able to successfully re-issue `device detach` while the guest is still
booting, and see through the actual detach to its logical conclusion.


[1] https://gitlab.com/libvirt/libvirt/-/issues/309 -- Disk detach is
    unsuccessfull while the guest is still booting
diff mbox series

Patch

diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
index dcfb779a7a..cdd6f775a1 100644
--- a/hw/acpi/pcihp.c
+++ b/hw/acpi/pcihp.c
@@ -357,6 +357,16 @@  void acpi_pcihp_device_unplug_request_cb(HotplugHandler *hotplug_dev,
      * acpi_pcihp_eject_slot() when the operation is completed.
      */
     pdev->qdev.pending_deleted_event = true;
+    /* if unplug was requested before OSPM is initialized,
+     * linux kernel will clear GPE0.sts[] bits during boot, which effectively
+     * hides unplug event. And than followup qmp_device_del() calls remain
+     * blocked by above flag permanently.
+     * Unblock qmp_device_del() by setting expire limit, so user can
+     * repeat unplug request later when OSPM has been booted.
+     */
+    pdev->qdev.pending_deleted_expires_ms =
+        qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL); /* 1 msec */
+
     s->acpi_pcihp_pci_status[bsel].down |= (1U << slot);
     acpi_send_event(DEVICE(hotplug_dev), ACPI_PCI_HOTPLUG_STATUS);
 }