diff mbox series

acpi: pcihp: make pending delete expire in 5sec

Message ID 20230403161618.1344414-1-imammedo@redhat.com
State New
Headers show
Series acpi: pcihp: make pending delete expire in 5sec | expand

Commit Message

Igor Mammedov April 3, 2023, 4:16 p.m. UTC
with Q35 using ACPI PCI hotplug by default, user's request to unplug
device is ignored when it's issued before guest OS has been booted.
And any additional attempt to request device hot-unplug afterwards
results in following error:

  "Device XYZ is already in the process of unplug"

arguably it can be considered as a regression introduced by [2],
before which it was possible to issue unplug request multiple
times.

Allowing pending delete expire brings ACPI PCI hotplug on par
with native PCIe unplug behavior [1] which in its turn refers
back to ACPI PCI hotplug ability to repeat unplug requests.

PS:
From ACPI point of view, unplug request sets PCI hotplug status
bit in GPE0 block. However depending on OSPM, status bits may
be retained (Windows) or cleared (Linux) during guest's ACPI
subsystem initialization, and as result Linux guest looses
plug/unplug event (no SCI generated) if plug/unplug has
happend before guest OS initialized GPE registers handling.
I couldn't find any restrictions wrt OPM clearing GPE status
bits ACPI spec.
Hence a fallback approach is to let user repeat unplug request
later at the time when guest OS has booted.

1) 18416c62e3 ("pcie: expire pending delete")
2)
Fixes: cce8944cc9ef ("qdev-monitor: Forbid repeated device_del")
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
CC: mst@redhat.com
CC: anisinha@redhat.com
CC: jusual@redhat.com
CC: kraxel@redhat.com
---
 hw/acpi/pcihp.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Michael S. Tsirkin April 3, 2023, 5:23 p.m. UTC | #1
On Mon, Apr 03, 2023 at 06:16:18PM +0200, Igor Mammedov wrote:
> with Q35 using ACPI PCI hotplug by default, user's request to unplug
> device is ignored when it's issued before guest OS has been booted.
> And any additional attempt to request device hot-unplug afterwards
> results in following error:
> 
>   "Device XYZ is already in the process of unplug"
> 
> arguably it can be considered as a regression introduced by [2],
> before which it was possible to issue unplug request multiple
> times.
> 
> Allowing pending delete expire brings ACPI PCI hotplug on par
> with native PCIe unplug behavior [1] which in its turn refers
> back to ACPI PCI hotplug ability to repeat unplug requests.
> 
> PS:
> >From ACPI point of view, unplug request sets PCI hotplug status
> bit in GPE0 block. However depending on OSPM, status bits may
> be retained (Windows) or cleared (Linux) during guest's ACPI
> subsystem initialization, and as result Linux guest looses
> plug/unplug event (no SCI generated) if plug/unplug has
> happend before guest OS initialized GPE registers handling.
> I couldn't find any restrictions wrt OPM clearing GPE status
> bits ACPI spec.
> Hence a fallback approach is to let user repeat unplug request
> later at the time when guest OS has booted.
> 
> 1) 18416c62e3 ("pcie: expire pending delete")
> 2)
> Fixes: cce8944cc9ef ("qdev-monitor: Forbid repeated device_del")
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>

A bit concerned about how this interacts with failover,
and 5sec is a lot of time that I hoped we'd avoid with acpi.
Any better ideas of catching such misbehaving guests?


Also at this point I do not know why we deny hotplug
pending_deleted_event in qdev core.  
Commit log says:

    Device unplug can be done asynchronously. Thus, sending the second
    device_del before the previous unplug is complete may lead to
    unexpected results. On PCIe devices, this cancels the hot-unplug
    process.

so it's a work around for an issue in pcie hotplug (and maybe shpc
too?). Maybe we should have put that check in pcie/shpc and
leave acpi along?




> ---
> CC: mst@redhat.com
> CC: anisinha@redhat.com
> CC: jusual@redhat.com
> CC: kraxel@redhat.com
> ---
>  hw/acpi/pcihp.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
> index dcfb779a7a..cd4f9fee0a 100644
> --- a/hw/acpi/pcihp.c
> +++ b/hw/acpi/pcihp.c
> @@ -357,6 +357,8 @@ void acpi_pcihp_device_unplug_request_cb(HotplugHandler *hotplug_dev,
>       * acpi_pcihp_eject_slot() when the operation is completed.
>       */
>      pdev->qdev.pending_deleted_event = true;
> +    pdev->qdev.pending_deleted_expires_ms =
> +        qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + 5000; /* 5 secs */
>      s->acpi_pcihp_pci_status[bsel].down |= (1U << slot);
>      acpi_send_event(DEVICE(hotplug_dev), ACPI_PCI_HOTPLUG_STATUS);
>  }
> -- 
> 2.39.1
Gerd Hoffmann April 4, 2023, 7:03 a.m. UTC | #2
Hi,

> > Allowing pending delete expire brings ACPI PCI hotplug on par
> > with native PCIe unplug behavior [1] which in its turn refers
> > back to ACPI PCI hotplug ability to repeat unplug requests.

> A bit concerned about how this interacts with failover,
> and 5sec is a lot of time that I hoped we'd avoid with acpi.
> Any better ideas of catching such misbehaving guests?

The 5sec are coming from the pcie spec: The hot-unplug request can be
canceled within 5 seconds by pressing the button again. The problem here
is that both hotplug and hot-unplug use the same signaling path, so we
really have to wait the 5 seconds to avoid the OS mis-interpreting the
button press as 'cancel' event.

ACPI hotplug hasn't this problem.  A unplug request is a unplug request,
period.  And it can't be canceled.  So it should be possible to use a
shorter period.  Possibly even no delay at all.

take care,
  Gerd
Ani Sinha April 4, 2023, 7:36 a.m. UTC | #3
On Tue, 4 Apr 2023, Gerd Hoffmann wrote:

>   Hi,
>
> > > Allowing pending delete expire brings ACPI PCI hotplug on par
> > > with native PCIe unplug behavior [1] which in its turn refers
> > > back to ACPI PCI hotplug ability to repeat unplug requests.
>
> > A bit concerned about how this interacts with failover,
> > and 5sec is a lot of time that I hoped we'd avoid with acpi.
> > Any better ideas of catching such misbehaving guests?
>
> The 5sec are coming from the pcie spec: The hot-unplug request can be
> canceled within 5 seconds by pressing the button again. The problem here
> is that both hotplug and hot-unplug use the same signaling path, so we
> really have to wait the 5 seconds to avoid the OS mis-interpreting the
> button press as 'cancel' event.
>
> ACPI hotplug hasn't this problem.  A unplug request is a unplug request,

For ACPI case, I think all we want is to make sure that the first unplug
event to not stick forever. A non-zero but small delay would make sure
that the first
unplug event would get cleared after that interval and subsequent unplug
events will get registered without that error.

> period.  And it can't be canceled.  So it should be possible to use a
> shorter period.  Possibly even no delay at all.
>
> take care,
>   Gerd
>
>
Igor Mammedov April 4, 2023, 8:28 a.m. UTC | #4
On Mon, 3 Apr 2023 13:23:45 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Mon, Apr 03, 2023 at 06:16:18PM +0200, Igor Mammedov wrote:
> > with Q35 using ACPI PCI hotplug by default, user's request to unplug
> > device is ignored when it's issued before guest OS has been booted.
> > And any additional attempt to request device hot-unplug afterwards
> > results in following error:
> > 
> >   "Device XYZ is already in the process of unplug"
> > 
> > arguably it can be considered as a regression introduced by [2],
> > before which it was possible to issue unplug request multiple
> > times.
> > 
> > Allowing pending delete expire brings ACPI PCI hotplug on par
> > with native PCIe unplug behavior [1] which in its turn refers
> > back to ACPI PCI hotplug ability to repeat unplug requests.
> > 
> > PS:  
> > >From ACPI point of view, unplug request sets PCI hotplug status  
> > bit in GPE0 block. However depending on OSPM, status bits may
> > be retained (Windows) or cleared (Linux) during guest's ACPI
> > subsystem initialization, and as result Linux guest looses
> > plug/unplug event (no SCI generated) if plug/unplug has
> > happend before guest OS initialized GPE registers handling.
> > I couldn't find any restrictions wrt OPM clearing GPE status
> > bits ACPI spec.
> > Hence a fallback approach is to let user repeat unplug request
> > later at the time when guest OS has booted.
> > 
> > 1) 18416c62e3 ("pcie: expire pending delete")
> > 2)
> > Fixes: cce8944cc9ef ("qdev-monitor: Forbid repeated device_del")
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>  
> 
> A bit concerned about how this interacts with failover,
> and 5sec is a lot of time that I hoped we'd avoid with acpi.
> Any better ideas of catching such misbehaving guests?

It shouldn't affect affect failover, pending_delete is not
cleared after all (only device removal should do that).
So all patch does is allowing to reissue unplug request
in case it was lost, delay here doesn't mean much
(do you have any preference wrt specific value)?

As for 'misbehaving' - I tried to find justification
for it in spec, but I couldn't.
Essentially it's upto OSPM to clear or not GPE status
bits at startup (linux was doing it since forever),
depending on guest's ability to handle hotplug events
at boot time.

It's more a user error, ACPI hotplug does imply booted
guest for it to function properly. So it's fine to
loose unplug event at boot time. What QEMU does wrong is
preventing follow up unplug requests.  

> 
> Also at this point I do not know why we deny hotplug
> pending_deleted_event in qdev core.  
> Commit log says:
> 
>     Device unplug can be done asynchronously. Thus, sending the second
>     device_del before the previous unplug is complete may lead to
>     unexpected results. On PCIe devices, this cancels the hot-unplug
>     process.
> 
> so it's a work around for an issue in pcie hotplug (and maybe shpc
> too?). Maybe we should have put that check in pcie/shpc and
> leave acpi along?
> 
> 
> 
> 
> > ---
> > CC: mst@redhat.com
> > CC: anisinha@redhat.com
> > CC: jusual@redhat.com
> > CC: kraxel@redhat.com
> > ---
> >  hw/acpi/pcihp.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
> > index dcfb779a7a..cd4f9fee0a 100644
> > --- a/hw/acpi/pcihp.c
> > +++ b/hw/acpi/pcihp.c
> > @@ -357,6 +357,8 @@ void acpi_pcihp_device_unplug_request_cb(HotplugHandler *hotplug_dev,
> >       * acpi_pcihp_eject_slot() when the operation is completed.
> >       */
> >      pdev->qdev.pending_deleted_event = true;
> > +    pdev->qdev.pending_deleted_expires_ms =
> > +        qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + 5000; /* 5 secs */
> >      s->acpi_pcihp_pci_status[bsel].down |= (1U << slot);
> >      acpi_send_event(DEVICE(hotplug_dev), ACPI_PCI_HOTPLUG_STATUS);
> >  }
> > -- 
> > 2.39.1  
>
Igor Mammedov April 4, 2023, 8:30 a.m. UTC | #5
On Tue, 4 Apr 2023 09:03:59 +0200
Gerd Hoffmann <kraxel@redhat.com> wrote:

>   Hi,
> 
> > > Allowing pending delete expire brings ACPI PCI hotplug on par
> > > with native PCIe unplug behavior [1] which in its turn refers
> > > back to ACPI PCI hotplug ability to repeat unplug requests.  
> 
> > A bit concerned about how this interacts with failover,
> > and 5sec is a lot of time that I hoped we'd avoid with acpi.
> > Any better ideas of catching such misbehaving guests?  
> 
> The 5sec are coming from the pcie spec: The hot-unplug request can be
> canceled within 5 seconds by pressing the button again. The problem here
> is that both hotplug and hot-unplug use the same signaling path, so we
> really have to wait the 5 seconds to avoid the OS mis-interpreting the
> button press as 'cancel' event.

Any pointer to spec?
Does it apply to SHPC too?
(/me thinking about moving pending_delete check to PCIe only code)

> 
> ACPI hotplug hasn't this problem.  A unplug request is a unplug request,
> period.  And it can't be canceled.  So it should be possible to use a
> shorter period.  Possibly even no delay at all.
> 
> take care,
>   Gerd
>
Gerd Hoffmann April 4, 2023, 10:46 a.m. UTC | #6
On Tue, Apr 04, 2023 at 10:30:55AM +0200, Igor Mammedov wrote:
> On Tue, 4 Apr 2023 09:03:59 +0200
> Gerd Hoffmann <kraxel@redhat.com> wrote:
> 
> >   Hi,
> > 
> > > > Allowing pending delete expire brings ACPI PCI hotplug on par
> > > > with native PCIe unplug behavior [1] which in its turn refers
> > > > back to ACPI PCI hotplug ability to repeat unplug requests.  
> > 
> > > A bit concerned about how this interacts with failover,
> > > and 5sec is a lot of time that I hoped we'd avoid with acpi.
> > > Any better ideas of catching such misbehaving guests?  
> > 
> > The 5sec are coming from the pcie spec: The hot-unplug request can be
> > canceled within 5 seconds by pressing the button again. The problem here
> > is that both hotplug and hot-unplug use the same signaling path, so we
> > really have to wait the 5 seconds to avoid the OS mis-interpreting the
> > button press as 'cancel' event.
> 
> Any pointer to spec?

pcie base spec, section 6.7.1.5. Attention Button

> Does it apply to SHPC too?

Yes (section 2.2.5. Attention Button).

take care,
  Gerd
Michael S. Tsirkin April 4, 2023, 12:40 p.m. UTC | #7
On Tue, Apr 04, 2023 at 01:06:38PM +0530, Ani Sinha wrote:
> 
> 
> On Tue, 4 Apr 2023, Gerd Hoffmann wrote:
> 
> >   Hi,
> >
> > > > Allowing pending delete expire brings ACPI PCI hotplug on par
> > > > with native PCIe unplug behavior [1] which in its turn refers
> > > > back to ACPI PCI hotplug ability to repeat unplug requests.
> >
> > > A bit concerned about how this interacts with failover,
> > > and 5sec is a lot of time that I hoped we'd avoid with acpi.
> > > Any better ideas of catching such misbehaving guests?
> >
> > The 5sec are coming from the pcie spec: The hot-unplug request can be
> > canceled within 5 seconds by pressing the button again. The problem here
> > is that both hotplug and hot-unplug use the same signaling path, so we
> > really have to wait the 5 seconds to avoid the OS mis-interpreting the
> > button press as 'cancel' event.
> >
> > ACPI hotplug hasn't this problem.  A unplug request is a unplug request,
> 
> For ACPI case, I think all we want is to make sure that the first unplug
> event to not stick forever. A non-zero but small delay would make sure
> that the first
> unplug event would get cleared after that interval and subsequent unplug
> events will get registered without that error.
> 
> > period.  And it can't be canceled.  So it should be possible to use a
> > shorter period.  Possibly even no delay at all.
> >
> > take care,
> >   Gerd
> >
> >


But why do we want a delay at all? for acpi you can resend
the interrupt as many times as you like.
Michael S. Tsirkin April 4, 2023, 12:46 p.m. UTC | #8
On Tue, Apr 04, 2023 at 10:28:07AM +0200, Igor Mammedov wrote:
> On Mon, 3 Apr 2023 13:23:45 -0400
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Mon, Apr 03, 2023 at 06:16:18PM +0200, Igor Mammedov wrote:
> > > with Q35 using ACPI PCI hotplug by default, user's request to unplug
> > > device is ignored when it's issued before guest OS has been booted.
> > > And any additional attempt to request device hot-unplug afterwards
> > > results in following error:
> > > 
> > >   "Device XYZ is already in the process of unplug"
> > > 
> > > arguably it can be considered as a regression introduced by [2],
> > > before which it was possible to issue unplug request multiple
> > > times.
> > > 
> > > Allowing pending delete expire brings ACPI PCI hotplug on par
> > > with native PCIe unplug behavior [1] which in its turn refers
> > > back to ACPI PCI hotplug ability to repeat unplug requests.
> > > 
> > > PS:  
> > > >From ACPI point of view, unplug request sets PCI hotplug status  
> > > bit in GPE0 block. However depending on OSPM, status bits may
> > > be retained (Windows) or cleared (Linux) during guest's ACPI
> > > subsystem initialization, and as result Linux guest looses
> > > plug/unplug event (no SCI generated) if plug/unplug has
> > > happend before guest OS initialized GPE registers handling.
> > > I couldn't find any restrictions wrt OPM clearing GPE status
> > > bits ACPI spec.
> > > Hence a fallback approach is to let user repeat unplug request
> > > later at the time when guest OS has booted.
> > > 
> > > 1) 18416c62e3 ("pcie: expire pending delete")
> > > 2)
> > > Fixes: cce8944cc9ef ("qdev-monitor: Forbid repeated device_del")
> > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>  
> > 
> > A bit concerned about how this interacts with failover,
> > and 5sec is a lot of time that I hoped we'd avoid with acpi.
> > Any better ideas of catching such misbehaving guests?
> 
> It shouldn't affect affect failover, pending_delete is not
> cleared after all (only device removal should do that).
> So all patch does is allowing to reissue unplug request
> in case it was lost, delay here doesn't mean much
> (do you have any preference wrt specific value)?

I'd prefer immediately.

> As for 'misbehaving' - I tried to find justification
> for it in spec, but I couldn't.
> Essentially it's upto OSPM to clear or not GPE status
> bits at startup (linux was doing it since forever),
> depending on guest's ability to handle hotplug events
> at boot time.
> 
> It's more a user error, ACPI hotplug does imply booted
> guest for it to function properly. So it's fine to
> loose unplug event at boot time. What QEMU does wrong is
> preventing follow up unplug requests.  
> 
> > 
> > Also at this point I do not know why we deny hotplug
> > pending_deleted_event in qdev core.  
> > Commit log says:
> > 
> >     Device unplug can be done asynchronously. Thus, sending the second
> >     device_del before the previous unplug is complete may lead to
> >     unexpected results. On PCIe devices, this cancels the hot-unplug
> >     process.
> > 
> > so it's a work around for an issue in pcie hotplug (and maybe shpc
> > too?). Maybe we should have put that check in pcie/shpc and
> > leave acpi along?
> > 
> > 
> > 
> > 
> > > ---
> > > CC: mst@redhat.com
> > > CC: anisinha@redhat.com
> > > CC: jusual@redhat.com
> > > CC: kraxel@redhat.com
> > > ---
> > >  hw/acpi/pcihp.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
> > > index dcfb779a7a..cd4f9fee0a 100644
> > > --- a/hw/acpi/pcihp.c
> > > +++ b/hw/acpi/pcihp.c
> > > @@ -357,6 +357,8 @@ void acpi_pcihp_device_unplug_request_cb(HotplugHandler *hotplug_dev,
> > >       * acpi_pcihp_eject_slot() when the operation is completed.
> > >       */
> > >      pdev->qdev.pending_deleted_event = true;
> > > +    pdev->qdev.pending_deleted_expires_ms =
> > > +        qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + 5000; /* 5 secs */
> > >      s->acpi_pcihp_pci_status[bsel].down |= (1U << slot);
> > >      acpi_send_event(DEVICE(hotplug_dev), ACPI_PCI_HOTPLUG_STATUS);
> > >  }
> > > -- 
> > > 2.39.1  
> >
Igor Mammedov April 4, 2023, 1:56 p.m. UTC | #9
On Tue, 4 Apr 2023 08:40:45 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, Apr 04, 2023 at 01:06:38PM +0530, Ani Sinha wrote:
> > 
> > 
> > On Tue, 4 Apr 2023, Gerd Hoffmann wrote:
> >   
> > >   Hi,
> > >  
> > > > > Allowing pending delete expire brings ACPI PCI hotplug on par
> > > > > with native PCIe unplug behavior [1] which in its turn refers
> > > > > back to ACPI PCI hotplug ability to repeat unplug requests.  
> > >  
> > > > A bit concerned about how this interacts with failover,
> > > > and 5sec is a lot of time that I hoped we'd avoid with acpi.
> > > > Any better ideas of catching such misbehaving guests?  
> > >
> > > The 5sec are coming from the pcie spec: The hot-unplug request can be
> > > canceled within 5 seconds by pressing the button again. The problem here
> > > is that both hotplug and hot-unplug use the same signaling path, so we
> > > really have to wait the 5 seconds to avoid the OS mis-interpreting the
> > > button press as 'cancel' event.
> > >
> > > ACPI hotplug hasn't this problem.  A unplug request is a unplug request,  
> > 
> > For ACPI case, I think all we want is to make sure that the first unplug
> > event to not stick forever. A non-zero but small delay would make sure
> > that the first
> > unplug event would get cleared after that interval and subsequent unplug
> > events will get registered without that error.
> >   
> > > period.  And it can't be canceled.  So it should be possible to use a
> > > shorter period.  Possibly even no delay at all.
> > >
> > > take care,
> > >   Gerd
> > >
> > >  
> 
> 
> But why do we want a delay at all? for acpi you can resend
> the interrupt as many times as you like.

yep, we can. It makes possible for user to cause limited
"interrupt storm". That also leads to device_del abuse [1]

1) https://www.mail-archive.com/qemu-devel@nongnu.org/msg952738.html
Igor Mammedov April 4, 2023, 2:04 p.m. UTC | #10
On Tue, 4 Apr 2023 08:46:15 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, Apr 04, 2023 at 10:28:07AM +0200, Igor Mammedov wrote:
> > On Mon, 3 Apr 2023 13:23:45 -0400
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >   
> > > On Mon, Apr 03, 2023 at 06:16:18PM +0200, Igor Mammedov wrote:  
> > > > with Q35 using ACPI PCI hotplug by default, user's request to unplug
> > > > device is ignored when it's issued before guest OS has been booted.
> > > > And any additional attempt to request device hot-unplug afterwards
> > > > results in following error:
> > > > 
> > > >   "Device XYZ is already in the process of unplug"
> > > > 
> > > > arguably it can be considered as a regression introduced by [2],
> > > > before which it was possible to issue unplug request multiple
> > > > times.
> > > > 
> > > > Allowing pending delete expire brings ACPI PCI hotplug on par
> > > > with native PCIe unplug behavior [1] which in its turn refers
> > > > back to ACPI PCI hotplug ability to repeat unplug requests.
> > > > 
> > > > PS:    
> > > > >From ACPI point of view, unplug request sets PCI hotplug status    
> > > > bit in GPE0 block. However depending on OSPM, status bits may
> > > > be retained (Windows) or cleared (Linux) during guest's ACPI
> > > > subsystem initialization, and as result Linux guest looses
> > > > plug/unplug event (no SCI generated) if plug/unplug has
> > > > happend before guest OS initialized GPE registers handling.
> > > > I couldn't find any restrictions wrt OPM clearing GPE status
> > > > bits ACPI spec.
> > > > Hence a fallback approach is to let user repeat unplug request
> > > > later at the time when guest OS has booted.
> > > > 
> > > > 1) 18416c62e3 ("pcie: expire pending delete")
> > > > 2)
> > > > Fixes: cce8944cc9ef ("qdev-monitor: Forbid repeated device_del")
> > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>    
> > > 
> > > A bit concerned about how this interacts with failover,
> > > and 5sec is a lot of time that I hoped we'd avoid with acpi.
> > > Any better ideas of catching such misbehaving guests?  
> > 
> > It shouldn't affect affect failover, pending_delete is not
> > cleared after all (only device removal should do that).
> > So all patch does is allowing to reissue unplug request
> > in case it was lost, delay here doesn't mean much
> > (do you have any preference wrt specific value)?  
> 
> I'd prefer immediately.

ok, lets use 1ms then, I'd rather reuse the preexisting
pending_deleted_expires_ms machinery instead of
special-casing immediate repeat.

> 
> > As for 'misbehaving' - I tried to find justification
> > for it in spec, but I couldn't.
> > Essentially it's upto OSPM to clear or not GPE status
> > bits at startup (linux was doing it since forever),
> > depending on guest's ability to handle hotplug events
> > at boot time.
> > 
> > It's more a user error, ACPI hotplug does imply booted
> > guest for it to function properly. So it's fine to
> > loose unplug event at boot time. What QEMU does wrong is
> > preventing follow up unplug requests.  
> >   
> > > 
> > > Also at this point I do not know why we deny hotplug
> > > pending_deleted_event in qdev core.  
> > > Commit log says:
> > > 
> > >     Device unplug can be done asynchronously. Thus, sending the second
> > >     device_del before the previous unplug is complete may lead to
> > >     unexpected results. On PCIe devices, this cancels the hot-unplug
> > >     process.
> > > 
> > > so it's a work around for an issue in pcie hotplug (and maybe shpc
> > > too?). Maybe we should have put that check in pcie/shpc and
> > > leave acpi along?
> > > 
> > > 
> > > 
> > >   
> > > > ---
> > > > CC: mst@redhat.com
> > > > CC: anisinha@redhat.com
> > > > CC: jusual@redhat.com
> > > > CC: kraxel@redhat.com
> > > > ---
> > > >  hw/acpi/pcihp.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > > 
> > > > diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
> > > > index dcfb779a7a..cd4f9fee0a 100644
> > > > --- a/hw/acpi/pcihp.c
> > > > +++ b/hw/acpi/pcihp.c
> > > > @@ -357,6 +357,8 @@ void acpi_pcihp_device_unplug_request_cb(HotplugHandler *hotplug_dev,
> > > >       * acpi_pcihp_eject_slot() when the operation is completed.
> > > >       */
> > > >      pdev->qdev.pending_deleted_event = true;
> > > > +    pdev->qdev.pending_deleted_expires_ms =
> > > > +        qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + 5000; /* 5 secs */
> > > >      s->acpi_pcihp_pci_status[bsel].down |= (1U << slot);
> > > >      acpi_send_event(DEVICE(hotplug_dev), ACPI_PCI_HOTPLUG_STATUS);
> > > >  }
> > > > -- 
> > > > 2.39.1    
> > >   
>
Ani Sinha April 4, 2023, 2:10 p.m. UTC | #11
On Tue, Apr 4, 2023 at 7:34 PM Igor Mammedov <imammedo@redhat.com> wrote:

> On Tue, 4 Apr 2023 08:46:15 -0400
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>
> > On Tue, Apr 04, 2023 at 10:28:07AM +0200, Igor Mammedov wrote:
> > > On Mon, 3 Apr 2023 13:23:45 -0400
> > > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > >
> > > > On Mon, Apr 03, 2023 at 06:16:18PM +0200, Igor Mammedov wrote:
> > > > > with Q35 using ACPI PCI hotplug by default, user's request to
> unplug
> > > > > device is ignored when it's issued before guest OS has been booted.
> > > > > And any additional attempt to request device hot-unplug afterwards
> > > > > results in following error:
> > > > >
> > > > >   "Device XYZ is already in the process of unplug"
> > > > >
> > > > > arguably it can be considered as a regression introduced by [2],
> > > > > before which it was possible to issue unplug request multiple
> > > > > times.
> > > > >
> > > > > Allowing pending delete expire brings ACPI PCI hotplug on par
> > > > > with native PCIe unplug behavior [1] which in its turn refers
> > > > > back to ACPI PCI hotplug ability to repeat unplug requests.
> > > > >
> > > > > PS:
> > > > > >From ACPI point of view, unplug request sets PCI hotplug status
>
> > > > > bit in GPE0 block. However depending on OSPM, status bits may
> > > > > be retained (Windows) or cleared (Linux) during guest's ACPI
> > > > > subsystem initialization, and as result Linux guest looses
> > > > > plug/unplug event (no SCI generated) if plug/unplug has
> > > > > happend before guest OS initialized GPE registers handling.
> > > > > I couldn't find any restrictions wrt OPM clearing GPE status
> > > > > bits ACPI spec.
> > > > > Hence a fallback approach is to let user repeat unplug request
> > > > > later at the time when guest OS has booted.
> > > > >
> > > > > 1) 18416c62e3 ("pcie: expire pending delete")
> > > > > 2)
> > > > > Fixes: cce8944cc9ef ("qdev-monitor: Forbid repeated device_del")
> > > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > > >
> > > > A bit concerned about how this interacts with failover,
> > > > and 5sec is a lot of time that I hoped we'd avoid with acpi.
> > > > Any better ideas of catching such misbehaving guests?
> > >
> > > It shouldn't affect affect failover, pending_delete is not
> > > cleared after all (only device removal should do that).
> > > So all patch does is allowing to reissue unplug request
> > > in case it was lost, delay here doesn't mean much
> > > (do you have any preference wrt specific value)?
> >
> > I'd prefer immediately.
>
> ok, lets use 1ms then, I'd rather reuse the preexisting
> pending_deleted_expires_ms machinery instead of
> special-casing immediate repeat.
>

Sounds good to me.


>
> >
> > > As for 'misbehaving' - I tried to find justification
> > > for it in spec, but I couldn't.
> > > Essentially it's upto OSPM to clear or not GPE status
> > > bits at startup (linux was doing it since forever),
> > > depending on guest's ability to handle hotplug events
> > > at boot time.
> > >
> > > It's more a user error, ACPI hotplug does imply booted
> > > guest for it to function properly. So it's fine to
> > > loose unplug event at boot time. What QEMU does wrong is
> > > preventing follow up unplug requests.
> > >
> > > >
> > > > Also at this point I do not know why we deny hotplug
> > > > pending_deleted_event in qdev core.
> > > > Commit log says:
> > > >
> > > >     Device unplug can be done asynchronously. Thus, sending the
> second
> > > >     device_del before the previous unplug is complete may lead to
> > > >     unexpected results. On PCIe devices, this cancels the hot-unplug
> > > >     process.
> > > >
> > > > so it's a work around for an issue in pcie hotplug (and maybe shpc
> > > > too?). Maybe we should have put that check in pcie/shpc and
> > > > leave acpi along?
> > > >
> > > >
> > > >
> > > >
> > > > > ---
> > > > > CC: mst@redhat.com
> > > > > CC: anisinha@redhat.com
> > > > > CC: jusual@redhat.com
> > > > > CC: kraxel@redhat.com
> > > > > ---
> > > > >  hw/acpi/pcihp.c | 2 ++
> > > > >  1 file changed, 2 insertions(+)
> > > > >
> > > > > diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
> > > > > index dcfb779a7a..cd4f9fee0a 100644
> > > > > --- a/hw/acpi/pcihp.c
> > > > > +++ b/hw/acpi/pcihp.c
> > > > > @@ -357,6 +357,8 @@ void
> acpi_pcihp_device_unplug_request_cb(HotplugHandler *hotplug_dev,
> > > > >       * acpi_pcihp_eject_slot() when the operation is completed.
> > > > >       */
> > > > >      pdev->qdev.pending_deleted_event = true;
> > > > > +    pdev->qdev.pending_deleted_expires_ms =
> > > > > +        qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + 5000; /* 5 secs */
> > > > >      s->acpi_pcihp_pci_status[bsel].down |= (1U << slot);
> > > > >      acpi_send_event(DEVICE(hotplug_dev), ACPI_PCI_HOTPLUG_STATUS);
> > > > >  }
> > > > > --
> > > > > 2.39.1
> > > >
> >
>
>
Ani Sinha April 4, 2023, 2:11 p.m. UTC | #12
On Mon, Apr 3, 2023 at 9:46 PM Igor Mammedov <imammedo@redhat.com> wrote:

> with Q35 using ACPI PCI hotplug by default, user's request to unplug
> device is ignored when it's issued before guest OS has been booted.
> And any additional attempt to request device hot-unplug afterwards
> results in following error:
>
>   "Device XYZ is already in the process of unplug"
>
> arguably it can be considered as a regression introduced by [2],
> before which it was possible to issue unplug request multiple
> times.
>
> Allowing pending delete expire brings ACPI PCI hotplug on par
> with native PCIe unplug behavior [1] which in its turn refers
> back to ACPI PCI hotplug ability to repeat unplug requests.
>
> PS:
> From ACPI point of view, unplug request sets PCI hotplug status
> bit in GPE0 block. However depending on OSPM, status bits may
> be retained (Windows) or cleared (Linux) during guest's ACPI
> subsystem initialization, and as result Linux guest looses
> plug/unplug event (no SCI generated) if plug/unplug has
> happend before guest OS initialized GPE registers handling.
> I couldn't find any restrictions wrt OPM

                                                         ^^^^
When you churn out a v2, can you fix this typo?


> clearing GPE status
> bits ACPI spec.
> Hence a fallback approach is to let user repeat unplug request
> later at the time when guest OS has booted.
>
> 1) 18416c62e3 ("pcie: expire pending delete")
> 2)
> Fixes: cce8944cc9ef ("qdev-monitor: Forbid repeated device_del")
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
> CC: mst@redhat.com
> CC: anisinha@redhat.com
> CC: jusual@redhat.com
> CC: kraxel@redhat.com
> ---
>  hw/acpi/pcihp.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
> index dcfb779a7a..cd4f9fee0a 100644
> --- a/hw/acpi/pcihp.c
> +++ b/hw/acpi/pcihp.c
> @@ -357,6 +357,8 @@ void
> acpi_pcihp_device_unplug_request_cb(HotplugHandler *hotplug_dev,
>       * acpi_pcihp_eject_slot() when the operation is completed.
>       */
>      pdev->qdev.pending_deleted_event = true;
> +    pdev->qdev.pending_deleted_expires_ms =
> +        qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + 5000; /* 5 secs */
>      s->acpi_pcihp_pci_status[bsel].down |= (1U << slot);
>      acpi_send_event(DEVICE(hotplug_dev), ACPI_PCI_HOTPLUG_STATUS);
>  }
> --
> 2.39.1
>
>
Michael S. Tsirkin April 4, 2023, 2:40 p.m. UTC | #13
On Tue, Apr 04, 2023 at 07:40:40PM +0530, Ani Sinha wrote:
> 
> 
> On Tue, Apr 4, 2023 at 7:34 PM Igor Mammedov <imammedo@redhat.com> wrote:
> 
>     On Tue, 4 Apr 2023 08:46:15 -0400
>     "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
>     > On Tue, Apr 04, 2023 at 10:28:07AM +0200, Igor Mammedov wrote:
>     > > On Mon, 3 Apr 2023 13:23:45 -0400
>     > > "Michael S. Tsirkin" <mst@redhat.com> wrote:
>     > >   
>     > > > On Mon, Apr 03, 2023 at 06:16:18PM +0200, Igor Mammedov wrote: 
>     > > > > with Q35 using ACPI PCI hotplug by default, user's request to
>     unplug
>     > > > > device is ignored when it's issued before guest OS has been booted.
>     > > > > And any additional attempt to request device hot-unplug afterwards
>     > > > > results in following error:
>     > > > >
>     > > > >   "Device XYZ is already in the process of unplug"
>     > > > >
>     > > > > arguably it can be considered as a regression introduced by [2],
>     > > > > before which it was possible to issue unplug request multiple
>     > > > > times.
>     > > > >
>     > > > > Allowing pending delete expire brings ACPI PCI hotplug on par
>     > > > > with native PCIe unplug behavior [1] which in its turn refers
>     > > > > back to ACPI PCI hotplug ability to repeat unplug requests.
>     > > > >
>     > > > > PS:   
>     > > > > >From ACPI point of view, unplug request sets PCI hotplug status   
>     > > > > bit in GPE0 block. However depending on OSPM, status bits may
>     > > > > be retained (Windows) or cleared (Linux) during guest's ACPI
>     > > > > subsystem initialization, and as result Linux guest looses
>     > > > > plug/unplug event (no SCI generated) if plug/unplug has
>     > > > > happend before guest OS initialized GPE registers handling.
>     > > > > I couldn't find any restrictions wrt OPM clearing GPE status
>     > > > > bits ACPI spec.
>     > > > > Hence a fallback approach is to let user repeat unplug request
>     > > > > later at the time when guest OS has booted.
>     > > > >
>     > > > > 1) 18416c62e3 ("pcie: expire pending delete")
>     > > > > 2)
>     > > > > Fixes: cce8944cc9ef ("qdev-monitor: Forbid repeated device_del")
>     > > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>   
>     > > >
>     > > > A bit concerned about how this interacts with failover,
>     > > > and 5sec is a lot of time that I hoped we'd avoid with acpi.
>     > > > Any better ideas of catching such misbehaving guests? 
>     > >
>     > > It shouldn't affect affect failover, pending_delete is not
>     > > cleared after all (only device removal should do that).
>     > > So all patch does is allowing to reissue unplug request
>     > > in case it was lost, delay here doesn't mean much
>     > > (do you have any preference wrt specific value)? 
>     >
>     > I'd prefer immediately.
> 
>     ok, lets use 1ms then, I'd rather reuse the preexisting
>     pending_deleted_expires_ms machinery instead of
>     special-casing immediate repeat.
> 
> 
> Sounds good to me.
>  

OK but please add a comment explaining what's going on.


> 
>     >
>     > > As for 'misbehaving' - I tried to find justification
>     > > for it in spec, but I couldn't.
>     > > Essentially it's upto OSPM to clear or not GPE status
>     > > bits at startup (linux was doing it since forever),
>     > > depending on guest's ability to handle hotplug events
>     > > at boot time.
>     > >
>     > > It's more a user error, ACPI hotplug does imply booted
>     > > guest for it to function properly. So it's fine to
>     > > loose unplug event at boot time. What QEMU does wrong is
>     > > preventing follow up unplug requests. 
>     > >   
>     > > >
>     > > > Also at this point I do not know why we deny hotplug
>     > > > pending_deleted_event in qdev core. 
>     > > > Commit log says:
>     > > >
>     > > >     Device unplug can be done asynchronously. Thus, sending the
>     second
>     > > >     device_del before the previous unplug is complete may lead to
>     > > >     unexpected results. On PCIe devices, this cancels the hot-unplug
>     > > >     process.
>     > > >
>     > > > so it's a work around for an issue in pcie hotplug (and maybe shpc
>     > > > too?). Maybe we should have put that check in pcie/shpc and
>     > > > leave acpi along?
>     > > >
>     > > >
>     > > >
>     > > >   
>     > > > > ---
>     > > > > CC: mst@redhat.com
>     > > > > CC: anisinha@redhat.com
>     > > > > CC: jusual@redhat.com
>     > > > > CC: kraxel@redhat.com
>     > > > > ---
>     > > > >  hw/acpi/pcihp.c | 2 ++
>     > > > >  1 file changed, 2 insertions(+)
>     > > > >
>     > > > > diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
>     > > > > index dcfb779a7a..cd4f9fee0a 100644
>     > > > > --- a/hw/acpi/pcihp.c
>     > > > > +++ b/hw/acpi/pcihp.c
>     > > > > @@ -357,6 +357,8 @@ void acpi_pcihp_device_unplug_request_cb
>     (HotplugHandler *hotplug_dev,
>     > > > >       * acpi_pcihp_eject_slot() when the operation is completed.
>     > > > >       */
>     > > > >      pdev->qdev.pending_deleted_event = true;
>     > > > > +    pdev->qdev.pending_deleted_expires_ms =
>     > > > > +        qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + 5000; /* 5 secs */
>     > > > >      s->acpi_pcihp_pci_status[bsel].down |= (1U << slot);
>     > > > >      acpi_send_event(DEVICE(hotplug_dev), ACPI_PCI_HOTPLUG_STATUS);
>     > > > >  }
>     > > > > --
>     > > > > 2.39.1   
>     > > >   
>     >
> 
>
Michael S. Tsirkin April 4, 2023, 2:42 p.m. UTC | #14
On Tue, Apr 04, 2023 at 04:04:35PM +0200, Igor Mammedov wrote:
> On Tue, 4 Apr 2023 08:46:15 -0400
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Tue, Apr 04, 2023 at 10:28:07AM +0200, Igor Mammedov wrote:
> > > On Mon, 3 Apr 2023 13:23:45 -0400
> > > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > >   
> > > > On Mon, Apr 03, 2023 at 06:16:18PM +0200, Igor Mammedov wrote:  
> > > > > with Q35 using ACPI PCI hotplug by default, user's request to unplug
> > > > > device is ignored when it's issued before guest OS has been booted.
> > > > > And any additional attempt to request device hot-unplug afterwards
> > > > > results in following error:
> > > > > 
> > > > >   "Device XYZ is already in the process of unplug"
> > > > > 
> > > > > arguably it can be considered as a regression introduced by [2],
> > > > > before which it was possible to issue unplug request multiple
> > > > > times.
> > > > > 
> > > > > Allowing pending delete expire brings ACPI PCI hotplug on par
> > > > > with native PCIe unplug behavior [1] which in its turn refers
> > > > > back to ACPI PCI hotplug ability to repeat unplug requests.
> > > > > 
> > > > > PS:    
> > > > > >From ACPI point of view, unplug request sets PCI hotplug status    
> > > > > bit in GPE0 block. However depending on OSPM, status bits may
> > > > > be retained (Windows) or cleared (Linux) during guest's ACPI
> > > > > subsystem initialization, and as result Linux guest looses
> > > > > plug/unplug event (no SCI generated) if plug/unplug has
> > > > > happend before guest OS initialized GPE registers handling.
> > > > > I couldn't find any restrictions wrt OPM clearing GPE status
> > > > > bits ACPI spec.
> > > > > Hence a fallback approach is to let user repeat unplug request
> > > > > later at the time when guest OS has booted.
> > > > > 
> > > > > 1) 18416c62e3 ("pcie: expire pending delete")
> > > > > 2)
> > > > > Fixes: cce8944cc9ef ("qdev-monitor: Forbid repeated device_del")
> > > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>    
> > > > 
> > > > A bit concerned about how this interacts with failover,
> > > > and 5sec is a lot of time that I hoped we'd avoid with acpi.
> > > > Any better ideas of catching such misbehaving guests?  
> > > 
> > > It shouldn't affect affect failover, pending_delete is not
> > > cleared after all (only device removal should do that).
> > > So all patch does is allowing to reissue unplug request
> > > in case it was lost, delay here doesn't mean much
> > > (do you have any preference wrt specific value)?  
> > 
> > I'd prefer immediately.
> 
> ok, lets use 1ms then, I'd rather reuse the preexisting
> pending_deleted_expires_ms machinery instead of
> special-casing immediate repeat.

And just to make sure, are you working on fixing this in Linux
at least? Because the work around is ok but it still causes
latency.

> > 
> > > As for 'misbehaving' - I tried to find justification
> > > for it in spec, but I couldn't.
> > > Essentially it's upto OSPM to clear or not GPE status
> > > bits at startup (linux was doing it since forever),
> > > depending on guest's ability to handle hotplug events
> > > at boot time.
> > > 
> > > It's more a user error, ACPI hotplug does imply booted
> > > guest for it to function properly. So it's fine to
> > > loose unplug event at boot time. What QEMU does wrong is
> > > preventing follow up unplug requests.  
> > >   
> > > > 
> > > > Also at this point I do not know why we deny hotplug
> > > > pending_deleted_event in qdev core.  
> > > > Commit log says:
> > > > 
> > > >     Device unplug can be done asynchronously. Thus, sending the second
> > > >     device_del before the previous unplug is complete may lead to
> > > >     unexpected results. On PCIe devices, this cancels the hot-unplug
> > > >     process.
> > > > 
> > > > so it's a work around for an issue in pcie hotplug (and maybe shpc
> > > > too?). Maybe we should have put that check in pcie/shpc and
> > > > leave acpi along?
> > > > 
> > > > 
> > > > 
> > > >   
> > > > > ---
> > > > > CC: mst@redhat.com
> > > > > CC: anisinha@redhat.com
> > > > > CC: jusual@redhat.com
> > > > > CC: kraxel@redhat.com
> > > > > ---
> > > > >  hw/acpi/pcihp.c | 2 ++
> > > > >  1 file changed, 2 insertions(+)
> > > > > 
> > > > > diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
> > > > > index dcfb779a7a..cd4f9fee0a 100644
> > > > > --- a/hw/acpi/pcihp.c
> > > > > +++ b/hw/acpi/pcihp.c
> > > > > @@ -357,6 +357,8 @@ void acpi_pcihp_device_unplug_request_cb(HotplugHandler *hotplug_dev,
> > > > >       * acpi_pcihp_eject_slot() when the operation is completed.
> > > > >       */
> > > > >      pdev->qdev.pending_deleted_event = true;
> > > > > +    pdev->qdev.pending_deleted_expires_ms =
> > > > > +        qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + 5000; /* 5 secs */
> > > > >      s->acpi_pcihp_pci_status[bsel].down |= (1U << slot);
> > > > >      acpi_send_event(DEVICE(hotplug_dev), ACPI_PCI_HOTPLUG_STATUS);
> > > > >  }
> > > > > -- 
> > > > > 2.39.1    
> > > >   
> >
Igor Mammedov April 5, 2023, 7:30 a.m. UTC | #15
On Tue, 4 Apr 2023 10:42:04 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Tue, Apr 04, 2023 at 04:04:35PM +0200, Igor Mammedov wrote:
> > On Tue, 4 Apr 2023 08:46:15 -0400
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >   
> > > On Tue, Apr 04, 2023 at 10:28:07AM +0200, Igor Mammedov wrote:  
> > > > On Mon, 3 Apr 2023 13:23:45 -0400
> > > > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > >     
> > > > > On Mon, Apr 03, 2023 at 06:16:18PM +0200, Igor Mammedov wrote:    
> > > > > > with Q35 using ACPI PCI hotplug by default, user's request to unplug
> > > > > > device is ignored when it's issued before guest OS has been booted.
> > > > > > And any additional attempt to request device hot-unplug afterwards
> > > > > > results in following error:
> > > > > > 
> > > > > >   "Device XYZ is already in the process of unplug"
> > > > > > 
> > > > > > arguably it can be considered as a regression introduced by [2],
> > > > > > before which it was possible to issue unplug request multiple
> > > > > > times.
> > > > > > 
> > > > > > Allowing pending delete expire brings ACPI PCI hotplug on par
> > > > > > with native PCIe unplug behavior [1] which in its turn refers
> > > > > > back to ACPI PCI hotplug ability to repeat unplug requests.
> > > > > > 
> > > > > > PS:      
> > > > > > >From ACPI point of view, unplug request sets PCI hotplug status      
> > > > > > bit in GPE0 block. However depending on OSPM, status bits may
> > > > > > be retained (Windows) or cleared (Linux) during guest's ACPI
> > > > > > subsystem initialization, and as result Linux guest looses
> > > > > > plug/unplug event (no SCI generated) if plug/unplug has
> > > > > > happend before guest OS initialized GPE registers handling.
> > > > > > I couldn't find any restrictions wrt OPM clearing GPE status
> > > > > > bits ACPI spec.
> > > > > > Hence a fallback approach is to let user repeat unplug request
> > > > > > later at the time when guest OS has booted.
> > > > > > 
> > > > > > 1) 18416c62e3 ("pcie: expire pending delete")
> > > > > > 2)
> > > > > > Fixes: cce8944cc9ef ("qdev-monitor: Forbid repeated device_del")
> > > > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>      
> > > > > 
> > > > > A bit concerned about how this interacts with failover,
> > > > > and 5sec is a lot of time that I hoped we'd avoid with acpi.
> > > > > Any better ideas of catching such misbehaving guests?    
> > > > 
> > > > It shouldn't affect affect failover, pending_delete is not
> > > > cleared after all (only device removal should do that).
> > > > So all patch does is allowing to reissue unplug request
> > > > in case it was lost, delay here doesn't mean much
> > > > (do you have any preference wrt specific value)?    
> > > 
> > > I'd prefer immediately.  
> > 
> > ok, lets use 1ms then, I'd rather reuse the preexisting
> > pending_deleted_expires_ms machinery instead of
> > special-casing immediate repeat.  
> 
> And just to make sure, are you working on fixing this in Linux
> at least? Because the work around is ok but it still causes
> latency.


Fixing what, clearing GPE status bits during ACPI subsystem
initialization?

Well at this point I'm not seeing a good justification for
removing GPE clearing (spec does not mandate that).
(but there is no harm in trying to send a patch, though
even if idea is accepted it won't do a dime for all current
and older distributions history show it was the thing even
since 2.6 kernels).

As for workaround, well it's not a workaround, but expected
behavior. 
ACPI hotplug expects functioning OSPM on guest side to work
properly. It's user's mistake to ask for unplug before that 
and user shall repeat request once guest is booted. What is
broken on QEMU side is that 'repeat' thingy (as it's noted
in commit message).

PS:
See commit message, Windows is not affected as it doesn't
clear GPE status bits during ACPI initialization
(at least the one version I've tested with, and I won't bet
on this with other versions or staying this way)

> 
> > >   
> > > > As for 'misbehaving' - I tried to find justification
> > > > for it in spec, but I couldn't.
> > > > Essentially it's upto OSPM to clear or not GPE status
> > > > bits at startup (linux was doing it since forever),
> > > > depending on guest's ability to handle hotplug events
> > > > at boot time.
> > > > 
> > > > It's more a user error, ACPI hotplug does imply booted
> > > > guest for it to function properly. So it's fine to
> > > > loose unplug event at boot time. What QEMU does wrong is
> > > > preventing follow up unplug requests.  
> > > >     
> > > > > 
> > > > > Also at this point I do not know why we deny hotplug
> > > > > pending_deleted_event in qdev core.  
> > > > > Commit log says:
> > > > > 
> > > > >     Device unplug can be done asynchronously. Thus, sending the second
> > > > >     device_del before the previous unplug is complete may lead to
> > > > >     unexpected results. On PCIe devices, this cancels the hot-unplug
> > > > >     process.
> > > > > 
> > > > > so it's a work around for an issue in pcie hotplug (and maybe shpc
> > > > > too?). Maybe we should have put that check in pcie/shpc and
> > > > > leave acpi along?
> > > > > 
> > > > > 
> > > > > 
> > > > >     
> > > > > > ---
> > > > > > CC: mst@redhat.com
> > > > > > CC: anisinha@redhat.com
> > > > > > CC: jusual@redhat.com
> > > > > > CC: kraxel@redhat.com
> > > > > > ---
> > > > > >  hw/acpi/pcihp.c | 2 ++
> > > > > >  1 file changed, 2 insertions(+)
> > > > > > 
> > > > > > diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
> > > > > > index dcfb779a7a..cd4f9fee0a 100644
> > > > > > --- a/hw/acpi/pcihp.c
> > > > > > +++ b/hw/acpi/pcihp.c
> > > > > > @@ -357,6 +357,8 @@ void acpi_pcihp_device_unplug_request_cb(HotplugHandler *hotplug_dev,
> > > > > >       * acpi_pcihp_eject_slot() when the operation is completed.
> > > > > >       */
> > > > > >      pdev->qdev.pending_deleted_event = true;
> > > > > > +    pdev->qdev.pending_deleted_expires_ms =
> > > > > > +        qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + 5000; /* 5 secs */
> > > > > >      s->acpi_pcihp_pci_status[bsel].down |= (1U << slot);
> > > > > >      acpi_send_event(DEVICE(hotplug_dev), ACPI_PCI_HOTPLUG_STATUS);
> > > > > >  }
> > > > > > -- 
> > > > > > 2.39.1      
> > > > >     
> > >   
>
Michael S. Tsirkin April 5, 2023, 8:32 a.m. UTC | #16
On Wed, Apr 05, 2023 at 09:30:20AM +0200, Igor Mammedov wrote:
> On Tue, 4 Apr 2023 10:42:04 -0400
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Tue, Apr 04, 2023 at 04:04:35PM +0200, Igor Mammedov wrote:
> > > On Tue, 4 Apr 2023 08:46:15 -0400
> > > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > >   
> > > > On Tue, Apr 04, 2023 at 10:28:07AM +0200, Igor Mammedov wrote:  
> > > > > On Mon, 3 Apr 2023 13:23:45 -0400
> > > > > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > >     
> > > > > > On Mon, Apr 03, 2023 at 06:16:18PM +0200, Igor Mammedov wrote:    
> > > > > > > with Q35 using ACPI PCI hotplug by default, user's request to unplug
> > > > > > > device is ignored when it's issued before guest OS has been booted.
> > > > > > > And any additional attempt to request device hot-unplug afterwards
> > > > > > > results in following error:
> > > > > > > 
> > > > > > >   "Device XYZ is already in the process of unplug"
> > > > > > > 
> > > > > > > arguably it can be considered as a regression introduced by [2],
> > > > > > > before which it was possible to issue unplug request multiple
> > > > > > > times.
> > > > > > > 
> > > > > > > Allowing pending delete expire brings ACPI PCI hotplug on par
> > > > > > > with native PCIe unplug behavior [1] which in its turn refers
> > > > > > > back to ACPI PCI hotplug ability to repeat unplug requests.
> > > > > > > 
> > > > > > > PS:      
> > > > > > > >From ACPI point of view, unplug request sets PCI hotplug status      
> > > > > > > bit in GPE0 block. However depending on OSPM, status bits may
> > > > > > > be retained (Windows) or cleared (Linux) during guest's ACPI
> > > > > > > subsystem initialization, and as result Linux guest looses
> > > > > > > plug/unplug event (no SCI generated) if plug/unplug has
> > > > > > > happend before guest OS initialized GPE registers handling.
> > > > > > > I couldn't find any restrictions wrt OPM clearing GPE status
> > > > > > > bits ACPI spec.
> > > > > > > Hence a fallback approach is to let user repeat unplug request
> > > > > > > later at the time when guest OS has booted.
> > > > > > > 
> > > > > > > 1) 18416c62e3 ("pcie: expire pending delete")
> > > > > > > 2)
> > > > > > > Fixes: cce8944cc9ef ("qdev-monitor: Forbid repeated device_del")
> > > > > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>      
> > > > > > 
> > > > > > A bit concerned about how this interacts with failover,
> > > > > > and 5sec is a lot of time that I hoped we'd avoid with acpi.
> > > > > > Any better ideas of catching such misbehaving guests?    
> > > > > 
> > > > > It shouldn't affect affect failover, pending_delete is not
> > > > > cleared after all (only device removal should do that).
> > > > > So all patch does is allowing to reissue unplug request
> > > > > in case it was lost, delay here doesn't mean much
> > > > > (do you have any preference wrt specific value)?    
> > > > 
> > > > I'd prefer immediately.  
> > > 
> > > ok, lets use 1ms then, I'd rather reuse the preexisting
> > > pending_deleted_expires_ms machinery instead of
> > > special-casing immediate repeat.  
> > 
> > And just to make sure, are you working on fixing this in Linux
> > at least? Because the work around is ok but it still causes
> > latency.
> 
> 
> Fixing what, clearing GPE status bits during ACPI subsystem
> initialization?
> 
> Well at this point I'm not seeing a good justification for
> removing GPE clearing (spec does not mandate that).
> (but there is no harm in trying to send a patch, though
> even if idea is accepted it won't do a dime for all current
> and older distributions history show it was the thing even
> since 2.6 kernels).
> 
> As for workaround, well it's not a workaround, but expected
> behavior. 
> ACPI hotplug expects functioning OSPM on guest side to work
> properly. It's user's mistake to ask for unplug before that 
> and user shall repeat request once guest is booted. What is
> broken on QEMU side is that 'repeat' thingy (as it's noted
> in commit message).

I don't see how you can claim it's user's mistake.  All users want is
device to be removed. How is our problem.  Guest can reboot at any time
and there's no indication to user that guest booted, blaming
users won't help if we do not have a fix for them.


> PS:
> See commit message, Windows is not affected as it doesn't
> clear GPE status bits during ACPI initialization
> (at least the one version I've tested with, and I won't bet
> on this with other versions or staying this way)

So I am saying linux should match windows. Clearing GPE
is a bad idea as you then miss events.

> > 
> > > >   
> > > > > As for 'misbehaving' - I tried to find justification
> > > > > for it in spec, but I couldn't.
> > > > > Essentially it's upto OSPM to clear or not GPE status
> > > > > bits at startup (linux was doing it since forever),
> > > > > depending on guest's ability to handle hotplug events
> > > > > at boot time.
> > > > > 
> > > > > It's more a user error, ACPI hotplug does imply booted
> > > > > guest for it to function properly. So it's fine to
> > > > > loose unplug event at boot time. What QEMU does wrong is
> > > > > preventing follow up unplug requests.  
> > > > >     
> > > > > > 
> > > > > > Also at this point I do not know why we deny hotplug
> > > > > > pending_deleted_event in qdev core.  
> > > > > > Commit log says:
> > > > > > 
> > > > > >     Device unplug can be done asynchronously. Thus, sending the second
> > > > > >     device_del before the previous unplug is complete may lead to
> > > > > >     unexpected results. On PCIe devices, this cancels the hot-unplug
> > > > > >     process.
> > > > > > 
> > > > > > so it's a work around for an issue in pcie hotplug (and maybe shpc
> > > > > > too?). Maybe we should have put that check in pcie/shpc and
> > > > > > leave acpi along?
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > >     
> > > > > > > ---
> > > > > > > CC: mst@redhat.com
> > > > > > > CC: anisinha@redhat.com
> > > > > > > CC: jusual@redhat.com
> > > > > > > CC: kraxel@redhat.com
> > > > > > > ---
> > > > > > >  hw/acpi/pcihp.c | 2 ++
> > > > > > >  1 file changed, 2 insertions(+)
> > > > > > > 
> > > > > > > diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
> > > > > > > index dcfb779a7a..cd4f9fee0a 100644
> > > > > > > --- a/hw/acpi/pcihp.c
> > > > > > > +++ b/hw/acpi/pcihp.c
> > > > > > > @@ -357,6 +357,8 @@ void acpi_pcihp_device_unplug_request_cb(HotplugHandler *hotplug_dev,
> > > > > > >       * acpi_pcihp_eject_slot() when the operation is completed.
> > > > > > >       */
> > > > > > >      pdev->qdev.pending_deleted_event = true;
> > > > > > > +    pdev->qdev.pending_deleted_expires_ms =
> > > > > > > +        qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + 5000; /* 5 secs */
> > > > > > >      s->acpi_pcihp_pci_status[bsel].down |= (1U << slot);
> > > > > > >      acpi_send_event(DEVICE(hotplug_dev), ACPI_PCI_HOTPLUG_STATUS);
> > > > > > >  }
> > > > > > > -- 
> > > > > > > 2.39.1      
> > > > > >     
> > > >   
> >
Igor Mammedov April 5, 2023, 9:24 a.m. UTC | #17
On Wed, 5 Apr 2023 04:32:16 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Wed, Apr 05, 2023 at 09:30:20AM +0200, Igor Mammedov wrote:
> > On Tue, 4 Apr 2023 10:42:04 -0400
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >   
> > > On Tue, Apr 04, 2023 at 04:04:35PM +0200, Igor Mammedov wrote:  
> > > > On Tue, 4 Apr 2023 08:46:15 -0400
> > > > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > >     
> > > > > On Tue, Apr 04, 2023 at 10:28:07AM +0200, Igor Mammedov wrote:    
> > > > > > On Mon, 3 Apr 2023 13:23:45 -0400
> > > > > > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > > >       
> > > > > > > On Mon, Apr 03, 2023 at 06:16:18PM +0200, Igor Mammedov wrote:      
> > > > > > > > with Q35 using ACPI PCI hotplug by default, user's request to unplug
> > > > > > > > device is ignored when it's issued before guest OS has been booted.
> > > > > > > > And any additional attempt to request device hot-unplug afterwards
> > > > > > > > results in following error:
> > > > > > > > 
> > > > > > > >   "Device XYZ is already in the process of unplug"
> > > > > > > > 
> > > > > > > > arguably it can be considered as a regression introduced by [2],
> > > > > > > > before which it was possible to issue unplug request multiple
> > > > > > > > times.
> > > > > > > > 
> > > > > > > > Allowing pending delete expire brings ACPI PCI hotplug on par
> > > > > > > > with native PCIe unplug behavior [1] which in its turn refers
> > > > > > > > back to ACPI PCI hotplug ability to repeat unplug requests.
> > > > > > > > 
> > > > > > > > PS:        
> > > > > > > > >From ACPI point of view, unplug request sets PCI hotplug status        
> > > > > > > > bit in GPE0 block. However depending on OSPM, status bits may
> > > > > > > > be retained (Windows) or cleared (Linux) during guest's ACPI
> > > > > > > > subsystem initialization, and as result Linux guest looses
> > > > > > > > plug/unplug event (no SCI generated) if plug/unplug has
> > > > > > > > happend before guest OS initialized GPE registers handling.
> > > > > > > > I couldn't find any restrictions wrt OPM clearing GPE status
> > > > > > > > bits ACPI spec.
> > > > > > > > Hence a fallback approach is to let user repeat unplug request
> > > > > > > > later at the time when guest OS has booted.
> > > > > > > > 
> > > > > > > > 1) 18416c62e3 ("pcie: expire pending delete")
> > > > > > > > 2)
> > > > > > > > Fixes: cce8944cc9ef ("qdev-monitor: Forbid repeated device_del")
> > > > > > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>        
> > > > > > > 
> > > > > > > A bit concerned about how this interacts with failover,
> > > > > > > and 5sec is a lot of time that I hoped we'd avoid with acpi.
> > > > > > > Any better ideas of catching such misbehaving guests?      
> > > > > > 
> > > > > > It shouldn't affect affect failover, pending_delete is not
> > > > > > cleared after all (only device removal should do that).
> > > > > > So all patch does is allowing to reissue unplug request
> > > > > > in case it was lost, delay here doesn't mean much
> > > > > > (do you have any preference wrt specific value)?      
> > > > > 
> > > > > I'd prefer immediately.    
> > > > 
> > > > ok, lets use 1ms then, I'd rather reuse the preexisting
> > > > pending_deleted_expires_ms machinery instead of
> > > > special-casing immediate repeat.    
> > > 
> > > And just to make sure, are you working on fixing this in Linux
> > > at least? Because the work around is ok but it still causes
> > > latency.  
> > 
> > 
> > Fixing what, clearing GPE status bits during ACPI subsystem
> > initialization?
> > 
> > Well at this point I'm not seeing a good justification for
> > removing GPE clearing (spec does not mandate that).
> > (but there is no harm in trying to send a patch, though
> > even if idea is accepted it won't do a dime for all current
> > and older distributions history show it was the thing even
> > since 2.6 kernels).
> > 
> > As for workaround, well it's not a workaround, but expected
> > behavior. 
> > ACPI hotplug expects functioning OSPM on guest side to work
> > properly. It's user's mistake to ask for unplug before that 
> > and user shall repeat request once guest is booted. What is
> > broken on QEMU side is that 'repeat' thingy (as it's noted
> > in commit message).  
> 
> I don't see how you can claim it's user's mistake.  All users want is
> device to be removed. How is our problem.  Guest can reboot at any time
> and there's no indication to user that guest booted, blaming
> users won't help if we do not have a fix for them.
ACPI hot-unplug requires cooperating guest, it doesn't matter what user
wants if guest is not able or not willing to unplug device.

User however can watch for DEVICE_DELETED event and repeat
request again in hope that guest would cooperate.

With CPU/DIMMs guest is able to tell user more exact status
of unplug request (using ACPI _OST method), I don't recall
if PCI subsystem is capable of using _OST as well.
But then again it requires functioning OSPM.

As for reboot, pcihp machinery does remove devices with pending
unplug requests at hw reset time (though it's QEMU specific
ACPI PCI hotplug quirk, not something that happens on baremetal).

So it's up to management tools to decide whether to repeat
or abandon/report error of unplug attempt.

> > PS:
> > See commit message, Windows is not affected as it doesn't
> > clear GPE status bits during ACPI initialization
> > (at least the one version I've tested with, and I won't bet
> > on this with other versions or staying this way)  
> 
> So I am saying linux should match windows. Clearing GPE
> is a bad idea as you then miss events.

I'd say it depends on if guest OS is able to handle hot[un]plug
at boot time when it enables GPE handlers (or any other time).
(My point of view here, it's a guest OS policy and management
layer should know what installed guest is capable of and what
quirks to use with it)

I'll try to send a kernel patch to remove GPEx.status clearing,
though it might be more complex than it seems,
hence I'm quite sceptical about it.

> > >   
> > > > >     
> > > > > > As for 'misbehaving' - I tried to find justification
> > > > > > for it in spec, but I couldn't.
> > > > > > Essentially it's upto OSPM to clear or not GPE status
> > > > > > bits at startup (linux was doing it since forever),
> > > > > > depending on guest's ability to handle hotplug events
> > > > > > at boot time.
> > > > > > 
> > > > > > It's more a user error, ACPI hotplug does imply booted
> > > > > > guest for it to function properly. So it's fine to
> > > > > > loose unplug event at boot time. What QEMU does wrong is
> > > > > > preventing follow up unplug requests.  
> > > > > >       
> > > > > > > 
> > > > > > > Also at this point I do not know why we deny hotplug
> > > > > > > pending_deleted_event in qdev core.  
> > > > > > > Commit log says:
> > > > > > > 
> > > > > > >     Device unplug can be done asynchronously. Thus, sending the second
> > > > > > >     device_del before the previous unplug is complete may lead to
> > > > > > >     unexpected results. On PCIe devices, this cancels the hot-unplug
> > > > > > >     process.
> > > > > > > 
> > > > > > > so it's a work around for an issue in pcie hotplug (and maybe shpc
> > > > > > > too?). Maybe we should have put that check in pcie/shpc and
> > > > > > > leave acpi along?
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > >       
> > > > > > > > ---
> > > > > > > > CC: mst@redhat.com
> > > > > > > > CC: anisinha@redhat.com
> > > > > > > > CC: jusual@redhat.com
> > > > > > > > CC: kraxel@redhat.com
> > > > > > > > ---
> > > > > > > >  hw/acpi/pcihp.c | 2 ++
> > > > > > > >  1 file changed, 2 insertions(+)
> > > > > > > > 
> > > > > > > > diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
> > > > > > > > index dcfb779a7a..cd4f9fee0a 100644
> > > > > > > > --- a/hw/acpi/pcihp.c
> > > > > > > > +++ b/hw/acpi/pcihp.c
> > > > > > > > @@ -357,6 +357,8 @@ void acpi_pcihp_device_unplug_request_cb(HotplugHandler *hotplug_dev,
> > > > > > > >       * acpi_pcihp_eject_slot() when the operation is completed.
> > > > > > > >       */
> > > > > > > >      pdev->qdev.pending_deleted_event = true;
> > > > > > > > +    pdev->qdev.pending_deleted_expires_ms =
> > > > > > > > +        qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + 5000; /* 5 secs */
> > > > > > > >      s->acpi_pcihp_pci_status[bsel].down |= (1U << slot);
> > > > > > > >      acpi_send_event(DEVICE(hotplug_dev), ACPI_PCI_HOTPLUG_STATUS);
> > > > > > > >  }
> > > > > > > > -- 
> > > > > > > > 2.39.1        
> > > > > > >       
> > > > >     
> > >   
>
Michael S. Tsirkin April 5, 2023, 9:59 a.m. UTC | #18
On Wed, Apr 05, 2023 at 11:24:16AM +0200, Igor Mammedov wrote:
> > > PS:
> > > See commit message, Windows is not affected as it doesn't
> > > clear GPE status bits during ACPI initialization
> > > (at least the one version I've tested with, and I won't bet
> > > on this with other versions or staying this way)  
> > 
> > So I am saying linux should match windows. Clearing GPE
> > is a bad idea as you then miss events.
> 
> I'd say it depends on if guest OS is able to handle hot[un]plug
> at boot time when it enables GPE handlers (or any other time).
> (My point of view here, it's a guest OS policy and management
> layer should know what installed guest is capable of and what
> quirks to use with it)
> 
> I'll try to send a kernel patch to remove GPEx.status clearing,
> though it might be more complex than it seems,
> hence I'm quite sceptical about it.

In the world of ACPI, windows is basically the gold standard,
whatever it does linux has to do ;)
Igor Mammedov April 5, 2023, 12:03 p.m. UTC | #19
On Wed, 5 Apr 2023 05:59:06 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Wed, Apr 05, 2023 at 11:24:16AM +0200, Igor Mammedov wrote:
> > > > PS:
> > > > See commit message, Windows is not affected as it doesn't
> > > > clear GPE status bits during ACPI initialization
> > > > (at least the one version I've tested with, and I won't bet
> > > > on this with other versions or staying this way)    
> > > 
> > > So I am saying linux should match windows. Clearing GPE
> > > is a bad idea as you then miss events.  
> > 
> > I'd say it depends on if guest OS is able to handle hot[un]plug
> > at boot time when it enables GPE handlers (or any other time).
> > (My point of view here, it's a guest OS policy and management
> > layer should know what installed guest is capable of and what
> > quirks to use with it)
> > 
> > I'll try to send a kernel patch to remove GPEx.status clearing,
> > though it might be more complex than it seems,
> > hence I'm quite sceptical about it.  
> 
> In the world of ACPI, windows is basically the gold standard,
> whatever it does linux has to do ;)
I'd say other way around (with their limited acpi interpreter,
it's getting better though),
While linux basically is acpica reference code.
Michael S. Tsirkin April 5, 2023, 12:27 p.m. UTC | #20
On Wed, Apr 05, 2023 at 02:03:32PM +0200, Igor Mammedov wrote:
> On Wed, 5 Apr 2023 05:59:06 -0400
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Wed, Apr 05, 2023 at 11:24:16AM +0200, Igor Mammedov wrote:
> > > > > PS:
> > > > > See commit message, Windows is not affected as it doesn't
> > > > > clear GPE status bits during ACPI initialization
> > > > > (at least the one version I've tested with, and I won't bet
> > > > > on this with other versions or staying this way)    
> > > > 
> > > > So I am saying linux should match windows. Clearing GPE
> > > > is a bad idea as you then miss events.  
> > > 
> > > I'd say it depends on if guest OS is able to handle hot[un]plug
> > > at boot time when it enables GPE handlers (or any other time).
> > > (My point of view here, it's a guest OS policy and management
> > > layer should know what installed guest is capable of and what
> > > quirks to use with it)
> > > 
> > > I'll try to send a kernel patch to remove GPEx.status clearing,
> > > though it might be more complex than it seems,
> > > hence I'm quite sceptical about it.  
> > 
> > In the world of ACPI, windows is basically the gold standard,
> > whatever it does linux has to do ;)
> I'd say other way around (with their limited acpi interpreter,
> it's getting better though),
> While linux basically is acpica reference code.

For a spec compliant acpi like ours maybe but on real hardware
it is like this because BIOS vendors test their ACPI with windows only.
Ani Sinha April 5, 2023, 12:32 p.m. UTC | #21
> On 05-Apr-2023, at 5:57 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> 
> On Wed, Apr 05, 2023 at 02:03:32PM +0200, Igor Mammedov wrote:
>> On Wed, 5 Apr 2023 05:59:06 -0400
>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>> 
>>> On Wed, Apr 05, 2023 at 11:24:16AM +0200, Igor Mammedov wrote:
>>>>>> PS:
>>>>>> See commit message, Windows is not affected as it doesn't
>>>>>> clear GPE status bits during ACPI initialization
>>>>>> (at least the one version I've tested with, and I won't bet
>>>>>> on this with other versions or staying this way)    
>>>>> 
>>>>> So I am saying linux should match windows. Clearing GPE
>>>>> is a bad idea as you then miss events.  
>>>> 
>>>> I'd say it depends on if guest OS is able to handle hot[un]plug
>>>> at boot time when it enables GPE handlers (or any other time).
>>>> (My point of view here, it's a guest OS policy and management
>>>> layer should know what installed guest is capable of and what
>>>> quirks to use with it)
>>>> 
>>>> I'll try to send a kernel patch to remove GPEx.status clearing,
>>>> though it might be more complex than it seems,
>>>> hence I'm quite sceptical about it.  
>>> 
>>> In the world of ACPI, windows is basically the gold standard,
>>> whatever it does linux has to do ;)
>> I'd say other way around (with their limited acpi interpreter,
>> it's getting better though),
>> While linux basically is acpica reference code.
> 
> For a spec compliant acpi like ours maybe but on real hardware
> it is like this because BIOS vendors test their ACPI with windows only.

I thought they used bios bits :-)
Igor Mammedov April 6, 2023, 11:46 a.m. UTC | #22
On Tue, 4 Apr 2023 12:46:45 +0200
Gerd Hoffmann <kraxel@redhat.com> wrote:

> On Tue, Apr 04, 2023 at 10:30:55AM +0200, Igor Mammedov wrote:
> > On Tue, 4 Apr 2023 09:03:59 +0200
> > Gerd Hoffmann <kraxel@redhat.com> wrote:
> >   
> > >   Hi,
> > >   
> > > > > Allowing pending delete expire brings ACPI PCI hotplug on par
> > > > > with native PCIe unplug behavior [1] which in its turn refers
> > > > > back to ACPI PCI hotplug ability to repeat unplug requests.    
> > >   
> > > > A bit concerned about how this interacts with failover,
> > > > and 5sec is a lot of time that I hoped we'd avoid with acpi.
> > > > Any better ideas of catching such misbehaving guests?    
> > > 
> > > The 5sec are coming from the pcie spec: The hot-unplug request can be
> > > canceled within 5 seconds by pressing the button again. The problem here
> > > is that both hotplug and hot-unplug use the same signaling path, so we
> > > really have to wait the 5 seconds to avoid the OS mis-interpreting the
> > > button press as 'cancel' event.  
> > 
> > Any pointer to spec?  
> 
> pcie base spec, section 6.7.1.5. Attention Button
> 
> > Does it apply to SHPC too?  
> 
> Yes (section 2.2.5. Attention Button).

shouldn't we set pending_deleted_expires_ms to 5sec for SHPC
as we do with PCIe?

> 
> take care,
>   Gerd
>
Gerd Hoffmann April 6, 2023, 12:01 p.m. UTC | #23
On Thu, Apr 06, 2023 at 01:46:48PM +0200, Igor Mammedov wrote:
> On Tue, 4 Apr 2023 12:46:45 +0200
> Gerd Hoffmann <kraxel@redhat.com> wrote:
> 
> > On Tue, Apr 04, 2023 at 10:30:55AM +0200, Igor Mammedov wrote:
> > > On Tue, 4 Apr 2023 09:03:59 +0200
> > > Gerd Hoffmann <kraxel@redhat.com> wrote:
> > >   
> > > >   Hi,
> > > >   
> > > > > > Allowing pending delete expire brings ACPI PCI hotplug on par
> > > > > > with native PCIe unplug behavior [1] which in its turn refers
> > > > > > back to ACPI PCI hotplug ability to repeat unplug requests.    
> > > >   
> > > > > A bit concerned about how this interacts with failover,
> > > > > and 5sec is a lot of time that I hoped we'd avoid with acpi.
> > > > > Any better ideas of catching such misbehaving guests?    
> > > > 
> > > > The 5sec are coming from the pcie spec: The hot-unplug request can be
> > > > canceled within 5 seconds by pressing the button again. The problem here
> > > > is that both hotplug and hot-unplug use the same signaling path, so we
> > > > really have to wait the 5 seconds to avoid the OS mis-interpreting the
> > > > button press as 'cancel' event.  
> > > 
> > > Any pointer to spec?  
> > 
> > pcie base spec, section 6.7.1.5. Attention Button
> > 
> > > Does it apply to SHPC too?  
> > 
> > Yes (section 2.2.5. Attention Button).
> 
> shouldn't we set pending_deleted_expires_ms to 5sec for SHPC
> as we do with PCIe?

I suspect it is not *that* simple.

For pcie there is one more detail.  The code also looks at the
indicator led.  When it is set to 'blink' (used by the guest to signal
hotplug is in progress) the code will block unplug too, i.e. the rules
are (IIRC, didn't check the code):

  (1) if less than 5 secs passed, reject, else ...
  (2) if the indicator blinks, reject, else ...
  (3) allow request (and send virtual attention button press to guest).

SHPC should probably do that too.  Sending a attention button press
while the guest clearly signals a hotplug is in progress has a high
chance to be interpreted as cancel.

take care,
  Gerd
diff mbox series

Patch

diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
index dcfb779a7a..cd4f9fee0a 100644
--- a/hw/acpi/pcihp.c
+++ b/hw/acpi/pcihp.c
@@ -357,6 +357,8 @@  void acpi_pcihp_device_unplug_request_cb(HotplugHandler *hotplug_dev,
      * acpi_pcihp_eject_slot() when the operation is completed.
      */
     pdev->qdev.pending_deleted_event = true;
+    pdev->qdev.pending_deleted_expires_ms =
+        qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + 5000; /* 5 secs */
     s->acpi_pcihp_pci_status[bsel].down |= (1U << slot);
     acpi_send_event(DEVICE(hotplug_dev), ACPI_PCI_HOTPLUG_STATUS);
 }