diff mbox series

[v3,1/2] s390x/pci: Introduce unplug requests and split unplug handler

Message ID 20190121134249.16615-2-david@redhat.com
State New
Headers show
Series s390x/pci: hotplug handler fixes and reworks | expand

Commit Message

David Hildenbrand Jan. 21, 2019, 1:42 p.m. UTC
PCI on s390x is really weird and how it was modeled in QEMU might not have
been the right choice. Anyhow, right now it is the case that:
- Hotplugging a PCI device will silently create a zPCI device
  (if none is provided)
- Hotunplugging a zPCI device will unplug the PCI device (if any)
- Hotunplugging a PCI device will unplug also the zPCI device
As far as I can see, we can no longer change this behavior. But we
should fix it.

Both device types are handled via a single hotplug handler call. This
is problematic for various reasons:
1. Unplugging via the zPCI device allows to unplug PCI bridges as
   checks are not performed - bad.
2. Unplugging via the zPCI device allows to unplug devices that are not
   hot removable. (check performed in qdev_unplug()) - bad.
3. Hotplug handler chains are not possible for the unplug case. In the
   future, the machine might want to override hotplug handlers, to
   process device specific stuff and to then branch off to the actual
   hotplug handler. We need separate hotplug handler calls for both the
   PCI and zPCI device to make this work reliably. All other PCI
   implementations are already prepared to handle this correctly, only
   s390x is missing.

Therefore, introduce the unplug_request handler and properly perform
unplug checks by redirecting to the separate unplug_request handlers.
When finally unplugging, perform two separate hotplug_handler_unplug()
calls, first for the PCI device, followed by the zPCI device. This now
nicely splits unplugging paths for both devices.

The redirect part is a little hairy, as the user is allowed to trigger
unplug either via the PCI or the zPCI device. So redirect always to the
PCI unplug request handler first and remember if that check has been
performed in the zPCI device. Redirect then to the zPCI device unplug
request handler to perform the magic. Remembering that we already
checked the PCI device breaks the redirect loop.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/s390x/s390-pci-bus.c | 166 +++++++++++++++++++++++++++-------------
 hw/s390x/s390-pci-bus.h |   1 +
 2 files changed, 113 insertions(+), 54 deletions(-)

Comments

Cornelia Huck Jan. 23, 2019, 11:03 a.m. UTC | #1
On Mon, 21 Jan 2019 14:42:48 +0100
David Hildenbrand <david@redhat.com> wrote:

> PCI on s390x is really weird and how it was modeled in QEMU might not have
> been the right choice. Anyhow, right now it is the case that:
> - Hotplugging a PCI device will silently create a zPCI device
>   (if none is provided)
> - Hotunplugging a zPCI device will unplug the PCI device (if any)
> - Hotunplugging a PCI device will unplug also the zPCI device
> As far as I can see, we can no longer change this behavior. But we
> should fix it.
> 
> Both device types are handled via a single hotplug handler call. This
> is problematic for various reasons:
> 1. Unplugging via the zPCI device allows to unplug PCI bridges as
>    checks are not performed - bad.

Maybe I'm confused here, but how can a zPCI device couple with a bridge?
David Hildenbrand Jan. 23, 2019, 11:08 a.m. UTC | #2
On 23.01.19 12:03, Cornelia Huck wrote:
> On Mon, 21 Jan 2019 14:42:48 +0100
> David Hildenbrand <david@redhat.com> wrote:
> 
>> PCI on s390x is really weird and how it was modeled in QEMU might not have
>> been the right choice. Anyhow, right now it is the case that:
>> - Hotplugging a PCI device will silently create a zPCI device
>>   (if none is provided)
>> - Hotunplugging a zPCI device will unplug the PCI device (if any)
>> - Hotunplugging a PCI device will unplug also the zPCI device
>> As far as I can see, we can no longer change this behavior. But we
>> should fix it.
>>
>> Both device types are handled via a single hotplug handler call. This
>> is problematic for various reasons:
>> 1. Unplugging via the zPCI device allows to unplug PCI bridges as
>>    checks are not performed - bad.
> 
> Maybe I'm confused here, but how can a zPCI device couple with a bridge?
> 

I was confused, bridges don't attach to a zPCI device. So this remark is
invalid. Thanks!
Cornelia Huck Jan. 28, 2019, 11:27 a.m. UTC | #3
On Wed, 23 Jan 2019 12:08:37 +0100
David Hildenbrand <david@redhat.com> wrote:

> On 23.01.19 12:03, Cornelia Huck wrote:
> > On Mon, 21 Jan 2019 14:42:48 +0100
> > David Hildenbrand <david@redhat.com> wrote:
> >   
> >> PCI on s390x is really weird and how it was modeled in QEMU might not have
> >> been the right choice. Anyhow, right now it is the case that:
> >> - Hotplugging a PCI device will silently create a zPCI device
> >>   (if none is provided)
> >> - Hotunplugging a zPCI device will unplug the PCI device (if any)
> >> - Hotunplugging a PCI device will unplug also the zPCI device
> >> As far as I can see, we can no longer change this behavior. But we
> >> should fix it.
> >>
> >> Both device types are handled via a single hotplug handler call. This
> >> is problematic for various reasons:
> >> 1. Unplugging via the zPCI device allows to unplug PCI bridges as
> >>    checks are not performed - bad.  
> > 
> > Maybe I'm confused here, but how can a zPCI device couple with a bridge?
> >   
> 
> I was confused, bridges don't attach to a zPCI device. So this remark is
> invalid. Thanks!
> 

Ok, I can remove point 1 when applying.
Pierre Morel Jan. 29, 2019, 1:31 p.m. UTC | #4
On 21/01/2019 14:42, David Hildenbrand wrote:
> PCI on s390x is really weird and how it was modeled in QEMU might not have
> been the right choice. Anyhow, right now it is the case that:
> - Hotplugging a PCI device will silently create a zPCI device
>    (if none is provided)
> - Hotunplugging a zPCI device will unplug the PCI device (if any)
> - Hotunplugging a PCI device will unplug also the zPCI device
> As far as I can see, we can no longer change this behavior. But we
> should fix it.

hum, is it really a problem per se?

> 
> Both device types are handled via a single hotplug handler call. This
> is problematic for various reasons:
> 1. Unplugging via the zPCI device allows to unplug PCI bridges as
>     checks are not performed - bad.

bad

> 2. Unplugging via the zPCI device allows to unplug devices that are not
>     hot removable. (check performed in qdev_unplug()) - bad.

bad

> 3. Hotplug handler chains are not possible for the unplug case. In the
>     future, the machine might want to override hotplug handlers, to
>     process device specific stuff and to then branch off to the actual
>     hotplug handler. We need separate hotplug handler calls for both the
>     PCI and zPCI device to make this work reliably. All other PCI
>     implementations are already prepared to handle this correctly, only
>     s390x is missing.

ok

> 
> Therefore, introduce the unplug_request handler and properly perform
> unplug checks by redirecting to the separate unplug_request handlers.
> When finally unplugging, perform two separate hotplug_handler_unplug()
> calls, first for the PCI device, followed by the zPCI device. This now
> nicely splits unplugging paths for both devices.

hum, PCI device handle the backend, host side, while zPCI handle the 
front end, guest side.
So unplugging PCI first will deny the guest any possibility to smoothly 
relinquish a device.


Is it possible the other way around?

Regards,
Pierre
David Hildenbrand Jan. 29, 2019, 3:14 p.m. UTC | #5
On 29.01.19 14:31, Pierre Morel wrote:
> On 21/01/2019 14:42, David Hildenbrand wrote:
>> PCI on s390x is really weird and how it was modeled in QEMU might not have
>> been the right choice. Anyhow, right now it is the case that:
>> - Hotplugging a PCI device will silently create a zPCI device
>>    (if none is provided)
>> - Hotunplugging a zPCI device will unplug the PCI device (if any)
>> - Hotunplugging a PCI device will unplug also the zPCI device
>> As far as I can see, we can no longer change this behavior. But we
>> should fix it.
> 
> hum, is it really a problem per se?

Yes, because this is not the way it is usually done in QEMU :/

> 
>>
>> Both device types are handled via a single hotplug handler call. This
>> is problematic for various reasons:
>> 1. Unplugging via the zPCI device allows to unplug PCI bridges as
>>     checks are not performed - bad.
> 
> bad
> 
>> 2. Unplugging via the zPCI device allows to unplug devices that are not
>>     hot removable. (check performed in qdev_unplug()) - bad.
> 
> bad
> 
>> 3. Hotplug handler chains are not possible for the unplug case. In the
>>     future, the machine might want to override hotplug handlers, to
>>     process device specific stuff and to then branch off to the actual
>>     hotplug handler. We need separate hotplug handler calls for both the
>>     PCI and zPCI device to make this work reliably. All other PCI
>>     implementations are already prepared to handle this correctly, only
>>     s390x is missing.
> 
> ok
> 
>>
>> Therefore, introduce the unplug_request handler and properly perform
>> unplug checks by redirecting to the separate unplug_request handlers.
>> When finally unplugging, perform two separate hotplug_handler_unplug()
>> calls, first for the PCI device, followed by the zPCI device. This now
>> nicely splits unplugging paths for both devices.
> 
> hum, PCI device handle the backend, host side, while zPCI handle the 
> front end, guest side.
> So unplugging PCI first will deny the guest any possibility to smoothly 
> relinquish a device.
> 
> 
> Is it possible the other way around?

Maybe, but it does not really matter. We unplug both devices
synchronously, without the guest recognizing the order. We always have
the unplug request first that notifies the guest. When we get an ACK
from the guest, we can unpplug both devices in any order.

(and if we want to change the order, we should do it in a separate
patch, this patch does not change the order, just refactors the code)

Thanks!

> 
> Regards,
> Pierre
>
Pierre Morel Jan. 29, 2019, 4:54 p.m. UTC | #6
On 29/01/2019 16:14, David Hildenbrand wrote:
> On 29.01.19 14:31, Pierre Morel wrote:
>> On 21/01/2019 14:42, David Hildenbrand wrote:
>>> PCI on s390x is really weird and how it was modeled in QEMU might not have
>>> been the right choice. Anyhow, right now it is the case that:
>>> - Hotplugging a PCI device will silently create a zPCI device
>>>     (if none is provided)
>>> - Hotunplugging a zPCI device will unplug the PCI device (if any)
>>> - Hotunplugging a PCI device will unplug also the zPCI device
>>> As far as I can see, we can no longer change this behavior. But we
>>> should fix it.


>> So unplugging PCI first will deny the guest any possibility to smoothly
>> relinquish a device.
>>
>>
>> Is it possible the other way around?
> 
> Maybe, but it does not really matter. We unplug both devices
> synchronously, without the guest recognizing the order. We always have
> the unplug request first that notifies the guest. When we get an ACK
> from the guest, we can unpplug both devices in any order.
> 
> (and if we want to change the order, we should do it in a separate
> patch, this patch does not change the order, just refactors the code)


If it is done atomically, then I have no objection.

Regards,
Pierre
David Hildenbrand Jan. 29, 2019, 8:27 p.m. UTC | #7
On 29.01.19 17:54, Pierre Morel wrote:
> On 29/01/2019 16:14, David Hildenbrand wrote:
>> On 29.01.19 14:31, Pierre Morel wrote:
>>> On 21/01/2019 14:42, David Hildenbrand wrote:
>>>> PCI on s390x is really weird and how it was modeled in QEMU might not have
>>>> been the right choice. Anyhow, right now it is the case that:
>>>> - Hotplugging a PCI device will silently create a zPCI device
>>>>     (if none is provided)
>>>> - Hotunplugging a zPCI device will unplug the PCI device (if any)
>>>> - Hotunplugging a PCI device will unplug also the zPCI device
>>>> As far as I can see, we can no longer change this behavior. But we
>>>> should fix it.
> 
> 
>>> So unplugging PCI first will deny the guest any possibility to smoothly
>>> relinquish a device.
>>>
>>>
>>> Is it possible the other way around?
>>
>> Maybe, but it does not really matter. We unplug both devices
>> synchronously, without the guest recognizing the order. We always have
>> the unplug request first that notifies the guest. When we get an ACK
>> from the guest, we can unpplug both devices in any order.
>>
>> (and if we want to change the order, we should do it in a separate
>> patch, this patch does not change the order, just refactors the code)
> 
> 
> If it is done atomically, then I have no objection.

Yes, this is atomically. It is two separate steps, but only logically.
The guest cannot observe it.

Thanks!

> 
> Regards,
> Pierre
> 
> 
>
Collin Walling Jan. 30, 2019, 7:52 p.m. UTC | #8
On 1/21/19 8:42 AM, David Hildenbrand wrote:
> PCI on s390x is really weird and how it was modeled in QEMU might not have
> been the right choice. Anyhow, right now it is the case that:
> - Hotplugging a PCI device will silently create a zPCI device
>   (if none is provided)
> - Hotunplugging a zPCI device will unplug the PCI device (if any)
> - Hotunplugging a PCI device will unplug also the zPCI device
> As far as I can see, we can no longer change this behavior. But we
> should fix it.
> 
> Both device types are handled via a single hotplug handler call. This
> is problematic for various reasons:
> 1. Unplugging via the zPCI device allows to unplug PCI bridges as
>    checks are not performed - bad.

Very bad...

Solving this issue is outside the scope of this patchset, correct? Or
am I missing something (nothing stood out to me as I reviewed the code)

> 2. Unplugging via the zPCI device allows to unplug devices that are not
>    hot removable. (check performed in qdev_unplug()) - bad.

VERY bad.

> 3. Hotplug handler chains are not possible for the unplug case. In the
>    future, the machine might want to override hotplug handlers, to
>    process device specific stuff and to then branch off to the actual
>    hotplug handler. We need separate hotplug handler calls for both the
>    PCI and zPCI device to make this work reliably. All other PCI
>    implementations are already prepared to handle this correctly, only
>    s390x is missing.
> 
> Therefore, introduce the unplug_request handler and properly perform
> unplug checks by redirecting to the separate unplug_request handlers.
> When finally unplugging, perform two separate hotplug_handler_unplug()
> calls, first for the PCI device, followed by the zPCI device. This now
> nicely splits unplugging paths for both devices.
> 
> The redirect part is a little hairy, as the user is allowed to trigger
> unplug either via the PCI or the zPCI device. So redirect always to the
> PCI unplug request handler first and remember if that check has been
> performed in the zPCI device. Redirect then to the zPCI device unplug
> request handler to perform the magic. Remembering that we already
> checked the PCI device breaks the redirect loop.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  hw/s390x/s390-pci-bus.c | 166 +++++++++++++++++++++++++++-------------
>  hw/s390x/s390-pci-bus.h |   1 +
>  2 files changed, 113 insertions(+), 54 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
> index f017c1ded0..bc17a8cf65 100644
> --- a/hw/s390x/s390-pci-bus.c
> +++ b/hw/s390x/s390-pci-bus.c
> @@ -148,6 +148,22 @@ out:
>      psccb->header.response_code = cpu_to_be16(rc);
>  }
>  
> +static void s390_pci_perform_unplug(S390PCIBusDevice *pbdev)
> +{
> +    HotplugHandler *hotplug_ctrl;
> +
> +    /* Unplug the PCI device */
> +    if (pbdev->pdev) {
> +        hotplug_ctrl = qdev_get_hotplug_handler(DEVICE(pbdev->pdev));
> +        hotplug_handler_unplug(hotplug_ctrl, DEVICE(pbdev->pdev),
> +                               &error_abort);
> +    }
> +
> +    /* Unplug the zPCI device */
> +    hotplug_ctrl = qdev_get_hotplug_handler(DEVICE(pbdev));
> +    hotplug_handler_unplug(hotplug_ctrl, DEVICE(pbdev), &error_abort);
> +}

Please help me understand this just a little bit better. Most of my 
understanding of the s390 PCI architecture stems from the lower-level 
instruction / DMA stuff and some of the pass-through stuff. I'm just a 
little shaky on the PCI topology with respect to how it is portrayed in 
the QEMU code.

My confusion: when we unplug a zPCI device, we are unplugging a *bus* 
device? Won't that cause an issue if other PCI devices are sharing that
bus? Or am I misunderstanding the QEMU code here and a PCIBusDevice is 
more-or-less a path from the bridge to the PCI device (and not the 
entire bus itself)? Or...?

Sorry for the technical question here that I should probably already 
know the answer to. Hopefully someone can help accelerate my
understanding.

> +
>  void s390_pci_sclp_deconfigure(SCCB *sccb)
>  {
>      IoaCfgSccb *psccb = (IoaCfgSccb *)sccb;
> @@ -179,7 +195,7 @@ void s390_pci_sclp_deconfigure(SCCB *sccb)
>          rc = SCLP_RC_NORMAL_COMPLETION;
>  
>          if (pbdev->release_timer) {
> -            qdev_unplug(DEVICE(pbdev->pdev), NULL);
> +            s390_pci_perform_unplug(pbdev);
>          }
>      }
>  out:
> @@ -217,6 +233,24 @@ S390PCIBusDevice *s390_pci_find_dev_by_target(S390pciState *s,
>      return NULL;
>  }
>  
> +static S390PCIBusDevice *s390_pci_find_dev_by_pci(S390pciState *s,
> +                                                  PCIDevice *pci_dev)
> +{
> +    S390PCIBusDevice *pbdev;
> +
> +    if (!pci_dev) {
> +        return NULL;
> +    }
> +
> +    QTAILQ_FOREACH(pbdev, &s->zpci_devs, link) {
> +        if (pbdev->pdev == pci_dev) {
> +            return pbdev;
> +        }
> +    }
> +
> +    return NULL;
> +}

Maybe call this "s390_pci_find_*bus*_dev_by_pci" to make it a little 
more clear that this finds and returns a bus device? Better yet, maybe
"s390_pci_find_bus" to be similar to the generic "pci_get_bus" function.

> +
>  S390PCIBusDevice *s390_pci_find_dev_by_idx(S390pciState *s, uint32_t idx)
>  {
>      return g_hash_table_lookup(s->zpci_table, &idx);
> @@ -939,77 +973,100 @@ static void s390_pcihost_timer_cb(void *opaque)
>      pbdev->state = ZPCI_FS_STANDBY;
>      s390_pci_generate_plug_event(HP_EVENT_CONFIGURED_TO_STBRES,
>                                   pbdev->fh, pbdev->fid);
> -    qdev_unplug(DEVICE(pbdev), NULL);
> +    s390_pci_perform_unplug(pbdev);
>  }
>  
>  static void s390_pcihost_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
>                                  Error **errp)
>  {
>      S390pciState *s = S390_PCI_HOST_BRIDGE(hotplug_dev);
> -    PCIDevice *pci_dev = NULL;
> -    PCIBus *bus;
> -    int32_t devfn;
>      S390PCIBusDevice *pbdev = NULL;
>  
> +    if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) {
> +        PCIDevice *pci_dev = PCI_DEVICE(dev);
> +        PCIBus *bus;
> +        int32_t devfn;
> +
> +        pbdev = s390_pci_find_dev_by_pci(s, PCI_DEVICE(dev));
> +        g_assert(pbdev);
> +
> +        s390_pci_generate_plug_event(HP_EVENT_STANDBY_TO_RESERVED,
> +                                     pbdev->fh, pbdev->fid);
> +        bus = pci_get_bus(pci_dev);
> +        devfn = pci_dev->devfn;
> +        object_unparent(OBJECT(pci_dev));
> +
> +        s390_pci_msix_free(pbdev);
> +        s390_pci_iommu_free(s, bus, devfn);
> +        pbdev->pdev = NULL;
> +        pbdev->state = ZPCI_FS_RESERVED;
> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_S390_PCI_DEVICE)) {

"S390_PCI_DEVICE" is synonymous with "zPCI device", right? Just making 
sure...

> +        pbdev = S390_PCI_DEVICE(dev);
> +
> +        if (pbdev->release_timer) {
> +            timer_del(pbdev->release_timer);
> +            timer_free(pbdev->release_timer);
> +            pbdev->release_timer = NULL;
> +        }
> +        pbdev->fid = 0;
> +        QTAILQ_REMOVE(&s->zpci_devs, pbdev, link);
> +        g_hash_table_remove(s->zpci_table, &pbdev->idx);
> +        object_unparent(OBJECT(pbdev));
> +    }
> +}
> +
> +static void s390_pcihost_unplug_request(HotplugHandler *hotplug_dev,
> +                                        DeviceState *dev,
> +                                        Error **errp)
> +{
> +    S390pciState *s = S390_PCI_HOST_BRIDGE(hotplug_dev);
> +    S390PCIBusDevice *pbdev;
> +
>      if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) {
>          error_setg(errp, "PCI bridge hot unplug currently not supported");
> -        return;
>      } else if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) {
> -        pci_dev = PCI_DEVICE(dev);
> -
> -        QTAILQ_FOREACH(pbdev, &s->zpci_devs, link) {
> -            if (pbdev->pdev == pci_dev) {
> -                break;
> -            }
> -        }
> -        assert(pbdev != NULL);
> +        /*
> +         * Redirect the unplug request to the zPCI device and remember that
> +         * we've checked the PCI device already (to prevent endless recursion).
> +         */
> +        pbdev = s390_pci_find_dev_by_pci(s, PCI_DEVICE(dev));
> +        g_assert(pbdev);
> +        pbdev->pci_unplug_request_processed = true;
> +        qdev_unplug(DEVICE(pbdev), errp);
>      } else if (object_dynamic_cast(OBJECT(dev), TYPE_S390_PCI_DEVICE)) {
>          pbdev = S390_PCI_DEVICE(dev);
> -        pci_dev = pbdev->pdev;
> -    } else {
> -        g_assert_not_reached();
> -    }
>  
> -    switch (pbdev->state) {
> -    case ZPCI_FS_RESERVED:
> -        goto out;
> -    case ZPCI_FS_STANDBY:
> -        break;
> -    default:
> -        if (pbdev->release_timer) {
> +        /*
> +         * If unplug was initially requested for the zPCI device, we
> +         * first have to redirect to the PCI device, which will in return
> +         * redirect back to us after performing its checks (if the request
> +         * is not blocked, e.g. because it's a PCI bridge).
> +         */
> +        if (pbdev->pdev && !pbdev->pci_unplug_request_processed) {
> +            qdev_unplug(DEVICE(pbdev->pdev), errp);
>              return;
>          }
> -        s390_pci_generate_plug_event(HP_EVENT_DECONFIGURE_REQUEST,
> -                                     pbdev->fh, pbdev->fid);
> -        pbdev->release_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL,
> -                                            s390_pcihost_timer_cb,
> -                                            pbdev);
> -        timer_mod(pbdev->release_timer,
> -                  qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + HOT_UNPLUG_TIMEOUT);
> -        return;
> -    }
> +        pbdev->pci_unplug_request_processed = false;
>  
> -    if (pbdev->release_timer) {
> -        timer_del(pbdev->release_timer);
> -        timer_free(pbdev->release_timer);
> -        pbdev->release_timer = NULL;
> +        switch (pbdev->state) {
> +        case ZPCI_FS_STANDBY:
> +        case ZPCI_FS_RESERVED:
> +            s390_pci_perform_unplug(pbdev);
> +            break;
> +        default:
> +            if (pbdev->release_timer) {
> +                return;
> +            }
> +            s390_pci_generate_plug_event(HP_EVENT_DECONFIGURE_REQUEST,
> +                                         pbdev->fh, pbdev->fid);
> +            pbdev->release_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL,
> +                                                s390_pcihost_timer_cb, pbdev);
> +            timer_mod(pbdev->release_timer,
> +                    qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + HOT_UNPLUG_TIMEOUT);
> +        }
> +    } else {
> +        g_assert_not_reached();
>      }
> -
> -    s390_pci_generate_plug_event(HP_EVENT_STANDBY_TO_RESERVED,
> -                                 pbdev->fh, pbdev->fid);
> -    bus = pci_get_bus(pci_dev);
> -    devfn = pci_dev->devfn;
> -    object_unparent(OBJECT(pci_dev));
> -    fmb_timer_free(pbdev);
> -    s390_pci_msix_free(pbdev);
> -    s390_pci_iommu_free(s, bus, devfn);
> -    pbdev->pdev = NULL;
> -    pbdev->state = ZPCI_FS_RESERVED;
> -out:
> -    pbdev->fid = 0;
> -    QTAILQ_REMOVE(&s->zpci_devs, pbdev, link);
> -    g_hash_table_remove(s->zpci_table, &pbdev->idx);
> -    object_unparent(OBJECT(pbdev));
>  }

So from what I understand: _unplug_request will deconfigure the device 
(if !standby and !reserved) and _unplug will remove the device from the 
guest?

Looks good.

>  
>  static void s390_pci_enumerate_bridge(PCIBus *bus, PCIDevice *pdev,
> @@ -1059,6 +1116,7 @@ static void s390_pcihost_class_init(ObjectClass *klass, void *data)
>      dc->realize = s390_pcihost_realize;
>      hc->pre_plug = s390_pcihost_pre_plug;
>      hc->plug = s390_pcihost_plug;
> +    hc->unplug_request = s390_pcihost_unplug_request;
>      hc->unplug = s390_pcihost_unplug;
>      msi_nonbroken = true;
>  }
> diff --git a/hw/s390x/s390-pci-bus.h b/hw/s390x/s390-pci-bus.h
> index dadad1f758..b1a6bb8296 100644
> --- a/hw/s390x/s390-pci-bus.h
> +++ b/hw/s390x/s390-pci-bus.h
> @@ -336,6 +336,7 @@ struct S390PCIBusDevice {
>      IndAddr *summary_ind;
>      IndAddr *indicator;
>      QEMUTimer *release_timer;
> +    bool pci_unplug_request_processed;
>      QTAILQ_ENTRY(S390PCIBusDevice) link;
>  };
>  
> 

Patch looks good so far. I only have a few questions above.
David Hildenbrand Jan. 31, 2019, 9:31 a.m. UTC | #9
On 30.01.19 20:52, Collin Walling wrote:
> On 1/21/19 8:42 AM, David Hildenbrand wrote:
>> PCI on s390x is really weird and how it was modeled in QEMU might not have
>> been the right choice. Anyhow, right now it is the case that:
>> - Hotplugging a PCI device will silently create a zPCI device
>>   (if none is provided)
>> - Hotunplugging a zPCI device will unplug the PCI device (if any)
>> - Hotunplugging a PCI device will unplug also the zPCI device
>> As far as I can see, we can no longer change this behavior. But we
>> should fix it.
>>
>> Both device types are handled via a single hotplug handler call. This
>> is problematic for various reasons:
>> 1. Unplugging via the zPCI device allows to unplug PCI bridges as
>>    checks are not performed - bad.
> 
> Very bad...
> 
> Solving this issue is outside the scope of this patchset, correct? Or
> am I missing something (nothing stood out to me as I reviewed the code)

As Conny mentioned, 1. does not hold (it resulted from a lack of
understanding on my side). So 1. is not possible.

> 
>> 2. Unplugging via the zPCI device allows to unplug devices that are not
>>    hot removable. (check performed in qdev_unplug()) - bad.
> 
> VERY bad.

This one however holds.

> 
>> 3. Hotplug handler chains are not possible for the unplug case. In the
>>    future, the machine might want to override hotplug handlers, to
>>    process device specific stuff and to then branch off to the actual
>>    hotplug handler. We need separate hotplug handler calls for both the
>>    PCI and zPCI device to make this work reliably. All other PCI
>>    implementations are already prepared to handle this correctly, only
>>    s390x is missing.
>>
>> Therefore, introduce the unplug_request handler and properly perform
>> unplug checks by redirecting to the separate unplug_request handlers.
>> When finally unplugging, perform two separate hotplug_handler_unplug()
>> calls, first for the PCI device, followed by the zPCI device. This now
>> nicely splits unplugging paths for both devices.
>>
>> The redirect part is a little hairy, as the user is allowed to trigger
>> unplug either via the PCI or the zPCI device. So redirect always to the
>> PCI unplug request handler first and remember if that check has been
>> performed in the zPCI device. Redirect then to the zPCI device unplug
>> request handler to perform the magic. Remembering that we already
>> checked the PCI device breaks the redirect loop.
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>  hw/s390x/s390-pci-bus.c | 166 +++++++++++++++++++++++++++-------------
>>  hw/s390x/s390-pci-bus.h |   1 +
>>  2 files changed, 113 insertions(+), 54 deletions(-)
>>
>> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
>> index f017c1ded0..bc17a8cf65 100644
>> --- a/hw/s390x/s390-pci-bus.c
>> +++ b/hw/s390x/s390-pci-bus.c
>> @@ -148,6 +148,22 @@ out:
>>      psccb->header.response_code = cpu_to_be16(rc);
>>  }
>>  
>> +static void s390_pci_perform_unplug(S390PCIBusDevice *pbdev)
>> +{
>> +    HotplugHandler *hotplug_ctrl;
>> +
>> +    /* Unplug the PCI device */
>> +    if (pbdev->pdev) {
>> +        hotplug_ctrl = qdev_get_hotplug_handler(DEVICE(pbdev->pdev));
>> +        hotplug_handler_unplug(hotplug_ctrl, DEVICE(pbdev->pdev),
>> +                               &error_abort);
>> +    }
>> +
>> +    /* Unplug the zPCI device */
>> +    hotplug_ctrl = qdev_get_hotplug_handler(DEVICE(pbdev));
>> +    hotplug_handler_unplug(hotplug_ctrl, DEVICE(pbdev), &error_abort);
>> +}
> 
> Please help me understand this just a little bit better. Most of my 
> understanding of the s390 PCI architecture stems from the lower-level 
> instruction / DMA stuff and some of the pass-through stuff. I'm just a 
> little shaky on the PCI topology with respect to how it is portrayed in 
> the QEMU code.
> 
> My confusion: when we unplug a zPCI device, we are unplugging a *bus* 
> device? Won't that cause an issue if other PCI devices are sharing that
> bus? Or am I misunderstanding the QEMU code here and a PCIBusDevice is 
> more-or-less a path from the bridge to the PCI device (and not the 
> entire bus itself)? Or...?

Both zPCI as well as PCI devices are actually bus devices ("devices
attached to a bus" - TYPE_DEVICE with dc->bus_type configured). (And the
registered bus hotplug handler handles hotplug/unplug on that bus).

There should not be any problem as long as we are not unplugging a
bridge, which has it's own bus with potentially other devices
registered. But we already shield unplugging bridges, so that is not an
issue.

S390PCIBusDevice (a.k.a. zPCI devices) are just a mean to attach further
architecture information to a PCI devices. These devices are all
attached to the TYPE_S390_PCI_BUS (and there is only one for the whole
system, part of the PCI host bridge)!

In contrast, PCI devices are attached to whatever bus is defined (it
could be the bus of the host bridge or the bus of a created bridge).

So unplugging an ordinary PCI device / S390PCIBusDevice does not affect
other devices, as long as we don't unplug bridges with devices attached.
But we already disallow that.

(I *think* we could even allow unplugging of PCI bridges if there are no
other devices attached, but that is future work)


> 
> Sorry for the technical question here that I should probably already 
> know the answer to. Hopefully someone can help accelerate my
> understanding.
> 
>> +
>>  void s390_pci_sclp_deconfigure(SCCB *sccb)
>>  {
>>      IoaCfgSccb *psccb = (IoaCfgSccb *)sccb;
>> @@ -179,7 +195,7 @@ void s390_pci_sclp_deconfigure(SCCB *sccb)
>>          rc = SCLP_RC_NORMAL_COMPLETION;
>>  
>>          if (pbdev->release_timer) {
>> -            qdev_unplug(DEVICE(pbdev->pdev), NULL);
>> +            s390_pci_perform_unplug(pbdev);
>>          }
>>      }
>>  out:
>> @@ -217,6 +233,24 @@ S390PCIBusDevice *s390_pci_find_dev_by_target(S390pciState *s,
>>      return NULL;
>>  }
>>  
>> +static S390PCIBusDevice *s390_pci_find_dev_by_pci(S390pciState *s,
>> +                                                  PCIDevice *pci_dev)
>> +{
>> +    S390PCIBusDevice *pbdev;
>> +
>> +    if (!pci_dev) {
>> +        return NULL;
>> +    }
>> +
>> +    QTAILQ_FOREACH(pbdev, &s->zpci_devs, link) {
>> +        if (pbdev->pdev == pci_dev) {
>> +            return pbdev;
>> +        }
>> +    }
>> +
>> +    return NULL;
>> +}
> 
> Maybe call this "s390_pci_find_*bus*_dev_by_pci" to make it a little 
> more clear that this finds and returns a bus device? Better yet, maybe
> "s390_pci_find_bus" to be similar to the generic "pci_get_bus" function.
> 

I chose this naming because it matches the other functions that already
exist

s390_pci_find_next_avail_dev()
s390_pci_find_dev_by_fid()
s390_pci_find_dev_by_uid()
...

>> +
>>  S390PCIBusDevice *s390_pci_find_dev_by_idx(S390pciState *s, uint32_t idx)
>>  {
>>      return g_hash_table_lookup(s->zpci_table, &idx);
>> @@ -939,77 +973,100 @@ static void s390_pcihost_timer_cb(void *opaque)
>>      pbdev->state = ZPCI_FS_STANDBY;
>>      s390_pci_generate_plug_event(HP_EVENT_CONFIGURED_TO_STBRES,
>>                                   pbdev->fh, pbdev->fid);
>> -    qdev_unplug(DEVICE(pbdev), NULL);
>> +    s390_pci_perform_unplug(pbdev);
>>  }
>>  
>>  static void s390_pcihost_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
>>                                  Error **errp)
>>  {
>>      S390pciState *s = S390_PCI_HOST_BRIDGE(hotplug_dev);
>> -    PCIDevice *pci_dev = NULL;
>> -    PCIBus *bus;
>> -    int32_t devfn;
>>      S390PCIBusDevice *pbdev = NULL;
>>  
>> +    if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) {
>> +        PCIDevice *pci_dev = PCI_DEVICE(dev);
>> +        PCIBus *bus;
>> +        int32_t devfn;
>> +
>> +        pbdev = s390_pci_find_dev_by_pci(s, PCI_DEVICE(dev));
>> +        g_assert(pbdev);
>> +
>> +        s390_pci_generate_plug_event(HP_EVENT_STANDBY_TO_RESERVED,
>> +                                     pbdev->fh, pbdev->fid);
>> +        bus = pci_get_bus(pci_dev);
>> +        devfn = pci_dev->devfn;
>> +        object_unparent(OBJECT(pci_dev));
>> +
>> +        s390_pci_msix_free(pbdev);
>> +        s390_pci_iommu_free(s, bus, devfn);
>> +        pbdev->pdev = NULL;
>> +        pbdev->state = ZPCI_FS_RESERVED;
>> +    } else if (object_dynamic_cast(OBJECT(dev), TYPE_S390_PCI_DEVICE)) {
> 
> "S390_PCI_DEVICE" is synonymous with "zPCI device", right? Just making 
> sure...

Yes, exactly.

> 
>> +        pbdev = S390_PCI_DEVICE(dev);
>> +
>> +        if (pbdev->release_timer) {
>> +            timer_del(pbdev->release_timer);
>> +            timer_free(pbdev->release_timer);
>> +            pbdev->release_timer = NULL;
>> +        }
>> +        pbdev->fid = 0;
>> +        QTAILQ_REMOVE(&s->zpci_devs, pbdev, link);
>> +        g_hash_table_remove(s->zpci_table, &pbdev->idx);
>> +        object_unparent(OBJECT(pbdev));
>> +    }
>> +}
>> +
>> +static void s390_pcihost_unplug_request(HotplugHandler *hotplug_dev,
>> +                                        DeviceState *dev,
>> +                                        Error **errp)
>> +{
>> +    S390pciState *s = S390_PCI_HOST_BRIDGE(hotplug_dev);
>> +    S390PCIBusDevice *pbdev;
>> +
>>      if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) {
>>          error_setg(errp, "PCI bridge hot unplug currently not supported");
>> -        return;
>>      } else if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) {
>> -        pci_dev = PCI_DEVICE(dev);
>> -
>> -        QTAILQ_FOREACH(pbdev, &s->zpci_devs, link) {
>> -            if (pbdev->pdev == pci_dev) {
>> -                break;
>> -            }
>> -        }
>> -        assert(pbdev != NULL);
>> +        /*
>> +         * Redirect the unplug request to the zPCI device and remember that
>> +         * we've checked the PCI device already (to prevent endless recursion).
>> +         */
>> +        pbdev = s390_pci_find_dev_by_pci(s, PCI_DEVICE(dev));
>> +        g_assert(pbdev);
>> +        pbdev->pci_unplug_request_processed = true;
>> +        qdev_unplug(DEVICE(pbdev), errp);
>>      } else if (object_dynamic_cast(OBJECT(dev), TYPE_S390_PCI_DEVICE)) {
>>          pbdev = S390_PCI_DEVICE(dev);
>> -        pci_dev = pbdev->pdev;
>> -    } else {
>> -        g_assert_not_reached();
>> -    }
>>  
>> -    switch (pbdev->state) {
>> -    case ZPCI_FS_RESERVED:
>> -        goto out;
>> -    case ZPCI_FS_STANDBY:
>> -        break;
>> -    default:
>> -        if (pbdev->release_timer) {
>> +        /*
>> +         * If unplug was initially requested for the zPCI device, we
>> +         * first have to redirect to the PCI device, which will in return
>> +         * redirect back to us after performing its checks (if the request
>> +         * is not blocked, e.g. because it's a PCI bridge).
>> +         */
>> +        if (pbdev->pdev && !pbdev->pci_unplug_request_processed) {
>> +            qdev_unplug(DEVICE(pbdev->pdev), errp);
>>              return;
>>          }
>> -        s390_pci_generate_plug_event(HP_EVENT_DECONFIGURE_REQUEST,
>> -                                     pbdev->fh, pbdev->fid);
>> -        pbdev->release_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL,
>> -                                            s390_pcihost_timer_cb,
>> -                                            pbdev);
>> -        timer_mod(pbdev->release_timer,
>> -                  qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + HOT_UNPLUG_TIMEOUT);
>> -        return;
>> -    }
>> +        pbdev->pci_unplug_request_processed = false;
>>  
>> -    if (pbdev->release_timer) {
>> -        timer_del(pbdev->release_timer);
>> -        timer_free(pbdev->release_timer);
>> -        pbdev->release_timer = NULL;
>> +        switch (pbdev->state) {
>> +        case ZPCI_FS_STANDBY:
>> +        case ZPCI_FS_RESERVED:
>> +            s390_pci_perform_unplug(pbdev);
>> +            break;
>> +        default:
>> +            if (pbdev->release_timer) {
>> +                return;
>> +            }
>> +            s390_pci_generate_plug_event(HP_EVENT_DECONFIGURE_REQUEST,
>> +                                         pbdev->fh, pbdev->fid);
>> +            pbdev->release_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL,
>> +                                                s390_pcihost_timer_cb, pbdev);
>> +            timer_mod(pbdev->release_timer,
>> +                    qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + HOT_UNPLUG_TIMEOUT);
>> +        }
>> +    } else {
>> +        g_assert_not_reached();
>>      }
>> -
>> -    s390_pci_generate_plug_event(HP_EVENT_STANDBY_TO_RESERVED,
>> -                                 pbdev->fh, pbdev->fid);
>> -    bus = pci_get_bus(pci_dev);
>> -    devfn = pci_dev->devfn;
>> -    object_unparent(OBJECT(pci_dev));
>> -    fmb_timer_free(pbdev);
>> -    s390_pci_msix_free(pbdev);
>> -    s390_pci_iommu_free(s, bus, devfn);
>> -    pbdev->pdev = NULL;
>> -    pbdev->state = ZPCI_FS_RESERVED;
>> -out:
>> -    pbdev->fid = 0;
>> -    QTAILQ_REMOVE(&s->zpci_devs, pbdev, link);
>> -    g_hash_table_remove(s->zpci_table, &pbdev->idx);
>> -    object_unparent(OBJECT(pbdev));
>>  }
> 
> So from what I understand: _unplug_request will deconfigure the device 
> (if !standby and !reserved) and _unplug will remove the device from the 
> guest?

You can think of unplug_request() as something like

"If the guest does not use this device, perform the unplug. Otherwise,
ask the guest ("request") to release the device so we can unplug it".

> 
> Looks good.
> 
>>  
>>  static void s390_pci_enumerate_bridge(PCIBus *bus, PCIDevice *pdev,
>> @@ -1059,6 +1116,7 @@ static void s390_pcihost_class_init(ObjectClass *klass, void *data)
>>      dc->realize = s390_pcihost_realize;
>>      hc->pre_plug = s390_pcihost_pre_plug;
>>      hc->plug = s390_pcihost_plug;
>> +    hc->unplug_request = s390_pcihost_unplug_request;
>>      hc->unplug = s390_pcihost_unplug;
>>      msi_nonbroken = true;
>>  }
>> diff --git a/hw/s390x/s390-pci-bus.h b/hw/s390x/s390-pci-bus.h
>> index dadad1f758..b1a6bb8296 100644
>> --- a/hw/s390x/s390-pci-bus.h
>> +++ b/hw/s390x/s390-pci-bus.h
>> @@ -336,6 +336,7 @@ struct S390PCIBusDevice {
>>      IndAddr *summary_ind;
>>      IndAddr *indicator;
>>      QEMUTimer *release_timer;
>> +    bool pci_unplug_request_processed;
>>      QTAILQ_ENTRY(S390PCIBusDevice) link;
>>  };
>>  
>>
> 
> Patch looks good so far. I only have a few questions above.
> 

Sure, thanks for having a look! BTW, I resend all s390x/pci patches as a
single series yesterday to give a better overview of what I am changing.

Thanks!
diff mbox series

Patch

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index f017c1ded0..bc17a8cf65 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -148,6 +148,22 @@  out:
     psccb->header.response_code = cpu_to_be16(rc);
 }
 
+static void s390_pci_perform_unplug(S390PCIBusDevice *pbdev)
+{
+    HotplugHandler *hotplug_ctrl;
+
+    /* Unplug the PCI device */
+    if (pbdev->pdev) {
+        hotplug_ctrl = qdev_get_hotplug_handler(DEVICE(pbdev->pdev));
+        hotplug_handler_unplug(hotplug_ctrl, DEVICE(pbdev->pdev),
+                               &error_abort);
+    }
+
+    /* Unplug the zPCI device */
+    hotplug_ctrl = qdev_get_hotplug_handler(DEVICE(pbdev));
+    hotplug_handler_unplug(hotplug_ctrl, DEVICE(pbdev), &error_abort);
+}
+
 void s390_pci_sclp_deconfigure(SCCB *sccb)
 {
     IoaCfgSccb *psccb = (IoaCfgSccb *)sccb;
@@ -179,7 +195,7 @@  void s390_pci_sclp_deconfigure(SCCB *sccb)
         rc = SCLP_RC_NORMAL_COMPLETION;
 
         if (pbdev->release_timer) {
-            qdev_unplug(DEVICE(pbdev->pdev), NULL);
+            s390_pci_perform_unplug(pbdev);
         }
     }
 out:
@@ -217,6 +233,24 @@  S390PCIBusDevice *s390_pci_find_dev_by_target(S390pciState *s,
     return NULL;
 }
 
+static S390PCIBusDevice *s390_pci_find_dev_by_pci(S390pciState *s,
+                                                  PCIDevice *pci_dev)
+{
+    S390PCIBusDevice *pbdev;
+
+    if (!pci_dev) {
+        return NULL;
+    }
+
+    QTAILQ_FOREACH(pbdev, &s->zpci_devs, link) {
+        if (pbdev->pdev == pci_dev) {
+            return pbdev;
+        }
+    }
+
+    return NULL;
+}
+
 S390PCIBusDevice *s390_pci_find_dev_by_idx(S390pciState *s, uint32_t idx)
 {
     return g_hash_table_lookup(s->zpci_table, &idx);
@@ -939,77 +973,100 @@  static void s390_pcihost_timer_cb(void *opaque)
     pbdev->state = ZPCI_FS_STANDBY;
     s390_pci_generate_plug_event(HP_EVENT_CONFIGURED_TO_STBRES,
                                  pbdev->fh, pbdev->fid);
-    qdev_unplug(DEVICE(pbdev), NULL);
+    s390_pci_perform_unplug(pbdev);
 }
 
 static void s390_pcihost_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
                                 Error **errp)
 {
     S390pciState *s = S390_PCI_HOST_BRIDGE(hotplug_dev);
-    PCIDevice *pci_dev = NULL;
-    PCIBus *bus;
-    int32_t devfn;
     S390PCIBusDevice *pbdev = NULL;
 
+    if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) {
+        PCIDevice *pci_dev = PCI_DEVICE(dev);
+        PCIBus *bus;
+        int32_t devfn;
+
+        pbdev = s390_pci_find_dev_by_pci(s, PCI_DEVICE(dev));
+        g_assert(pbdev);
+
+        s390_pci_generate_plug_event(HP_EVENT_STANDBY_TO_RESERVED,
+                                     pbdev->fh, pbdev->fid);
+        bus = pci_get_bus(pci_dev);
+        devfn = pci_dev->devfn;
+        object_unparent(OBJECT(pci_dev));
+
+        s390_pci_msix_free(pbdev);
+        s390_pci_iommu_free(s, bus, devfn);
+        pbdev->pdev = NULL;
+        pbdev->state = ZPCI_FS_RESERVED;
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_S390_PCI_DEVICE)) {
+        pbdev = S390_PCI_DEVICE(dev);
+
+        if (pbdev->release_timer) {
+            timer_del(pbdev->release_timer);
+            timer_free(pbdev->release_timer);
+            pbdev->release_timer = NULL;
+        }
+        pbdev->fid = 0;
+        QTAILQ_REMOVE(&s->zpci_devs, pbdev, link);
+        g_hash_table_remove(s->zpci_table, &pbdev->idx);
+        object_unparent(OBJECT(pbdev));
+    }
+}
+
+static void s390_pcihost_unplug_request(HotplugHandler *hotplug_dev,
+                                        DeviceState *dev,
+                                        Error **errp)
+{
+    S390pciState *s = S390_PCI_HOST_BRIDGE(hotplug_dev);
+    S390PCIBusDevice *pbdev;
+
     if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) {
         error_setg(errp, "PCI bridge hot unplug currently not supported");
-        return;
     } else if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) {
-        pci_dev = PCI_DEVICE(dev);
-
-        QTAILQ_FOREACH(pbdev, &s->zpci_devs, link) {
-            if (pbdev->pdev == pci_dev) {
-                break;
-            }
-        }
-        assert(pbdev != NULL);
+        /*
+         * Redirect the unplug request to the zPCI device and remember that
+         * we've checked the PCI device already (to prevent endless recursion).
+         */
+        pbdev = s390_pci_find_dev_by_pci(s, PCI_DEVICE(dev));
+        g_assert(pbdev);
+        pbdev->pci_unplug_request_processed = true;
+        qdev_unplug(DEVICE(pbdev), errp);
     } else if (object_dynamic_cast(OBJECT(dev), TYPE_S390_PCI_DEVICE)) {
         pbdev = S390_PCI_DEVICE(dev);
-        pci_dev = pbdev->pdev;
-    } else {
-        g_assert_not_reached();
-    }
 
-    switch (pbdev->state) {
-    case ZPCI_FS_RESERVED:
-        goto out;
-    case ZPCI_FS_STANDBY:
-        break;
-    default:
-        if (pbdev->release_timer) {
+        /*
+         * If unplug was initially requested for the zPCI device, we
+         * first have to redirect to the PCI device, which will in return
+         * redirect back to us after performing its checks (if the request
+         * is not blocked, e.g. because it's a PCI bridge).
+         */
+        if (pbdev->pdev && !pbdev->pci_unplug_request_processed) {
+            qdev_unplug(DEVICE(pbdev->pdev), errp);
             return;
         }
-        s390_pci_generate_plug_event(HP_EVENT_DECONFIGURE_REQUEST,
-                                     pbdev->fh, pbdev->fid);
-        pbdev->release_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL,
-                                            s390_pcihost_timer_cb,
-                                            pbdev);
-        timer_mod(pbdev->release_timer,
-                  qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + HOT_UNPLUG_TIMEOUT);
-        return;
-    }
+        pbdev->pci_unplug_request_processed = false;
 
-    if (pbdev->release_timer) {
-        timer_del(pbdev->release_timer);
-        timer_free(pbdev->release_timer);
-        pbdev->release_timer = NULL;
+        switch (pbdev->state) {
+        case ZPCI_FS_STANDBY:
+        case ZPCI_FS_RESERVED:
+            s390_pci_perform_unplug(pbdev);
+            break;
+        default:
+            if (pbdev->release_timer) {
+                return;
+            }
+            s390_pci_generate_plug_event(HP_EVENT_DECONFIGURE_REQUEST,
+                                         pbdev->fh, pbdev->fid);
+            pbdev->release_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL,
+                                                s390_pcihost_timer_cb, pbdev);
+            timer_mod(pbdev->release_timer,
+                    qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + HOT_UNPLUG_TIMEOUT);
+        }
+    } else {
+        g_assert_not_reached();
     }
-
-    s390_pci_generate_plug_event(HP_EVENT_STANDBY_TO_RESERVED,
-                                 pbdev->fh, pbdev->fid);
-    bus = pci_get_bus(pci_dev);
-    devfn = pci_dev->devfn;
-    object_unparent(OBJECT(pci_dev));
-    fmb_timer_free(pbdev);
-    s390_pci_msix_free(pbdev);
-    s390_pci_iommu_free(s, bus, devfn);
-    pbdev->pdev = NULL;
-    pbdev->state = ZPCI_FS_RESERVED;
-out:
-    pbdev->fid = 0;
-    QTAILQ_REMOVE(&s->zpci_devs, pbdev, link);
-    g_hash_table_remove(s->zpci_table, &pbdev->idx);
-    object_unparent(OBJECT(pbdev));
 }
 
 static void s390_pci_enumerate_bridge(PCIBus *bus, PCIDevice *pdev,
@@ -1059,6 +1116,7 @@  static void s390_pcihost_class_init(ObjectClass *klass, void *data)
     dc->realize = s390_pcihost_realize;
     hc->pre_plug = s390_pcihost_pre_plug;
     hc->plug = s390_pcihost_plug;
+    hc->unplug_request = s390_pcihost_unplug_request;
     hc->unplug = s390_pcihost_unplug;
     msi_nonbroken = true;
 }
diff --git a/hw/s390x/s390-pci-bus.h b/hw/s390x/s390-pci-bus.h
index dadad1f758..b1a6bb8296 100644
--- a/hw/s390x/s390-pci-bus.h
+++ b/hw/s390x/s390-pci-bus.h
@@ -336,6 +336,7 @@  struct S390PCIBusDevice {
     IndAddr *summary_ind;
     IndAddr *indicator;
     QEMUTimer *release_timer;
+    bool pci_unplug_request_processed;
     QTAILQ_ENTRY(S390PCIBusDevice) link;
 };