mbox series

[v3,0/7] hw/pci: SR-IOV related fixes and improvements

Message ID 20240212-reuse-v3-0-8017b689ce7f@daynix.com
Headers show
Series hw/pci: SR-IOV related fixes and improvements | expand

Message

Akihiko Odaki Feb. 12, 2024, 10:20 a.m. UTC
I submitted a RFC series[1] to add support for SR-IOV emulation to
virtio-net-pci. During the development of the series, I fixed some
trivial bugs and made improvements that I think are independently
useful. This series extracts those fixes and improvements from the RFC
series. Below is an explanation of the patches:

Patch 1 adds a function to check if ROM BAR is explicitly enabled. It
is used in the RFC series to report an error if the user requests to
enable ROM BAR for SR-IOV VF. Patch 2 and 3 use it for vfio to remove
hacky device option dictionary inspection.

Patch 4 adds SR-IOV NumVFs validation to fix potential buffer overflow.

Patch 5 changes to realize SR-IOV VFs when the PF is being realized to
validate VF configuration.

Patch 6 fixes memory leak that occurs if a SR-IOV VF fails to realize.

[1]: https://patchew.org/QEMU/20231210-sriov-v2-0-b959e8a6dfaf@daynix.com/

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
Changes in v3:
- Extracted patch "hw/pci: Use -1 as a default value for rombar" from
  patch "hw/pci: Determine if rombar is explicitly enabled"
  (Philippe Mathieu-Daudé)
- Added an audit result of PCIDevice::rom_bar to the message of patch
  "hw/pci: Use -1 as a default value for rombar"
  (Philippe Mathieu-Daudé)
- Link to v2: https://lore.kernel.org/r/20240210-reuse-v2-0-24ba2a502692@daynix.com

Changes in v2:
- Reset after enabling a function so that NVMe VF state gets updated.
- Link to v1: https://lore.kernel.org/r/20240203-reuse-v1-0-5be8c5ce6338@daynix.com

---
Akihiko Odaki (7):
      hw/pci: Use -1 as a default value for rombar
      hw/pci: Determine if rombar is explicitly enabled
      vfio: Avoid inspecting option QDict for rombar
      hw/qdev: Remove opts member
      pcie_sriov: Validate NumVFs
      pcie_sriov: Reuse SR-IOV VF device instances
      pcie_sriov: Release VFs failed to realize

 docs/pcie_sriov.txt         |   8 ++--
 include/hw/pci/pci.h        |   2 +-
 include/hw/pci/pci_device.h |   7 ++-
 include/hw/pci/pcie_sriov.h |   6 +--
 include/hw/qdev-core.h      |   4 --
 hw/core/qdev.c              |   1 -
 hw/net/igb.c                |  13 ++++--
 hw/nvme/ctrl.c              |  24 ++++++----
 hw/pci/pci.c                |  20 +++++----
 hw/pci/pci_host.c           |   4 +-
 hw/pci/pcie.c               |   4 +-
 hw/pci/pcie_sriov.c         | 105 +++++++++++++++++++++-----------------------
 hw/vfio/pci.c               |   3 +-
 system/qdev-monitor.c       |  12 ++---
 14 files changed, 116 insertions(+), 97 deletions(-)
---
base-commit: 4a4efae44f19528589204581e9e2fab69c5d39aa
change-id: 20240129-reuse-faae22b11934

Best regards,

Comments

Akihiko Odaki Feb. 13, 2024, 12:26 p.m. UTC | #1
On 2024/02/13 17:51, Minwoo Im wrote:
>> -----Original Message-----
>> From: qemu-block-bounces+minwoo.im.dev=gmail.com@nongnu.org <qemu-block-
>> bounces+minwoo.im.dev=gmail.com@nongnu.org> On Behalf Of Akihiko Odaki
>> Sent: Monday, February 12, 2024 7:21 PM
>> To: Philippe Mathieu-Daudé <philmd@linaro.org>; Michael S. Tsirkin
>> <mst@redhat.com>; Marcel Apfelbaum <marcel.apfelbaum@gmail.com>; Alex
>> Williamson <alex.williamson@redhat.com>; Cédric Le Goater <clg@redhat.com>;
>> Paolo Bonzini <pbonzini@redhat.com>; Daniel P. Berrangé <berrange@redhat.com>;
>> Eduardo Habkost <eduardo@habkost.net>; Sriram Yagnaraman
>> <sriram.yagnaraman@est.tech>; Jason Wang <jasowang@redhat.com>; Keith Busch
>> <kbusch@kernel.org>; Klaus Jensen <its@irrelevant.dk>
>> Cc: qemu-devel@nongnu.org; qemu-block@nongnu.org; Akihiko Odaki
>> <akihiko.odaki@daynix.com>
>> Subject: [PATCH v3 6/7] pcie_sriov: Reuse SR-IOV VF device instances
>>
>> Disable SR-IOV VF devices by reusing code to power down PCI devices
>> instead of removing them when the guest requests to disable VFs. This
>> allows to realize devices and report VF realization errors at PF
>> realization time.
>>
>> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> 
> Hello Akihiko,
> 
> I think this patch fixes the issue reported in [1].  The latest master branch
> also causes an object-related assertion error when we enable VF(s) and disable
> through sysfs over and over again (at least twice).  But this issue is also
> fixed with your patch.
> 
> **
> ERROR:../qom/object.c:753:object_finalize: assertion failed: (obj->parent == NULL)
> Bail out! ERROR:../qom/object.c:753:object_finalize: assertion failed: (obj->parent == NULL)

I'll note that in the next version.

> 
> Klaus,
> 
> If this patchset is applied, I think [1] can be dropped.  What do you think?

[1] should be kept. This patchset fixes use-after-free but double free 
[1] fixes still occurs.

Regards,
Akihiko Odaki

> 
> Thanks,
> 
> [1] https://lore.kernel.org/qemu-devel/20240109022953epcms2p54550dcfc9f831a515206513ae98e7511@epcms2p5/
>
Akihiko Odaki Feb. 13, 2024, 1:48 p.m. UTC | #2
On 2024/02/13 17:51, Minwoo Im wrote:
>> -----Original Message-----
>> From: qemu-block-bounces+minwoo.im.dev=gmail.com@nongnu.org <qemu-block-
>> bounces+minwoo.im.dev=gmail.com@nongnu.org> On Behalf Of Akihiko Odaki
>> Sent: Monday, February 12, 2024 7:21 PM
>> To: Philippe Mathieu-Daudé <philmd@linaro.org>; Michael S. Tsirkin
>> <mst@redhat.com>; Marcel Apfelbaum <marcel.apfelbaum@gmail.com>; Alex
>> Williamson <alex.williamson@redhat.com>; Cédric Le Goater <clg@redhat.com>;
>> Paolo Bonzini <pbonzini@redhat.com>; Daniel P. Berrangé <berrange@redhat.com>;
>> Eduardo Habkost <eduardo@habkost.net>; Sriram Yagnaraman
>> <sriram.yagnaraman@est.tech>; Jason Wang <jasowang@redhat.com>; Keith Busch
>> <kbusch@kernel.org>; Klaus Jensen <its@irrelevant.dk>
>> Cc: qemu-devel@nongnu.org; qemu-block@nongnu.org; Akihiko Odaki
>> <akihiko.odaki@daynix.com>
>> Subject: [PATCH v3 6/7] pcie_sriov: Reuse SR-IOV VF device instances
>>
>> Disable SR-IOV VF devices by reusing code to power down PCI devices
>> instead of removing them when the guest requests to disable VFs. This
>> allows to realize devices and report VF realization errors at PF
>> realization time.
>>
>> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> 
> Hello Akihiko,
> 
> I think this patch fixes the issue reported in [1].  The latest master branch
> also causes an object-related assertion error when we enable VF(s) and disable
> through sysfs over and over again (at least twice).  But this issue is also
> fixed with your patch.
> 
> **
> ERROR:../qom/object.c:753:object_finalize: assertion failed: (obj->parent == NULL)
> Bail out! ERROR:../qom/object.c:753:object_finalize: assertion failed: (obj->parent == NULL)

I looked into this and found it's not fixed with my patch though the 
symptom may be gone.

The problem is that object_ref() is not called when copying subsys. An 
obvious fix is to add an object_ref() call, but I think it's too hacky 
and error-prone. It's better to enumerate nvme_props and call 
object_property_get_qobject() and object_property_set_qobject() for each 
property.

Regards,
Akihiko Odaki

> 
> Klaus,
> 
> If this patchset is applied, I think [1] can be dropped.  What do you think?
> 
> Thanks,
> 
> [1] https://lore.kernel.org/qemu-devel/20240109022953epcms2p54550dcfc9f831a515206513ae98e7511@epcms2p5/
>