Message ID | 1450176195-9066-2-git-send-email-caoj.fnst@cn.fujitsu.com |
---|---|
State | New |
Headers | show |
This got lost over the Christmas break, sorry. Cc'ing Marcel for additional PCI expertise. Cao jin <caoj.fnst@cn.fujitsu.com> writes: > msi_init() is a supporting function in PCI device initialization, > in order to convert .init() to .realize(), it should be modified first. "Supporting function" doesn't imply "should use Error to report errors". HACKING explains: Use the simplest suitable method to communicate success / failure to callers. Stick to common methods: non-negative on success / -1 on error, non-negative / -errno, non-null / null, or Error objects. Example: when a function returns a non-null pointer on success, and it can fail only in one way (as far as the caller is concerned), returning null on failure is just fine, and certainly simpler and a lot easier on the eyes than propagating an Error object through an Error ** parameter. Example: when a function's callers need to report details on failure only the function really knows, use Error **, and set suitable errors. Do not report an error to the user when you're also returning an error for somebody else to handle. Leave the reporting to the place that consumes the error returned. As we'll see in your patch of msi.c, msi_init() can fail in multiple ways, and uses -errno to comunicate them. That can be okay even in realize(). It also reports to the user. That's what makes it unsuitable for realize(). > Also modify the callers Actually, you're *fixing* callers! But the bugs aren't 100% clear, yet, see below for details. Once we know what the bugs are, we'll want to reword the commit message to describe the bugs and their impact. I recommend to skip ahead to msi.c, then come back to the device models. > Bonus: add more comment for msi_init(). > Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> > --- > hw/audio/intel-hda.c | 10 ++++- > hw/ide/ich.c | 2 +- > hw/net/vmxnet3.c | 13 +++--- > hw/pci-bridge/ioh3420.c | 7 +++- > hw/pci-bridge/pci_bridge_dev.c | 8 +++- > hw/pci-bridge/xio3130_downstream.c | 8 +++- > hw/pci-bridge/xio3130_upstream.c | 8 +++- > hw/pci/msi.c | 18 +++++++-- > hw/scsi/megasas.c | 15 +++++-- > hw/scsi/vmw_pvscsi.c | 13 ++++-- > hw/usb/hcd-xhci.c | 81 +++++++++++++++++++++----------------- > hw/vfio/pci.c | 20 +++++----- > include/hw/pci/msi.h | 4 +- > 13 files changed, 135 insertions(+), 72 deletions(-) > > diff --git a/hw/audio/intel-hda.c b/hw/audio/intel-hda.c > index 433463e..0d770131 100644 > --- a/hw/audio/intel-hda.c > +++ b/hw/audio/intel-hda.c > @@ -1130,6 +1130,7 @@ static void intel_hda_realize(PCIDevice *pci, Error **errp) > { > IntelHDAState *d = INTEL_HDA(pci); > uint8_t *conf = d->pci.config; > + int ret; > > d->name = object_get_typename(OBJECT(d)); > > @@ -1142,11 +1143,18 @@ static void intel_hda_realize(PCIDevice *pci, Error **errp) > "intel-hda", 0x4000); > pci_register_bar(&d->pci, 0, 0, &d->mmio); > if (d->msi) { > - msi_init(&d->pci, d->old_msi_addr ? 0x50 : 0x60, 1, true, false); > + ret = msi_init(&d->pci, d->old_msi_addr ? 0x50 : 0x60, 1, true, > + false, errp); > + if (ret < 0) { > + goto cleanup_on_msi_fail; > + } > } > > hda_codec_bus_init(DEVICE(pci), &d->codecs, sizeof(d->codecs), > intel_hda_response, intel_hda_xfer); > + > +cleanup_on_msi_fail: > + object_unref(OBJECT(&d->mmio)); > } > This is a bug fix. Two bugs, actually. Bug#1: we ignore msi_init() failure. If it fails because the board doesn't support MSI, the failure is ignored silently. If it fails because the PCI config space offset is already occupied, we report the error to the user, but still ignore it. The latter feels like a programming error to me. Your patch fixes realize() to fail when msi_init() fails. Bug#2: we report errors with error_report_err() instead through errp. This is because we use pci_add_capability() instead of pci_add_capability2(). I don't have the time to review the other devices you patch. Both bugs should be easy to spot in the patch: if the value of msi_init() is ignored before the patch, the device got bug#1, and if the patched device uses realize(), it got bug#2. The patch could be split up into parts that fix just one thing. Not sure that's worth the bother. > static void intel_hda_exit(PCIDevice *pci) > diff --git a/hw/ide/ich.c b/hw/ide/ich.c > index 16925fa..94b1809 100644 > --- a/hw/ide/ich.c > +++ b/hw/ide/ich.c > @@ -145,7 +145,7 @@ static void pci_ich9_ahci_realize(PCIDevice *dev, Error **errp) > /* Although the AHCI 1.3 specification states that the first capability > * should be PMCAP, the Intel ICH9 data sheet specifies that the ICH9 > * AHCI device puts the MSI capability first, pointing to 0x80. */ > - msi_init(dev, ICH9_MSI_CAP_OFFSET, 1, true, false); > + msi_init(dev, ICH9_MSI_CAP_OFFSET, 1, true, false, errp); > } > > static void pci_ich9_uninit(PCIDevice *dev) > diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c > index 37373e5..c373e77 100644 > --- a/hw/net/vmxnet3.c > +++ b/hw/net/vmxnet3.c > @@ -2135,13 +2135,13 @@ vmxnet3_cleanup_msix(VMXNET3State *s) > #define VMXNET3_PER_VECTOR_MASK (false) > > static bool > -vmxnet3_init_msi(VMXNET3State *s) > +vmxnet3_init_msi(VMXNET3State *s, Error **errp) > { > PCIDevice *d = PCI_DEVICE(s); > int res; > > res = msi_init(d, VMXNET3_MSI_OFFSET, VMXNET3_MAX_NMSIX_INTRS, > - VMXNET3_USE_64BIT, VMXNET3_PER_VECTOR_MASK); > + VMXNET3_USE_64BIT, VMXNET3_PER_VECTOR_MASK, errp); > if (0 > res) { > VMW_WRPRN("Failed to initialize MSI, error %d", res); > s->msi_used = false; > @@ -2204,6 +2204,11 @@ static void vmxnet3_pci_realize(PCIDevice *pci_dev, Error **errp) > > VMW_CBPRN("Starting init..."); > > + if (!vmxnet3_init_msi(s, errp)) { > + VMW_WRPRN("Failed to initialize MSI, configuration is inconsistent."); > + return; > + } > + > memory_region_init_io(&s->bar0, OBJECT(s), &b0_ops, s, > "vmxnet3-b0", VMXNET3_PT_REG_SIZE); > pci_register_bar(pci_dev, VMXNET3_BAR0_IDX, > @@ -2228,10 +2233,6 @@ static void vmxnet3_pci_realize(PCIDevice *pci_dev, Error **errp) > VMW_WRPRN("Failed to initialize MSI-X, configuration is inconsistent."); > } > > - if (!vmxnet3_init_msi(s)) { > - VMW_WRPRN("Failed to initialize MSI, configuration is inconsistent."); > - } > - > vmxnet3_net_init(s); > > register_savevm(dev, "vmxnet3-msix", -1, 1, > diff --git a/hw/pci-bridge/ioh3420.c b/hw/pci-bridge/ioh3420.c > index eead195..5df9e83 100644 > --- a/hw/pci-bridge/ioh3420.c > +++ b/hw/pci-bridge/ioh3420.c > @@ -96,6 +96,7 @@ static int ioh3420_initfn(PCIDevice *d) > PCIEPort *p = PCIE_PORT(d); > PCIESlot *s = PCIE_SLOT(d); > int rc; > + Error *local_err = NULL; > > pci_bridge_initfn(d, TYPE_PCIE_BUS); > pcie_port_init_reg(d); > @@ -105,12 +106,16 @@ static int ioh3420_initfn(PCIDevice *d) > if (rc < 0) { > goto err_bridge; > } > + > rc = msi_init(d, IOH_EP_MSI_OFFSET, IOH_EP_MSI_NR_VECTOR, > IOH_EP_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT, > - IOH_EP_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT); > + IOH_EP_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT, > + &local_err); > if (rc < 0) { > + error_report_err(local_err); > goto err_bridge; > } > + > rc = pcie_cap_init(d, IOH_EP_EXP_OFFSET, PCI_EXP_TYPE_ROOT_PORT, p->port); > if (rc < 0) { > goto err_msi; > diff --git a/hw/pci-bridge/pci_bridge_dev.c b/hw/pci-bridge/pci_bridge_dev.c > index bc3e1b7..bafaeeb 100644 > --- a/hw/pci-bridge/pci_bridge_dev.c > +++ b/hw/pci-bridge/pci_bridge_dev.c > @@ -51,6 +51,7 @@ static int pci_bridge_dev_initfn(PCIDevice *dev) > PCIBridge *br = PCI_BRIDGE(dev); > PCIBridgeDev *bridge_dev = PCI_BRIDGE_DEV(dev); > int err; > + Error *local_err = NULL; > > pci_bridge_initfn(dev, TYPE_PCI_BUS); > > @@ -66,17 +67,21 @@ static int pci_bridge_dev_initfn(PCIDevice *dev) > /* MSI is not applicable without SHPC */ > bridge_dev->flags &= ~(1 << PCI_BRIDGE_DEV_F_MSI_REQ); > } > + > err = slotid_cap_init(dev, 0, bridge_dev->chassis_nr, 0); > if (err) { > goto slotid_error; > } > + > if ((bridge_dev->flags & (1 << PCI_BRIDGE_DEV_F_MSI_REQ)) && > msi_supported) { > - err = msi_init(dev, 0, 1, true, true); > + err = msi_init(dev, 0, 1, true, true, &local_err); > if (err < 0) { > + error_report_err(local_err); > goto msi_error; > } > } > + > if (shpc_present(dev)) { > /* TODO: spec recommends using 64 bit prefetcheable BAR. > * Check whether that works well. */ > @@ -84,6 +89,7 @@ static int pci_bridge_dev_initfn(PCIDevice *dev) > PCI_BASE_ADDRESS_MEM_TYPE_64, &bridge_dev->bar); > } > return 0; > + > msi_error: > slotid_cap_cleanup(dev); > slotid_error: > diff --git a/hw/pci-bridge/xio3130_downstream.c b/hw/pci-bridge/xio3130_downstream.c > index b4dd25f..3fc7455 100644 > --- a/hw/pci-bridge/xio3130_downstream.c > +++ b/hw/pci-bridge/xio3130_downstream.c > @@ -59,21 +59,26 @@ static int xio3130_downstream_initfn(PCIDevice *d) > PCIEPort *p = PCIE_PORT(d); > PCIESlot *s = PCIE_SLOT(d); > int rc; > + Error *local_err = NULL; > > pci_bridge_initfn(d, TYPE_PCIE_BUS); > pcie_port_init_reg(d); > > rc = msi_init(d, XIO3130_MSI_OFFSET, XIO3130_MSI_NR_VECTOR, > XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT, > - XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT); > + XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT, > + &local_err); > if (rc < 0) { > + error_report_err(local_err); > goto err_bridge; > } > + > rc = pci_bridge_ssvid_init(d, XIO3130_SSVID_OFFSET, > XIO3130_SSVID_SVID, XIO3130_SSVID_SSID); > if (rc < 0) { > goto err_bridge; > } > + > rc = pcie_cap_init(d, XIO3130_EXP_OFFSET, PCI_EXP_TYPE_DOWNSTREAM, > p->port); > if (rc < 0) { > @@ -103,6 +108,7 @@ err_msi: > msi_uninit(d); > err_bridge: > pci_bridge_exitfn(d); > + > return rc; > } > > diff --git a/hw/pci-bridge/xio3130_upstream.c b/hw/pci-bridge/xio3130_upstream.c > index 434c8fd..882271c 100644 > --- a/hw/pci-bridge/xio3130_upstream.c > +++ b/hw/pci-bridge/xio3130_upstream.c > @@ -55,26 +55,32 @@ static int xio3130_upstream_initfn(PCIDevice *d) > { > PCIEPort *p = PCIE_PORT(d); > int rc; > + Error *local_err = NULL; > > pci_bridge_initfn(d, TYPE_PCIE_BUS); > pcie_port_init_reg(d); > > rc = msi_init(d, XIO3130_MSI_OFFSET, XIO3130_MSI_NR_VECTOR, > XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT, > - XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT); > + XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT, > + &local_err); > if (rc < 0) { > + error_report_err(local_err); > goto err_bridge; > } > + > rc = pci_bridge_ssvid_init(d, XIO3130_SSVID_OFFSET, > XIO3130_SSVID_SVID, XIO3130_SSVID_SSID); > if (rc < 0) { > goto err_bridge; > } > + > rc = pcie_cap_init(d, XIO3130_EXP_OFFSET, PCI_EXP_TYPE_UPSTREAM, > p->port); > if (rc < 0) { > goto err_msi; > } > + > pcie_cap_flr_init(d); > pcie_cap_deverr_init(d); > rc = pcie_aer_init(d, XIO3130_AER_OFFSET); > diff --git a/hw/pci/msi.c b/hw/pci/msi.c > index c1dd531..ad6ed09 100644 > --- a/hw/pci/msi.c > +++ b/hw/pci/msi.c > @@ -150,15 +150,23 @@ bool msi_enabled(const PCIDevice *dev) > PCI_MSI_FLAGS_ENABLE); > } > > -int msi_init(struct PCIDevice *dev, uint8_t offset, > - unsigned int nr_vectors, bool msi64bit, bool msi_per_vector_mask) > +/* > + * @nr_vectors: Multiple Message Capable field of Message Control register > + * @msi64bit: support 64-bit message address or not > + * @msi_per_vector_mask: support per-vector masking or not Missing: @offset, @errp. git-grep @errp shows the various ways we document this parameter elsewhere. We suck at consistency. Pick one you like. > + * > + * return: MSI capability offset in config space "... on success, -errno on error" plus an explanation of what the error codes mean. If nobody uses the error codes, we could simply return -1 and call it a day. > + */ The comment follows GTK-Doc conventions, but not quite. GTK-Doc comments are designed for easy extraction of reference documentation. We don't do that. We use them in a few places anyway, such as object.h. The majority of the code doesn't use them. Their use is up to the maintainer. I detest them, and keep them out of the subsystems I maintain. PCI maintainer's call. Prior discussion: http://lists.nongnu.org/archive/html/qemu-devel/2015-05/msg03368.html Here's how I'd write this one: /* * Make PCI device @dev MSI-capable. * Non-zero @offset puts capability MSI at that offset in PCI config * space. * @nr_vectors is the number of MSI vectors (1, 2, 4, 8, 16 or 32). * If @msi64bit, make the device capable of sending a 64-bit message * address. * If @msi_per_vector_mask, make the device support per-vector masking. * @errp is for returning errors. * Return the offset of capability MSI in config space on success, * set @errp and return -errno on error. * FIXME explain the error codes, or dumb down return value to -1 */ Except I'm not sure the function should fail in the first place! See below. > +int msi_init(struct PCIDevice *dev, uint8_t offset, unsigned int nr_vectors, > + bool msi64bit, bool msi_per_vector_mask, Error **errp) > { > unsigned int vectors_order; > - uint16_t flags; > + uint16_t flags; /* Message Control register value */ > uint8_t cap_size; > int config_offset; > > if (!msi_supported) { > + error_setg(errp, "MSI is not supported by interrupt controller"); > return -ENOTSUP; First failure mode: board does not support MSI (-ENOTSUP). Question to the PCI guys: why is this even an error? A device with capability MSI should work just fine in such a board. > } > > @@ -182,7 +190,8 @@ int msi_init(struct PCIDevice *dev, uint8_t offset, > } > > cap_size = msi_cap_sizeof(flags); > - config_offset = pci_add_capability(dev, PCI_CAP_ID_MSI, offset, cap_size); > + config_offset = pci_add_capability2(dev, PCI_CAP_ID_MSI, offset, > + cap_size, errp); pci_add_capability() is a wrapper around pci_add_capability2() that additionally reports errors with error_report_err(). This is what makes it unsuitable for realize(). > if (config_offset < 0) { > return config_offset; Inherits failing modes from pci_add_capability2(). Two: out of PCI config space (-ENOSPC), and capability overlap (-EINVAL). Question for the PCI guys: how can these happen? When they happen, is it a programming error? > } > @@ -205,6 +214,7 @@ int msi_init(struct PCIDevice *dev, uint8_t offset, > pci_set_long(dev->wmask + msi_mask_off(dev, msi64bit), > 0xffffffff >> (PCI_MSI_VECTORS_MAX - nr_vectors)); > } > + > return config_offset; > } > > diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c > index d7dc667..ba25b3a 100644 > --- a/hw/scsi/megasas.c > +++ b/hw/scsi/megasas.c > @@ -2339,6 +2339,17 @@ static void megasas_scsi_realize(PCIDevice *dev, Error **errp) > /* Interrupt pin 1 */ > pci_conf[PCI_INTERRUPT_PIN] = 0x01; > > + if (megasas_use_msi(s)) { > + int ret; > + > + ret = msi_init(dev, 0x50, 1, true, false, errp); > + if (ret > 0) { > + s->flags &= ~MEGASAS_MASK_USE_MSI; > + } else { > + return; > + } > + } > + > memory_region_init_io(&s->mmio_io, OBJECT(s), &megasas_mmio_ops, s, > "megasas-mmio", 0x4000); > memory_region_init_io(&s->port_io, OBJECT(s), &megasas_port_ops, s, > @@ -2346,10 +2357,6 @@ static void megasas_scsi_realize(PCIDevice *dev, Error **errp) > memory_region_init_io(&s->queue_io, OBJECT(s), &megasas_queue_ops, s, > "megasas-queue", 0x40000); > > - if (megasas_use_msi(s) && > - msi_init(dev, 0x50, 1, true, false)) { > - s->flags &= ~MEGASAS_MASK_USE_MSI; > - } > if (megasas_use_msix(s) && > msix_init(dev, 15, &s->mmio_io, b->mmio_bar, 0x2000, > &s->mmio_io, b->mmio_bar, 0x3800, 0x68)) { > diff --git a/hw/scsi/vmw_pvscsi.c b/hw/scsi/vmw_pvscsi.c > index 9c71f31..073b956 100644 > --- a/hw/scsi/vmw_pvscsi.c > +++ b/hw/scsi/vmw_pvscsi.c > @@ -1014,13 +1014,13 @@ pvscsi_io_read(void *opaque, hwaddr addr, unsigned size) > > > static bool > -pvscsi_init_msi(PVSCSIState *s) > +pvscsi_init_msi(PVSCSIState *s, Error **errp) > { > int res; > PCIDevice *d = PCI_DEVICE(s); > > res = msi_init(d, PVSCSI_MSI_OFFSET, PVSCSI_MSIX_NUM_VECTORS, > - PVSCSI_USE_64BIT, PVSCSI_PER_VECTOR_MASK); > + PVSCSI_USE_64BIT, PVSCSI_PER_VECTOR_MASK, errp); > if (res < 0) { > trace_pvscsi_init_msi_fail(res); > s->msi_used = false; > @@ -1066,6 +1066,7 @@ static int > pvscsi_init(PCIDevice *pci_dev) > { > PVSCSIState *s = PVSCSI(pci_dev); > + Error *local_err = NULL; > > trace_pvscsi_state("init"); > > @@ -1079,12 +1080,16 @@ pvscsi_init(PCIDevice *pci_dev) > /* Interrupt pin A */ > pci_config_set_interrupt_pin(pci_dev->config, 1); > > + pvscsi_init_msi(s, &local_err); > + if (local_err) { > + error_report_err(local_err); > + return -1; > + } > + > memory_region_init_io(&s->io_space, OBJECT(s), &pvscsi_ops, s, > "pvscsi-io", PVSCSI_MEM_SPACE_SIZE); > pci_register_bar(pci_dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->io_space); > > - pvscsi_init_msi(s); > - > s->completion_worker = qemu_bh_new(pvscsi_process_completion_queue, s); > if (!s->completion_worker) { > pvscsi_cleanup_msi(s); > diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c > index 268ab36..7cd5f6c 100644 > --- a/hw/usb/hcd-xhci.c > +++ b/hw/usb/hcd-xhci.c > @@ -3572,6 +3572,43 @@ static void usb_xhci_init(XHCIState *xhci) > } > } > > +static void usb_xhci_exit(PCIDevice *dev) > +{ > + int i; > + XHCIState *xhci = XHCI(dev); > + > + trace_usb_xhci_exit(); > + > + for (i = 0; i < xhci->numslots; i++) { > + xhci_disable_slot(xhci, i + 1); > + } > + > + if (xhci->mfwrap_timer) { > + timer_del(xhci->mfwrap_timer); > + timer_free(xhci->mfwrap_timer); > + xhci->mfwrap_timer = NULL; > + } > + > + memory_region_del_subregion(&xhci->mem, &xhci->mem_cap); > + memory_region_del_subregion(&xhci->mem, &xhci->mem_oper); > + memory_region_del_subregion(&xhci->mem, &xhci->mem_runtime); > + memory_region_del_subregion(&xhci->mem, &xhci->mem_doorbell); > + > + for (i = 0; i < xhci->numports; i++) { > + XHCIPort *port = &xhci->ports[i]; > + memory_region_del_subregion(&xhci->mem, &port->mem); > + } > + > + /* destroy msix memory region */ > + if (dev->msix_table && dev->msix_pba > + && dev->msix_entry_used) { > + memory_region_del_subregion(&xhci->mem, &dev->msix_table_mmio); > + memory_region_del_subregion(&xhci->mem, &dev->msix_pba_mmio); > + } > + > + usb_bus_release(&xhci->bus); > +} > + > static void usb_xhci_realize(struct PCIDevice *dev, Error **errp) > { > int i, ret; > @@ -3643,7 +3680,10 @@ static void usb_xhci_realize(struct PCIDevice *dev, Error **errp) > } > > if (xhci_get_flag(xhci, XHCI_FLAG_USE_MSI)) { > - msi_init(dev, 0x70, xhci->numintrs, true, false); > + ret = msi_init(dev, 0x70, xhci->numintrs, true, false, errp); > + if (ret < 0) { > + goto cleanup_on_msi_fail; > + } > } > if (xhci_get_flag(xhci, XHCI_FLAG_USE_MSI_X)) { > msix_init(dev, xhci->numintrs, > @@ -3651,43 +3691,10 @@ static void usb_xhci_realize(struct PCIDevice *dev, Error **errp) > &xhci->mem, 0, OFF_MSIX_PBA, > 0x90); > } > -} > - > -static void usb_xhci_exit(PCIDevice *dev) > -{ > - int i; > - XHCIState *xhci = XHCI(dev); > - > - trace_usb_xhci_exit(); > - > - for (i = 0; i < xhci->numslots; i++) { > - xhci_disable_slot(xhci, i + 1); > - } > - > - if (xhci->mfwrap_timer) { > - timer_del(xhci->mfwrap_timer); > - timer_free(xhci->mfwrap_timer); > - xhci->mfwrap_timer = NULL; > - } > > - memory_region_del_subregion(&xhci->mem, &xhci->mem_cap); > - memory_region_del_subregion(&xhci->mem, &xhci->mem_oper); > - memory_region_del_subregion(&xhci->mem, &xhci->mem_runtime); > - memory_region_del_subregion(&xhci->mem, &xhci->mem_doorbell); > - > - for (i = 0; i < xhci->numports; i++) { > - XHCIPort *port = &xhci->ports[i]; > - memory_region_del_subregion(&xhci->mem, &port->mem); > - } > - > - /* destroy msix memory region */ > - if (dev->msix_table && dev->msix_pba > - && dev->msix_entry_used) { > - memory_region_del_subregion(&xhci->mem, &dev->msix_table_mmio); > - memory_region_del_subregion(&xhci->mem, &dev->msix_pba_mmio); > - } > - > - usb_bus_release(&xhci->bus); > +cleanup_on_msi_fail: > + usb_xhci_exit(dev); > + object_unref(OBJECT(&xhci->mem)); > } > > static int usb_xhci_post_load(void *opaque, int version_id) > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c > index 1fb868c..633642e 100644 > --- a/hw/vfio/pci.c > +++ b/hw/vfio/pci.c > @@ -1139,7 +1139,7 @@ static void vfio_disable_interrupts(VFIOPCIDevice *vdev) > } > } > > -static int vfio_msi_setup(VFIOPCIDevice *vdev, int pos) > +static int vfio_msi_setup(VFIOPCIDevice *vdev, int pos, Error **errp) > { > uint16_t ctrl; > bool msi_64bit, msi_maskbit; > @@ -1147,6 +1147,7 @@ static int vfio_msi_setup(VFIOPCIDevice *vdev, int pos) > > if (pread(vdev->vbasedev.fd, &ctrl, sizeof(ctrl), > vdev->config_offset + pos + PCI_CAP_FLAGS) != sizeof(ctrl)) { > + error_setg(errp, "Read error!"); > return -errno; > } > ctrl = le16_to_cpu(ctrl); > @@ -1157,12 +1158,11 @@ static int vfio_msi_setup(VFIOPCIDevice *vdev, int pos) > > trace_vfio_msi_setup(vdev->vbasedev.name, pos); > > - ret = msi_init(&vdev->pdev, pos, entries, msi_64bit, msi_maskbit); > + ret = msi_init(&vdev->pdev, pos, entries, msi_64bit, msi_maskbit, errp); > if (ret < 0) { > if (ret == -ENOTSUP) { > return 0; > } > - error_report("vfio: msi_init failed"); > return ret; > } > vdev->msi_cap_size = 0xa + (msi_maskbit ? 0xa : 0) + (msi_64bit ? 0x4 : 0); > @@ -1654,7 +1654,7 @@ static void vfio_check_af_flr(VFIOPCIDevice *vdev, uint8_t pos) > } > } > > -static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos) > +static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos, Error **errp) > { > PCIDevice *pdev = &vdev->pdev; > uint8_t cap_id, next, size; > @@ -1679,7 +1679,7 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos) > * will be changed as we unwind the stack. > */ > if (next) { > - ret = vfio_add_std_cap(vdev, next); > + ret = vfio_add_std_cap(vdev, next, errp); > if (ret) { > return ret; > } > @@ -1695,7 +1695,7 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos) > > switch (cap_id) { > case PCI_CAP_ID_MSI: > - ret = vfio_msi_setup(vdev, pos); > + ret = vfio_msi_setup(vdev, pos, errp); > break; > case PCI_CAP_ID_EXP: > vfio_check_pcie_flr(vdev, pos); > @@ -1729,7 +1729,7 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos) > return 0; > } > > -static int vfio_add_capabilities(VFIOPCIDevice *vdev) > +static int vfio_add_capabilities(VFIOPCIDevice *vdev, Error **errp) > { > PCIDevice *pdev = &vdev->pdev; > > @@ -1738,7 +1738,7 @@ static int vfio_add_capabilities(VFIOPCIDevice *vdev) > return 0; /* Nothing to add */ > } > > - return vfio_add_std_cap(vdev, pdev->config[PCI_CAPABILITY_LIST]); > + return vfio_add_std_cap(vdev, pdev->config[PCI_CAPABILITY_LIST], errp); > } > > static void vfio_pci_pre_reset(VFIOPCIDevice *vdev) > @@ -2349,6 +2349,7 @@ static int vfio_initfn(PCIDevice *pdev) > struct stat st; > int groupid; > int ret; > + Error *local_err = NULL; > > /* Check that the host device exists */ > snprintf(path, sizeof(path), > @@ -2507,8 +2508,9 @@ static int vfio_initfn(PCIDevice *pdev) > > vfio_map_bars(vdev); > > - ret = vfio_add_capabilities(vdev); > + ret = vfio_add_capabilities(vdev, &local_err); > if (ret) { > + error_report_err(local_err); > goto out_teardown; > } > > diff --git a/include/hw/pci/msi.h b/include/hw/pci/msi.h > index 50e452b..da1dc1a 100644 > --- a/include/hw/pci/msi.h > +++ b/include/hw/pci/msi.h > @@ -34,8 +34,8 @@ extern bool msi_supported; > void msi_set_message(PCIDevice *dev, MSIMessage msg); > MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector); > bool msi_enabled(const PCIDevice *dev); > -int msi_init(struct PCIDevice *dev, uint8_t offset, > - unsigned int nr_vectors, bool msi64bit, bool msi_per_vector_mask); > +int msi_init(struct PCIDevice *dev, uint8_t offset, unsigned int nr_vectors, > + bool msi64bit, bool msi_per_vector_mask, Error **errp); > void msi_uninit(struct PCIDevice *dev); > void msi_reset(PCIDevice *dev); > void msi_notify(PCIDevice *dev, unsigned int vector);
hi, Markus Thanks for still remembering this patch, and quite a lot response:) I will give a appropriate response after I read & understand them all.(so, not cc other guys here) On 03/02/2016 05:13 PM, Markus Armbruster wrote: > This got lost over the Christmas break, sorry. > > Cc'ing Marcel for additional PCI expertise. > > Cao jin <caoj.fnst@cn.fujitsu.com> writes: > >> msi_init() is a supporting function in PCI device initialization, >> in order to convert .init() to .realize(), it should be modified first. > > "Supporting function" doesn't imply "should use Error to report > errors". HACKING explains: > > Use the simplest suitable method to communicate success / failure to > callers. Stick to common methods: non-negative on success / -1 on > error, non-negative / -errno, non-null / null, or Error objects. > > Example: when a function returns a non-null pointer on success, and it > can fail only in one way (as far as the caller is concerned), returning > null on failure is just fine, and certainly simpler and a lot easier on > the eyes than propagating an Error object through an Error ** parameter. > > Example: when a function's callers need to report details on failure > only the function really knows, use Error **, and set suitable errors. > > Do not report an error to the user when you're also returning an error > for somebody else to handle. Leave the reporting to the place that > consumes the error returned. > > As we'll see in your patch of msi.c, msi_init() can fail in multiple > ways, and uses -errno to comunicate them. That can be okay even in > realize(). It also reports to the user. That's what makes it > unsuitable for realize(). > >> Also modify the callers > > Actually, you're *fixing* callers! But the bugs aren't 100% clear, yet, > see below for details. Once we know what the bugs are, we'll want to > reword the commit message to describe the bugs and their impact. > > I recommend to skip ahead to msi.c, then come back to the device models. > >> Bonus: add more comment for msi_init(). >> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> >> --- >> hw/audio/intel-hda.c | 10 ++++- >> hw/ide/ich.c | 2 +- >> hw/net/vmxnet3.c | 13 +++--- >> hw/pci-bridge/ioh3420.c | 7 +++- >> hw/pci-bridge/pci_bridge_dev.c | 8 +++- >> hw/pci-bridge/xio3130_downstream.c | 8 +++- >> hw/pci-bridge/xio3130_upstream.c | 8 +++- >> hw/pci/msi.c | 18 +++++++-- >> hw/scsi/megasas.c | 15 +++++-- >> hw/scsi/vmw_pvscsi.c | 13 ++++-- >> hw/usb/hcd-xhci.c | 81 +++++++++++++++++++++----------------- >> hw/vfio/pci.c | 20 +++++----- >> include/hw/pci/msi.h | 4 +- >> 13 files changed, 135 insertions(+), 72 deletions(-) >> >> diff --git a/hw/audio/intel-hda.c b/hw/audio/intel-hda.c >> index 433463e..0d770131 100644 >> --- a/hw/audio/intel-hda.c >> +++ b/hw/audio/intel-hda.c >> @@ -1130,6 +1130,7 @@ static void intel_hda_realize(PCIDevice *pci, Error **errp) >> { >> IntelHDAState *d = INTEL_HDA(pci); >> uint8_t *conf = d->pci.config; >> + int ret; >> >> d->name = object_get_typename(OBJECT(d)); >> >> @@ -1142,11 +1143,18 @@ static void intel_hda_realize(PCIDevice *pci, Error **errp) >> "intel-hda", 0x4000); >> pci_register_bar(&d->pci, 0, 0, &d->mmio); >> if (d->msi) { >> - msi_init(&d->pci, d->old_msi_addr ? 0x50 : 0x60, 1, true, false); >> + ret = msi_init(&d->pci, d->old_msi_addr ? 0x50 : 0x60, 1, true, >> + false, errp); >> + if (ret < 0) { >> + goto cleanup_on_msi_fail; >> + } >> } >> >> hda_codec_bus_init(DEVICE(pci), &d->codecs, sizeof(d->codecs), >> intel_hda_response, intel_hda_xfer); >> + >> +cleanup_on_msi_fail: >> + object_unref(OBJECT(&d->mmio)); >> } >> > > This is a bug fix. Two bugs, actually. > > Bug#1: we ignore msi_init() failure. If it fails because the board > doesn't support MSI, the failure is ignored silently. If it fails > because the PCI config space offset is already occupied, we report the > error to the user, but still ignore it. The latter feels like a > programming error to me. > > Your patch fixes realize() to fail when msi_init() fails. > > Bug#2: we report errors with error_report_err() instead through errp. > This is because we use pci_add_capability() instead of > pci_add_capability2(). > > I don't have the time to review the other devices you patch. Both bugs > should be easy to spot in the patch: if the value of msi_init() is > ignored before the patch, the device got bug#1, and if the patched > device uses realize(), it got bug#2. > > The patch could be split up into parts that fix just one thing. Not > sure that's worth the bother. > >> static void intel_hda_exit(PCIDevice *pci) >> diff --git a/hw/ide/ich.c b/hw/ide/ich.c >> index 16925fa..94b1809 100644 >> --- a/hw/ide/ich.c >> +++ b/hw/ide/ich.c >> @@ -145,7 +145,7 @@ static void pci_ich9_ahci_realize(PCIDevice *dev, Error **errp) >> /* Although the AHCI 1.3 specification states that the first capability >> * should be PMCAP, the Intel ICH9 data sheet specifies that the ICH9 >> * AHCI device puts the MSI capability first, pointing to 0x80. */ >> - msi_init(dev, ICH9_MSI_CAP_OFFSET, 1, true, false); >> + msi_init(dev, ICH9_MSI_CAP_OFFSET, 1, true, false, errp); >> } >> >> static void pci_ich9_uninit(PCIDevice *dev) >> diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c >> index 37373e5..c373e77 100644 >> --- a/hw/net/vmxnet3.c >> +++ b/hw/net/vmxnet3.c >> @@ -2135,13 +2135,13 @@ vmxnet3_cleanup_msix(VMXNET3State *s) >> #define VMXNET3_PER_VECTOR_MASK (false) >> >> static bool >> -vmxnet3_init_msi(VMXNET3State *s) >> +vmxnet3_init_msi(VMXNET3State *s, Error **errp) >> { >> PCIDevice *d = PCI_DEVICE(s); >> int res; >> >> res = msi_init(d, VMXNET3_MSI_OFFSET, VMXNET3_MAX_NMSIX_INTRS, >> - VMXNET3_USE_64BIT, VMXNET3_PER_VECTOR_MASK); >> + VMXNET3_USE_64BIT, VMXNET3_PER_VECTOR_MASK, errp); >> if (0 > res) { >> VMW_WRPRN("Failed to initialize MSI, error %d", res); >> s->msi_used = false; >> @@ -2204,6 +2204,11 @@ static void vmxnet3_pci_realize(PCIDevice *pci_dev, Error **errp) >> >> VMW_CBPRN("Starting init..."); >> >> + if (!vmxnet3_init_msi(s, errp)) { >> + VMW_WRPRN("Failed to initialize MSI, configuration is inconsistent."); >> + return; >> + } >> + >> memory_region_init_io(&s->bar0, OBJECT(s), &b0_ops, s, >> "vmxnet3-b0", VMXNET3_PT_REG_SIZE); >> pci_register_bar(pci_dev, VMXNET3_BAR0_IDX, >> @@ -2228,10 +2233,6 @@ static void vmxnet3_pci_realize(PCIDevice *pci_dev, Error **errp) >> VMW_WRPRN("Failed to initialize MSI-X, configuration is inconsistent."); >> } >> >> - if (!vmxnet3_init_msi(s)) { >> - VMW_WRPRN("Failed to initialize MSI, configuration is inconsistent."); >> - } >> - >> vmxnet3_net_init(s); >> >> register_savevm(dev, "vmxnet3-msix", -1, 1, >> diff --git a/hw/pci-bridge/ioh3420.c b/hw/pci-bridge/ioh3420.c >> index eead195..5df9e83 100644 >> --- a/hw/pci-bridge/ioh3420.c >> +++ b/hw/pci-bridge/ioh3420.c >> @@ -96,6 +96,7 @@ static int ioh3420_initfn(PCIDevice *d) >> PCIEPort *p = PCIE_PORT(d); >> PCIESlot *s = PCIE_SLOT(d); >> int rc; >> + Error *local_err = NULL; >> >> pci_bridge_initfn(d, TYPE_PCIE_BUS); >> pcie_port_init_reg(d); >> @@ -105,12 +106,16 @@ static int ioh3420_initfn(PCIDevice *d) >> if (rc < 0) { >> goto err_bridge; >> } >> + >> rc = msi_init(d, IOH_EP_MSI_OFFSET, IOH_EP_MSI_NR_VECTOR, >> IOH_EP_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT, >> - IOH_EP_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT); >> + IOH_EP_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT, >> + &local_err); >> if (rc < 0) { >> + error_report_err(local_err); >> goto err_bridge; >> } >> + >> rc = pcie_cap_init(d, IOH_EP_EXP_OFFSET, PCI_EXP_TYPE_ROOT_PORT, p->port); >> if (rc < 0) { >> goto err_msi; >> diff --git a/hw/pci-bridge/pci_bridge_dev.c b/hw/pci-bridge/pci_bridge_dev.c >> index bc3e1b7..bafaeeb 100644 >> --- a/hw/pci-bridge/pci_bridge_dev.c >> +++ b/hw/pci-bridge/pci_bridge_dev.c >> @@ -51,6 +51,7 @@ static int pci_bridge_dev_initfn(PCIDevice *dev) >> PCIBridge *br = PCI_BRIDGE(dev); >> PCIBridgeDev *bridge_dev = PCI_BRIDGE_DEV(dev); >> int err; >> + Error *local_err = NULL; >> >> pci_bridge_initfn(dev, TYPE_PCI_BUS); >> >> @@ -66,17 +67,21 @@ static int pci_bridge_dev_initfn(PCIDevice *dev) >> /* MSI is not applicable without SHPC */ >> bridge_dev->flags &= ~(1 << PCI_BRIDGE_DEV_F_MSI_REQ); >> } >> + >> err = slotid_cap_init(dev, 0, bridge_dev->chassis_nr, 0); >> if (err) { >> goto slotid_error; >> } >> + >> if ((bridge_dev->flags & (1 << PCI_BRIDGE_DEV_F_MSI_REQ)) && >> msi_supported) { >> - err = msi_init(dev, 0, 1, true, true); >> + err = msi_init(dev, 0, 1, true, true, &local_err); >> if (err < 0) { >> + error_report_err(local_err); >> goto msi_error; >> } >> } >> + >> if (shpc_present(dev)) { >> /* TODO: spec recommends using 64 bit prefetcheable BAR. >> * Check whether that works well. */ >> @@ -84,6 +89,7 @@ static int pci_bridge_dev_initfn(PCIDevice *dev) >> PCI_BASE_ADDRESS_MEM_TYPE_64, &bridge_dev->bar); >> } >> return 0; >> + >> msi_error: >> slotid_cap_cleanup(dev); >> slotid_error: >> diff --git a/hw/pci-bridge/xio3130_downstream.c b/hw/pci-bridge/xio3130_downstream.c >> index b4dd25f..3fc7455 100644 >> --- a/hw/pci-bridge/xio3130_downstream.c >> +++ b/hw/pci-bridge/xio3130_downstream.c >> @@ -59,21 +59,26 @@ static int xio3130_downstream_initfn(PCIDevice *d) >> PCIEPort *p = PCIE_PORT(d); >> PCIESlot *s = PCIE_SLOT(d); >> int rc; >> + Error *local_err = NULL; >> >> pci_bridge_initfn(d, TYPE_PCIE_BUS); >> pcie_port_init_reg(d); >> >> rc = msi_init(d, XIO3130_MSI_OFFSET, XIO3130_MSI_NR_VECTOR, >> XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT, >> - XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT); >> + XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT, >> + &local_err); >> if (rc < 0) { >> + error_report_err(local_err); >> goto err_bridge; >> } >> + >> rc = pci_bridge_ssvid_init(d, XIO3130_SSVID_OFFSET, >> XIO3130_SSVID_SVID, XIO3130_SSVID_SSID); >> if (rc < 0) { >> goto err_bridge; >> } >> + >> rc = pcie_cap_init(d, XIO3130_EXP_OFFSET, PCI_EXP_TYPE_DOWNSTREAM, >> p->port); >> if (rc < 0) { >> @@ -103,6 +108,7 @@ err_msi: >> msi_uninit(d); >> err_bridge: >> pci_bridge_exitfn(d); >> + >> return rc; >> } >> >> diff --git a/hw/pci-bridge/xio3130_upstream.c b/hw/pci-bridge/xio3130_upstream.c >> index 434c8fd..882271c 100644 >> --- a/hw/pci-bridge/xio3130_upstream.c >> +++ b/hw/pci-bridge/xio3130_upstream.c >> @@ -55,26 +55,32 @@ static int xio3130_upstream_initfn(PCIDevice *d) >> { >> PCIEPort *p = PCIE_PORT(d); >> int rc; >> + Error *local_err = NULL; >> >> pci_bridge_initfn(d, TYPE_PCIE_BUS); >> pcie_port_init_reg(d); >> >> rc = msi_init(d, XIO3130_MSI_OFFSET, XIO3130_MSI_NR_VECTOR, >> XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT, >> - XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT); >> + XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT, >> + &local_err); >> if (rc < 0) { >> + error_report_err(local_err); >> goto err_bridge; >> } >> + >> rc = pci_bridge_ssvid_init(d, XIO3130_SSVID_OFFSET, >> XIO3130_SSVID_SVID, XIO3130_SSVID_SSID); >> if (rc < 0) { >> goto err_bridge; >> } >> + >> rc = pcie_cap_init(d, XIO3130_EXP_OFFSET, PCI_EXP_TYPE_UPSTREAM, >> p->port); >> if (rc < 0) { >> goto err_msi; >> } >> + >> pcie_cap_flr_init(d); >> pcie_cap_deverr_init(d); >> rc = pcie_aer_init(d, XIO3130_AER_OFFSET); >> diff --git a/hw/pci/msi.c b/hw/pci/msi.c >> index c1dd531..ad6ed09 100644 >> --- a/hw/pci/msi.c >> +++ b/hw/pci/msi.c >> @@ -150,15 +150,23 @@ bool msi_enabled(const PCIDevice *dev) >> PCI_MSI_FLAGS_ENABLE); >> } >> >> -int msi_init(struct PCIDevice *dev, uint8_t offset, >> - unsigned int nr_vectors, bool msi64bit, bool msi_per_vector_mask) >> +/* >> + * @nr_vectors: Multiple Message Capable field of Message Control register >> + * @msi64bit: support 64-bit message address or not >> + * @msi_per_vector_mask: support per-vector masking or not > > Missing: @offset, @errp. > > git-grep @errp shows the various ways we document this parameter > elsewhere. We suck at consistency. Pick one you like. > >> + * >> + * return: MSI capability offset in config space > > "... on success, -errno on error" plus an explanation of what the error > codes mean. If nobody uses the error codes, we could simply return -1 > and call it a day. > >> + */ > > The comment follows GTK-Doc conventions, but not quite. > > GTK-Doc comments are designed for easy extraction of reference > documentation. We don't do that. We use them in a few places anyway, > such as object.h. The majority of the code doesn't use them. Their use > is up to the maintainer. I detest them, and keep them out of the > subsystems I maintain. PCI maintainer's call. > > Prior discussion: > http://lists.nongnu.org/archive/html/qemu-devel/2015-05/msg03368.html > > Here's how I'd write this one: > > /* > * Make PCI device @dev MSI-capable. > * Non-zero @offset puts capability MSI at that offset in PCI config > * space. > * @nr_vectors is the number of MSI vectors (1, 2, 4, 8, 16 or 32). > * If @msi64bit, make the device capable of sending a 64-bit message > * address. > * If @msi_per_vector_mask, make the device support per-vector masking. > * @errp is for returning errors. > * Return the offset of capability MSI in config space on success, > * set @errp and return -errno on error. > * FIXME explain the error codes, or dumb down return value to -1 > */ > > Except I'm not sure the function should fail in the first place! See > below. > >> +int msi_init(struct PCIDevice *dev, uint8_t offset, unsigned int nr_vectors, >> + bool msi64bit, bool msi_per_vector_mask, Error **errp) >> { >> unsigned int vectors_order; >> - uint16_t flags; >> + uint16_t flags; /* Message Control register value */ >> uint8_t cap_size; >> int config_offset; >> >> if (!msi_supported) { >> + error_setg(errp, "MSI is not supported by interrupt controller"); >> return -ENOTSUP; > > First failure mode: board does not support MSI (-ENOTSUP). > > Question to the PCI guys: why is this even an error? A device with > capability MSI should work just fine in such a board. > >> } >> >> @@ -182,7 +190,8 @@ int msi_init(struct PCIDevice *dev, uint8_t offset, >> } >> >> cap_size = msi_cap_sizeof(flags); >> - config_offset = pci_add_capability(dev, PCI_CAP_ID_MSI, offset, cap_size); >> + config_offset = pci_add_capability2(dev, PCI_CAP_ID_MSI, offset, >> + cap_size, errp); > > pci_add_capability() is a wrapper around pci_add_capability2() that > additionally reports errors with error_report_err(). This is what makes > it unsuitable for realize(). > >> if (config_offset < 0) { >> return config_offset; > > Inherits failing modes from pci_add_capability2(). Two: out of PCI > config space (-ENOSPC), and capability overlap (-EINVAL). > > Question for the PCI guys: how can these happen? When they happen, is > it a programming error? > >> } >> @@ -205,6 +214,7 @@ int msi_init(struct PCIDevice *dev, uint8_t offset, >> pci_set_long(dev->wmask + msi_mask_off(dev, msi64bit), >> 0xffffffff >> (PCI_MSI_VECTORS_MAX - nr_vectors)); >> } >> + >> return config_offset; >> } >> >> diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c >> index d7dc667..ba25b3a 100644 >> --- a/hw/scsi/megasas.c >> +++ b/hw/scsi/megasas.c >> @@ -2339,6 +2339,17 @@ static void megasas_scsi_realize(PCIDevice *dev, Error **errp) >> /* Interrupt pin 1 */ >> pci_conf[PCI_INTERRUPT_PIN] = 0x01; >> >> + if (megasas_use_msi(s)) { >> + int ret; >> + >> + ret = msi_init(dev, 0x50, 1, true, false, errp); >> + if (ret > 0) { >> + s->flags &= ~MEGASAS_MASK_USE_MSI; >> + } else { >> + return; >> + } >> + } >> + >> memory_region_init_io(&s->mmio_io, OBJECT(s), &megasas_mmio_ops, s, >> "megasas-mmio", 0x4000); >> memory_region_init_io(&s->port_io, OBJECT(s), &megasas_port_ops, s, >> @@ -2346,10 +2357,6 @@ static void megasas_scsi_realize(PCIDevice *dev, Error **errp) >> memory_region_init_io(&s->queue_io, OBJECT(s), &megasas_queue_ops, s, >> "megasas-queue", 0x40000); >> >> - if (megasas_use_msi(s) && >> - msi_init(dev, 0x50, 1, true, false)) { >> - s->flags &= ~MEGASAS_MASK_USE_MSI; >> - } >> if (megasas_use_msix(s) && >> msix_init(dev, 15, &s->mmio_io, b->mmio_bar, 0x2000, >> &s->mmio_io, b->mmio_bar, 0x3800, 0x68)) { >> diff --git a/hw/scsi/vmw_pvscsi.c b/hw/scsi/vmw_pvscsi.c >> index 9c71f31..073b956 100644 >> --- a/hw/scsi/vmw_pvscsi.c >> +++ b/hw/scsi/vmw_pvscsi.c >> @@ -1014,13 +1014,13 @@ pvscsi_io_read(void *opaque, hwaddr addr, unsigned size) >> >> >> static bool >> -pvscsi_init_msi(PVSCSIState *s) >> +pvscsi_init_msi(PVSCSIState *s, Error **errp) >> { >> int res; >> PCIDevice *d = PCI_DEVICE(s); >> >> res = msi_init(d, PVSCSI_MSI_OFFSET, PVSCSI_MSIX_NUM_VECTORS, >> - PVSCSI_USE_64BIT, PVSCSI_PER_VECTOR_MASK); >> + PVSCSI_USE_64BIT, PVSCSI_PER_VECTOR_MASK, errp); >> if (res < 0) { >> trace_pvscsi_init_msi_fail(res); >> s->msi_used = false; >> @@ -1066,6 +1066,7 @@ static int >> pvscsi_init(PCIDevice *pci_dev) >> { >> PVSCSIState *s = PVSCSI(pci_dev); >> + Error *local_err = NULL; >> >> trace_pvscsi_state("init"); >> >> @@ -1079,12 +1080,16 @@ pvscsi_init(PCIDevice *pci_dev) >> /* Interrupt pin A */ >> pci_config_set_interrupt_pin(pci_dev->config, 1); >> >> + pvscsi_init_msi(s, &local_err); >> + if (local_err) { >> + error_report_err(local_err); >> + return -1; >> + } >> + >> memory_region_init_io(&s->io_space, OBJECT(s), &pvscsi_ops, s, >> "pvscsi-io", PVSCSI_MEM_SPACE_SIZE); >> pci_register_bar(pci_dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->io_space); >> >> - pvscsi_init_msi(s); >> - >> s->completion_worker = qemu_bh_new(pvscsi_process_completion_queue, s); >> if (!s->completion_worker) { >> pvscsi_cleanup_msi(s); >> diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c >> index 268ab36..7cd5f6c 100644 >> --- a/hw/usb/hcd-xhci.c >> +++ b/hw/usb/hcd-xhci.c >> @@ -3572,6 +3572,43 @@ static void usb_xhci_init(XHCIState *xhci) >> } >> } >> >> +static void usb_xhci_exit(PCIDevice *dev) >> +{ >> + int i; >> + XHCIState *xhci = XHCI(dev); >> + >> + trace_usb_xhci_exit(); >> + >> + for (i = 0; i < xhci->numslots; i++) { >> + xhci_disable_slot(xhci, i + 1); >> + } >> + >> + if (xhci->mfwrap_timer) { >> + timer_del(xhci->mfwrap_timer); >> + timer_free(xhci->mfwrap_timer); >> + xhci->mfwrap_timer = NULL; >> + } >> + >> + memory_region_del_subregion(&xhci->mem, &xhci->mem_cap); >> + memory_region_del_subregion(&xhci->mem, &xhci->mem_oper); >> + memory_region_del_subregion(&xhci->mem, &xhci->mem_runtime); >> + memory_region_del_subregion(&xhci->mem, &xhci->mem_doorbell); >> + >> + for (i = 0; i < xhci->numports; i++) { >> + XHCIPort *port = &xhci->ports[i]; >> + memory_region_del_subregion(&xhci->mem, &port->mem); >> + } >> + >> + /* destroy msix memory region */ >> + if (dev->msix_table && dev->msix_pba >> + && dev->msix_entry_used) { >> + memory_region_del_subregion(&xhci->mem, &dev->msix_table_mmio); >> + memory_region_del_subregion(&xhci->mem, &dev->msix_pba_mmio); >> + } >> + >> + usb_bus_release(&xhci->bus); >> +} >> + >> static void usb_xhci_realize(struct PCIDevice *dev, Error **errp) >> { >> int i, ret; >> @@ -3643,7 +3680,10 @@ static void usb_xhci_realize(struct PCIDevice *dev, Error **errp) >> } >> >> if (xhci_get_flag(xhci, XHCI_FLAG_USE_MSI)) { >> - msi_init(dev, 0x70, xhci->numintrs, true, false); >> + ret = msi_init(dev, 0x70, xhci->numintrs, true, false, errp); >> + if (ret < 0) { >> + goto cleanup_on_msi_fail; >> + } >> } >> if (xhci_get_flag(xhci, XHCI_FLAG_USE_MSI_X)) { >> msix_init(dev, xhci->numintrs, >> @@ -3651,43 +3691,10 @@ static void usb_xhci_realize(struct PCIDevice *dev, Error **errp) >> &xhci->mem, 0, OFF_MSIX_PBA, >> 0x90); >> } >> -} >> - >> -static void usb_xhci_exit(PCIDevice *dev) >> -{ >> - int i; >> - XHCIState *xhci = XHCI(dev); >> - >> - trace_usb_xhci_exit(); >> - >> - for (i = 0; i < xhci->numslots; i++) { >> - xhci_disable_slot(xhci, i + 1); >> - } >> - >> - if (xhci->mfwrap_timer) { >> - timer_del(xhci->mfwrap_timer); >> - timer_free(xhci->mfwrap_timer); >> - xhci->mfwrap_timer = NULL; >> - } >> >> - memory_region_del_subregion(&xhci->mem, &xhci->mem_cap); >> - memory_region_del_subregion(&xhci->mem, &xhci->mem_oper); >> - memory_region_del_subregion(&xhci->mem, &xhci->mem_runtime); >> - memory_region_del_subregion(&xhci->mem, &xhci->mem_doorbell); >> - >> - for (i = 0; i < xhci->numports; i++) { >> - XHCIPort *port = &xhci->ports[i]; >> - memory_region_del_subregion(&xhci->mem, &port->mem); >> - } >> - >> - /* destroy msix memory region */ >> - if (dev->msix_table && dev->msix_pba >> - && dev->msix_entry_used) { >> - memory_region_del_subregion(&xhci->mem, &dev->msix_table_mmio); >> - memory_region_del_subregion(&xhci->mem, &dev->msix_pba_mmio); >> - } >> - >> - usb_bus_release(&xhci->bus); >> +cleanup_on_msi_fail: >> + usb_xhci_exit(dev); >> + object_unref(OBJECT(&xhci->mem)); >> } >> >> static int usb_xhci_post_load(void *opaque, int version_id) >> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c >> index 1fb868c..633642e 100644 >> --- a/hw/vfio/pci.c >> +++ b/hw/vfio/pci.c >> @@ -1139,7 +1139,7 @@ static void vfio_disable_interrupts(VFIOPCIDevice *vdev) >> } >> } >> >> -static int vfio_msi_setup(VFIOPCIDevice *vdev, int pos) >> +static int vfio_msi_setup(VFIOPCIDevice *vdev, int pos, Error **errp) >> { >> uint16_t ctrl; >> bool msi_64bit, msi_maskbit; >> @@ -1147,6 +1147,7 @@ static int vfio_msi_setup(VFIOPCIDevice *vdev, int pos) >> >> if (pread(vdev->vbasedev.fd, &ctrl, sizeof(ctrl), >> vdev->config_offset + pos + PCI_CAP_FLAGS) != sizeof(ctrl)) { >> + error_setg(errp, "Read error!"); >> return -errno; >> } >> ctrl = le16_to_cpu(ctrl); >> @@ -1157,12 +1158,11 @@ static int vfio_msi_setup(VFIOPCIDevice *vdev, int pos) >> >> trace_vfio_msi_setup(vdev->vbasedev.name, pos); >> >> - ret = msi_init(&vdev->pdev, pos, entries, msi_64bit, msi_maskbit); >> + ret = msi_init(&vdev->pdev, pos, entries, msi_64bit, msi_maskbit, errp); >> if (ret < 0) { >> if (ret == -ENOTSUP) { >> return 0; >> } >> - error_report("vfio: msi_init failed"); >> return ret; >> } >> vdev->msi_cap_size = 0xa + (msi_maskbit ? 0xa : 0) + (msi_64bit ? 0x4 : 0); >> @@ -1654,7 +1654,7 @@ static void vfio_check_af_flr(VFIOPCIDevice *vdev, uint8_t pos) >> } >> } >> >> -static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos) >> +static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos, Error **errp) >> { >> PCIDevice *pdev = &vdev->pdev; >> uint8_t cap_id, next, size; >> @@ -1679,7 +1679,7 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos) >> * will be changed as we unwind the stack. >> */ >> if (next) { >> - ret = vfio_add_std_cap(vdev, next); >> + ret = vfio_add_std_cap(vdev, next, errp); >> if (ret) { >> return ret; >> } >> @@ -1695,7 +1695,7 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos) >> >> switch (cap_id) { >> case PCI_CAP_ID_MSI: >> - ret = vfio_msi_setup(vdev, pos); >> + ret = vfio_msi_setup(vdev, pos, errp); >> break; >> case PCI_CAP_ID_EXP: >> vfio_check_pcie_flr(vdev, pos); >> @@ -1729,7 +1729,7 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos) >> return 0; >> } >> >> -static int vfio_add_capabilities(VFIOPCIDevice *vdev) >> +static int vfio_add_capabilities(VFIOPCIDevice *vdev, Error **errp) >> { >> PCIDevice *pdev = &vdev->pdev; >> >> @@ -1738,7 +1738,7 @@ static int vfio_add_capabilities(VFIOPCIDevice *vdev) >> return 0; /* Nothing to add */ >> } >> >> - return vfio_add_std_cap(vdev, pdev->config[PCI_CAPABILITY_LIST]); >> + return vfio_add_std_cap(vdev, pdev->config[PCI_CAPABILITY_LIST], errp); >> } >> >> static void vfio_pci_pre_reset(VFIOPCIDevice *vdev) >> @@ -2349,6 +2349,7 @@ static int vfio_initfn(PCIDevice *pdev) >> struct stat st; >> int groupid; >> int ret; >> + Error *local_err = NULL; >> >> /* Check that the host device exists */ >> snprintf(path, sizeof(path), >> @@ -2507,8 +2508,9 @@ static int vfio_initfn(PCIDevice *pdev) >> >> vfio_map_bars(vdev); >> >> - ret = vfio_add_capabilities(vdev); >> + ret = vfio_add_capabilities(vdev, &local_err); >> if (ret) { >> + error_report_err(local_err); >> goto out_teardown; >> } >> >> diff --git a/include/hw/pci/msi.h b/include/hw/pci/msi.h >> index 50e452b..da1dc1a 100644 >> --- a/include/hw/pci/msi.h >> +++ b/include/hw/pci/msi.h >> @@ -34,8 +34,8 @@ extern bool msi_supported; >> void msi_set_message(PCIDevice *dev, MSIMessage msg); >> MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector); >> bool msi_enabled(const PCIDevice *dev); >> -int msi_init(struct PCIDevice *dev, uint8_t offset, >> - unsigned int nr_vectors, bool msi64bit, bool msi_per_vector_mask); >> +int msi_init(struct PCIDevice *dev, uint8_t offset, unsigned int nr_vectors, >> + bool msi64bit, bool msi_per_vector_mask, Error **errp); >> void msi_uninit(struct PCIDevice *dev); >> void msi_reset(PCIDevice *dev); >> void msi_notify(PCIDevice *dev, unsigned int vector); > > > . >
On 03/02/2016 11:13 AM, Markus Armbruster wrote: > This got lost over the Christmas break, sorry. > > Cc'ing Marcel for additional PCI expertise. > > Cao jin <caoj.fnst@cn.fujitsu.com> writes: > >> msi_init() is a supporting function in PCI device initialization, >> in order to convert .init() to .realize(), it should be modified first. > > "Supporting function" doesn't imply "should use Error to report > errors". HACKING explains: > > Use the simplest suitable method to communicate success / failure to > callers. Stick to common methods: non-negative on success / -1 on > error, non-negative / -errno, non-null / null, or Error objects. > > Example: when a function returns a non-null pointer on success, and it > can fail only in one way (as far as the caller is concerned), returning > null on failure is just fine, and certainly simpler and a lot easier on > the eyes than propagating an Error object through an Error ** parameter. > > Example: when a function's callers need to report details on failure > only the function really knows, use Error **, and set suitable errors. > > Do not report an error to the user when you're also returning an error > for somebody else to handle. Leave the reporting to the place that > consumes the error returned. > > As we'll see in your patch of msi.c, msi_init() can fail in multiple > ways, and uses -errno to comunicate them. That can be okay even in > realize(). It also reports to the user. That's what makes it > unsuitable for realize(). > >> Also modify the callers > > Actually, you're *fixing* callers! But the bugs aren't 100% clear, yet, > see below for details. Once we know what the bugs are, we'll want to > reword the commit message to describe the bugs and their impact. > > I recommend to skip ahead to msi.c, then come back to the device models. > >> Bonus: add more comment for msi_init(). >> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> >> --- >> hw/audio/intel-hda.c | 10 ++++- >> hw/ide/ich.c | 2 +- >> hw/net/vmxnet3.c | 13 +++--- >> hw/pci-bridge/ioh3420.c | 7 +++- >> hw/pci-bridge/pci_bridge_dev.c | 8 +++- >> hw/pci-bridge/xio3130_downstream.c | 8 +++- >> hw/pci-bridge/xio3130_upstream.c | 8 +++- >> hw/pci/msi.c | 18 +++++++-- >> hw/scsi/megasas.c | 15 +++++-- >> hw/scsi/vmw_pvscsi.c | 13 ++++-- >> hw/usb/hcd-xhci.c | 81 +++++++++++++++++++++----------------- >> hw/vfio/pci.c | 20 +++++----- >> include/hw/pci/msi.h | 4 +- >> 13 files changed, 135 insertions(+), 72 deletions(-) >> [...] > Except I'm not sure the function should fail in the first place! See > below. > >> +int msi_init(struct PCIDevice *dev, uint8_t offset, unsigned int nr_vectors, >> + bool msi64bit, bool msi_per_vector_mask, Error **errp) >> { >> unsigned int vectors_order; >> - uint16_t flags; >> + uint16_t flags; /* Message Control register value */ >> uint8_t cap_size; >> int config_offset; >> >> if (!msi_supported) { >> + error_setg(errp, "MSI is not supported by interrupt controller"); >> return -ENOTSUP; > > First failure mode: board does not support MSI (-ENOTSUP). > > Question to the PCI guys: why is this even an error? A device with > capability MSI should work just fine in such a board. Hi Markus, Adding Jan Kiszka, maybe he can help. That's a fair question. Is there any history for this decision? The board not supporting MSI has nothing to do with the capability being there. The HW should not change because the board doe not support it. The capability should be present but not active. > >> } >> >> @@ -182,7 +190,8 @@ int msi_init(struct PCIDevice *dev, uint8_t offset, >> } >> >> cap_size = msi_cap_sizeof(flags); >> - config_offset = pci_add_capability(dev, PCI_CAP_ID_MSI, offset, cap_size); >> + config_offset = pci_add_capability2(dev, PCI_CAP_ID_MSI, offset, >> + cap_size, errp); > > pci_add_capability() is a wrapper around pci_add_capability2() that > additionally reports errors with error_report_err(). This is what makes > it unsuitable for realize(). > >> if (config_offset < 0) { >> return config_offset; > > Inherits failing modes from pci_add_capability2(). Two: out of PCI > config space (-ENOSPC), and capability overlap (-EINVAL). > > Question for the PCI guys: how can these happen? When they happen, is > it a programming error? out of PCI config space: a device emulation error, not enough room for all its capabilities - it seems to be a programming error. capability overlap: is for device assignment. This checks for a real HW that is broke. - not a programming error. > >> } >> @@ -205,6 +214,7 @@ int msi_init(struct PCIDevice *dev, uint8_t offset, >> pci_set_long(dev->wmask + msi_mask_off(dev, msi64bit), >> 0xffffffff >> (PCI_MSI_VECTORS_MAX - nr_vectors)); >> } >> + >> return config_offset; >> } >> Thanks, Marcel [...]
On Thu, Mar 03, 2016 at 12:12:27PM +0200, Marcel Apfelbaum wrote: > >>+int msi_init(struct PCIDevice *dev, uint8_t offset, unsigned int nr_vectors, > >>+ bool msi64bit, bool msi_per_vector_mask, Error **errp) > >> { > >> unsigned int vectors_order; > >>- uint16_t flags; > >>+ uint16_t flags; /* Message Control register value */ > >> uint8_t cap_size; > >> int config_offset; > >> > >> if (!msi_supported) { > >>+ error_setg(errp, "MSI is not supported by interrupt controller"); > >> return -ENOTSUP; > > > >First failure mode: board does not support MSI (-ENOTSUP). > > > >Question to the PCI guys: why is this even an error? A device with > >capability MSI should work just fine in such a board. > > Hi Markus, > > Adding Jan Kiszka, maybe he can help. > > That's a fair question. Is there any history for this decision? > The board not supporting MSI has nothing to do with the capability being there. > The HW should not change because the board doe not support it. > > The capability should be present but not active. Digging in git log will tell you why we have the msi_supported flag: commit 02eb84d0ec97f183ac23ee939403a139e8849b1d ("qemu/pci: MSI-X support functions") This is a safety measure to avoid breaking platforms which should support MSI-X but currently lack this in the interrupt controller emulation. in other words, on some platforms we must hide msi support from devices because otherwise guests will try to use it, and our emulation is incomplete. And the conclusion from that is that for msi_init to fail silently is at the moment the right thing to do. The only other reason for it to fail is pci config space corruption, this probably never happens in practice.
On 03/03/2016 12:45 PM, Michael S. Tsirkin wrote: > On Thu, Mar 03, 2016 at 12:12:27PM +0200, Marcel Apfelbaum wrote: >>>> +int msi_init(struct PCIDevice *dev, uint8_t offset, unsigned int nr_vectors, >>>> + bool msi64bit, bool msi_per_vector_mask, Error **errp) >>>> { >>>> unsigned int vectors_order; >>>> - uint16_t flags; >>>> + uint16_t flags; /* Message Control register value */ >>>> uint8_t cap_size; >>>> int config_offset; >>>> >>>> if (!msi_supported) { >>>> + error_setg(errp, "MSI is not supported by interrupt controller"); >>>> return -ENOTSUP; >>> >>> First failure mode: board does not support MSI (-ENOTSUP). >>> >>> Question to the PCI guys: why is this even an error? A device with >>> capability MSI should work just fine in such a board. >> >> Hi Markus, >> >> Adding Jan Kiszka, maybe he can help. >> >> That's a fair question. Is there any history for this decision? >> The board not supporting MSI has nothing to do with the capability being there. >> The HW should not change because the board doe not support it. >> >> The capability should be present but not active. > > Digging in git log will tell you why we have the msi_supported flag: > > commit 02eb84d0ec97f183ac23ee939403a139e8849b1d ("qemu/pci: MSI-X support functions") > > This is a safety measure to avoid breaking platforms which should support > MSI-X but currently lack this in the interrupt controller emulation. > > in other words, on some platforms we must hide msi support from devices > because otherwise guests will try to use it, and our emulation is > incomplete. OK, thanks. So the flag should be "msi_broken" or "msi_present_but_not_implemented" and not "msi_supported" that leads (at least me) to the assumption that some platform *does not support msi* rather than it supports it, but we don't emulate it. > > And the conclusion from that is that for msi_init to fail silently is > at the moment the right thing to do. But this is not the only thing we do, we are modifying the PCI devices. We could fail starting the VM if a device supporting MSI is added on a platform with broken msi, but this will prevent us to use assigned devices. Emulated devices should be created with a specific "msi=off" flag. Thanks, Marcel > > The only other reason for it to fail is pci config space corruption, > this probably never happens in practice. >
On Thu, Mar 03, 2016 at 01:19:09PM +0200, Marcel Apfelbaum wrote: > On 03/03/2016 12:45 PM, Michael S. Tsirkin wrote: > >On Thu, Mar 03, 2016 at 12:12:27PM +0200, Marcel Apfelbaum wrote: > >>>>+int msi_init(struct PCIDevice *dev, uint8_t offset, unsigned int nr_vectors, > >>>>+ bool msi64bit, bool msi_per_vector_mask, Error **errp) > >>>> { > >>>> unsigned int vectors_order; > >>>>- uint16_t flags; > >>>>+ uint16_t flags; /* Message Control register value */ > >>>> uint8_t cap_size; > >>>> int config_offset; > >>>> > >>>> if (!msi_supported) { > >>>>+ error_setg(errp, "MSI is not supported by interrupt controller"); > >>>> return -ENOTSUP; > >>> > >>>First failure mode: board does not support MSI (-ENOTSUP). > >>> > >>>Question to the PCI guys: why is this even an error? A device with > >>>capability MSI should work just fine in such a board. > >> > >>Hi Markus, > >> > >>Adding Jan Kiszka, maybe he can help. > >> > >>That's a fair question. Is there any history for this decision? > >>The board not supporting MSI has nothing to do with the capability being there. > >>The HW should not change because the board doe not support it. > >> > >>The capability should be present but not active. > > > >Digging in git log will tell you why we have the msi_supported flag: > > > >commit 02eb84d0ec97f183ac23ee939403a139e8849b1d ("qemu/pci: MSI-X support functions") > > > > This is a safety measure to avoid breaking platforms which should support > > MSI-X but currently lack this in the interrupt controller emulation. > > > >in other words, on some platforms we must hide msi support from devices > >because otherwise guests will try to use it, and our emulation is > >incomplete. > > > OK, thanks. So the flag should be "msi_broken" or "msi_present_but_not_implemented" and not > "msi_supported" that leads (at least me) to the assumption that some platform *does not support msi* > rather than it supports it, but we don't emulate it. > > > > > >And the conclusion from that is that for msi_init to fail silently is > >at the moment the right thing to do. > > But this is not the only thing we do, we are modifying the PCI devices. We could fail starting the VM > if a device supporting MSI is added on a platform with broken msi, but this will prevent us to use > assigned devices. Emulated devices should be created with a specific "msi=off" flag. > > Thanks, > Marcel That will just break a bunch of valid configurations, for no real benefit to users. > > > >The only other reason for it to fail is pci config space corruption, > >this probably never happens in practice. > >
Marcel Apfelbaum <marcel@redhat.com> writes: > On 03/02/2016 11:13 AM, Markus Armbruster wrote: >> This got lost over the Christmas break, sorry. >> >> Cc'ing Marcel for additional PCI expertise. >> >> Cao jin <caoj.fnst@cn.fujitsu.com> writes: >> >>> msi_init() is a supporting function in PCI device initialization, >>> in order to convert .init() to .realize(), it should be modified first. >> >> "Supporting function" doesn't imply "should use Error to report >> errors". HACKING explains: >> >> Use the simplest suitable method to communicate success / failure to >> callers. Stick to common methods: non-negative on success / -1 on >> error, non-negative / -errno, non-null / null, or Error objects. >> >> Example: when a function returns a non-null pointer on success, and it >> can fail only in one way (as far as the caller is concerned), returning >> null on failure is just fine, and certainly simpler and a lot easier on >> the eyes than propagating an Error object through an Error ** parameter. >> >> Example: when a function's callers need to report details on failure >> only the function really knows, use Error **, and set suitable errors. >> >> Do not report an error to the user when you're also returning an error >> for somebody else to handle. Leave the reporting to the place that >> consumes the error returned. >> >> As we'll see in your patch of msi.c, msi_init() can fail in multiple >> ways, and uses -errno to comunicate them. That can be okay even in >> realize(). It also reports to the user. That's what makes it >> unsuitable for realize(). >> >>> Also modify the callers >> >> Actually, you're *fixing* callers! But the bugs aren't 100% clear, yet, >> see below for details. Once we know what the bugs are, we'll want to >> reword the commit message to describe the bugs and their impact. >> >> I recommend to skip ahead to msi.c, then come back to the device models. >> >>> Bonus: add more comment for msi_init(). >>> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> >>> --- >>> hw/audio/intel-hda.c | 10 ++++- >>> hw/ide/ich.c | 2 +- >>> hw/net/vmxnet3.c | 13 +++--- >>> hw/pci-bridge/ioh3420.c | 7 +++- >>> hw/pci-bridge/pci_bridge_dev.c | 8 +++- >>> hw/pci-bridge/xio3130_downstream.c | 8 +++- >>> hw/pci-bridge/xio3130_upstream.c | 8 +++- >>> hw/pci/msi.c | 18 +++++++-- >>> hw/scsi/megasas.c | 15 +++++-- >>> hw/scsi/vmw_pvscsi.c | 13 ++++-- >>> hw/usb/hcd-xhci.c | 81 +++++++++++++++++++++----------------- >>> hw/vfio/pci.c | 20 +++++----- >>> include/hw/pci/msi.h | 4 +- >>> 13 files changed, 135 insertions(+), 72 deletions(-) >>> > > [...] > >> Except I'm not sure the function should fail in the first place! See >> below. >> >>> +int msi_init(struct PCIDevice *dev, uint8_t offset, unsigned int nr_vectors, >>> + bool msi64bit, bool msi_per_vector_mask, Error **errp) >>> { >>> unsigned int vectors_order; >>> - uint16_t flags; >>> + uint16_t flags; /* Message Control register value */ >>> uint8_t cap_size; >>> int config_offset; >>> >>> if (!msi_supported) { >>> + error_setg(errp, "MSI is not supported by interrupt controller"); >>> return -ENOTSUP; >> >> First failure mode: board does not support MSI (-ENOTSUP). >> >> Question to the PCI guys: why is this even an error? A device with >> capability MSI should work just fine in such a board. > > Hi Markus, > > Adding Jan Kiszka, maybe he can help. > > That's a fair question. Is there any history for this decision? > The board not supporting MSI has nothing to do with the capability being there. > The HW should not change because the board doe not support it. > > The capability should be present but not active. > >> >>> } >>> >>> @@ -182,7 +190,8 @@ int msi_init(struct PCIDevice *dev, uint8_t offset, >>> } >>> >>> cap_size = msi_cap_sizeof(flags); >>> - config_offset = pci_add_capability(dev, PCI_CAP_ID_MSI, offset, cap_size); >>> + config_offset = pci_add_capability2(dev, PCI_CAP_ID_MSI, offset, >>> + cap_size, errp); >> >> pci_add_capability() is a wrapper around pci_add_capability2() that >> additionally reports errors with error_report_err(). This is what makes >> it unsuitable for realize(). >> >>> if (config_offset < 0) { >>> return config_offset; >> >> Inherits failing modes from pci_add_capability2(). Two: out of PCI >> config space (-ENOSPC), and capability overlap (-EINVAL). >> >> Question for the PCI guys: how can these happen? When they happen, is >> it a programming error? > > out of PCI config space: a device emulation error, not enough room > for all its capabilities - it seems to be a programming error. Programming error should be an assertion failure, not falling to a variant of the device the user didn't order and that might not even exist in the real world. > capability overlap: is for device assignment. This checks for a real HW > that is broke. - not a programming error. Okay. Thanks! [...]
"Michael S. Tsirkin" <mst@redhat.com> writes: > On Thu, Mar 03, 2016 at 01:19:09PM +0200, Marcel Apfelbaum wrote: >> On 03/03/2016 12:45 PM, Michael S. Tsirkin wrote: >> >On Thu, Mar 03, 2016 at 12:12:27PM +0200, Marcel Apfelbaum wrote: >> >>>>+int msi_init(struct PCIDevice *dev, uint8_t offset, unsigned int nr_vectors, >> >>>>+ bool msi64bit, bool msi_per_vector_mask, Error **errp) >> >>>> { >> >>>> unsigned int vectors_order; >> >>>>- uint16_t flags; >> >>>>+ uint16_t flags; /* Message Control register value */ >> >>>> uint8_t cap_size; >> >>>> int config_offset; >> >>>> >> >>>> if (!msi_supported) { >> >>>>+ error_setg(errp, "MSI is not supported by interrupt controller"); >> >>>> return -ENOTSUP; >> >>> >> >>>First failure mode: board does not support MSI (-ENOTSUP). >> >>> >> >>>Question to the PCI guys: why is this even an error? A device with >> >>>capability MSI should work just fine in such a board. >> >> >> >>Hi Markus, >> >> >> >>Adding Jan Kiszka, maybe he can help. >> >> >> >>That's a fair question. Is there any history for this decision? >> >>The board not supporting MSI has nothing to do with the capability being there. >> >>The HW should not change because the board doe not support it. >> >> >> >>The capability should be present but not active. >> > >> >Digging in git log will tell you why we have the msi_supported flag: >> > >> >commit 02eb84d0ec97f183ac23ee939403a139e8849b1d ("qemu/pci: MSI-X support functions") >> > >> > This is a safety measure to avoid breaking platforms which should support >> > MSI-X but currently lack this in the interrupt controller emulation. >> > >> >in other words, on some platforms we must hide msi support from devices >> >because otherwise guests will try to use it, and our emulation is >> >incomplete. >> >> >> OK, thanks. So the flag should be "msi_broken" or "msi_present_but_not_implemented" and not >> "msi_supported" that leads (at least me) to the assumption that some platform *does not support msi* >> rather than it supports it, but we don't emulate it. I agree the name is badly misleading for this role. Now let me see how this contraption actually works. msi_supported is global, initialized to false, and becomes globally true when 1. certain MSI-capable interrupt controllers realize: "apic", "kvm-apic" if kvm_has_gsi_routing(), "xen-apic", "arm-gicv2m", "openpic" models 1 and 2, "kvm-openpic" models 1 and 2 2. "s390-pcihost" class-initializes 3. machine "spapr-machine" initializes Issues: * "Global is problematic. What if a board has more than one interrupt controller? What if one of them sets msi_supported, but the other one is of the kind Michael described, i.e. guests know it has MSI, but our emulation doesn't actually work? * "Initialize to false" is problematic. We don't clear msi_supported when we have a broken interrupt controler, we set it when we have a working one. The consequence is that boards with non-MSI interrupt controllers are treated just like boards with broken interrupt controllers. Here's how msi_supported is documented: /* Flag for interrupt controller to declare MSI/MSI-X support */ bool msi_supported; This is matches how the code works. However, it contradicts the commit message Michael quoted. The most plausible explanation is that the commit is flawed. * Class-initialize (2.) looks wrong to me. msi_supported becomes true when QOM type "s390-pcihost" is created, regardless of whether instances get created, let alone used. * I'm not sure about 3., but the spapr guys can worry about that. >> >And the conclusion from that is that for msi_init to fail silently is >> >at the moment the right thing to do. >> >> But this is not the only thing we do, we are modifying the PCI devices. We could fail starting the VM >> if a device supporting MSI is added on a platform with broken msi, but this will prevent us to use >> assigned devices. Emulated devices should be created with a specific "msi=off" flag. >> >> Thanks, >> Marcel > > That will just break a bunch of valid configurations, for no real > benefit to users. I disagree, strongly. If I ask for msi=on, then QEMU should either give it to me or fail, not silently "correct" my configuration. Of course, if we have msi_supported = false for everybody and its dog whether it's really needed or not, "or fail" will indeed reject "a bunch of valid configurations". We "compensate" by silently messing with the user's configuration. That's shoddy workmanship, sorry. We should instead have msi_supported = false only when it's actually needed, and thus reject exactly the configurations that won't work. >> > >> >The only other reason for it to fail is pci config space corruption, >> >this probably never happens in practice. >> >
On Thu, Mar 03, 2016 at 04:03:16PM +0100, Markus Armbruster wrote: > "Michael S. Tsirkin" <mst@redhat.com> writes: > > > On Thu, Mar 03, 2016 at 01:19:09PM +0200, Marcel Apfelbaum wrote: > >> On 03/03/2016 12:45 PM, Michael S. Tsirkin wrote: > >> >On Thu, Mar 03, 2016 at 12:12:27PM +0200, Marcel Apfelbaum wrote: > >> >>>>+int msi_init(struct PCIDevice *dev, uint8_t offset, unsigned int nr_vectors, > >> >>>>+ bool msi64bit, bool msi_per_vector_mask, Error **errp) > >> >>>> { > >> >>>> unsigned int vectors_order; > >> >>>>- uint16_t flags; > >> >>>>+ uint16_t flags; /* Message Control register value */ > >> >>>> uint8_t cap_size; > >> >>>> int config_offset; > >> >>>> > >> >>>> if (!msi_supported) { > >> >>>>+ error_setg(errp, "MSI is not supported by interrupt controller"); > >> >>>> return -ENOTSUP; > >> >>> > >> >>>First failure mode: board does not support MSI (-ENOTSUP). > >> >>> > >> >>>Question to the PCI guys: why is this even an error? A device with > >> >>>capability MSI should work just fine in such a board. > >> >> > >> >>Hi Markus, > >> >> > >> >>Adding Jan Kiszka, maybe he can help. > >> >> > >> >>That's a fair question. Is there any history for this decision? > >> >>The board not supporting MSI has nothing to do with the capability being there. > >> >>The HW should not change because the board doe not support it. > >> >> > >> >>The capability should be present but not active. > >> > > >> >Digging in git log will tell you why we have the msi_supported flag: > >> > > >> >commit 02eb84d0ec97f183ac23ee939403a139e8849b1d ("qemu/pci: MSI-X support functions") > >> > > >> > This is a safety measure to avoid breaking platforms which should support > >> > MSI-X but currently lack this in the interrupt controller emulation. > >> > > >> >in other words, on some platforms we must hide msi support from devices > >> >because otherwise guests will try to use it, and our emulation is > >> >incomplete. > >> > >> > >> OK, thanks. So the flag should be "msi_broken" or "msi_present_but_not_implemented" and not > >> "msi_supported" that leads (at least me) to the assumption that some platform *does not support msi* > >> rather than it supports it, but we don't emulate it. > > I agree the name is badly misleading for this role. > > Now let me see how this contraption actually works. msi_supported is > global, initialized to false, and becomes globally true when > > 1. certain MSI-capable interrupt controllers realize: "apic", > "kvm-apic" if kvm_has_gsi_routing(), "xen-apic", "arm-gicv2m", > "openpic" models 1 and 2, "kvm-openpic" models 1 and 2 > > 2. "s390-pcihost" class-initializes > > 3. machine "spapr-machine" initializes > > Issues: > > * "Global is problematic. What if a board has more than one interrupt > controller? What if one of them sets msi_supported, but the other one > is of the kind Michael described, i.e. guests know it has MSI, but our > emulation doesn't actually work? > > * "Initialize to false" is problematic. We don't clear msi_supported > when we have a broken interrupt controler, we set it when we have a > working one. The consequence is that boards with non-MSI interrupt > controllers are treated just like boards with broken interrupt > controllers. > > Here's how msi_supported is documented: > > /* Flag for interrupt controller to declare MSI/MSI-X support */ > bool msi_supported; > > This is matches how the code works. However, it contradicts the > commit message Michael quoted. The most plausible explanation is that > the commit is flawed. > > * Class-initialize (2.) looks wrong to me. msi_supported becomes true > when QOM type "s390-pcihost" is created, regardless of whether > instances get created, let alone used. > > * I'm not sure about 3., but the spapr guys can worry about that. > > >> >And the conclusion from that is that for msi_init to fail silently is > >> >at the moment the right thing to do. > >> > >> But this is not the only thing we do, we are modifying the PCI devices. We could fail starting the VM > >> if a device supporting MSI is added on a platform with broken msi, but this will prevent us to use > >> assigned devices. Emulated devices should be created with a specific "msi=off" flag. > >> > >> Thanks, > >> Marcel > > > > That will just break a bunch of valid configurations, for no real > > benefit to users. > > I disagree, strongly. > > If I ask for msi=on, then QEMU should either give it to me or fail, not > silently "correct" my configuration. > > Of course, if we have msi_supported = false for everybody and its dog > whether it's really needed or not, "or fail" will indeed reject "a bunch > of valid configurations". We "compensate" by silently messing with the > user's configuration. That's shoddy workmanship, sorry. > > We should instead have msi_supported = false only when it's actually > needed, and thus reject exactly the configurations that won't work. Most people don't care one way or the other. We started without msi. We later added msi for some boards only. Asking everyone to add msi=off for all other boards at that point would have been silly. > >> > > >> >The only other reason for it to fail is pci config space corruption, > >> >this probably never happens in practice. > >> >
"Michael S. Tsirkin" <mst@redhat.com> writes: > On Thu, Mar 03, 2016 at 04:03:16PM +0100, Markus Armbruster wrote: >> "Michael S. Tsirkin" <mst@redhat.com> writes: >> >> > On Thu, Mar 03, 2016 at 01:19:09PM +0200, Marcel Apfelbaum wrote: >> >> On 03/03/2016 12:45 PM, Michael S. Tsirkin wrote: >> >> >On Thu, Mar 03, 2016 at 12:12:27PM +0200, Marcel Apfelbaum wrote: >> >> >>>>+int msi_init(struct PCIDevice *dev, uint8_t offset, unsigned int nr_vectors, >> >> >>>>+ bool msi64bit, bool msi_per_vector_mask, Error **errp) >> >> >>>> { >> >> >>>> unsigned int vectors_order; >> >> >>>>- uint16_t flags; >> >> >>>>+ uint16_t flags; /* Message Control register value */ >> >> >>>> uint8_t cap_size; >> >> >>>> int config_offset; >> >> >>>> >> >> >>>> if (!msi_supported) { >> >> >>>>+ error_setg(errp, "MSI is not supported by interrupt controller"); >> >> >>>> return -ENOTSUP; >> >> >>> >> >> >>>First failure mode: board does not support MSI (-ENOTSUP). >> >> >>> >> >> >>>Question to the PCI guys: why is this even an error? A device with >> >> >>>capability MSI should work just fine in such a board. >> >> >> >> >> >>Hi Markus, >> >> >> >> >> >>Adding Jan Kiszka, maybe he can help. >> >> >> >> >> >>That's a fair question. Is there any history for this decision? >> >> >>The board not supporting MSI has nothing to do with the capability being there. >> >> >>The HW should not change because the board doe not support it. >> >> >> >> >> >>The capability should be present but not active. >> >> > >> >> >Digging in git log will tell you why we have the msi_supported flag: >> >> > >> >> >commit 02eb84d0ec97f183ac23ee939403a139e8849b1d ("qemu/pci: MSI-X support functions") >> >> > >> >> > This is a safety measure to avoid breaking platforms which should support >> >> > MSI-X but currently lack this in the interrupt controller emulation. >> >> > >> >> >in other words, on some platforms we must hide msi support from devices >> >> >because otherwise guests will try to use it, and our emulation is >> >> >incomplete. >> >> >> >> >> >> OK, thanks. So the flag should be "msi_broken" or "msi_present_but_not_implemented" and not >> >> "msi_supported" that leads (at least me) to the assumption that some platform *does not support msi* >> >> rather than it supports it, but we don't emulate it. >> >> I agree the name is badly misleading for this role. >> >> Now let me see how this contraption actually works. msi_supported is >> global, initialized to false, and becomes globally true when >> >> 1. certain MSI-capable interrupt controllers realize: "apic", >> "kvm-apic" if kvm_has_gsi_routing(), "xen-apic", "arm-gicv2m", >> "openpic" models 1 and 2, "kvm-openpic" models 1 and 2 >> >> 2. "s390-pcihost" class-initializes >> >> 3. machine "spapr-machine" initializes >> >> Issues: >> >> * "Global is problematic. What if a board has more than one interrupt >> controller? What if one of them sets msi_supported, but the other one >> is of the kind Michael described, i.e. guests know it has MSI, but our >> emulation doesn't actually work? >> >> * "Initialize to false" is problematic. We don't clear msi_supported >> when we have a broken interrupt controler, we set it when we have a >> working one. The consequence is that boards with non-MSI interrupt >> controllers are treated just like boards with broken interrupt >> controllers. >> >> Here's how msi_supported is documented: >> >> /* Flag for interrupt controller to declare MSI/MSI-X support */ >> bool msi_supported; >> >> This is matches how the code works. However, it contradicts the >> commit message Michael quoted. The most plausible explanation is that >> the commit is flawed. >> >> * Class-initialize (2.) looks wrong to me. msi_supported becomes true >> when QOM type "s390-pcihost" is created, regardless of whether >> instances get created, let alone used. >> >> * I'm not sure about 3., but the spapr guys can worry about that. >> >> >> >And the conclusion from that is that for msi_init to fail silently is >> >> >at the moment the right thing to do. >> >> >> >> But this is not the only thing we do, we are modifying the PCI devices. We could fail starting the VM >> >> if a device supporting MSI is added on a platform with broken msi, but this will prevent us to use >> >> assigned devices. Emulated devices should be created with a specific "msi=off" flag. >> >> >> >> Thanks, >> >> Marcel >> > >> > That will just break a bunch of valid configurations, for no real >> > benefit to users. >> >> I disagree, strongly. >> >> If I ask for msi=on, then QEMU should either give it to me or fail, not >> silently "correct" my configuration. >> >> Of course, if we have msi_supported = false for everybody and its dog >> whether it's really needed or not, "or fail" will indeed reject "a bunch >> of valid configurations". We "compensate" by silently messing with the >> user's configuration. That's shoddy workmanship, sorry. >> >> We should instead have msi_supported = false only when it's actually >> needed, and thus reject exactly the configurations that won't work. > > Most people don't care one way or the other. We started without msi. > We later added msi for some boards only. Asking everyone to add msi=off > for all other boards at that point would have been silly. I'm afraid you're missing my point entirely. Yes, we started without MSI. We then added MSI support to some devices and to some boards. Exactly the same history as for physical hardware. Plugging an MSI-capable device into an MSI-incapable board works just fine, both for physical and for virtual hardware. What doesn't work is plugging an MSI-capable device into an MSI-capable board with *broken* MSI support. As a convenience feature, we summarily and silently remove a device's MSI capability when we detect such a broken board. At least that's what the commit message you quoted claims. In reality, we remove it not just for broken boards, but even for MSI-incapable boards. I take issue with "summarily and silently", and "even for MSI-incapable boards". When multiple variants of a device exist, and the user didn't ask for a specific one, then picking the one that works best with the board is just fine. It's absolutely not fine when the user did ask for a specific one. When I ask for msi=on, I mean it. If it can't work with this board, tell me. But silently ignoring my orders is a bug. It's fine to have emulations of MSI-capable boards where MSI doesn't yet work. Even if that means we have to reject MSI-capable devices. It's absolutely not fine to reject them for MSI-incapable boards, where they'd work just fine. Furthermore, I doubt the wisdom of creating virtual devices that don't exist physically just to have something that works in our broken boards. If the physical device exists in MSI-capable and MSI-incapable variants, emulating both is fine. But if it only ever existed MSI-capable, having the PCI core(!) drop the MSI capability is a questionable idea. I suspect that the need for this dubious hack would be much smaller if we didn't foolishly treat every MSI-incapable board as broken MSI-capable board. If you conclude that cleaning up this carelessly made mess is not worth the bother now, then at least explain the mess in the code, please. It's not obvious, and figuring out what's going on and why it is the way it is has taken me several hours, and Marcel's help. [...]
On 03/02/2016 05:13 PM, Markus Armbruster wrote: > This got lost over the Christmas break, sorry. > > Cc'ing Marcel for additional PCI expertise. > > Cao jin <caoj.fnst@cn.fujitsu.com> writes: > >> msi_init() is a supporting function in PCI device initialization, >> in order to convert .init() to .realize(), it should be modified first. > > "Supporting function" doesn't imply "should use Error to report > errors". HACKING explains: > > Use the simplest suitable method to communicate success / failure to > callers. Stick to common methods: non-negative on success / -1 on > error, non-negative / -errno, non-null / null, or Error objects. > > Example: when a function returns a non-null pointer on success, and it > can fail only in one way (as far as the caller is concerned), returning > null on failure is just fine, and certainly simpler and a lot easier on > the eyes than propagating an Error object through an Error ** parameter. > > Example: when a function's callers need to report details on failure > only the function really knows, use Error **, and set suitable errors. > > Do not report an error to the user when you're also returning an error > for somebody else to handle. Leave the reporting to the place that > consumes the error returned. > Really appreciate your review, I just finished reading all the comments and discussion. Seems pci_add_capability2()(commit cd9aa33e introduced) doesn`t follow the new error reporting rule(report error while also return error). So I am thinking, could we revert commit cd9aa33e, let pci_add_capability() return error code and assert when out of pci space, and let caller(only assigned device, others could ignore the error) handle the error code(new a error object, propagate it) Hope to hear PCI Maintainer`s advice(So I don`t cc other in this round)
Cao jin <caoj.fnst@cn.fujitsu.com> writes: > On 03/02/2016 05:13 PM, Markus Armbruster wrote: >> This got lost over the Christmas break, sorry. >> >> Cc'ing Marcel for additional PCI expertise. >> >> Cao jin <caoj.fnst@cn.fujitsu.com> writes: >> >>> msi_init() is a supporting function in PCI device initialization, >>> in order to convert .init() to .realize(), it should be modified first. >> >> "Supporting function" doesn't imply "should use Error to report >> errors". HACKING explains: >> >> Use the simplest suitable method to communicate success / failure to >> callers. Stick to common methods: non-negative on success / -1 on >> error, non-negative / -errno, non-null / null, or Error objects. >> >> Example: when a function returns a non-null pointer on success, and it >> can fail only in one way (as far as the caller is concerned), returning >> null on failure is just fine, and certainly simpler and a lot easier on >> the eyes than propagating an Error object through an Error ** parameter. >> >> Example: when a function's callers need to report details on failure >> only the function really knows, use Error **, and set suitable errors. >> >> Do not report an error to the user when you're also returning an error >> for somebody else to handle. Leave the reporting to the place that >> consumes the error returned. >> > > Really appreciate your review, I just finished reading all the > comments and discussion. > > Seems pci_add_capability2()(commit cd9aa33e introduced) doesn`t follow > the new error reporting rule(report error while also return error). Misunderstanding? "Report an error to the user" means error_report() and such. error_setg() doesn't report to the user, it returns an error object to the caller. > So I am thinking, could we revert commit cd9aa33e, let > pci_add_capability() return error code and assert when out of pci > space, and let caller(only assigned device, others could ignore the > error) handle the error code(new a error object, propagate it) > > Hope to hear PCI Maintainer`s advice(So I don`t cc other in this round)
On 03/23/2016 04:12 PM, Markus Armbruster wrote: > Cao jin <caoj.fnst@cn.fujitsu.com> writes: > >> >> Really appreciate your review, I just finished reading all the >> comments and discussion. >> >> Seems pci_add_capability2()(commit cd9aa33e introduced) doesn`t follow >> the new error reporting rule(report error while also return error). > > Misunderstanding? > > "Report an error to the user" means error_report() and such. > error_setg() doesn't report to the user, it returns an error object to > the caller. > ah...thanks for correcting me. >> So I am thinking, could we revert commit cd9aa33e, let >> pci_add_capability() return error code and assert when out of pci >> space, and let caller(only assigned device, others could ignore the >> error) handle the error code(new a error object, propagate it) >> >> Hope to hear PCI Maintainer`s advice(So I don`t cc other in this round) >
diff --git a/hw/audio/intel-hda.c b/hw/audio/intel-hda.c index 433463e..0d770131 100644 --- a/hw/audio/intel-hda.c +++ b/hw/audio/intel-hda.c @@ -1130,6 +1130,7 @@ static void intel_hda_realize(PCIDevice *pci, Error **errp) { IntelHDAState *d = INTEL_HDA(pci); uint8_t *conf = d->pci.config; + int ret; d->name = object_get_typename(OBJECT(d)); @@ -1142,11 +1143,18 @@ static void intel_hda_realize(PCIDevice *pci, Error **errp) "intel-hda", 0x4000); pci_register_bar(&d->pci, 0, 0, &d->mmio); if (d->msi) { - msi_init(&d->pci, d->old_msi_addr ? 0x50 : 0x60, 1, true, false); + ret = msi_init(&d->pci, d->old_msi_addr ? 0x50 : 0x60, 1, true, + false, errp); + if (ret < 0) { + goto cleanup_on_msi_fail; + } } hda_codec_bus_init(DEVICE(pci), &d->codecs, sizeof(d->codecs), intel_hda_response, intel_hda_xfer); + +cleanup_on_msi_fail: + object_unref(OBJECT(&d->mmio)); } static void intel_hda_exit(PCIDevice *pci) diff --git a/hw/ide/ich.c b/hw/ide/ich.c index 16925fa..94b1809 100644 --- a/hw/ide/ich.c +++ b/hw/ide/ich.c @@ -145,7 +145,7 @@ static void pci_ich9_ahci_realize(PCIDevice *dev, Error **errp) /* Although the AHCI 1.3 specification states that the first capability * should be PMCAP, the Intel ICH9 data sheet specifies that the ICH9 * AHCI device puts the MSI capability first, pointing to 0x80. */ - msi_init(dev, ICH9_MSI_CAP_OFFSET, 1, true, false); + msi_init(dev, ICH9_MSI_CAP_OFFSET, 1, true, false, errp); } static void pci_ich9_uninit(PCIDevice *dev) diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c index 37373e5..c373e77 100644 --- a/hw/net/vmxnet3.c +++ b/hw/net/vmxnet3.c @@ -2135,13 +2135,13 @@ vmxnet3_cleanup_msix(VMXNET3State *s) #define VMXNET3_PER_VECTOR_MASK (false) static bool -vmxnet3_init_msi(VMXNET3State *s) +vmxnet3_init_msi(VMXNET3State *s, Error **errp) { PCIDevice *d = PCI_DEVICE(s); int res; res = msi_init(d, VMXNET3_MSI_OFFSET, VMXNET3_MAX_NMSIX_INTRS, - VMXNET3_USE_64BIT, VMXNET3_PER_VECTOR_MASK); + VMXNET3_USE_64BIT, VMXNET3_PER_VECTOR_MASK, errp); if (0 > res) { VMW_WRPRN("Failed to initialize MSI, error %d", res); s->msi_used = false; @@ -2204,6 +2204,11 @@ static void vmxnet3_pci_realize(PCIDevice *pci_dev, Error **errp) VMW_CBPRN("Starting init..."); + if (!vmxnet3_init_msi(s, errp)) { + VMW_WRPRN("Failed to initialize MSI, configuration is inconsistent."); + return; + } + memory_region_init_io(&s->bar0, OBJECT(s), &b0_ops, s, "vmxnet3-b0", VMXNET3_PT_REG_SIZE); pci_register_bar(pci_dev, VMXNET3_BAR0_IDX, @@ -2228,10 +2233,6 @@ static void vmxnet3_pci_realize(PCIDevice *pci_dev, Error **errp) VMW_WRPRN("Failed to initialize MSI-X, configuration is inconsistent."); } - if (!vmxnet3_init_msi(s)) { - VMW_WRPRN("Failed to initialize MSI, configuration is inconsistent."); - } - vmxnet3_net_init(s); register_savevm(dev, "vmxnet3-msix", -1, 1, diff --git a/hw/pci-bridge/ioh3420.c b/hw/pci-bridge/ioh3420.c index eead195..5df9e83 100644 --- a/hw/pci-bridge/ioh3420.c +++ b/hw/pci-bridge/ioh3420.c @@ -96,6 +96,7 @@ static int ioh3420_initfn(PCIDevice *d) PCIEPort *p = PCIE_PORT(d); PCIESlot *s = PCIE_SLOT(d); int rc; + Error *local_err = NULL; pci_bridge_initfn(d, TYPE_PCIE_BUS); pcie_port_init_reg(d); @@ -105,12 +106,16 @@ static int ioh3420_initfn(PCIDevice *d) if (rc < 0) { goto err_bridge; } + rc = msi_init(d, IOH_EP_MSI_OFFSET, IOH_EP_MSI_NR_VECTOR, IOH_EP_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT, - IOH_EP_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT); + IOH_EP_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT, + &local_err); if (rc < 0) { + error_report_err(local_err); goto err_bridge; } + rc = pcie_cap_init(d, IOH_EP_EXP_OFFSET, PCI_EXP_TYPE_ROOT_PORT, p->port); if (rc < 0) { goto err_msi; diff --git a/hw/pci-bridge/pci_bridge_dev.c b/hw/pci-bridge/pci_bridge_dev.c index bc3e1b7..bafaeeb 100644 --- a/hw/pci-bridge/pci_bridge_dev.c +++ b/hw/pci-bridge/pci_bridge_dev.c @@ -51,6 +51,7 @@ static int pci_bridge_dev_initfn(PCIDevice *dev) PCIBridge *br = PCI_BRIDGE(dev); PCIBridgeDev *bridge_dev = PCI_BRIDGE_DEV(dev); int err; + Error *local_err = NULL; pci_bridge_initfn(dev, TYPE_PCI_BUS); @@ -66,17 +67,21 @@ static int pci_bridge_dev_initfn(PCIDevice *dev) /* MSI is not applicable without SHPC */ bridge_dev->flags &= ~(1 << PCI_BRIDGE_DEV_F_MSI_REQ); } + err = slotid_cap_init(dev, 0, bridge_dev->chassis_nr, 0); if (err) { goto slotid_error; } + if ((bridge_dev->flags & (1 << PCI_BRIDGE_DEV_F_MSI_REQ)) && msi_supported) { - err = msi_init(dev, 0, 1, true, true); + err = msi_init(dev, 0, 1, true, true, &local_err); if (err < 0) { + error_report_err(local_err); goto msi_error; } } + if (shpc_present(dev)) { /* TODO: spec recommends using 64 bit prefetcheable BAR. * Check whether that works well. */ @@ -84,6 +89,7 @@ static int pci_bridge_dev_initfn(PCIDevice *dev) PCI_BASE_ADDRESS_MEM_TYPE_64, &bridge_dev->bar); } return 0; + msi_error: slotid_cap_cleanup(dev); slotid_error: diff --git a/hw/pci-bridge/xio3130_downstream.c b/hw/pci-bridge/xio3130_downstream.c index b4dd25f..3fc7455 100644 --- a/hw/pci-bridge/xio3130_downstream.c +++ b/hw/pci-bridge/xio3130_downstream.c @@ -59,21 +59,26 @@ static int xio3130_downstream_initfn(PCIDevice *d) PCIEPort *p = PCIE_PORT(d); PCIESlot *s = PCIE_SLOT(d); int rc; + Error *local_err = NULL; pci_bridge_initfn(d, TYPE_PCIE_BUS); pcie_port_init_reg(d); rc = msi_init(d, XIO3130_MSI_OFFSET, XIO3130_MSI_NR_VECTOR, XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT, - XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT); + XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT, + &local_err); if (rc < 0) { + error_report_err(local_err); goto err_bridge; } + rc = pci_bridge_ssvid_init(d, XIO3130_SSVID_OFFSET, XIO3130_SSVID_SVID, XIO3130_SSVID_SSID); if (rc < 0) { goto err_bridge; } + rc = pcie_cap_init(d, XIO3130_EXP_OFFSET, PCI_EXP_TYPE_DOWNSTREAM, p->port); if (rc < 0) { @@ -103,6 +108,7 @@ err_msi: msi_uninit(d); err_bridge: pci_bridge_exitfn(d); + return rc; } diff --git a/hw/pci-bridge/xio3130_upstream.c b/hw/pci-bridge/xio3130_upstream.c index 434c8fd..882271c 100644 --- a/hw/pci-bridge/xio3130_upstream.c +++ b/hw/pci-bridge/xio3130_upstream.c @@ -55,26 +55,32 @@ static int xio3130_upstream_initfn(PCIDevice *d) { PCIEPort *p = PCIE_PORT(d); int rc; + Error *local_err = NULL; pci_bridge_initfn(d, TYPE_PCIE_BUS); pcie_port_init_reg(d); rc = msi_init(d, XIO3130_MSI_OFFSET, XIO3130_MSI_NR_VECTOR, XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT, - XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT); + XIO3130_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT, + &local_err); if (rc < 0) { + error_report_err(local_err); goto err_bridge; } + rc = pci_bridge_ssvid_init(d, XIO3130_SSVID_OFFSET, XIO3130_SSVID_SVID, XIO3130_SSVID_SSID); if (rc < 0) { goto err_bridge; } + rc = pcie_cap_init(d, XIO3130_EXP_OFFSET, PCI_EXP_TYPE_UPSTREAM, p->port); if (rc < 0) { goto err_msi; } + pcie_cap_flr_init(d); pcie_cap_deverr_init(d); rc = pcie_aer_init(d, XIO3130_AER_OFFSET); diff --git a/hw/pci/msi.c b/hw/pci/msi.c index c1dd531..ad6ed09 100644 --- a/hw/pci/msi.c +++ b/hw/pci/msi.c @@ -150,15 +150,23 @@ bool msi_enabled(const PCIDevice *dev) PCI_MSI_FLAGS_ENABLE); } -int msi_init(struct PCIDevice *dev, uint8_t offset, - unsigned int nr_vectors, bool msi64bit, bool msi_per_vector_mask) +/* + * @nr_vectors: Multiple Message Capable field of Message Control register + * @msi64bit: support 64-bit message address or not + * @msi_per_vector_mask: support per-vector masking or not + * + * return: MSI capability offset in config space + */ +int msi_init(struct PCIDevice *dev, uint8_t offset, unsigned int nr_vectors, + bool msi64bit, bool msi_per_vector_mask, Error **errp) { unsigned int vectors_order; - uint16_t flags; + uint16_t flags; /* Message Control register value */ uint8_t cap_size; int config_offset; if (!msi_supported) { + error_setg(errp, "MSI is not supported by interrupt controller"); return -ENOTSUP; } @@ -182,7 +190,8 @@ int msi_init(struct PCIDevice *dev, uint8_t offset, } cap_size = msi_cap_sizeof(flags); - config_offset = pci_add_capability(dev, PCI_CAP_ID_MSI, offset, cap_size); + config_offset = pci_add_capability2(dev, PCI_CAP_ID_MSI, offset, + cap_size, errp); if (config_offset < 0) { return config_offset; } @@ -205,6 +214,7 @@ int msi_init(struct PCIDevice *dev, uint8_t offset, pci_set_long(dev->wmask + msi_mask_off(dev, msi64bit), 0xffffffff >> (PCI_MSI_VECTORS_MAX - nr_vectors)); } + return config_offset; } diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c index d7dc667..ba25b3a 100644 --- a/hw/scsi/megasas.c +++ b/hw/scsi/megasas.c @@ -2339,6 +2339,17 @@ static void megasas_scsi_realize(PCIDevice *dev, Error **errp) /* Interrupt pin 1 */ pci_conf[PCI_INTERRUPT_PIN] = 0x01; + if (megasas_use_msi(s)) { + int ret; + + ret = msi_init(dev, 0x50, 1, true, false, errp); + if (ret > 0) { + s->flags &= ~MEGASAS_MASK_USE_MSI; + } else { + return; + } + } + memory_region_init_io(&s->mmio_io, OBJECT(s), &megasas_mmio_ops, s, "megasas-mmio", 0x4000); memory_region_init_io(&s->port_io, OBJECT(s), &megasas_port_ops, s, @@ -2346,10 +2357,6 @@ static void megasas_scsi_realize(PCIDevice *dev, Error **errp) memory_region_init_io(&s->queue_io, OBJECT(s), &megasas_queue_ops, s, "megasas-queue", 0x40000); - if (megasas_use_msi(s) && - msi_init(dev, 0x50, 1, true, false)) { - s->flags &= ~MEGASAS_MASK_USE_MSI; - } if (megasas_use_msix(s) && msix_init(dev, 15, &s->mmio_io, b->mmio_bar, 0x2000, &s->mmio_io, b->mmio_bar, 0x3800, 0x68)) { diff --git a/hw/scsi/vmw_pvscsi.c b/hw/scsi/vmw_pvscsi.c index 9c71f31..073b956 100644 --- a/hw/scsi/vmw_pvscsi.c +++ b/hw/scsi/vmw_pvscsi.c @@ -1014,13 +1014,13 @@ pvscsi_io_read(void *opaque, hwaddr addr, unsigned size) static bool -pvscsi_init_msi(PVSCSIState *s) +pvscsi_init_msi(PVSCSIState *s, Error **errp) { int res; PCIDevice *d = PCI_DEVICE(s); res = msi_init(d, PVSCSI_MSI_OFFSET, PVSCSI_MSIX_NUM_VECTORS, - PVSCSI_USE_64BIT, PVSCSI_PER_VECTOR_MASK); + PVSCSI_USE_64BIT, PVSCSI_PER_VECTOR_MASK, errp); if (res < 0) { trace_pvscsi_init_msi_fail(res); s->msi_used = false; @@ -1066,6 +1066,7 @@ static int pvscsi_init(PCIDevice *pci_dev) { PVSCSIState *s = PVSCSI(pci_dev); + Error *local_err = NULL; trace_pvscsi_state("init"); @@ -1079,12 +1080,16 @@ pvscsi_init(PCIDevice *pci_dev) /* Interrupt pin A */ pci_config_set_interrupt_pin(pci_dev->config, 1); + pvscsi_init_msi(s, &local_err); + if (local_err) { + error_report_err(local_err); + return -1; + } + memory_region_init_io(&s->io_space, OBJECT(s), &pvscsi_ops, s, "pvscsi-io", PVSCSI_MEM_SPACE_SIZE); pci_register_bar(pci_dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY, &s->io_space); - pvscsi_init_msi(s); - s->completion_worker = qemu_bh_new(pvscsi_process_completion_queue, s); if (!s->completion_worker) { pvscsi_cleanup_msi(s); diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c index 268ab36..7cd5f6c 100644 --- a/hw/usb/hcd-xhci.c +++ b/hw/usb/hcd-xhci.c @@ -3572,6 +3572,43 @@ static void usb_xhci_init(XHCIState *xhci) } } +static void usb_xhci_exit(PCIDevice *dev) +{ + int i; + XHCIState *xhci = XHCI(dev); + + trace_usb_xhci_exit(); + + for (i = 0; i < xhci->numslots; i++) { + xhci_disable_slot(xhci, i + 1); + } + + if (xhci->mfwrap_timer) { + timer_del(xhci->mfwrap_timer); + timer_free(xhci->mfwrap_timer); + xhci->mfwrap_timer = NULL; + } + + memory_region_del_subregion(&xhci->mem, &xhci->mem_cap); + memory_region_del_subregion(&xhci->mem, &xhci->mem_oper); + memory_region_del_subregion(&xhci->mem, &xhci->mem_runtime); + memory_region_del_subregion(&xhci->mem, &xhci->mem_doorbell); + + for (i = 0; i < xhci->numports; i++) { + XHCIPort *port = &xhci->ports[i]; + memory_region_del_subregion(&xhci->mem, &port->mem); + } + + /* destroy msix memory region */ + if (dev->msix_table && dev->msix_pba + && dev->msix_entry_used) { + memory_region_del_subregion(&xhci->mem, &dev->msix_table_mmio); + memory_region_del_subregion(&xhci->mem, &dev->msix_pba_mmio); + } + + usb_bus_release(&xhci->bus); +} + static void usb_xhci_realize(struct PCIDevice *dev, Error **errp) { int i, ret; @@ -3643,7 +3680,10 @@ static void usb_xhci_realize(struct PCIDevice *dev, Error **errp) } if (xhci_get_flag(xhci, XHCI_FLAG_USE_MSI)) { - msi_init(dev, 0x70, xhci->numintrs, true, false); + ret = msi_init(dev, 0x70, xhci->numintrs, true, false, errp); + if (ret < 0) { + goto cleanup_on_msi_fail; + } } if (xhci_get_flag(xhci, XHCI_FLAG_USE_MSI_X)) { msix_init(dev, xhci->numintrs, @@ -3651,43 +3691,10 @@ static void usb_xhci_realize(struct PCIDevice *dev, Error **errp) &xhci->mem, 0, OFF_MSIX_PBA, 0x90); } -} - -static void usb_xhci_exit(PCIDevice *dev) -{ - int i; - XHCIState *xhci = XHCI(dev); - - trace_usb_xhci_exit(); - - for (i = 0; i < xhci->numslots; i++) { - xhci_disable_slot(xhci, i + 1); - } - - if (xhci->mfwrap_timer) { - timer_del(xhci->mfwrap_timer); - timer_free(xhci->mfwrap_timer); - xhci->mfwrap_timer = NULL; - } - memory_region_del_subregion(&xhci->mem, &xhci->mem_cap); - memory_region_del_subregion(&xhci->mem, &xhci->mem_oper); - memory_region_del_subregion(&xhci->mem, &xhci->mem_runtime); - memory_region_del_subregion(&xhci->mem, &xhci->mem_doorbell); - - for (i = 0; i < xhci->numports; i++) { - XHCIPort *port = &xhci->ports[i]; - memory_region_del_subregion(&xhci->mem, &port->mem); - } - - /* destroy msix memory region */ - if (dev->msix_table && dev->msix_pba - && dev->msix_entry_used) { - memory_region_del_subregion(&xhci->mem, &dev->msix_table_mmio); - memory_region_del_subregion(&xhci->mem, &dev->msix_pba_mmio); - } - - usb_bus_release(&xhci->bus); +cleanup_on_msi_fail: + usb_xhci_exit(dev); + object_unref(OBJECT(&xhci->mem)); } static int usb_xhci_post_load(void *opaque, int version_id) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 1fb868c..633642e 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -1139,7 +1139,7 @@ static void vfio_disable_interrupts(VFIOPCIDevice *vdev) } } -static int vfio_msi_setup(VFIOPCIDevice *vdev, int pos) +static int vfio_msi_setup(VFIOPCIDevice *vdev, int pos, Error **errp) { uint16_t ctrl; bool msi_64bit, msi_maskbit; @@ -1147,6 +1147,7 @@ static int vfio_msi_setup(VFIOPCIDevice *vdev, int pos) if (pread(vdev->vbasedev.fd, &ctrl, sizeof(ctrl), vdev->config_offset + pos + PCI_CAP_FLAGS) != sizeof(ctrl)) { + error_setg(errp, "Read error!"); return -errno; } ctrl = le16_to_cpu(ctrl); @@ -1157,12 +1158,11 @@ static int vfio_msi_setup(VFIOPCIDevice *vdev, int pos) trace_vfio_msi_setup(vdev->vbasedev.name, pos); - ret = msi_init(&vdev->pdev, pos, entries, msi_64bit, msi_maskbit); + ret = msi_init(&vdev->pdev, pos, entries, msi_64bit, msi_maskbit, errp); if (ret < 0) { if (ret == -ENOTSUP) { return 0; } - error_report("vfio: msi_init failed"); return ret; } vdev->msi_cap_size = 0xa + (msi_maskbit ? 0xa : 0) + (msi_64bit ? 0x4 : 0); @@ -1654,7 +1654,7 @@ static void vfio_check_af_flr(VFIOPCIDevice *vdev, uint8_t pos) } } -static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos) +static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos, Error **errp) { PCIDevice *pdev = &vdev->pdev; uint8_t cap_id, next, size; @@ -1679,7 +1679,7 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos) * will be changed as we unwind the stack. */ if (next) { - ret = vfio_add_std_cap(vdev, next); + ret = vfio_add_std_cap(vdev, next, errp); if (ret) { return ret; } @@ -1695,7 +1695,7 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos) switch (cap_id) { case PCI_CAP_ID_MSI: - ret = vfio_msi_setup(vdev, pos); + ret = vfio_msi_setup(vdev, pos, errp); break; case PCI_CAP_ID_EXP: vfio_check_pcie_flr(vdev, pos); @@ -1729,7 +1729,7 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos) return 0; } -static int vfio_add_capabilities(VFIOPCIDevice *vdev) +static int vfio_add_capabilities(VFIOPCIDevice *vdev, Error **errp) { PCIDevice *pdev = &vdev->pdev; @@ -1738,7 +1738,7 @@ static int vfio_add_capabilities(VFIOPCIDevice *vdev) return 0; /* Nothing to add */ } - return vfio_add_std_cap(vdev, pdev->config[PCI_CAPABILITY_LIST]); + return vfio_add_std_cap(vdev, pdev->config[PCI_CAPABILITY_LIST], errp); } static void vfio_pci_pre_reset(VFIOPCIDevice *vdev) @@ -2349,6 +2349,7 @@ static int vfio_initfn(PCIDevice *pdev) struct stat st; int groupid; int ret; + Error *local_err = NULL; /* Check that the host device exists */ snprintf(path, sizeof(path), @@ -2507,8 +2508,9 @@ static int vfio_initfn(PCIDevice *pdev) vfio_map_bars(vdev); - ret = vfio_add_capabilities(vdev); + ret = vfio_add_capabilities(vdev, &local_err); if (ret) { + error_report_err(local_err); goto out_teardown; } diff --git a/include/hw/pci/msi.h b/include/hw/pci/msi.h index 50e452b..da1dc1a 100644 --- a/include/hw/pci/msi.h +++ b/include/hw/pci/msi.h @@ -34,8 +34,8 @@ extern bool msi_supported; void msi_set_message(PCIDevice *dev, MSIMessage msg); MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector); bool msi_enabled(const PCIDevice *dev); -int msi_init(struct PCIDevice *dev, uint8_t offset, - unsigned int nr_vectors, bool msi64bit, bool msi_per_vector_mask); +int msi_init(struct PCIDevice *dev, uint8_t offset, unsigned int nr_vectors, + bool msi64bit, bool msi_per_vector_mask, Error **errp); void msi_uninit(struct PCIDevice *dev); void msi_reset(PCIDevice *dev); void msi_notify(PCIDevice *dev, unsigned int vector);
msi_init() is a supporting function in PCI device initialization, in order to convert .init() to .realize(), it should be modified first. Also modify the callers Bonus: add more comment for msi_init(). Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> --- hw/audio/intel-hda.c | 10 ++++- hw/ide/ich.c | 2 +- hw/net/vmxnet3.c | 13 +++--- hw/pci-bridge/ioh3420.c | 7 +++- hw/pci-bridge/pci_bridge_dev.c | 8 +++- hw/pci-bridge/xio3130_downstream.c | 8 +++- hw/pci-bridge/xio3130_upstream.c | 8 +++- hw/pci/msi.c | 18 +++++++-- hw/scsi/megasas.c | 15 +++++-- hw/scsi/vmw_pvscsi.c | 13 ++++-- hw/usb/hcd-xhci.c | 81 +++++++++++++++++++++----------------- hw/vfio/pci.c | 20 +++++----- include/hw/pci/msi.h | 4 +- 13 files changed, 135 insertions(+), 72 deletions(-)