Patchwork [RESEND,v2,2/2] seabios q35: Add new PCI slot to irq routing function

login
register
mail settings
Submitter Alex Williamson
Date Feb. 15, 2013, 9:11 p.m.
Message ID <20130215211141.5990.92169.stgit@bling.home>
Download mbox | patch
Permalink /patch/220872/
State New
Headers show

Comments

Alex Williamson - Feb. 15, 2013, 9:11 p.m.
q35/ich9 doesn't use the same interrupt mapping function as
i440fx/piix.  PIRQA:D and PIRQE:H are programmed identically, but we
start at index 0, not index -1.  Slots 25 through 31 are also
programmed independently.

When running qemu w/o this patch, a device at address 0:6.0 will have
its PCI interrupt line register programmed with irq 10 (as seen by
info pci), but it actually uses irq 11 (as reported the guest).  Half
of the interrupt lines are misprogrammedi like this.  Functionally, a
fully emulated qemu guest doesn't care much, but when we try to use
device assignment, we really need to know the correct irqs.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
 src/pciinit.c |   33 ++++++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)
Kevin O'Connor - Feb. 21, 2013, 4:29 a.m.
On Fri, Feb 15, 2013 at 02:11:41PM -0700, Alex Williamson wrote:
> q35/ich9 doesn't use the same interrupt mapping function as
> i440fx/piix.  PIRQA:D and PIRQE:H are programmed identically, but we
> start at index 0, not index -1.  Slots 25 through 31 are also
> programmed independently.
> 
> When running qemu w/o this patch, a device at address 0:6.0 will have
> its PCI interrupt line register programmed with irq 10 (as seen by
> info pci), but it actually uses irq 11 (as reported the guest).  Half
> of the interrupt lines are misprogrammedi like this.  Functionally, a
> fully emulated qemu guest doesn't care much, but when we try to use
> device assignment, we really need to know the correct irqs.

Thanks.  Before committing this, I have a couple of questions.

[...]
> --- a/src/pciinit.c
> +++ b/src/pciinit.c
> @@ -91,8 +91,10 @@ const u8 pci_irqs[4] = {
>      10, 10, 11, 11
>  };
>  
> +static int (*pci_slot_get_irq)(struct pci_device *pci, int pin);

This looks like it will cause a null pointer dereference if (for what
ever odd reason) neither the piix nor mch chipset is detected.

[...]
> +static int mch_pci_slot_get_irq(struct pci_device *pci, int pin)
> +{
> +    int irq, slot, pin_addend = 0;
> +
> +    while (pci->parent != NULL) {
> +        pin_addend += pci_bdf_to_dev(pci->bdf);
> +        pci = pci->parent;
> +    }
> +    slot = pci_bdf_to_dev(pci->bdf);
> +
> +    switch (slot) {
> +    /* Slots 0-24 rotate slot:pin mapping similar to piix above, but
> +       with a different starting index - see q35-acpi-dsdt.dsl */ 
> +    case 0 ... 24:
> +        irq = pci_irqs[(pin - 1 + pin_addend + slot) & 3];
> +        break;
> +    /* Slots 25-31 all use LNKA mapping (or LNKE, but A:D = E:H) */
> +    case 25 ... 31:
> +        irq = pci_irqs[(pin - 1 + pin_addend) & 3];
> +        break;
> +    }

I'm not sure I understand this.  Since the pin to irq mapping for each
pci slot is motherboard dependent (and not chipset dependent), why
would the qemu "motherboard" use such a weird mapping?  Why wouldn't
we just have qemu do something more consistent?

-Kevin
Alex Williamson - Feb. 21, 2013, 4:50 a.m.
On Wed, 2013-02-20 at 23:29 -0500, Kevin O'Connor wrote:
> On Fri, Feb 15, 2013 at 02:11:41PM -0700, Alex Williamson wrote:
> > q35/ich9 doesn't use the same interrupt mapping function as
> > i440fx/piix.  PIRQA:D and PIRQE:H are programmed identically, but we
> > start at index 0, not index -1.  Slots 25 through 31 are also
> > programmed independently.
> > 
> > When running qemu w/o this patch, a device at address 0:6.0 will have
> > its PCI interrupt line register programmed with irq 10 (as seen by
> > info pci), but it actually uses irq 11 (as reported the guest).  Half
> > of the interrupt lines are misprogrammedi like this.  Functionally, a
> > fully emulated qemu guest doesn't care much, but when we try to use
> > device assignment, we really need to know the correct irqs.
> 
> Thanks.  Before committing this, I have a couple of questions.
> 
> [...]
> > --- a/src/pciinit.c
> > +++ b/src/pciinit.c
> > @@ -91,8 +91,10 @@ const u8 pci_irqs[4] = {
> >      10, 10, 11, 11
> >  };
> >  
> > +static int (*pci_slot_get_irq)(struct pci_device *pci, int pin);
> 
> This looks like it will cause a null pointer dereference if (for what
> ever odd reason) neither the piix nor mch chipset is detected.

That's correct.  We could initialize it to a dummy function that just
does a dprintf.  I can send a follow-up patch if that sounds ok.

> [...]
> > +static int mch_pci_slot_get_irq(struct pci_device *pci, int pin)
> > +{
> > +    int irq, slot, pin_addend = 0;
> > +
> > +    while (pci->parent != NULL) {
> > +        pin_addend += pci_bdf_to_dev(pci->bdf);
> > +        pci = pci->parent;
> > +    }
> > +    slot = pci_bdf_to_dev(pci->bdf);
> > +
> > +    switch (slot) {
> > +    /* Slots 0-24 rotate slot:pin mapping similar to piix above, but
> > +       with a different starting index - see q35-acpi-dsdt.dsl */ 
> > +    case 0 ... 24:
> > +        irq = pci_irqs[(pin - 1 + pin_addend + slot) & 3];
> > +        break;
> > +    /* Slots 25-31 all use LNKA mapping (or LNKE, but A:D = E:H) */
> > +    case 25 ... 31:
> > +        irq = pci_irqs[(pin - 1 + pin_addend) & 3];
> > +        break;
> > +    }
> 
> I'm not sure I understand this.  Since the pin to irq mapping for each
> pci slot is motherboard dependent (and not chipset dependent), why
> would the qemu "motherboard" use such a weird mapping?  Why wouldn't
> we just have qemu do something more consistent?

PCI of course has 4 interrupt lines.  440fx only supports 4 interrupt
lines, so to attempt to evenly distribute device interrupts over those 4
lines, we shift the connections on each slot.  Q35 supports 8 interrupt
lines.  My take on this distribution is that we're reserving 4 of them
for root complex devices (everything directly attached to the host
bridge) and use the other 4 for add-in devices (everything behind PCIe
root ports or the PCIe-to-PCI bridge).  Besides load balancing, an
obvious benefit for a vendor to do this kind of layout is that it
reduces the chances of sharing interrupts between onboard devices and
plugin devices.  I'd guess that this layout likely came from looking at
real hardware and trying to do something similar.  Thanks,

Alex

Patch

diff --git a/src/pciinit.c b/src/pciinit.c
index 05c0fe8..4cdc777 100644
--- a/src/pciinit.c
+++ b/src/pciinit.c
@@ -91,8 +91,10 @@  const u8 pci_irqs[4] = {
     10, 10, 11, 11
 };
 
+static int (*pci_slot_get_irq)(struct pci_device *pci, int pin);
+
 // Return the global irq number corresponding to a host bus device irq pin.
-static int pci_slot_get_irq(struct pci_device *pci, int pin)
+static int piix_pci_slot_get_irq(struct pci_device *pci, int pin)
 {
     int slot_addend = 0;
 
@@ -104,6 +106,31 @@  static int pci_slot_get_irq(struct pci_device *pci, int pin)
     return pci_irqs[(pin - 1 + slot_addend) & 3];
 }
 
+static int mch_pci_slot_get_irq(struct pci_device *pci, int pin)
+{
+    int irq, slot, pin_addend = 0;
+
+    while (pci->parent != NULL) {
+        pin_addend += pci_bdf_to_dev(pci->bdf);
+        pci = pci->parent;
+    }
+    slot = pci_bdf_to_dev(pci->bdf);
+
+    switch (slot) {
+    /* Slots 0-24 rotate slot:pin mapping similar to piix above, but
+       with a different starting index - see q35-acpi-dsdt.dsl */ 
+    case 0 ... 24:
+        irq = pci_irqs[(pin - 1 + pin_addend + slot) & 3];
+        break;
+    /* Slots 25-31 all use LNKA mapping (or LNKE, but A:D = E:H) */
+    case 25 ... 31:
+        irq = pci_irqs[(pin - 1 + pin_addend) & 3];
+        break;
+    }
+
+    return irq;
+}
+
 /* PIIX3/PIIX4 PCI to ISA bridge */
 static void piix_isa_bridge_setup(struct pci_device *pci, void *arg)
 {
@@ -292,6 +319,8 @@  void i440fx_mem_addr_setup(struct pci_device *dev, void *arg)
         pcimem_start = 0x80000000;
     else if (RamSize <= 0xc0000000)
         pcimem_start = 0xc0000000;
+
+    pci_slot_get_irq = piix_pci_slot_get_irq;
 }
 
 void mch_mem_addr_setup(struct pci_device *dev, void *arg)
@@ -310,6 +339,8 @@  void mch_mem_addr_setup(struct pci_device *dev, void *arg)
 
     /* setup pci i/o window (above mmconfig) */
     pcimem_start = addr + size;
+
+    pci_slot_get_irq = mch_pci_slot_get_irq;
 }
 
 static const struct pci_device_id pci_platform_tbl[] = {