diff mbox

[v3] pci : Add pba_offset PCI quirk for Chelsio T5 devices

Message ID 1435698210-15999-1-git-send-email-glaupre@chelsio.com
State New
Headers show

Commit Message

Gabriel Laupre June 30, 2015, 9:03 p.m. UTC
Fix pba_offset initialization value for Chelsio T5 Virtual Function
device. The T5 hardware has a bug in it where it reports a Pending Interrupt
Bit Array Offset of 0x8000 for its SR-IOV Virtual Functions instead
of the 0x1000 that the hardware actually uses internally. As the hardware
doesn't return the correct pba_offset value, add a quirk to instead
return a hardcoded value of 0x1000 when a Chelsio T5 VF device is
detected.

This bug has been fixed in the Chelsio's next chip series T6 but there are
no plans to respin the T5 ASIC for this bug. It is just documented in the
T5 Errata and left it at that.

v3: Test the correctness of MSIX data compare to the specified BAR and apply a
  quirk if it comes from a Chelsio T5 Virtual Function, otherwise raise a
  config error.

v2: Replace and PCI_DEVICE_ID_CHELSIO_T5_SERIES_VF macro definition with
  the Chelsio's T5 VF devices identifier schema of 0x58xx

Signed-off-by: Gabriel Laupre <glaupre@chelsio.com>
---
 hw/vfio/pci.c            | 25 +++++++++++++++++++++++++
 include/hw/pci/pci_ids.h |  2 ++
 2 files changed, 27 insertions(+)

Comments

Alex Williamson June 30, 2015, 9:35 p.m. UTC | #1
On Tue, 2015-06-30 at 14:03 -0700, Gabriel Laupre wrote:
> Fix pba_offset initialization value for Chelsio T5 Virtual Function
> device. The T5 hardware has a bug in it where it reports a Pending Interrupt
> Bit Array Offset of 0x8000 for its SR-IOV Virtual Functions instead
> of the 0x1000 that the hardware actually uses internally. As the hardware
> doesn't return the correct pba_offset value, add a quirk to instead
> return a hardcoded value of 0x1000 when a Chelsio T5 VF device is
> detected.
> 
> This bug has been fixed in the Chelsio's next chip series T6 but there are
> no plans to respin the T5 ASIC for this bug. It is just documented in the
> T5 Errata and left it at that.
> 
> v3: Test the correctness of MSIX data compare to the specified BAR and apply a
>   quirk if it comes from a Chelsio T5 Virtual Function, otherwise raise a
>   config error.
> 
> v2: Replace and PCI_DEVICE_ID_CHELSIO_T5_SERIES_VF macro definition with
>   the Chelsio's T5 VF devices identifier schema of 0x58xx
> 
> Signed-off-by: Gabriel Laupre <glaupre@chelsio.com>
> ---
>  hw/vfio/pci.c            | 25 +++++++++++++++++++++++++
>  include/hw/pci/pci_ids.h |  2 ++
>  2 files changed, 27 insertions(+)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index e0e339a..797fedb 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -2252,6 +2252,31 @@ static int vfio_early_setup_msix(VFIOPCIDevice *vdev)
>      vdev->msix->pba_offset = pba & ~PCI_MSIX_FLAGS_BIRMASK;
>      vdev->msix->entries = (ctrl & PCI_MSIX_FLAGS_QSIZE) + 1;
>  
> +    /* Test the size of the pba variables and catch if they extend outside of
> +     * the specified BAR. If it is the case, we have a broken configuration or
> +     * we need to apply a hardware specific quirk. */

Please don't introduce new comment styles, this file is very consistent
in using the following format for multi-line comments:

/*
 * Line 1
 * Line 2
 */

> +    if (vdev->msix->table_offset >=
> +        vdev->bars[vdev->msix->table_bar].region.size ||
> +        vdev->msix->pba_offset >=
> +        vdev->bars[vdev->msix->pba_bar].region.size) {

We're testing both the vector table and PBA offsets relative to the BAR
size, so the comment is slightly off in mentioning only the PBA.

> +
> +        PCIDevice *pdev = &vdev->pdev;
> +        uint16_t vendor = pci_get_word(pdev->config + PCI_VENDOR_ID);
> +        uint16_t device = pci_get_word(pdev->config + PCI_DEVICE_ID);
> +
> +        /* Chelsio T5 Virtual Function devices are encoded as 0x58xx for T5
> +         * adapters. The T5 hardware returns an incorrect value of 0x8000 for
> +         * the VF PBA offset. The correct value is 0x1000, so we hard code that
> +         * here. */

Comment style...

Notice that in my example I included that the VF BAR was only 8K, which
I think is a useful data point.

> +        if (vendor == PCI_VENDOR_ID_CHELSIO && (device & 0xff00) == 0x5800) {
> +            vdev->msix->pba_offset = 0x1000;
> +        } else {
> +            error_report("vfio: Hardware reports invalid configuration, "
> +            "MSIX data outside of specified BAR");

Please look at how the rest of file wraps long lines.  The standard I
try to use is to place additional lines justified to the right of the
open paren of the function, ex:

foo("blah blah... "
    "blah blah");

When that's not possible, additional lines should at least be indented
deeper than the first line.  Always try to be consistent with the file
you're contributing to.

> +            return -EINVAL;
> +        }
> +    }
> +
>      trace_vfio_early_setup_msix(vdev->vbasedev.name, pos,
>                                  vdev->msix->table_bar,
>                                  vdev->msix->table_offset,
> diff --git a/include/hw/pci/pci_ids.h b/include/hw/pci/pci_ids.h
> index 49c062b..d98e6c9 100644
> --- a/include/hw/pci/pci_ids.h
> +++ b/include/hw/pci/pci_ids.h
> @@ -114,6 +114,8 @@
>  #define PCI_VENDOR_ID_ENSONIQ            0x1274
>  #define PCI_DEVICE_ID_ENSONIQ_ES1370     0x5000
>  
> +#define PCI_VENDOR_ID_CHELSIO            0x1425
> +
>  #define PCI_VENDOR_ID_FREESCALE          0x1957
>  #define PCI_DEVICE_ID_MPC8533E           0x0030
>
Bandan Das June 30, 2015, 9:58 p.m. UTC | #2
Gabriel Laupre <glaupre@chelsio.com> writes:
...
> +    /* Test the size of the pba variables and catch if they extend outside of
> +     * the specified BAR. If it is the case, we have a broken configuration or
> +     * we need to apply a hardware specific quirk. */
> +    if (vdev->msix->table_offset >=
> +        vdev->bars[vdev->msix->table_bar].region.size ||
> +        vdev->msix->pba_offset >=
> +        vdev->bars[vdev->msix->pba_bar].region.size) {
> +
> +        PCIDevice *pdev = &vdev->pdev;
> +        uint16_t vendor = pci_get_word(pdev->config + PCI_VENDOR_ID);
> +        uint16_t device = pci_get_word(pdev->config + PCI_DEVICE_ID);
> +
> +        /* Chelsio T5 Virtual Function devices are encoded as 0x58xx for T5
> +         * adapters. The T5 hardware returns an incorrect value of 0x8000 for
> +         * the VF PBA offset. The correct value is 0x1000, so we hard code that
> +         * here. */
> +        if (vendor == PCI_VENDOR_ID_CHELSIO && (device & 0xff00) == 0x5800) {
> +            vdev->msix->pba_offset = 0x1000;

For the rare case where table_offset is wrong for the device being checked for
above and pba_offset is actually correct, shouldn't we fail ?

> +        } else {
> +            error_report("vfio: Hardware reports invalid configuration, "
> +            "MSIX data outside of specified BAR");

Since we are printing anyway, and we have already made the check above, why
not print exactly what's wrong instead of "MSIX data" ?

> +            return -EINVAL;
> +        }
> +    }
> +
>      trace_vfio_early_setup_msix(vdev->vbasedev.name, pos,
>                                  vdev->msix->table_bar,
>                                  vdev->msix->table_offset,
> diff --git a/include/hw/pci/pci_ids.h b/include/hw/pci/pci_ids.h
> index 49c062b..d98e6c9 100644
> --- a/include/hw/pci/pci_ids.h
> +++ b/include/hw/pci/pci_ids.h
> @@ -114,6 +114,8 @@
>  #define PCI_VENDOR_ID_ENSONIQ            0x1274
>  #define PCI_DEVICE_ID_ENSONIQ_ES1370     0x5000
>  
> +#define PCI_VENDOR_ID_CHELSIO            0x1425
> +
>  #define PCI_VENDOR_ID_FREESCALE          0x1957
>  #define PCI_DEVICE_ID_MPC8533E           0x0030
Alex Williamson June 30, 2015, 10:28 p.m. UTC | #3
On Tue, 2015-06-30 at 17:58 -0400, Bandan Das wrote:
> Gabriel Laupre <glaupre@chelsio.com> writes:
> ...
> > +    /* Test the size of the pba variables and catch if they extend outside of
> > +     * the specified BAR. If it is the case, we have a broken configuration or
> > +     * we need to apply a hardware specific quirk. */
> > +    if (vdev->msix->table_offset >=
> > +        vdev->bars[vdev->msix->table_bar].region.size ||
> > +        vdev->msix->pba_offset >=
> > +        vdev->bars[vdev->msix->pba_bar].region.size) {
> > +
> > +        PCIDevice *pdev = &vdev->pdev;
> > +        uint16_t vendor = pci_get_word(pdev->config + PCI_VENDOR_ID);
> > +        uint16_t device = pci_get_word(pdev->config + PCI_DEVICE_ID);
> > +
> > +        /* Chelsio T5 Virtual Function devices are encoded as 0x58xx for T5
> > +         * adapters. The T5 hardware returns an incorrect value of 0x8000 for
> > +         * the VF PBA offset. The correct value is 0x1000, so we hard code that
> > +         * here. */
> > +        if (vendor == PCI_VENDOR_ID_CHELSIO && (device & 0xff00) == 0x5800) {
> > +            vdev->msix->pba_offset = 0x1000;
> 
> For the rare case where table_offset is wrong for the device being checked for
> above and pba_offset is actually correct, shouldn't we fail ?
> 
> > +        } else {
> > +            error_report("vfio: Hardware reports invalid configuration, "
> > +            "MSIX data outside of specified BAR");
> 
> Since we are printing anyway, and we have already made the check above, why
> not print exactly what's wrong instead of "MSIX data" ?

Probably diminishing returns to get too specific, we just need to know
that it's a hardware bug.  If we want the test to be more thorough, it
should be extracted from msix_init() so we're not duplicating code.
Thanks,

Alex

> > +            return -EINVAL;
> > +        }
> > +    }
> > +
> >      trace_vfio_early_setup_msix(vdev->vbasedev.name, pos,
> >                                  vdev->msix->table_bar,
> >                                  vdev->msix->table_offset,
> > diff --git a/include/hw/pci/pci_ids.h b/include/hw/pci/pci_ids.h
> > index 49c062b..d98e6c9 100644
> > --- a/include/hw/pci/pci_ids.h
> > +++ b/include/hw/pci/pci_ids.h
> > @@ -114,6 +114,8 @@
> >  #define PCI_VENDOR_ID_ENSONIQ            0x1274
> >  #define PCI_DEVICE_ID_ENSONIQ_ES1370     0x5000
> >  
> > +#define PCI_VENDOR_ID_CHELSIO            0x1425
> > +
> >  #define PCI_VENDOR_ID_FREESCALE          0x1957
> >  #define PCI_DEVICE_ID_MPC8533E           0x0030
Casey Leedom June 30, 2015, 10:59 p.m. UTC | #4
Or you end up formalizing the concept of PCI Quirks as the kernel does rather than scattering special exception code through the source ...

Casey
Gabriel Laupre July 1, 2015, 1:28 a.m. UTC | #5
@ Bandan
...
> > +
> > +        /* Chelsio T5 Virtual Function devices are encoded as 0x58xx for T5
> > +         * adapters. The T5 hardware returns an incorrect value of 0x8000 for
> > +         * the VF PBA offset. The correct value is 0x1000, so we hard code that
> > +         * here. */
> > +        if (vendor == PCI_VENDOR_ID_CHELSIO && (device & 0xff00) == 0x5800) {
> > +            vdev->msix->pba_offset = 0x1000;

> For the rare case where table_offset is wrong for the device being checked for above and pba_offset is actually correct, shouldn't we fail ?

I don't know if it is relevant to do all the tests here because in the function msix_init() all size are checked. I would prefer keeping this test as this to simplify the quirk, i.e. just testing the device first, and if another size than the pba_offset is wrong, then the sanity check in the function msix_init() will catch the error.

@ Alex
I corrected what you pointed out. I will send the patch v4 in a minute.

Thanks you

Gabriel
Bandan Das July 1, 2015, 1:47 a.m. UTC | #6
Gabriel Laupre <glaupre@chelsio.com> writes:

> @ Bandan ...
>> > + + /* Chelsio T5 Virtual Function devices are encoded as 0x58xx
>> > for T5 + * adapters. The T5 hardware returns an incorrect value of
>> > 0x8000 for + * the VF PBA offset. The correct value is 0x1000, so
>> > we hard code that + * here. */ + if (vendor ==
>> > PCI_VENDOR_ID_CHELSIO && (device & 0xff00) == 0x5800) { +
>> > vdev->msix->pba_offset = 0x1000;
>
>> For the rare case where table_offset is wrong for the device being
> checked for above and pba_offset is actually correct, shouldn't we
> fail ?
>
> I don't know if it is relevant to do all the tests here because in the
> function msix_init() all size are checked. I would prefer keeping this
> test as this to simplify the quirk, i.e. just testing the device
> first, and if another size than the pba_offset is wrong, then the
> sanity check in the function msix_init() will catch the error.

Ok, here's the excerpt:

+    /* Test the size of the pba variables and catch if they extend outside of
+     * the specified BAR. If it is the case, we have a broken configuration or
+     * we need to apply a hardware specific quirk. */
+    if (vdev->msix->table_offset >=
+        vdev->bars[vdev->msix->table_bar].region.size ||
+        vdev->msix->pba_offset >=
+        vdev->bars[vdev->msix->pba_bar].region.size) {
+
+        PCIDevice *pdev = &vdev->pdev;
+        uint16_t vendor = pci_get_word(pdev->config + PCI_VENDOR_ID);
+        uint16_t device = pci_get_word(pdev->config + PCI_DEVICE_ID);
+
+        /* Chelsio T5 Virtual Function devices are encoded as 0x58xx for T5
+         * adapters. The T5 hardware returns an incorrect value of 0x8000 for
+         * the VF PBA offset. The correct value is 0x1000, so we hard code that
+         * here. */
+        if (vendor == PCI_VENDOR_ID_CHELSIO && (device & 0xff00) == 0x5800) {
+            vdev->msix->pba_offset = 0x1000;
+        } else {
+            error_report("vfio: Hardware reports invalid configuration, "
+            "MSIX data outside of specified BAR");
+            return -EINVAL;
+        }
+    }


What you are suggesting is:
If table_offset is not as expected, then check if it's a chelsio device.
If it's not, then print a message. On the other hand, if it's a chelsio
device, then let msix_init() catch the error. Why ? And if we are sure
that msix_init will error out, what's the purpose of the table_offset
check ?

> @ Alex I corrected what you pointed out. I will send the patch v4 in a
> minute.
>
> Thanks you
>
> Gabriel
Gabriel Laupre July 1, 2015, 1:53 a.m. UTC | #7
Right,

I may have send the patch a bit too soon, I need to take care of that.

-----Original Message-----
From: Bandan Das [mailto:bsd@redhat.com] 
Sent: Tuesday, June 30, 2015 6:48 PM
To: Gabriel Laupre
Cc: jb-gnumlists@wisemo.com; Casey Leedom; mst@redhat.com; qemu-devel@nongnu.org; Anish Bhatt; Michael Boksanyi; alex.williamson@redhat.com; bsd@makefile.in
Subject: Re: [Qemu-devel] [PATCH v3] pci : Add pba_offset PCI quirk for Chelsio T5 devices

Gabriel Laupre <glaupre@chelsio.com> writes:

> @ Bandan ...
>> > + + /* Chelsio T5 Virtual Function devices are encoded as 0x58xx
>> > for T5 + * adapters. The T5 hardware returns an incorrect value of
>> > 0x8000 for + * the VF PBA offset. The correct value is 0x1000, so 
>> > we hard code that + * here. */ + if (vendor == 
>> > PCI_VENDOR_ID_CHELSIO && (device & 0xff00) == 0x5800) { +
>> > vdev->msix->pba_offset = 0x1000;
>
>> For the rare case where table_offset is wrong for the device being
> checked for above and pba_offset is actually correct, shouldn't we 
> fail ?
>
> I don't know if it is relevant to do all the tests here because in the 
> function msix_init() all size are checked. I would prefer keeping this 
> test as this to simplify the quirk, i.e. just testing the device 
> first, and if another size than the pba_offset is wrong, then the 
> sanity check in the function msix_init() will catch the error.

Ok, here's the excerpt:

+    /* Test the size of the pba variables and catch if they extend outside of
+     * the specified BAR. If it is the case, we have a broken configuration or
+     * we need to apply a hardware specific quirk. */
+    if (vdev->msix->table_offset >=
+        vdev->bars[vdev->msix->table_bar].region.size ||
+        vdev->msix->pba_offset >=
+        vdev->bars[vdev->msix->pba_bar].region.size) {
+
+        PCIDevice *pdev = &vdev->pdev;
+        uint16_t vendor = pci_get_word(pdev->config + PCI_VENDOR_ID);
+        uint16_t device = pci_get_word(pdev->config + PCI_DEVICE_ID);
+
+        /* Chelsio T5 Virtual Function devices are encoded as 0x58xx for T5
+         * adapters. The T5 hardware returns an incorrect value of 0x8000 for
+         * the VF PBA offset. The correct value is 0x1000, so we hard code that
+         * here. */
+        if (vendor == PCI_VENDOR_ID_CHELSIO && (device & 0xff00) == 0x5800) {
+            vdev->msix->pba_offset = 0x1000;
+        } else {
+            error_report("vfio: Hardware reports invalid configuration, "
+            "MSIX data outside of specified BAR");
+            return -EINVAL;
+        }
+    }


What you are suggesting is:
If table_offset is not as expected, then check if it's a chelsio device.
If it's not, then print a message. On the other hand, if it's a chelsio device, then let msix_init() catch the error. Why ? And if we are sure that msix_init will error out, what's the purpose of the table_offset check ?

> @ Alex I corrected what you pointed out. I will send the patch v4 in a 
> minute.
>
> Thanks you
>
> Gabriel
Gabriel Laupre July 1, 2015, 2:13 a.m. UTC | #8
> What you are suggesting is:
> If table_offset is not as expected, then check if it's a chelsio device.
> If it's not, then print a message. On the other hand, if it's a chelsio device, then let msix_init() catch the error. Why ? And if we are sure that msix_init will error out, what's the purpose of the table_offset check ?

The first test of table_offset is indeed useless for the quirk as we know that the hardware problem comes from the pba_offset. We shouldn't mix the different potential errors in the same test, I agree.


-----Original Message-----
From: Bandan Das [mailto:bsd@redhat.com] 
Sent: Tuesday, June 30, 2015 6:48 PM
To: Gabriel Laupre
Cc: jb-gnumlists@wisemo.com; Casey Leedom; mst@redhat.com; qemu-devel@nongnu.org; Anish Bhatt; Michael Boksanyi; alex.williamson@redhat.com; bsd@makefile.in
Subject: Re: [Qemu-devel] [PATCH v3] pci : Add pba_offset PCI quirk for Chelsio T5 devices

Gabriel Laupre <glaupre@chelsio.com> writes:

> @ Bandan ...
>> > + + /* Chelsio T5 Virtual Function devices are encoded as 0x58xx
>> > for T5 + * adapters. The T5 hardware returns an incorrect value of
>> > 0x8000 for + * the VF PBA offset. The correct value is 0x1000, so 
>> > we hard code that + * here. */ + if (vendor == 
>> > PCI_VENDOR_ID_CHELSIO && (device & 0xff00) == 0x5800) { +
>> > vdev->msix->pba_offset = 0x1000;
>
>> For the rare case where table_offset is wrong for the device being
> checked for above and pba_offset is actually correct, shouldn't we 
> fail ?
>
> I don't know if it is relevant to do all the tests here because in the 
> function msix_init() all size are checked. I would prefer keeping this 
> test as this to simplify the quirk, i.e. just testing the device 
> first, and if another size than the pba_offset is wrong, then the 
> sanity check in the function msix_init() will catch the error.

Ok, here's the excerpt:

+    /* Test the size of the pba variables and catch if they extend outside of
+     * the specified BAR. If it is the case, we have a broken configuration or
+     * we need to apply a hardware specific quirk. */
+    if (vdev->msix->table_offset >=
+        vdev->bars[vdev->msix->table_bar].region.size ||
+        vdev->msix->pba_offset >=
+        vdev->bars[vdev->msix->pba_bar].region.size) {
+
+        PCIDevice *pdev = &vdev->pdev;
+        uint16_t vendor = pci_get_word(pdev->config + PCI_VENDOR_ID);
+        uint16_t device = pci_get_word(pdev->config + PCI_DEVICE_ID);
+
+        /* Chelsio T5 Virtual Function devices are encoded as 0x58xx for T5
+         * adapters. The T5 hardware returns an incorrect value of 0x8000 for
+         * the VF PBA offset. The correct value is 0x1000, so we hard code that
+         * here. */
+        if (vendor == PCI_VENDOR_ID_CHELSIO && (device & 0xff00) == 0x5800) {
+            vdev->msix->pba_offset = 0x1000;
+        } else {
+            error_report("vfio: Hardware reports invalid configuration, "
+            "MSIX data outside of specified BAR");
+            return -EINVAL;
+        }
+    }


What you are suggesting is:
If table_offset is not as expected, then check if it's a chelsio device.
If it's not, then print a message. On the other hand, if it's a chelsio device, then let msix_init() catch the error. Why ? And if we are sure that msix_init will error out, what's the purpose of the table_offset check ?

> @ Alex I corrected what you pointed out. I will send the patch v4 in a 
> minute.
>
> Thanks you
>
> Gabriel
Gabriel Laupre July 1, 2015, 6:10 p.m. UTC | #9
> What you are suggesting is:
> If table_offset is not as expected, then check if it's a chelsio device.
> If it's not, then print a message. On the other hand, if it's a chelsio device, then let msix_init() catch the error. Why ? And if we are sure that msix_init will error out, what's the purpose of the table_offset check ?

The test needs only to check the pba_offset and reduced as following is enough

...
    /*   
     * Test the size of the pba_offset variable and catch if it extends outside
     * of the specified BAR. If it is the case, we need to apply a hardware
     * specific quirk if the device is known or we have a broken configuration.
     */
    if (vdev->msix->pba_offset >=
        vdev->bars[vdev->msix->pba_bar].region.size) {

        PCIDevice *pdev = &vdev->pdev;
        uint16_t vendor = pci_get_word(pdev->config + PCI_VENDOR_ID);
        uint16_t device = pci_get_word(pdev->config + PCI_DEVICE_ID);

        /*
         * Chelsio T5 Virtual Function devices are encoded as 0x58xx for T5
         * adapters. The T5 hardware returns an incorrect value of 0x8000 for
         * the VF PBA offset while the BAR itself is only 8k. The correct value
         * is 0x1000, so we hard code that here.
         */
        if (vendor == PCI_VENDOR_ID_CHELSIO && (device & 0xff00) == 0x5800) {
            vdev->msix->pba_offset = 0x1000;
        } else {
            error_report("vfio: Hardware reports invalid configuration, "
                         "MSIX data outside of specified BAR");
            return -EINVAL;
        }
    }  
...

As the hardware problem is only related with the pba_offset and the purpose of the quirk is to correct the known hardware error. The table_offset has never been seen as wrong. Therefore the msix_init() sanity check should take care of a "rare" potential error as you mentioned. 

This time I'll wait for ACKs from your side before submitting a new version :)

Gabriel

-----Original Message-----
From: Bandan Das [mailto:bsd@redhat.com] 
Sent: Tuesday, June 30, 2015 6:48 PM
To: Gabriel Laupre
Cc: jb-gnumlists@wisemo.com; Casey Leedom; mst@redhat.com; qemu-devel@nongnu.org; Anish Bhatt; Michael Boksanyi; alex.williamson@redhat.com; bsd@makefile.in
Subject: Re: [Qemu-devel] [PATCH v3] pci : Add pba_offset PCI quirk for Chelsio T5 devices

Gabriel Laupre <glaupre@chelsio.com> writes:

> @ Bandan ...
>> > + + /* Chelsio T5 Virtual Function devices are encoded as 0x58xx
>> > for T5 + * adapters. The T5 hardware returns an incorrect value of
>> > 0x8000 for + * the VF PBA offset. The correct value is 0x1000, so 
>> > we hard code that + * here. */ + if (vendor == 
>> > PCI_VENDOR_ID_CHELSIO && (device & 0xff00) == 0x5800) { +
>> > vdev->msix->pba_offset = 0x1000;
>
>> For the rare case where table_offset is wrong for the device being
> checked for above and pba_offset is actually correct, shouldn't we 
> fail ?
>
> I don't know if it is relevant to do all the tests here because in the 
> function msix_init() all size are checked. I would prefer keeping this 
> test as this to simplify the quirk, i.e. just testing the device 
> first, and if another size than the pba_offset is wrong, then the 
> sanity check in the function msix_init() will catch the error.

Ok, here's the excerpt:

+    /* Test the size of the pba variables and catch if they extend outside of
+     * the specified BAR. If it is the case, we have a broken configuration or
+     * we need to apply a hardware specific quirk. */
+    if (vdev->msix->table_offset >=
+        vdev->bars[vdev->msix->table_bar].region.size ||
+        vdev->msix->pba_offset >=
+        vdev->bars[vdev->msix->pba_bar].region.size) {
+
+        PCIDevice *pdev = &vdev->pdev;
+        uint16_t vendor = pci_get_word(pdev->config + PCI_VENDOR_ID);
+        uint16_t device = pci_get_word(pdev->config + PCI_DEVICE_ID);
+
+        /* Chelsio T5 Virtual Function devices are encoded as 0x58xx for T5
+         * adapters. The T5 hardware returns an incorrect value of 0x8000 for
+         * the VF PBA offset. The correct value is 0x1000, so we hard code that
+         * here. */
+        if (vendor == PCI_VENDOR_ID_CHELSIO && (device & 0xff00) == 0x5800) {
+            vdev->msix->pba_offset = 0x1000;
+        } else {
+            error_report("vfio: Hardware reports invalid configuration, "
+            "MSIX data outside of specified BAR");
+            return -EINVAL;
+        }
+    }


What you are suggesting is:
If table_offset is not as expected, then check if it's a chelsio device.
If it's not, then print a message. On the other hand, if it's a chelsio device, then let msix_init() catch the error. Why ? And if we are sure that msix_init will error out, what's the purpose of the table_offset check ?

> @ Alex I corrected what you pointed out. I will send the patch v4 in a 
> minute.
>
> Thanks you
>
> Gabriel
Alex Williamson July 1, 2015, 6:18 p.m. UTC | #10
On Wed, 2015-07-01 at 18:10 +0000, Gabriel Laupre wrote:
> > What you are suggesting is:
> > If table_offset is not as expected, then check if it's a chelsio device.
> > If it's not, then print a message. On the other hand, if it's a chelsio device, then let msix_init() catch the error. Why ? And if we are sure that msix_init will error out, what's the purpose of the table_offset check ?
> 
> The test needs only to check the pba_offset and reduced as following is enough
> 
> ...
>     /*   
>      * Test the size of the pba_offset variable and catch if it extends outside
>      * of the specified BAR. If it is the case, we need to apply a hardware
>      * specific quirk if the device is known or we have a broken configuration.
>      */
>     if (vdev->msix->pba_offset >=
>         vdev->bars[vdev->msix->pba_bar].region.size) {
> 
>         PCIDevice *pdev = &vdev->pdev;
>         uint16_t vendor = pci_get_word(pdev->config + PCI_VENDOR_ID);
>         uint16_t device = pci_get_word(pdev->config + PCI_DEVICE_ID);
> 
>         /*
>          * Chelsio T5 Virtual Function devices are encoded as 0x58xx for T5
>          * adapters. The T5 hardware returns an incorrect value of 0x8000 for
>          * the VF PBA offset while the BAR itself is only 8k. The correct value
>          * is 0x1000, so we hard code that here.
>          */
>         if (vendor == PCI_VENDOR_ID_CHELSIO && (device & 0xff00) == 0x5800) {
>             vdev->msix->pba_offset = 0x1000;
>         } else {
>             error_report("vfio: Hardware reports invalid configuration, "
>                          "MSIX data outside of specified BAR");
>             return -EINVAL;
>         }
>     }  
> ...
> 
> As the hardware problem is only related with the pba_offset and the purpose of the quirk is to correct the known hardware error. The table_offset has never been seen as wrong. Therefore the msix_init() sanity check should take care of a "rare" potential error as you mentioned. 
> 
> This time I'll wait for ACKs from your side before submitting a new version :)

I would s/data/PBA/ in the error_report text for this version.  Thanks,

Alex

> -----Original Message-----
> From: Bandan Das [mailto:bsd@redhat.com] 
> Sent: Tuesday, June 30, 2015 6:48 PM
> To: Gabriel Laupre
> Cc: jb-gnumlists@wisemo.com; Casey Leedom; mst@redhat.com; qemu-devel@nongnu.org; Anish Bhatt; Michael Boksanyi; alex.williamson@redhat.com; bsd@makefile.in
> Subject: Re: [Qemu-devel] [PATCH v3] pci : Add pba_offset PCI quirk for Chelsio T5 devices
> 
> Gabriel Laupre <glaupre@chelsio.com> writes:
> 
> > @ Bandan ...
> >> > + + /* Chelsio T5 Virtual Function devices are encoded as 0x58xx
> >> > for T5 + * adapters. The T5 hardware returns an incorrect value of
> >> > 0x8000 for + * the VF PBA offset. The correct value is 0x1000, so 
> >> > we hard code that + * here. */ + if (vendor == 
> >> > PCI_VENDOR_ID_CHELSIO && (device & 0xff00) == 0x5800) { +
> >> > vdev->msix->pba_offset = 0x1000;
> >
> >> For the rare case where table_offset is wrong for the device being
> > checked for above and pba_offset is actually correct, shouldn't we 
> > fail ?
> >
> > I don't know if it is relevant to do all the tests here because in the 
> > function msix_init() all size are checked. I would prefer keeping this 
> > test as this to simplify the quirk, i.e. just testing the device 
> > first, and if another size than the pba_offset is wrong, then the 
> > sanity check in the function msix_init() will catch the error.
> 
> Ok, here's the excerpt:
> 
> +    /* Test the size of the pba variables and catch if they extend outside of
> +     * the specified BAR. If it is the case, we have a broken configuration or
> +     * we need to apply a hardware specific quirk. */
> +    if (vdev->msix->table_offset >=
> +        vdev->bars[vdev->msix->table_bar].region.size ||
> +        vdev->msix->pba_offset >=
> +        vdev->bars[vdev->msix->pba_bar].region.size) {
> +
> +        PCIDevice *pdev = &vdev->pdev;
> +        uint16_t vendor = pci_get_word(pdev->config + PCI_VENDOR_ID);
> +        uint16_t device = pci_get_word(pdev->config + PCI_DEVICE_ID);
> +
> +        /* Chelsio T5 Virtual Function devices are encoded as 0x58xx for T5
> +         * adapters. The T5 hardware returns an incorrect value of 0x8000 for
> +         * the VF PBA offset. The correct value is 0x1000, so we hard code that
> +         * here. */
> +        if (vendor == PCI_VENDOR_ID_CHELSIO && (device & 0xff00) == 0x5800) {
> +            vdev->msix->pba_offset = 0x1000;
> +        } else {
> +            error_report("vfio: Hardware reports invalid configuration, "
> +            "MSIX data outside of specified BAR");
> +            return -EINVAL;
> +        }
> +    }
> 
> 
> What you are suggesting is:
> If table_offset is not as expected, then check if it's a chelsio device.
> If it's not, then print a message. On the other hand, if it's a chelsio device, then let msix_init() catch the error. Why ? And if we are sure that msix_init will error out, what's the purpose of the table_offset check ?
> 
> > @ Alex I corrected what you pointed out. I will send the patch v4 in a 
> > minute.
> >
> > Thanks you
> >
> > Gabriel
Bandan Das July 1, 2015, 6:27 p.m. UTC | #11
Alex Williamson <alex.williamson@redhat.com> writes:
...
>>          */
>>         if (vendor == PCI_VENDOR_ID_CHELSIO && (device & 0xff00) == 0x5800) {
>>             vdev->msix->pba_offset = 0x1000;
>>         } else {
>>             error_report("vfio: Hardware reports invalid configuration, "
>>                          "MSIX data outside of specified BAR");
>>             return -EINVAL;
>>         }
>>     }  
>> ...
>> 
>> As the hardware problem is only related with the pba_offset and the purpose of the quirk is to correct the known hardware error. The table_offset has never been seen as wrong. Therefore the msix_init() sanity check should take care of a "rare" potential error as you mentioned. 
>> 
>> This time I'll wait for ACKs from your side before submitting a new version :)
>
> I would s/data/PBA/ in the error_report text for this version.  Thanks,

Looks good to me, Gabriel + what Alex mentioned above.

Thanks,
Bandan
diff mbox

Patch

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index e0e339a..797fedb 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2252,6 +2252,31 @@  static int vfio_early_setup_msix(VFIOPCIDevice *vdev)
     vdev->msix->pba_offset = pba & ~PCI_MSIX_FLAGS_BIRMASK;
     vdev->msix->entries = (ctrl & PCI_MSIX_FLAGS_QSIZE) + 1;
 
+    /* Test the size of the pba variables and catch if they extend outside of
+     * the specified BAR. If it is the case, we have a broken configuration or
+     * we need to apply a hardware specific quirk. */
+    if (vdev->msix->table_offset >=
+        vdev->bars[vdev->msix->table_bar].region.size ||
+        vdev->msix->pba_offset >=
+        vdev->bars[vdev->msix->pba_bar].region.size) {
+
+        PCIDevice *pdev = &vdev->pdev;
+        uint16_t vendor = pci_get_word(pdev->config + PCI_VENDOR_ID);
+        uint16_t device = pci_get_word(pdev->config + PCI_DEVICE_ID);
+
+        /* Chelsio T5 Virtual Function devices are encoded as 0x58xx for T5
+         * adapters. The T5 hardware returns an incorrect value of 0x8000 for
+         * the VF PBA offset. The correct value is 0x1000, so we hard code that
+         * here. */
+        if (vendor == PCI_VENDOR_ID_CHELSIO && (device & 0xff00) == 0x5800) {
+            vdev->msix->pba_offset = 0x1000;
+        } else {
+            error_report("vfio: Hardware reports invalid configuration, "
+            "MSIX data outside of specified BAR");
+            return -EINVAL;
+        }
+    }
+
     trace_vfio_early_setup_msix(vdev->vbasedev.name, pos,
                                 vdev->msix->table_bar,
                                 vdev->msix->table_offset,
diff --git a/include/hw/pci/pci_ids.h b/include/hw/pci/pci_ids.h
index 49c062b..d98e6c9 100644
--- a/include/hw/pci/pci_ids.h
+++ b/include/hw/pci/pci_ids.h
@@ -114,6 +114,8 @@ 
 #define PCI_VENDOR_ID_ENSONIQ            0x1274
 #define PCI_DEVICE_ID_ENSONIQ_ES1370     0x5000
 
+#define PCI_VENDOR_ID_CHELSIO            0x1425
+
 #define PCI_VENDOR_ID_FREESCALE          0x1957
 #define PCI_DEVICE_ID_MPC8533E           0x0030