Patchwork [Xen-devel,v4,18/22] xen/pci: Pay attention to PCI_MSIX_TABLE_OFFSET

login
register
mail settings
Submitter Bjorn Helgaas
Date April 25, 2013, 4:42 p.m.
Message ID <20130425164238.GA3052@google.com>
Download mbox | patch
Permalink /patch/239566/
State Not Applicable
Headers show

Comments

Bjorn Helgaas - April 25, 2013, 4:42 p.m.
On Thu, Apr 25, 2013 at 10:40:57AM +0100, Jan Beulich wrote:
> >>> On 24.04.13 at 18:34, Bjorn Helgaas <bhelgaas@google.com> wrote:
> > On Mon, Apr 22, 2013 at 5:12 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> >> The MSI-X Table structure may be at a non-zero offset into the
> >> device BAR, and we should account for that.
> 
> NAK: The base is just being use to pass to the hypervisor, which
> then takes care to add the offset.

Thanks for noticing this.  I updated the patch to just add a comment to
avert future confusion, and refreshed the subsequent xen patches as below.

> Recent hypervisors will actually
> only consume this to issue a warning if not matching what gets
> read from the corresponding BAR. Earlier hypervisors used this
> instead of reading the BAR.

Note that pci_resource_start() gives you a CPU address, and what you
read from the BAR is a PCI bus address.  These are not in the same
address space and can't be directly compared.  I assume the
hypervisors take that into account and do the appropriate
conversions?

Bjorn


commit 34f394aad6706a42fb69922ed184c918ec9f9f81
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Thu Apr 18 15:02:34 2013 -0600

    xen/pci: Comment on unusual PCI_MSIX_TABLE usage
    
    To find the MSI-X Table structure, you look at the BAR specified
    by PCI_MSIX_FLAGS_BIRMASK, then apply the PCI_MSIX_TABLE_OFFSET to
    the address from the BAR.  So most readers of PCI_MSIX_TABLE use
    PCI_MSIX_TABLE_OFFSET, but in xen's case, the hypervisor takes
    care of applying the offset, so we don't need to do it here.
    
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    CC: Jan Beulich <JBeulich@suse.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Beulich - April 26, 2013, 7:16 a.m.
>>> On 25.04.13 at 18:42, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Thu, Apr 25, 2013 at 10:40:57AM +0100, Jan Beulich wrote:
>> Recent hypervisors will actually
>> only consume this to issue a warning if not matching what gets
>> read from the corresponding BAR. Earlier hypervisors used this
>> instead of reading the BAR.
> 
> Note that pci_resource_start() gives you a CPU address, and what you
> read from the BAR is a PCI bus address.  These are not in the same
> address space and can't be directly compared.  I assume the
> hypervisors take that into account and do the appropriate
> conversions?

I suppose Xen has never been run on a system where the two
would differ, and it's quite likely that it would break on such
systems. Question is - are there any x86-based systems where
this is the case?

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas - April 26, 2013, 2:50 p.m.
On Fri, Apr 26, 2013 at 1:16 AM, Jan Beulich <JBeulich@suse.com> wrote:
>>>> On 25.04.13 at 18:42, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> On Thu, Apr 25, 2013 at 10:40:57AM +0100, Jan Beulich wrote:
>>> Recent hypervisors will actually
>>> only consume this to issue a warning if not matching what gets
>>> read from the corresponding BAR. Earlier hypervisors used this
>>> instead of reading the BAR.
>>
>> Note that pci_resource_start() gives you a CPU address, and what you
>> read from the BAR is a PCI bus address.  These are not in the same
>> address space and can't be directly compared.  I assume the
>> hypervisors take that into account and do the appropriate
>> conversions?
>
> I suppose Xen has never been run on a system where the two
> would differ, and it's quite likely that it would break on such
> systems. Question is - are there any x86-based systems where
> this is the case?

I haven't personally seen one, but it looks like they're coming, based
on this patch we merged last year:

commit b4873931cc8c934a9893d5962bde97aca23be983
Author: Mike Yoknis <mike.yoknis@hp.com>
Date:   Wed Nov 7 15:52:20 2012 -0700

    x86/PCI: Allow x86 platforms to use translation offsets

    The memory range descriptors in the _CRS control method contain an address
    translation offset for host bridges.  This value is used to translate
    addresses across the bridge.  The support to use _TRA values is present for
    other architectures but not for X86 platforms.

    For existing X86 platforms the _TRA value is zero.  Non-zero _TRA values
    are expected on future X86 platforms.  This change will register that value
    with the resource.

    Signed-off-by: Mike Yoknis <mike.yoknis@hp.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index 94e7662..96e44fc 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -301,6 +301,7 @@  static int xen_initdom_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 					      &table_offset);
 			bir = (u8)(table_offset & PCI_MSIX_FLAGS_BIRMASK);
 
+			/* Hypervisor takes care of PCI_MSIX_TABLE_OFFSET */
 			map_irq.table_base = pci_resource_start(dev, bir);
 			map_irq.entry_nr = msidesc->msi_attrib.entry_nr;
 		}

commit 311d82d74afa19cb0ea03c516e290457f9aab63d
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Thu Apr 18 12:40:33 2013 -0600

    xen/pci: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK
    
    PCI_MSIX_FLAGS_BIRMASK is mis-named because the BIR mask is in the
    Table Offset register, not the flags ("Message Control" per spec)
    register.
    
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    CC: Jan Beulich <JBeulich@suse.com>

diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index 96e44fc..bec03d4 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -299,7 +299,7 @@  static int xen_initdom_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 
 			pci_read_config_dword(dev, pos + PCI_MSIX_TABLE,
 					      &table_offset);
-			bir = (u8)(table_offset & PCI_MSIX_FLAGS_BIRMASK);
+			bir = (u8)(table_offset & PCI_MSIX_TABLE_BIR);
 
 			/* Hypervisor takes care of PCI_MSIX_TABLE_OFFSET */
 			map_irq.table_base = pci_resource_start(dev, bir);

commit 4a5b938fe2ed3e5f5c51f4906c0d1c1486f37e49
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Thu Apr 18 15:10:58 2013 -0600

    xen/pci: Used cached MSI-X capability offset
    
    We now cache the MSI-X capability offset in the struct pci_dev, so no
    need to find the capability again.
    
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
    CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
    CC: Jan Beulich <JBeulich@suse.com>

diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c
index bec03d4..a151b02 100644
--- a/arch/x86/pci/xen.c
+++ b/arch/x86/pci/xen.c
@@ -295,8 +295,7 @@  static int xen_initdom_setup_msi_irqs(struct pci_dev *dev, int nvec, int type)
 			int pos;
 			u32 table_offset, bir;
 
-			pos = pci_find_capability(dev, PCI_CAP_ID_MSIX);
-
+			pos = dev->msix_cap;
 			pci_read_config_dword(dev, pos + PCI_MSIX_TABLE,
 					      &table_offset);
 			bir = (u8)(table_offset & PCI_MSIX_TABLE_BIR);