diff mbox

[RFC,v5,6/7] PCI: Add a new bit to pci_bus_flags to indicate interrupt remapping

Message ID 1459864004-2869-1-git-send-email-xyjxie@linux.vnet.ibm.com (mailing list archive)
State Not Applicable
Headers show

Commit Message

Yongji Xie April 5, 2016, 1:46 p.m. UTC
I'm trying to find a proper way to indicate
the capability of interrupt remapping on PPC64
because we need this to determine whether it is
safe to mmap MSI-X table in VFIO driver.

There is a existing flag for this in the IOMMU
space:

enum iommu_cap {
	IOMMU_CAP_CACHE_COHERENCY,
--->    IOMMU_CAP_INTR_REMAP,
	IOMMU_CAP_NOEXEC,
};

But it seems to be not a good idea to use
bus_set_iommu() to add this flag on PPC64 which
never set/use iommu_ops.

Eric also posted a patchset [1] to abstract
this capability on MSI controller side for ARM.
But I found we also never use msi_domain_info
on PPC64. We have to rework the MSI code on
PPC64 to support msi_domain_ops if we want to
use msi_domain_info.

Then I noticed that interrupt remapping is
abstracted on PCI host bridge side on PPC64.
So I'm thinking whether we could add a new
bit to pci_bus_flags to indicate this
capability, like this patch does.

Compared with IOMMU_CAP_INTR_REMAP, this
flag is more common. Any arch can set/use this
easily. And it can provide a better granularity
(pci_bus_type -> pci_bus).

[1] http://www.spinics.net/lists/kvm/msg130262.html

Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/pci-ioda.c |    8 ++++++++
 include/linux/pci.h                       |    1 +
 2 files changed, 9 insertions(+)

Comments

Gavin Shan April 6, 2016, 12:11 a.m. UTC | #1
On Tue, Apr 05, 2016 at 09:46:43PM +0800, Yongji Xie wrote:
>I'm trying to find a proper way to indicate
>the capability of interrupt remapping on PPC64
>because we need this to determine whether it is
>safe to mmap MSI-X table in VFIO driver.
>
>There is a existing flag for this in the IOMMU
>space:
>
>enum iommu_cap {
>	IOMMU_CAP_CACHE_COHERENCY,
>--->    IOMMU_CAP_INTR_REMAP,
>	IOMMU_CAP_NOEXEC,
>};
>
>But it seems to be not a good idea to use
>bus_set_iommu() to add this flag on PPC64 which
>never set/use iommu_ops.
>
>Eric also posted a patchset [1] to abstract
>this capability on MSI controller side for ARM.
>But I found we also never use msi_domain_info
>on PPC64. We have to rework the MSI code on
>PPC64 to support msi_domain_ops if we want to
>use msi_domain_info.
>
>Then I noticed that interrupt remapping is
>abstracted on PCI host bridge side on PPC64.
>So I'm thinking whether we could add a new
>bit to pci_bus_flags to indicate this
>capability, like this patch does.
>
>Compared with IOMMU_CAP_INTR_REMAP, this
>flag is more common. Any arch can set/use this
>easily. And it can provide a better granularity
>(pci_bus_type -> pci_bus).
>
>[1] http://www.spinics.net/lists/kvm/msg130262.html
>
>Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
>---
> arch/powerpc/platforms/powernv/pci-ioda.c |    8 ++++++++
> include/linux/pci.h                       |    1 +
> 2 files changed, 9 insertions(+)
>
>diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>index f90dc04..9557638 100644
>--- a/arch/powerpc/platforms/powernv/pci-ioda.c
>+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>@@ -3080,6 +3080,12 @@ static void pnv_pci_ioda_fixup(void)
> 	pnv_npu_ioda_fixup();
> }
>
>+int pnv_pci_ioda_root_bridge_prepare(struct pci_host_bridge *bridge)
>+{
>+	bridge->bus->bus_flags |= PCI_BUS_FLAGS_MSI_REMAP;
>+	return 0;
>+}
>+

A irelevent question: Here we set the flag unconditionally. Can the
MSIx table be exposed to user space unconditionally? Do we have issues
if the MSIx related BAR isn't page aligned or overlapped with others?

> /*
>  * Returns the alignment for I/O or memory windows for P2P
>  * bridges. That actually depends on how PEs are segmented.
>@@ -3364,6 +3370,8 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
> 	 */
> 	ppc_md.pcibios_fixup = pnv_pci_ioda_fixup;
>
>+	ppc_md.pcibios_root_bridge_prepare = pnv_pci_ioda_root_bridge_prepare;
>+
> 	if (phb->type == PNV_PHB_NPU)
> 		hose->controller_ops = pnv_npu_ioda_controller_ops;
> 	else
>diff --git a/include/linux/pci.h b/include/linux/pci.h
>index 27df4a6..741dcaf 100644
>--- a/include/linux/pci.h
>+++ b/include/linux/pci.h
>@@ -193,6 +193,7 @@ typedef unsigned short __bitwise pci_bus_flags_t;
> enum pci_bus_flags {
> 	PCI_BUS_FLAGS_NO_MSI   = (__force pci_bus_flags_t) 1,
> 	PCI_BUS_FLAGS_NO_MMRBC = (__force pci_bus_flags_t) 2,
>+	PCI_BUS_FLAGS_MSI_REMAP = (__force pci_bus_flags_t) 3,

Yongji, I guess the corresponding value to PCI_BUS_FLAGS_MSI_REMAP
would be 4 other than 3. Otherwise, two of the flags cannot exist
at the same time.

Thanks,
Gavin

> };
>
> /* These values come from the PCI Express Spec */
>-- 
>1.7.9.5
>
Yongji Xie April 6, 2016, 3:18 a.m. UTC | #2
On 2016/4/6 8:11, Gavin Shan wrote:
> On Tue, Apr 05, 2016 at 09:46:43PM +0800, Yongji Xie wrote:
>> I'm trying to find a proper way to indicate
>> the capability of interrupt remapping on PPC64
>> because we need this to determine whether it is
>> safe to mmap MSI-X table in VFIO driver.
>>
>> There is a existing flag for this in the IOMMU
>> space:
>>
>> enum iommu_cap {
>> 	IOMMU_CAP_CACHE_COHERENCY,
>> --->    IOMMU_CAP_INTR_REMAP,
>> 	IOMMU_CAP_NOEXEC,
>> };
>>
>> But it seems to be not a good idea to use
>> bus_set_iommu() to add this flag on PPC64 which
>> never set/use iommu_ops.
>>
>> Eric also posted a patchset [1] to abstract
>> this capability on MSI controller side for ARM.
>> But I found we also never use msi_domain_info
>> on PPC64. We have to rework the MSI code on
>> PPC64 to support msi_domain_ops if we want to
>> use msi_domain_info.
>>
>> Then I noticed that interrupt remapping is
>> abstracted on PCI host bridge side on PPC64.
>> So I'm thinking whether we could add a new
>> bit to pci_bus_flags to indicate this
>> capability, like this patch does.
>>
>> Compared with IOMMU_CAP_INTR_REMAP, this
>> flag is more common. Any arch can set/use this
>> easily. And it can provide a better granularity
>> (pci_bus_type -> pci_bus).
>>
>> [1] http://www.spinics.net/lists/kvm/msg130262.html
>>
>> Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
>> ---
>> arch/powerpc/platforms/powernv/pci-ioda.c |    8 ++++++++
>> include/linux/pci.h                       |    1 +
>> 2 files changed, 9 insertions(+)
>>
>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>> index f90dc04..9557638 100644
>> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
>> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>> @@ -3080,6 +3080,12 @@ static void pnv_pci_ioda_fixup(void)
>> 	pnv_npu_ioda_fixup();
>> }
>>
>> +int pnv_pci_ioda_root_bridge_prepare(struct pci_host_bridge *bridge)
>> +{
>> +	bridge->bus->bus_flags |= PCI_BUS_FLAGS_MSI_REMAP;
>> +	return 0;
>> +}
>> +
> A irelevent question: Here we set the flag unconditionally. Can the
> MSIx table be exposed to user space unconditionally? Do we have issues
> if the MSIx related BAR isn't page aligned or overlapped with others?

This flag is just used to indicate the capability of MSI
remapping which means PCI devices on this bus
can only shoot the MSIs assigned for them. It does
not mean MSIx table can be exposed to user space
unconditionally.

>> /*
>>   * Returns the alignment for I/O or memory windows for P2P
>>   * bridges. That actually depends on how PEs are segmented.
>> @@ -3364,6 +3370,8 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
>> 	 */
>> 	ppc_md.pcibios_fixup = pnv_pci_ioda_fixup;
>>
>> +	ppc_md.pcibios_root_bridge_prepare = pnv_pci_ioda_root_bridge_prepare;
>> +
>> 	if (phb->type == PNV_PHB_NPU)
>> 		hose->controller_ops = pnv_npu_ioda_controller_ops;
>> 	else
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 27df4a6..741dcaf 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -193,6 +193,7 @@ typedef unsigned short __bitwise pci_bus_flags_t;
>> enum pci_bus_flags {
>> 	PCI_BUS_FLAGS_NO_MSI   = (__force pci_bus_flags_t) 1,
>> 	PCI_BUS_FLAGS_NO_MMRBC = (__force pci_bus_flags_t) 2,
>> +	PCI_BUS_FLAGS_MSI_REMAP = (__force pci_bus_flags_t) 3,
> Yongji, I guess the corresponding value to PCI_BUS_FLAGS_MSI_REMAP
> would be 4 other than 3. Otherwise, two of the flags cannot exist
> at the same time.
>
> Thanks,
> Gavin

Yes, you are right. It should be 4 (1 << 2).

Thanks,
Yongji

>> };
>>
>> /* These values come from the PCI Express Spec */
>> -- 
>> 1.7.9.5
>>
diff mbox

Patch

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index f90dc04..9557638 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -3080,6 +3080,12 @@  static void pnv_pci_ioda_fixup(void)
 	pnv_npu_ioda_fixup();
 }
 
+int pnv_pci_ioda_root_bridge_prepare(struct pci_host_bridge *bridge)
+{
+	bridge->bus->bus_flags |= PCI_BUS_FLAGS_MSI_REMAP;
+	return 0;
+}
+
 /*
  * Returns the alignment for I/O or memory windows for P2P
  * bridges. That actually depends on how PEs are segmented.
@@ -3364,6 +3370,8 @@  static void __init pnv_pci_init_ioda_phb(struct device_node *np,
 	 */
 	ppc_md.pcibios_fixup = pnv_pci_ioda_fixup;
 
+	ppc_md.pcibios_root_bridge_prepare = pnv_pci_ioda_root_bridge_prepare;
+
 	if (phb->type == PNV_PHB_NPU)
 		hose->controller_ops = pnv_npu_ioda_controller_ops;
 	else
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 27df4a6..741dcaf 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -193,6 +193,7 @@  typedef unsigned short __bitwise pci_bus_flags_t;
 enum pci_bus_flags {
 	PCI_BUS_FLAGS_NO_MSI   = (__force pci_bus_flags_t) 1,
 	PCI_BUS_FLAGS_NO_MMRBC = (__force pci_bus_flags_t) 2,
+	PCI_BUS_FLAGS_MSI_REMAP = (__force pci_bus_flags_t) 3,
 };
 
 /* These values come from the PCI Express Spec */