diff mbox series

[V10,4/8] PCI/sysfs: Add a 10-Bit Tag sysfs file PCIe Endpoint devices

Message ID 20211009104938.48225-5-liudongdong3@huawei.com
State New
Headers show
Series PCI: Enable 10-Bit tag support for PCIe devices | expand

Commit Message

Dongdong Liu Oct. 9, 2021, 10:49 a.m. UTC
PCIe spec 5.0 r1.0 section 2.2.6.2 says:

  If an Endpoint supports sending Requests to other Endpoints (as
  opposed to host memory), the Endpoint must not send 10-Bit Tag
  Requests to another given Endpoint unless an implementation-specific
  mechanism determines that the Endpoint supports 10-Bit Tag Completer
  capability.

Add a 10bit_tag sysfs file, write 0 to disable 10-Bit Tag Requester
when the driver does not bind the device. The typical use case is for
p2pdma when the peer device does not support 10-Bit Tag Completer.
Write 1 to enable 10-Bit Tag Requester when RC supports 10-Bit Tag
Completer capability. The typical use case is for host memory targeted
by DMA Requests. The 10bit_tag file content indicate current status of
10-Bit Tag Requester Enable.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
 Documentation/ABI/testing/sysfs-bus-pci | 18 +++++-
 drivers/pci/pci-sysfs.c                 | 78 +++++++++++++++++++++++++
 drivers/pci/pci.h                       |  2 +
 drivers/pci/probe.c                     | 14 +++++
 4 files changed, 111 insertions(+), 1 deletion(-)

Comments

Bjorn Helgaas Oct. 27, 2021, 10:28 p.m. UTC | #1
On Sat, Oct 09, 2021 at 06:49:34PM +0800, Dongdong Liu wrote:
> PCIe spec 5.0 r1.0 section 2.2.6.2 says:
> 
>   If an Endpoint supports sending Requests to other Endpoints (as
>   opposed to host memory), the Endpoint must not send 10-Bit Tag
>   Requests to another given Endpoint unless an implementation-specific
>   mechanism determines that the Endpoint supports 10-Bit Tag Completer
>   capability.
> 
> Add a 10bit_tag sysfs file, write 0 to disable 10-Bit Tag Requester
> when the driver does not bind the device. The typical use case is for
> p2pdma when the peer device does not support 10-Bit Tag Completer.
> Write 1 to enable 10-Bit Tag Requester when RC supports 10-Bit Tag
> Completer capability. The typical use case is for host memory targeted
> by DMA Requests. The 10bit_tag file content indicate current status of
> 10-Bit Tag Requester Enable.

Don't we have a hole here?  We're adding knobs to control 10-Bit Tag
usage, but don't we have basically the same issues with Extended
(8-bit) Tags?

I wonder if we should be adding a more general "tags" file that can
manage both 8-bit and 10-bit tag usage.

> +static struct device_attribute dev_attr_10bit_tag = __ATTR(10bit_tag, 0644,
> +							   pci_10bit_tag_show,
> +							   pci_10bit_tag_store);

I think this should use DEVICE_ATTR().  Or even better, if the name
doesn't start with a digit, DEVICE_ATTR_RW().
Dongdong Liu Oct. 28, 2021, 7:44 a.m. UTC | #2
Hi Bjorn

Many thanks for your review.
On 2021/10/28 6:28, Bjorn Helgaas wrote:
> On Sat, Oct 09, 2021 at 06:49:34PM +0800, Dongdong Liu wrote:
>> PCIe spec 5.0 r1.0 section 2.2.6.2 says:
>>
>>   If an Endpoint supports sending Requests to other Endpoints (as
>>   opposed to host memory), the Endpoint must not send 10-Bit Tag
>>   Requests to another given Endpoint unless an implementation-specific
>>   mechanism determines that the Endpoint supports 10-Bit Tag Completer
>>   capability.
>>
>> Add a 10bit_tag sysfs file, write 0 to disable 10-Bit Tag Requester
>> when the driver does not bind the device. The typical use case is for
>> p2pdma when the peer device does not support 10-Bit Tag Completer.
>> Write 1 to enable 10-Bit Tag Requester when RC supports 10-Bit Tag
>> Completer capability. The typical use case is for host memory targeted
>> by DMA Requests. The 10bit_tag file content indicate current status of
>> 10-Bit Tag Requester Enable.
>
> Don't we have a hole here?  We're adding knobs to control 10-Bit Tag
> usage, but don't we have basically the same issues with Extended
> (8-bit) Tags?

All PCIe completers are required to support 8-bit tags
from the "[PATCH] PCI: enable extended tags support for PCIe endpoints"
(https://patchwork.kernel.org/project/linux-arm-msm/patch/1474769434-5756-1-git-send-email-okaya@codeaurora.org/).

I ask hardware colleagues, also says all PCIe devices should support
8-bit tags completer default, so seems no need to do this for 8-bit tags.
>
> I wonder if we should be adding a more general "tags" file that can
> manage both 8-bit and 10-bit tag usage.
>
>> +static struct device_attribute dev_attr_10bit_tag = __ATTR(10bit_tag, 0644,
>> +							   pci_10bit_tag_show,
>> +							   pci_10bit_tag_store);
>
> I think this should use DEVICE_ATTR().
Yes, will do.

Thanks,
Dongdong
> Or even better, if the name doesn't start with a digit, DEVICE_ATTR_RW().
> .
>
Bjorn Helgaas Oct. 28, 2021, 5:24 p.m. UTC | #3
On Thu, Oct 28, 2021 at 03:44:49PM +0800, Dongdong Liu wrote:
> On 2021/10/28 6:28, Bjorn Helgaas wrote:
> > On Sat, Oct 09, 2021 at 06:49:34PM +0800, Dongdong Liu wrote:
> > > PCIe spec 5.0 r1.0 section 2.2.6.2 says:
> > > 
> > >   If an Endpoint supports sending Requests to other Endpoints (as
> > >   opposed to host memory), the Endpoint must not send 10-Bit Tag
> > >   Requests to another given Endpoint unless an implementation-specific
> > >   mechanism determines that the Endpoint supports 10-Bit Tag Completer
> > >   capability.
> > > 
> > > Add a 10bit_tag sysfs file, write 0 to disable 10-Bit Tag Requester
> > > when the driver does not bind the device. The typical use case is for
> > > p2pdma when the peer device does not support 10-Bit Tag Completer.
> > > Write 1 to enable 10-Bit Tag Requester when RC supports 10-Bit Tag
> > > Completer capability. The typical use case is for host memory targeted
> > > by DMA Requests. The 10bit_tag file content indicate current status of
> > > 10-Bit Tag Requester Enable.
> > 
> > Don't we have a hole here?  We're adding knobs to control 10-Bit Tag
> > usage, but don't we have basically the same issues with Extended
> > (8-bit) Tags?
> 
> All PCIe completers are required to support 8-bit tags
> from the "[PATCH] PCI: enable extended tags support for PCIe endpoints"
> (https://patchwork.kernel.org/project/linux-arm-msm/patch/1474769434-5756-1-git-send-email-okaya@codeaurora.org/).
> 
> I ask hardware colleagues, also says all PCIe devices should support
> 8-bit tags completer default, so seems no need to do this for 8-bit tags.

Oh, right, I forgot that, thanks for the reminder!  Let's add a
comment in pci_configure_extended_tags() to that effect so I'll
remember next time.

I think the appropriate reference is PCIe r5.0, sec 2.2.6.2, which
says "Receivers/Completers must handle 8-bit Tag values correctly
regardless of the setting of their Extended Tag Field Enable bit (see
Section 7.5.3.4)."

The Tag field was 8 bits all the way from PCIe r1.0, but until r2.1 it
said that by default, only the lower 5 bits are used.

The text about all Completers explicitly being required to support
8-bit Tags wasn't added until PCIe r3.0, which might explain some
confusion and the presence of the Extended Tag Field Enable bit.

At the same time, can you fold pci_configure_10bit_tags() directly
into pci_configure_extended_tags()?  It's pretty small and I think it
will be easier if it's all in one place.

> > I wonder if we should be adding a more general "tags" file that can
> > manage both 8-bit and 10-bit tag usage.

I'm still thinking that maybe a generic name (without "10") would be
better, even though we don't need it to manage 8-bit tags.  It's
conceivable that there could be even more tag bits in the future, and
it would be nice if we didn't have to add yet another file.

Bjorn
Dongdong Liu Oct. 29, 2021, 7:16 a.m. UTC | #4
On 2021/10/29 1:24, Bjorn Helgaas wrote:
> On Thu, Oct 28, 2021 at 03:44:49PM +0800, Dongdong Liu wrote:
>> On 2021/10/28 6:28, Bjorn Helgaas wrote:
>>> On Sat, Oct 09, 2021 at 06:49:34PM +0800, Dongdong Liu wrote:
>>>> PCIe spec 5.0 r1.0 section 2.2.6.2 says:
>>>>
>>>>   If an Endpoint supports sending Requests to other Endpoints (as
>>>>   opposed to host memory), the Endpoint must not send 10-Bit Tag
>>>>   Requests to another given Endpoint unless an implementation-specific
>>>>   mechanism determines that the Endpoint supports 10-Bit Tag Completer
>>>>   capability.
>>>>
>>>> Add a 10bit_tag sysfs file, write 0 to disable 10-Bit Tag Requester
>>>> when the driver does not bind the device. The typical use case is for
>>>> p2pdma when the peer device does not support 10-Bit Tag Completer.
>>>> Write 1 to enable 10-Bit Tag Requester when RC supports 10-Bit Tag
>>>> Completer capability. The typical use case is for host memory targeted
>>>> by DMA Requests. The 10bit_tag file content indicate current status of
>>>> 10-Bit Tag Requester Enable.
>>>
>>> Don't we have a hole here?  We're adding knobs to control 10-Bit Tag
>>> usage, but don't we have basically the same issues with Extended
>>> (8-bit) Tags?
>>
>> All PCIe completers are required to support 8-bit tags
>> from the "[PATCH] PCI: enable extended tags support for PCIe endpoints"
>> (https://patchwork.kernel.org/project/linux-arm-msm/patch/1474769434-5756-1-git-send-email-okaya@codeaurora.org/).
>>
>> I ask hardware colleagues, also says all PCIe devices should support
>> 8-bit tags completer default, so seems no need to do this for 8-bit tags.
>
> Oh, right, I forgot that, thanks for the reminder!  Let's add a
> comment in pci_configure_extended_tags() to that effect so I'll
> remember next time.
Ok, Will do.
>
> I think the appropriate reference is PCIe r5.0, sec 2.2.6.2, which
> says "Receivers/Completers must handle 8-bit Tag values correctly
> regardless of the setting of their Extended Tag Field Enable bit (see
> Section 7.5.3.4)."
>
> The Tag field was 8 bits all the way from PCIe r1.0, but until r2.1 it
> said that by default, only the lower 5 bits are used.
>
> The text about all Completers explicitly being required to support
> 8-bit Tags wasn't added until PCIe r3.0, which might explain some
> confusion and the presence of the Extended Tag Field Enable bit.
>
Thanks for the clarification.
> At the same time, can you fold pci_configure_10bit_tags() directly
> into pci_configure_extended_tags()?  It's pretty small and I think it
> will be easier if it's all in one place.
OK, will do.
>
>>> I wonder if we should be adding a more general "tags" file that can
>>> manage both 8-bit and 10-bit tag usage.
>
> I'm still thinking that maybe a generic name (without "10") would be
> better, even though we don't need it to manage 8-bit tags.  It's
> conceivable that there could be even more tag bits in the future, and
> it would be nice if we didn't have to add yet another file.
Looks good, will do.

Thanks,
Dongdong.
>
> Bjorn
> .
>
diff mbox series

Patch

diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index d4ae03296861..0c26346d1069 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -156,7 +156,7 @@  Description:
 		binary file containing the Vital Product Data for the
 		device.  It should follow the VPD format defined in
 		PCI Specification 2.1 or 2.2, but users should consider
-		that some devices may have incorrectly formatted data.  
+		that some devices may have incorrectly formatted data.
 		If the underlying VPD has a writable section then the
 		corresponding section of this file will be writable.
 
@@ -424,3 +424,19 @@  Description:
 
 		The file is writable if the PF is bound to a driver that
 		implements ->sriov_set_msix_vec_count().
+
+What:		/sys/bus/pci/devices/.../10bit_tag
+Date:		September 2021
+Contact:	Dongdong Liu <liudongdong3@huawei.com>
+Description:
+		The file will be visible when the device supports 10-Bit Tag
+		Requester. The file is readable, the value indicate current
+		status of 10-Bit Tag Requester Enable.
+		1 - enabled, 0 - disabled.
+
+		The file is also writable, write 0 to disable 10-Bit Tag
+		Requester when the driver does not bind the device. The typical
+		use case is for p2pdma when the peer device does not support
+		10-Bit Tag Completer. Write 1 to enable 10-Bit Tag Requester
+		when RC supports 10-Bit Tag Completer capability. The typical
+		use case is for host memory targeted by DMA Requests.
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 7fb5cd17cc98..f571e4a0eb4c 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -306,6 +306,55 @@  static ssize_t enable_show(struct device *dev, struct device_attribute *attr,
 }
 static DEVICE_ATTR_RW(enable);
 
+static ssize_t pci_10bit_tag_store(struct device *dev,
+				   struct device_attribute *attr,
+				   const char *buf, size_t count)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	bool enable;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	if (kstrtobool(buf, &enable) < 0)
+		return -EINVAL;
+
+	if (pdev->driver)
+		return -EBUSY;
+
+	if (!pcie_rp_10bit_tag_cmp_supported(pdev))
+		return -EPERM;
+
+	if (enable)
+		pcie_capability_set_word(pdev, PCI_EXP_DEVCTL2,
+				PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+	else
+		pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL2,
+				   PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+
+	return count;
+}
+
+static ssize_t pci_10bit_tag_show(struct device *dev,
+				  struct device_attribute *attr,
+				  char *buf)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	u16 ctl;
+	int ret;
+
+	ret = pcie_capability_read_word(pdev, PCI_EXP_DEVCTL2, &ctl);
+	if (ret)
+		return -EINVAL;
+
+	return sysfs_emit(buf, "%u\n",
+			  !!(ctl & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN));
+}
+
+static struct device_attribute dev_attr_10bit_tag = __ATTR(10bit_tag, 0644,
+							   pci_10bit_tag_show,
+							   pci_10bit_tag_store);
+
 #ifdef CONFIG_NUMA
 static ssize_t numa_node_store(struct device *dev,
 			       struct device_attribute *attr, const char *buf,
@@ -635,6 +684,11 @@  static struct attribute *pcie_dev_attrs[] = {
 	NULL,
 };
 
+static struct attribute *pcie_dev_10bit_tag_attrs[] = {
+	&dev_attr_10bit_tag.attr,
+	NULL,
+};
+
 static struct attribute *pcibus_attrs[] = {
 	&dev_attr_bus_rescan.attr,
 	&dev_attr_cpuaffinity.attr,
@@ -1482,6 +1536,24 @@  static umode_t pcie_dev_attrs_are_visible(struct kobject *kobj,
 	return 0;
 }
 
+static umode_t pcie_dev_10bit_tag_attrs_is_visible(struct kobject *kobj,
+					  struct attribute *a, int n)
+{
+	struct device *dev = kobj_to_dev(kobj);
+	struct pci_dev *pdev = to_pci_dev(dev);
+
+	if (pdev->is_virtfn)
+		return 0;
+
+	if (pci_pcie_type(pdev) != PCI_EXP_TYPE_ENDPOINT)
+		return 0;
+
+	if (!(pdev->devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ))
+		return 0;
+
+	return a->mode;
+}
+
 static const struct attribute_group pci_dev_group = {
 	.attrs = pci_dev_attrs,
 };
@@ -1522,6 +1594,11 @@  static const struct attribute_group pcie_dev_attr_group = {
 	.is_visible = pcie_dev_attrs_are_visible,
 };
 
+static const struct attribute_group pcie_dev_10bit_tag_attr_group = {
+	.attrs = pcie_dev_10bit_tag_attrs,
+	.is_visible = pcie_dev_10bit_tag_attrs_is_visible,
+};
+
 static const struct attribute_group *pci_dev_attr_groups[] = {
 	&pci_dev_attr_group,
 	&pci_dev_hp_attr_group,
@@ -1531,6 +1608,7 @@  static const struct attribute_group *pci_dev_attr_groups[] = {
 #endif
 	&pci_bridge_attr_group,
 	&pcie_dev_attr_group,
+	&pcie_dev_10bit_tag_attr_group,
 #ifdef CONFIG_PCIEAER
 	&aer_stats_attr_group,
 #endif
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 1cce56c2aea0..f719a41dfc7f 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -264,6 +264,8 @@  struct device *pci_get_host_bridge_device(struct pci_dev *dev);
 void pci_put_host_bridge_device(struct device *dev);
 
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign);
+bool pcie_rp_10bit_tag_cmp_supported(struct pci_dev *dev);
+
 bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
 				int crs_timeout);
 bool pci_bus_generic_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 7259ad774ac8..705dd4e85df5 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2042,6 +2042,20 @@  static void pci_configure_mps(struct pci_dev *dev)
 		 p_mps, mps, mpss);
 }
 
+bool pcie_rp_10bit_tag_cmp_supported(struct pci_dev *dev)
+{
+	struct pci_dev *root;
+
+	root = pcie_find_root_port(dev);
+	if (!root)
+		return false;
+
+	if (!(root->devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP))
+		return false;
+
+	return true;
+}
+
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 {
 	struct pci_host_bridge *host;