Patchwork [v2] PCI: Document PCIE BUS MPS parameters

login
register
mail settings
Submitter Yijing Wang
Date Jan. 24, 2013, 2:58 a.m.
Message ID <1358996288-31200-1-git-send-email-wangyijing@huawei.com>
Download mbox | patch
Permalink /patch/215105/
State Superseded
Headers show

Comments

Yijing Wang - Jan. 24, 2013, 2:58 a.m.
v0->v1: Update MPS parameters as non-arch and add MRRS
		description into pcie_bus_perf parameter suggested
		by Andrew Murray.
v1->v2: Update some semantic problems and add MPS and MRRS
		explanation suggested by Joe Lawrence and Randy Dunlap.

Document PCIE BUS MPS parameters pcie_bus_tune_off, pcie_bus_safe,
pcie_bus_peer2peer, pcie_bus_perf into Documentation/kernel-parameters.txt.
These parameters were introduced by Jon Mason <mason@myri.com> at
commit 5f39e6705 and commit b03e7495a8.

Signed-off-by: Yijing Wang <wangyijing@huawei.com>
---
 Documentation/kernel-parameters.txt |   16 ++++++++++++++++
 1 files changed, 16 insertions(+), 0 deletions(-)
Bjorn Helgaas - Jan. 25, 2013, 4:57 a.m.
[+cc Jon, can you make sure this documentation is accurate?]

On Wed, Jan 23, 2013 at 7:58 PM, Yijing Wang <wangyijing@huawei.com> wrote:
> v0->v1: Update MPS parameters as non-arch and add MRRS
>                 description into pcie_bus_perf parameter suggested
>                 by Andrew Murray.
> v1->v2: Update some semantic problems and add MPS and MRRS
>                 explanation suggested by Joe Lawrence and Randy Dunlap.
>
> Document PCIE BUS MPS parameters pcie_bus_tune_off, pcie_bus_safe,
> pcie_bus_peer2peer, pcie_bus_perf into Documentation/kernel-parameters.txt.
> These parameters were introduced by Jon Mason <mason@myri.com> at
> commit 5f39e6705 and commit b03e7495a8.
>
> Signed-off-by: Yijing Wang <wangyijing@huawei.com>
> ---
>  Documentation/kernel-parameters.txt |   16 ++++++++++++++++
>  1 files changed, 16 insertions(+), 0 deletions(-)
>
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index 363e348..0bb279f 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -2227,6 +2227,22 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>                                 This sorting is done to get a device
>                                 order compatible with older (<= 2.4) kernels.
>                 nobfsort        Don't sort PCI devices into breadth-first order.
> +               pcie_bus_tune_off       Disable PCIe MPS (Max Payload Size)
> +                               turning and using the BIOS-configured MPS defaults.

s/turning/tuning/
s/using/use/

> +               pcie_bus_safe   Use the smallest common denominator MPS
> +                               of the entire tree below a root complex for every
> +                               device on that fabric. Can avoid inconsistent MPS
> +                               problem caused by hotplug.

I'm not sure about this.  "smallest common denominator" is colloquial
and not exact.  I think you probably mean "the smallest supported MPS
of any device below a root complex."

I'm also not sure how this will avoid MPS problems caused by hotplug.
I think you have to assume that a hot-added device may support only a
128B MPS, which means the only way to guarantee that you can use that
device is to set MPS=128 for any device that will send packets to the
hot-added device.  Or we could reconfigure things at the time of the
hot-add, but I don't think we do that today (except for the hot-added
device itself).

> +               pcie_bus_perf   Configure pcie device MPS to the largest

Capitalize "PCIe" consistently.  Better yet, drop it completely since
it's obvious that we're only talking about PCIe here.

> +                               allowable MPS based on its parent bus. Also set
> +                               MRRS (Max Read Request Size) to the largest supported
> +                               value (no larger than the MPS that the device or bus
> +                               can support) for Max performance.

s/Max/best/

> +               pcie_bus_peer2peer      Make the system-wide MPS the smallest
> +                               possible value (128B). This configuration could prevent
> +                               peer to peer DMA transmission from working by having
> +                               the MPS on one root port different than the MPS on
> +                               another.

I think the idea is that pcie_bus_peer2peer would *allow* peer-to-peer
DMA to work, which is the opposite of what you wrote.  Maybe something
like this would be better:

  Set every device's MPS to 128B, which every device is guaranteed to support.
  This configuration allows peer-to-peer DMA between any pair of devices,
  possibly at the cost of reduced performance.

>                 cbiosize=nn[KMG]        The fixed amount of bus space which is
>                                 reserved for the CardBus bridge's IO window.
>                                 The default value is 256 bytes.
> --
> 1.7.1
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yijing Wang - Jan. 25, 2013, 9:13 a.m.
Hi Bjorn,
   Thanks for your review and comments! Please refer to inlined comment bellow.

On 2013/1/25 12:57, Bjorn Helgaas wrote:
> [+cc Jon, can you make sure this documentation is accurate?]
> 
>>
>> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
>> index 363e348..0bb279f 100644
>> --- a/Documentation/kernel-parameters.txt
>> +++ b/Documentation/kernel-parameters.txt
>> @@ -2227,6 +2227,22 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>>                                 This sorting is done to get a device
>>                                 order compatible with older (<= 2.4) kernels.
>>                 nobfsort        Don't sort PCI devices into breadth-first order.
>> +               pcie_bus_tune_off       Disable PCIe MPS (Max Payload Size)
>> +                               turning and using the BIOS-configured MPS defaults.
> 
> s/turning/tuning/
> s/using/use/

Will update it.

> 
>> +               pcie_bus_safe   Use the smallest common denominator MPS
>> +                               of the entire tree below a root complex for every
>> +                               device on that fabric. Can avoid inconsistent MPS
>> +                               problem caused by hotplug.
> 
> I'm not sure about this.  "smallest common denominator" is colloquial
> and not exact.  I think you probably mean "the smallest supported MPS
> of any device below a root complex."

Yes, will update it.

> 
> I'm also not sure how this will avoid MPS problems caused by hotplug.
> I think you have to assume that a hot-added device may support only a
> 128B MPS, which means the only way to guarantee that you can use that
> device is to set MPS=128 for any device that will send packets to the
> hot-added device.  Or we could reconfigure things at the time of the
> hot-add, but I don't think we do that today (except for the hot-added
> device itself).

Actually, I'm working a new patch to fix this problem. I will send out the patch as soon.
Because system default use pcie_bus_tune_off, after do pci hotplug,
we never configure newly added pcie device mps. The default mps (128B) of
added pcie device maybe smaller than upstream port mps (maybe 256B default),
so this device can not work normally.
We can separate it in two:
1、Hotplug slot directly connected to root port:
  In this case, we can reconfigure root port and newly added pcie device mps.
  We can configure their mps to largest allowable vaule or conservatively just modify
  added device mps, make it equal to mps value in root port (consider peer to peer DMA).

2、Hotplug slot not connected to root port:
  In this case, we can try to configure newly added device mps equal to upsteam port.
  If newly added device mpss smaller than mps in upstream port. We need to print warning
  info, let user know what to do next step.

> 
>> +               pcie_bus_perf   Configure pcie device MPS to the largest
> 
> Capitalize "PCIe" consistently.  Better yet, drop it completely since
> it's obvious that we're only talking about PCIe here.

Ok, drop it.

> 
>> +                               allowable MPS based on its parent bus. Also set
>> +                               MRRS (Max Read Request Size) to the largest supported
>> +                               value (no larger than the MPS that the device or bus
>> +                               can support) for Max performance.
> 
> s/Max/best/

Will update it.

> 
>> +               pcie_bus_peer2peer      Make the system-wide MPS the smallest
>> +                               possible value (128B). This configuration could prevent
>> +                               peer to peer DMA transmission from working by having
>> +                               the MPS on one root port different than the MPS on
>> +                               another.
> 
> I think the idea is that pcie_bus_peer2peer would *allow* peer-to-peer
> DMA to work, which is the opposite of what you wrote.  Maybe something
> like this would be better:
> 
>   Set every device's MPS to 128B, which every device is guaranteed to support.
>   This configuration allows peer-to-peer DMA between any pair of devices,
>   possibly at the cost of reduced performance.
This description is good.

Very sorry, that's my fault. I will update it.

> 
>>                 cbiosize=nn[KMG]        The fixed amount of bus space which is
>>                                 reserved for the CardBus bridge's IO window.
>>                                 The default value is 256 bytes.
>> --
>> 1.7.1
>>
>>
> 
> .
>

Patch

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 363e348..0bb279f 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2227,6 +2227,22 @@  bytes respectively. Such letter suffixes can also be entirely omitted.
 				This sorting is done to get a device
 				order compatible with older (<= 2.4) kernels.
 		nobfsort	Don't sort PCI devices into breadth-first order.
+		pcie_bus_tune_off	Disable PCIe MPS (Max Payload Size)
+				turning and using the BIOS-configured MPS defaults.
+		pcie_bus_safe	Use the smallest common denominator MPS
+				of the entire tree below a root complex for every
+				device on that fabric. Can avoid inconsistent MPS
+				problem caused by hotplug.
+		pcie_bus_perf	Configure pcie device MPS to the largest
+				allowable MPS based on its parent bus. Also set
+				MRRS (Max Read Request Size) to the largest supported
+				value (no larger than the MPS that the device or bus
+				can support) for Max performance.
+		pcie_bus_peer2peer	Make the system-wide MPS the smallest
+				possible value (128B). This configuration could prevent
+				peer to peer DMA transmission from working by having
+				the MPS on one root port different than the MPS on
+				another.
 		cbiosize=nn[KMG]	The fixed amount of bus space which is
 				reserved for the CardBus bridge's IO window.
 				The default value is 256 bytes.