Patchwork [v2,07/10] ARM: tegra: pcie: Add device tree support

login
register
mail settings
Submitter Thierry Reding
Date June 11, 2012, 3:05 p.m.
Message ID <1339427118-32263-8-git-send-email-thierry.reding@avionic-design.de>
Download mbox | patch
Permalink /patch/164213/
State Not Applicable
Headers show

Comments

Thierry Reding - June 11, 2012, 3:05 p.m.
This commit adds support for instantiating the Tegra PCIe controller
from a device tree.

Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>

---
Changes in v2:
- increase compile coverage by using the IS_ENABLED() macro
- disable node by default
---
 .../devicetree/bindings/pci/tegra-pcie.txt         |   23 ++++
 arch/arm/boot/dts/tegra20.dtsi                     |    9 ++
 arch/arm/mach-tegra/pcie.c                         |  120 +++++++++++++++++++-
 3 files changed, 149 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/pci/tegra-pcie.txt
Stephen Warren - June 11, 2012, 9:33 p.m.
On 06/11/2012 09:05 AM, Thierry Reding wrote:
> This commit adds support for instantiating the Tegra PCIe controller
> from a device tree.

> +++ b/Documentation/devicetree/bindings/pci/tegra-pcie.txt

Can we please name this nvidia,tegra20-pcie.txt to match the naming of
all the other Tegra bindings.

> +Required properties:
> +- compatible: "nvidia,tegra20-pcie"
> +- reg: physical base address and length of the controller's registers

Since there's more than one range now, that should specify how many
entries are required and what they represent.

> +Optional properties:
> +- pex-clk-supply: supply voltage for internal reference clock
> +- vdd-supply: power supply for controller (1.05V)

Those shouldn't be optional. If the board has no regulator, the board's
.dts should provide a fixed always-on regulator that those properties
can refer to, so that the driver can always get() those regulators.

> diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi

> +	pci {
...
> +		status = "disable";

That should be "disabled"; sorry for providing a bad example.

> diff --git a/arch/arm/mach-tegra/pcie.c b/arch/arm/mach-tegra/pcie.c

> +static struct tegra_pcie_pdata *tegra_pcie_parse_dt(struct platform_device *pdev)

> +	if (of_find_property(node, "vdd-supply", NULL)) {

As mentioned above, that if statement should be removed, since the
regulators shouldn't be optional.

> +		pcie->vdd_supply = regulator_get(&pdev->dev, "vdd");

Those could be devm_regulator_get(). Then tegra_pcie_remove() wouldn't
have to put() them.

> +	for (i = 0; i < TEGRA_PCIE_MAX_PORTS; i++)
> +		pdata->enable_ports[i] = true;

Shouldn't the DT indicate which ports are used? I assume there's some
reason that the existing driver allows that to be configured, rather
than always enabling all ports. At least, enumeration time wasted on
non-existent ports springs to mind, and perhaps attempting to enable
port 1 when port 0 is x4 and using all the lanes would cause errors in
port 0?
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thierry Reding - June 12, 2012, 6:21 a.m.
* Stephen Warren wrote:
> On 06/11/2012 09:05 AM, Thierry Reding wrote:
> > This commit adds support for instantiating the Tegra PCIe controller
> > from a device tree.
> 
> > +++ b/Documentation/devicetree/bindings/pci/tegra-pcie.txt
> 
> Can we please name this nvidia,tegra20-pcie.txt to match the naming of
> all the other Tegra bindings.

Yes, will do.

> > +Required properties:
> > +- compatible: "nvidia,tegra20-pcie"
> > +- reg: physical base address and length of the controller's registers
> 
> Since there's more than one range now, that should specify how many
> entries are required and what they represent.

Okay.

> > +Optional properties:
> > +- pex-clk-supply: supply voltage for internal reference clock
> > +- vdd-supply: power supply for controller (1.05V)
> 
> Those shouldn't be optional. If the board has no regulator, the board's
> .dts should provide a fixed always-on regulator that those properties
> can refer to, so that the driver can always get() those regulators.

That'll add more dummy regulators and I don't think sprinkling them across
the DTS is going to work very well. Maybe collecting them under a top-level
"regulators" node is a good option. If you have a better alternative, I'm all
open for it.

> > diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
> 
> > +	pci {
> ...
> > +		status = "disable";
> 
> That should be "disabled"; sorry for providing a bad example.

Yes.

> > diff --git a/arch/arm/mach-tegra/pcie.c b/arch/arm/mach-tegra/pcie.c
> 
> > +static struct tegra_pcie_pdata *tegra_pcie_parse_dt(struct platform_device *pdev)
> 
> > +	if (of_find_property(node, "vdd-supply", NULL)) {
> 
> As mentioned above, that if statement should be removed, since the
> regulators shouldn't be optional.

Okay.

> > +		pcie->vdd_supply = regulator_get(&pdev->dev, "vdd");
> 
> Those could be devm_regulator_get(). Then tegra_pcie_remove() wouldn't
> have to put() them.

Okay.

> > +	for (i = 0; i < TEGRA_PCIE_MAX_PORTS; i++)
> > +		pdata->enable_ports[i] = true;
> 
> Shouldn't the DT indicate which ports are used? I assume there's some
> reason that the existing driver allows that to be configured, rather
> than always enabling all ports. At least, enumeration time wasted on
> non-existent ports springs to mind, and perhaps attempting to enable
> port 1 when port 0 is x4 and using all the lanes would cause errors in
> port 0?

Yes, that's been on my mind as well. I'm not sure about the best binding for
this. Perhaps something like:

	pci {
		enable-ports = <0 1 2>;
	};

Would do?

Thierry
Stephen Warren - June 12, 2012, 3:44 p.m.
On 06/12/2012 12:21 AM, Thierry Reding wrote:
> * Stephen Warren wrote:
>> On 06/11/2012 09:05 AM, Thierry Reding wrote:
>>> This commit adds support for instantiating the Tegra PCIe
>>> controller from a device tree.
>> 
>>> +++ b/Documentation/devicetree/bindings/pci/tegra-pcie.txt
>> 
>> Can we please name this nvidia,tegra20-pcie.txt to match the
>> naming of all the other Tegra bindings.
> 
> Yes, will do.
> 
>>> +Required properties: +- compatible: "nvidia,tegra20-pcie" +-
>>> reg: physical base address and length of the controller's
>>> registers
>> 
>> Since there's more than one range now, that should specify how
>> many entries are required and what they represent.
> 
> Okay.
> 
>>> +Optional properties: +- pex-clk-supply: supply voltage for
>>> internal reference clock +- vdd-supply: power supply for
>>> controller (1.05V)
>> 
>> Those shouldn't be optional. If the board has no regulator, the
>> board's .dts should provide a fixed always-on regulator that
>> those properties can refer to, so that the driver can always
>> get() those regulators.
> 
> That'll add more dummy regulators and I don't think sprinkling them
> across the DTS is going to work very well. Maybe collecting them
> under a top-level "regulators" node is a good option. If you have a
> better alternative, I'm all open for it.
> 
>>> diff --git a/arch/arm/boot/dts/tegra20.dtsi
>>> b/arch/arm/boot/dts/tegra20.dtsi
>> 
>>> +	pci {
>> ...
>>> +		status = "disable";
>> 
>> That should be "disabled"; sorry for providing a bad example.
> 
> Yes.
> 
>>> diff --git a/arch/arm/mach-tegra/pcie.c
>>> b/arch/arm/mach-tegra/pcie.c
>> 
>>> +static struct tegra_pcie_pdata *tegra_pcie_parse_dt(struct
>>> platform_device *pdev)
>> 
>>> +	if (of_find_property(node, "vdd-supply", NULL)) {
>> 
>> As mentioned above, that if statement should be removed, since
>> the regulators shouldn't be optional.
> 
> Okay.
> 
>>> +		pcie->vdd_supply = regulator_get(&pdev->dev, "vdd");
>> 
>> Those could be devm_regulator_get(). Then tegra_pcie_remove()
>> wouldn't have to put() them.
> 
> Okay.
> 
>>> +	for (i = 0; i < TEGRA_PCIE_MAX_PORTS; i++) +
>>> pdata->enable_ports[i] = true;
>> 
>> Shouldn't the DT indicate which ports are used? I assume there's
>> some reason that the existing driver allows that to be
>> configured, rather than always enabling all ports. At least,
>> enumeration time wasted on non-existent ports springs to mind,
>> and perhaps attempting to enable port 1 when port 0 is x4 and
>> using all the lanes would cause errors in port 0?
> 
> Yes, that's been on my mind as well. I'm not sure about the best
> binding for this. Perhaps something like:
> 
> pci { enable-ports = <0 1 2>; };
> 
> Would do?

That seems reasonable, although since the property is presumably
something specific to the Tegra PCIe binding, not generic, I think it
should be nvidia,enable-ports.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thierry Reding - June 12, 2012, 5:20 p.m.
* Stephen Warren wrote:
> On 06/12/2012 12:21 AM, Thierry Reding wrote:
> > * Stephen Warren wrote:
> >> On 06/11/2012 09:05 AM, Thierry Reding wrote:
> >>> This commit adds support for instantiating the Tegra PCIe
> >>> controller from a device tree.
> >> 
> >>> +++ b/Documentation/devicetree/bindings/pci/tegra-pcie.txt
> >> 
> >> Can we please name this nvidia,tegra20-pcie.txt to match the
> >> naming of all the other Tegra bindings.
> > 
> > Yes, will do.
> > 
> >>> +Required properties: +- compatible: "nvidia,tegra20-pcie" +-
> >>> reg: physical base address and length of the controller's
> >>> registers
> >> 
> >> Since there's more than one range now, that should specify how
> >> many entries are required and what they represent.
> > 
> > Okay.
> > 
> >>> +Optional properties: +- pex-clk-supply: supply voltage for
> >>> internal reference clock +- vdd-supply: power supply for
> >>> controller (1.05V)
> >> 
> >> Those shouldn't be optional. If the board has no regulator, the
> >> board's .dts should provide a fixed always-on regulator that
> >> those properties can refer to, so that the driver can always
> >> get() those regulators.
> > 
> > That'll add more dummy regulators and I don't think sprinkling them
> > across the DTS is going to work very well. Maybe collecting them
> > under a top-level "regulators" node is a good option. If you have a
> > better alternative, I'm all open for it.
> > 
> >>> diff --git a/arch/arm/boot/dts/tegra20.dtsi
> >>> b/arch/arm/boot/dts/tegra20.dtsi
> >> 
> >>> +	pci {
> >> ...
> >>> +		status = "disable";
> >> 
> >> That should be "disabled"; sorry for providing a bad example.
> > 
> > Yes.
> > 
> >>> diff --git a/arch/arm/mach-tegra/pcie.c
> >>> b/arch/arm/mach-tegra/pcie.c
> >> 
> >>> +static struct tegra_pcie_pdata *tegra_pcie_parse_dt(struct
> >>> platform_device *pdev)
> >> 
> >>> +	if (of_find_property(node, "vdd-supply", NULL)) {
> >> 
> >> As mentioned above, that if statement should be removed, since
> >> the regulators shouldn't be optional.
> > 
> > Okay.
> > 
> >>> +		pcie->vdd_supply = regulator_get(&pdev->dev, "vdd");
> >> 
> >> Those could be devm_regulator_get(). Then tegra_pcie_remove()
> >> wouldn't have to put() them.
> > 
> > Okay.
> > 
> >>> +	for (i = 0; i < TEGRA_PCIE_MAX_PORTS; i++) +
> >>> pdata->enable_ports[i] = true;
> >> 
> >> Shouldn't the DT indicate which ports are used? I assume there's
> >> some reason that the existing driver allows that to be
> >> configured, rather than always enabling all ports. At least,
> >> enumeration time wasted on non-existent ports springs to mind,
> >> and perhaps attempting to enable port 1 when port 0 is x4 and
> >> using all the lanes would cause errors in port 0?
> > 
> > Yes, that's been on my mind as well. I'm not sure about the best
> > binding for this. Perhaps something like:
> > 
> > pci { enable-ports = <0 1 2>; };
> > 
> > Would do?
> 
> That seems reasonable, although since the property is presumably
> something specific to the Tegra PCIe binding, not generic, I think it
> should be nvidia,enable-ports.

I came up with the following alternative:

	pci {
		compatible = "nvidia,tegra20-pcie";
		reg = <0x80003000 0x00000800   /* PADS registers */
		       0x80003800 0x00000200   /* AFI registers */
		       0x80004000 0x00100000   /* configuration space */
		       0x80104000 0x00100000   /* extended configuration space */
		       0x80400000 0x00010000   /* downstream I/O */
		       0x90000000 0x10000000   /* non-prefetchable memory */
		       0xa0000000 0x10000000>; /* prefetchable memory */
		interrupts = <0 98 0x04   /* controller interrupt */
		              0 99 0x04>; /* MSI interrupt */
		status = "disabled";

		ranges = <0x80000000 0x80000000 0x00002000   /* 2 root ports */
			  0x80004000 0x80004000 0x00100000   /* configuration space */
			  0x80104000 0x80104000 0x00100000   /* extended configuration space */
			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */

		#address-cells = <1>;
		#size-cells = <1>;

		port@80000000 {
			reg = <0x80000000 0x00001000>;
			status = "disabled";
		};

		port@80001000 {
			reg = <0x80001000 0x00001000>;
			status = "disabled";
		};
	};

The "ranges" property can probably be cleaned up a bit, but the most
interesting part is the port@ children, which can simply be enabled in board
DTS files by setting the status property to "okay". I find that somewhat more
intuitive to the variant with an "enable-ports" property.

What do you think of this?

Thierry
Stephen Warren - June 12, 2012, 7:46 p.m.
On 06/12/2012 01:10 PM, Mitch Bradley wrote:
> On 6/12/2012 7:20 AM, Thierry Reding wrote:
...
>> I came up with the following alternative:
>>
>> 	pci {
>> 		compatible = "nvidia,tegra20-pcie";
>> 		reg = <0x80003000 0x00000800   /* PADS registers */
>> 		       0x80003800 0x00000200   /* AFI registers */
>> 		       0x80004000 0x00100000   /* configuration space */
>> 		       0x80104000 0x00100000   /* extended configuration space */
>> 		       0x80400000 0x00010000   /* downstream I/O */
>> 		       0x90000000 0x10000000   /* non-prefetchable memory */
>> 		       0xa0000000 0x10000000>; /* prefetchable memory */
>> 		interrupts = <0 98 0x04   /* controller interrupt */
>> 		              0 99 0x04>; /* MSI interrupt */
>> 		status = "disabled";
>>
>> 		ranges = <0x80000000 0x80000000 0x00002000   /* 2 root ports */
>> 			  0x80004000 0x80004000 0x00100000   /* configuration space */
>> 			  0x80104000 0x80104000 0x00100000   /* extended configuration space */
>> 			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
>> 			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
>> 			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
>>
>> 		#address-cells = <1>;
>> 		#size-cells = <1>;
>>
>> 		port@80000000 {
>> 			reg = <0x80000000 0x00001000>;
>> 			status = "disabled";
>> 		};
>>
>> 		port@80001000 {
>> 			reg = <0x80001000 0x00001000>;
>> 			status = "disabled";
>> 		};
>> 	};
>>
>> The "ranges" property can probably be cleaned up a bit, but the most
>> interesting part is the port@ children, which can simply be enabled in board
>> DTS files by setting the status property to "okay". I find that somewhat more
>> intuitive to the variant with an "enable-ports" property.
>>
>> What do you think of this?
> 
> The problem is that children of a PCI-ish bus have specific expectations
> about the parent address format - the standard 3-address-cell PCI
> addressing.  So making a PCI bus node - even if it is PCIe - with 1
> address cell is a problem.

Couldn't you put the #address-cells=<3> underneath each port node, and
put any PCIe devices there rather than under the main PCIe controller
node? I'm not sure how the interrupt mapping table etc. would work in
that case, but that seems to solve the addressing issue.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mitch Bradley - June 12, 2012, 7:52 p.m.
On 6/12/2012 9:46 AM, Stephen Warren wrote:
> On 06/12/2012 01:10 PM, Mitch Bradley wrote:
>> On 6/12/2012 7:20 AM, Thierry Reding wrote:
> ...
>>> I came up with the following alternative:
>>>
>>> 	pci {
>>> 		compatible = "nvidia,tegra20-pcie";
>>> 		reg =<0x80003000 0x00000800   /* PADS registers */
>>> 		       0x80003800 0x00000200   /* AFI registers */
>>> 		       0x80004000 0x00100000   /* configuration space */
>>> 		       0x80104000 0x00100000   /* extended configuration space */
>>> 		       0x80400000 0x00010000   /* downstream I/O */
>>> 		       0x90000000 0x10000000   /* non-prefetchable memory */
>>> 		       0xa0000000 0x10000000>; /* prefetchable memory */
>>> 		interrupts =<0 98 0x04   /* controller interrupt */
>>> 		              0 99 0x04>; /* MSI interrupt */
>>> 		status = "disabled";
>>>
>>> 		ranges =<0x80000000 0x80000000 0x00002000   /* 2 root ports */
>>> 			  0x80004000 0x80004000 0x00100000   /* configuration space */
>>> 			  0x80104000 0x80104000 0x00100000   /* extended configuration space */
>>> 			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
>>> 			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
>>> 			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
>>>
>>> 		#address-cells =<1>;
>>> 		#size-cells =<1>;
>>>
>>> 		port@80000000 {
>>> 			reg =<0x80000000 0x00001000>;
>>> 			status = "disabled";
>>> 		};
>>>
>>> 		port@80001000 {
>>> 			reg =<0x80001000 0x00001000>;
>>> 			status = "disabled";
>>> 		};
>>> 	};
>>>
>>> The "ranges" property can probably be cleaned up a bit, but the most
>>> interesting part is the port@ children, which can simply be enabled in board
>>> DTS files by setting the status property to "okay". I find that somewhat more
>>> intuitive to the variant with an "enable-ports" property.
>>>
>>> What do you think of this?
>>
>> The problem is that children of a PCI-ish bus have specific expectations
>> about the parent address format - the standard 3-address-cell PCI
>> addressing.  So making a PCI bus node - even if it is PCIe - with 1
>> address cell is a problem.
>
> Couldn't you put the #address-cells=<3>  underneath each port node, and
> put any PCIe devices there rather than under the main PCIe controller
> node? I'm not sure how the interrupt mapping table etc. would work in
> that case, but that seems to solve the addressing issue.

Yes, that would work just fine, and in fact would be preferable, if a 
"root port" is essentially an independent PCI bus.  The parent of those 
root ports should not be named "pci", though, as it is not in itself a 
PCI bus.  It could be called "pcis".

>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stephen Warren - June 12, 2012, 8:15 p.m.
On 06/12/2012 11:20 AM, Thierry Reding wrote:
...
> I came up with the following alternative:
> 
> 	pci {
> 		compatible = "nvidia,tegra20-pcie";
> 		reg = <0x80003000 0x00000800   /* PADS registers */
> 		       0x80003800 0x00000200   /* AFI registers */
> 		       0x80004000 0x00100000   /* configuration space */
> 		       0x80104000 0x00100000   /* extended configuration space */
> 		       0x80400000 0x00010000   /* downstream I/O */
> 		       0x90000000 0x10000000   /* non-prefetchable memory */
> 		       0xa0000000 0x10000000>; /* prefetchable memory */
> 		interrupts = <0 98 0x04   /* controller interrupt */
> 		              0 99 0x04>; /* MSI interrupt */
> 		status = "disabled";
> 
> 		ranges = <0x80000000 0x80000000 0x00002000   /* 2 root ports */
> 			  0x80004000 0x80004000 0x00100000   /* configuration space */
> 			  0x80104000 0x80104000 0x00100000   /* extended configuration space */
> 			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
> 			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
> 			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
> 
> 		#address-cells = <1>;
> 		#size-cells = <1>;
> 
> 		port@80000000 {
> 			reg = <0x80000000 0x00001000>;
> 			status = "disabled";
> 		};
> 
> 		port@80001000 {
> 			reg = <0x80001000 0x00001000>;
> 			status = "disabled";
> 		};
> 	};
> 
> The "ranges" property can probably be cleaned up a bit, but the most
> interesting part is the port@ children, which can simply be enabled in board
> DTS files by setting the status property to "okay". I find that somewhat more
> intuitive to the variant with an "enable-ports" property.
> 
> What do you think of this?

As a general concept, this kind of design seems OK to me.

The "port" child nodes I think should be named "pci@..." given Mitch's
comments, I think.

The port nodes probably need two entries in reg, given the following in
our downstream driver:

>         int rp_offset = 0;
>         int ctrl_offset = AFI_PEX0_CTRL;
...
>         for (port = 0; port < MAX_PCIE_SUPPORTED_PORTS; port++) {
>                 ctrl_offset += (port * 8);
>                 rp_offset = (rp_offset + 0x1000) * port;
>                 if (tegra_pcie.plat_data->port_status[port])
>                         tegra_pcie_add_port(port, rp_offset, ctrl_offset);
>         }

(which actually looks likely to be horribly buggy for port>1 and only
accidentally correct for port==1, but anyway...)

But instead, I'd be tempted to make the top-level node say:

	#address-cells = <1>;
	#size-cells = <0>;

... so that the child nodes' reg is just the port ID. The parent node
can calculate the addresses/offsets of the per-port registers within the
PCIe controller's register space based on the ID using code roughly like
what I quoted above:

	pci@0 {
		reg = <0>;
		status = "disabled";
	};

	pci@1 {
		reg = <0>;
		status = "disabled";
	};

That would save having to put 2 entries in the reg, and perhaps remove
the need for any ranges property.

I think you also need a property to specify the exact port layout; the
Tegra20 controller supports:

1 x4 port
2 x2 ports (you can choose to use only 1 of these I assume)

So just because only 1 of the ports is enabled, doesn't imply it's x4;
it could still be x2.

Tegra30 has more options.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mitch Bradley - June 12, 2012, 9:11 p.m.
On 6/12/2012 10:15 AM, Stephen Warren wrote:
> On 06/12/2012 11:20 AM, Thierry Reding wrote:
> ...
>> I came up with the following alternative:
>>
>> 	pci {
>> 		compatible = "nvidia,tegra20-pcie";
>> 		reg =<0x80003000 0x00000800   /* PADS registers */
>> 		       0x80003800 0x00000200   /* AFI registers */
>> 		       0x80004000 0x00100000   /* configuration space */
>> 		       0x80104000 0x00100000   /* extended configuration space */
>> 		       0x80400000 0x00010000   /* downstream I/O */
>> 		       0x90000000 0x10000000   /* non-prefetchable memory */
>> 		       0xa0000000 0x10000000>; /* prefetchable memory */
>> 		interrupts =<0 98 0x04   /* controller interrupt */
>> 		              0 99 0x04>; /* MSI interrupt */
>> 		status = "disabled";
>>
>> 		ranges =<0x80000000 0x80000000 0x00002000   /* 2 root ports */
>> 			  0x80004000 0x80004000 0x00100000   /* configuration space */
>> 			  0x80104000 0x80104000 0x00100000   /* extended configuration space */
>> 			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
>> 			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
>> 			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
>>
>> 		#address-cells =<1>;
>> 		#size-cells =<1>;
>>
>> 		port@80000000 {
>> 			reg =<0x80000000 0x00001000>;
>> 			status = "disabled";
>> 		};
>>
>> 		port@80001000 {
>> 			reg =<0x80001000 0x00001000>;
>> 			status = "disabled";
>> 		};
>> 	};
>>
>> The "ranges" property can probably be cleaned up a bit, but the most
>> interesting part is the port@ children, which can simply be enabled in board
>> DTS files by setting the status property to "okay". I find that somewhat more
>> intuitive to the variant with an "enable-ports" property.
>>
>> What do you think of this?
>
> As a general concept, this kind of design seems OK to me.
>
> The "port" child nodes I think should be named "pci@..." given Mitch's
> comments, I think.
>
> The port nodes probably need two entries in reg, given the following in
> our downstream driver:
>
>>          int rp_offset = 0;
>>          int ctrl_offset = AFI_PEX0_CTRL;
> ...
>>          for (port = 0; port<  MAX_PCIE_SUPPORTED_PORTS; port++) {
>>                  ctrl_offset += (port * 8);
>>                  rp_offset = (rp_offset + 0x1000) * port;
>>                  if (tegra_pcie.plat_data->port_status[port])
>>                          tegra_pcie_add_port(port, rp_offset, ctrl_offset);
>>          }
>
> (which actually looks likely to be horribly buggy for port>1 and only
> accidentally correct for port==1, but anyway...)
>
> But instead, I'd be tempted to make the top-level node say:
>
> 	#address-cells =<1>;
> 	#size-cells =<0>;
>
> ... so that the child nodes' reg is just the port ID. The parent node
> can calculate the addresses/offsets of the per-port registers within the
> PCIe controller's register space based on the ID using code roughly like
> what I quoted above:
>
> 	pci@0 {
> 		reg =<0>;
> 		status = "disabled";
> 	};
>
> 	pci@1 {
> 		reg =<0>;
> 		status = "disabled";
> 	};
reg = <1> ?

>
>
> That would save having to put 2 entries in the reg, and perhaps remove
> the need for any ranges property.

ISTM that having two reg entries, specifying the rp and ctrl registers, 
is preferable to having code to calculate the addresses.  That makes the 
code simpler and the device tree more directly descriptive of the 
hardware layout.  The less "magic" (in this case, the register address 
calculation), the better.

>
>
> I think you also need a property to specify the exact port layout; the
> Tegra20 controller supports:
>
> 1 x4 port
> 2 x2 ports (you can choose to use only 1 of these I assume)
>
> So just because only 1 of the ports is enabled, doesn't imply it's x4;
> it could still be x2.
>
> Tegra30 has more options.
> _______________________________________________
> devicetree-discuss mailing list
> devicetree-discuss@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/devicetree-discuss
>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thierry Reding - June 13, 2012, 5:54 a.m.
* Mitch Bradley wrote:
> On 6/12/2012 9:46 AM, Stephen Warren wrote:
> >On 06/12/2012 01:10 PM, Mitch Bradley wrote:
> >>On 6/12/2012 7:20 AM, Thierry Reding wrote:
> >...
> >>>I came up with the following alternative:
> >>>
> >>>	pci {
> >>>		compatible = "nvidia,tegra20-pcie";
> >>>		reg =<0x80003000 0x00000800   /* PADS registers */
> >>>		       0x80003800 0x00000200   /* AFI registers */
> >>>		       0x80004000 0x00100000   /* configuration space */
> >>>		       0x80104000 0x00100000   /* extended configuration space */
> >>>		       0x80400000 0x00010000   /* downstream I/O */
> >>>		       0x90000000 0x10000000   /* non-prefetchable memory */
> >>>		       0xa0000000 0x10000000>; /* prefetchable memory */
> >>>		interrupts =<0 98 0x04   /* controller interrupt */
> >>>		              0 99 0x04>; /* MSI interrupt */
> >>>		status = "disabled";
> >>>
> >>>		ranges =<0x80000000 0x80000000 0x00002000   /* 2 root ports */
> >>>			  0x80004000 0x80004000 0x00100000   /* configuration space */
> >>>			  0x80104000 0x80104000 0x00100000   /* extended configuration space */
> >>>			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
> >>>			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
> >>>			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
> >>>
> >>>		#address-cells =<1>;
> >>>		#size-cells =<1>;
> >>>
> >>>		port@80000000 {
> >>>			reg =<0x80000000 0x00001000>;
> >>>			status = "disabled";
> >>>		};
> >>>
> >>>		port@80001000 {
> >>>			reg =<0x80001000 0x00001000>;
> >>>			status = "disabled";
> >>>		};
> >>>	};
> >>>
> >>>The "ranges" property can probably be cleaned up a bit, but the most
> >>>interesting part is the port@ children, which can simply be enabled in board
> >>>DTS files by setting the status property to "okay". I find that somewhat more
> >>>intuitive to the variant with an "enable-ports" property.
> >>>
> >>>What do you think of this?
> >>
> >>The problem is that children of a PCI-ish bus have specific expectations
> >>about the parent address format - the standard 3-address-cell PCI
> >>addressing.  So making a PCI bus node - even if it is PCIe - with 1
> >>address cell is a problem.
> >
> >Couldn't you put the #address-cells=<3>  underneath each port node, and
> >put any PCIe devices there rather than under the main PCIe controller
> >node? I'm not sure how the interrupt mapping table etc. would work in
> >that case, but that seems to solve the addressing issue.
> 
> Yes, that would work just fine, and in fact would be preferable, if
> a "root port" is essentially an independent PCI bus.  The parent of
> those root ports should not be named "pci", though, as it is not in
> itself a PCI bus.  It could be called "pcis".

Yes, on Tegra each of the ports provides a separate PCI bus. How about naming
the parent "pcie-controller"? IMO that reflects the nature of the device
better, whereas "pcis" would seem more like a "virtual" node for collecting
similar children.

Thierry
Thierry Reding - June 13, 2012, 6:34 a.m.
* Stephen Warren wrote:
> On 06/12/2012 11:20 AM, Thierry Reding wrote:
> ...
> > I came up with the following alternative:
> > 
> > 	pci {
> > 		compatible = "nvidia,tegra20-pcie";
> > 		reg = <0x80003000 0x00000800   /* PADS registers */
> > 		       0x80003800 0x00000200   /* AFI registers */
> > 		       0x80004000 0x00100000   /* configuration space */
> > 		       0x80104000 0x00100000   /* extended configuration space */
> > 		       0x80400000 0x00010000   /* downstream I/O */
> > 		       0x90000000 0x10000000   /* non-prefetchable memory */
> > 		       0xa0000000 0x10000000>; /* prefetchable memory */
> > 		interrupts = <0 98 0x04   /* controller interrupt */
> > 		              0 99 0x04>; /* MSI interrupt */
> > 		status = "disabled";
> > 
> > 		ranges = <0x80000000 0x80000000 0x00002000   /* 2 root ports */
> > 			  0x80004000 0x80004000 0x00100000   /* configuration space */
> > 			  0x80104000 0x80104000 0x00100000   /* extended configuration space */
> > 			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
> > 			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
> > 			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
> > 
> > 		#address-cells = <1>;
> > 		#size-cells = <1>;
> > 
> > 		port@80000000 {
> > 			reg = <0x80000000 0x00001000>;
> > 			status = "disabled";
> > 		};
> > 
> > 		port@80001000 {
> > 			reg = <0x80001000 0x00001000>;
> > 			status = "disabled";
> > 		};
> > 	};
> > 
> > The "ranges" property can probably be cleaned up a bit, but the most
> > interesting part is the port@ children, which can simply be enabled in board
> > DTS files by setting the status property to "okay". I find that somewhat more
> > intuitive to the variant with an "enable-ports" property.
> > 
> > What do you think of this?
> 
> As a general concept, this kind of design seems OK to me.
> 
> The "port" child nodes I think should be named "pci@..." given Mitch's
> comments, I think.
> 
> The port nodes probably need two entries in reg, given the following in
> our downstream driver:
> 
> >         int rp_offset = 0;
> >         int ctrl_offset = AFI_PEX0_CTRL;
> ...
> >         for (port = 0; port < MAX_PCIE_SUPPORTED_PORTS; port++) {
> >                 ctrl_offset += (port * 8);
> >                 rp_offset = (rp_offset + 0x1000) * port;
> >                 if (tegra_pcie.plat_data->port_status[port])
> >                         tegra_pcie_add_port(port, rp_offset, ctrl_offset);
> >         }
> 
> (which actually looks likely to be horribly buggy for port>1 and only
> accidentally correct for port==1, but anyway...)

Yeah, I currently have code that gets the port index and the reset register
offset (ctrl_offset above) from the port address. It feels rather hackish,
though.

> But instead, I'd be tempted to make the top-level node say:
> 
> 	#address-cells = <1>;
> 	#size-cells = <0>;
> 
> ... so that the child nodes' reg is just the port ID. The parent node
> can calculate the addresses/offsets of the per-port registers within the
> PCIe controller's register space based on the ID using code roughly like
> what I quoted above:
> 
> 	pci@0 {
> 		reg = <0>;
> 		status = "disabled";
> 	};
> 
> 	pci@1 {
> 		reg = <0>;
> 		status = "disabled";
> 	};
> 
> That would save having to put 2 entries in the reg, and perhaps remove
> the need for any ranges property.
> 
> I think you also need a property to specify the exact port layout; the
> Tegra20 controller supports:
> 
> 1 x4 port
> 2 x2 ports (you can choose to use only 1 of these I assume)
> 
> So just because only 1 of the ports is enabled, doesn't imply it's x4;
> it could still be x2.
> 
> Tegra30 has more options.

Both the upstream and downstream drivers currently hard-code this to dual and
411 (I assume that means 1x4, 2x1?) configurations for Tegra20 and Tegra30
respectively. Unfortunately the register AFI_PCIE_CONFIG isn't documented on
Tegra20 at all and for Tegra30 lists only the valid configurations (see
28.4.1.5 PCIe CONFIG) but not the corresponding encodings.

Maybe a good name for the new property would be "num-lanes". I also wonder if
this property would be better off in the parent node, which would make it
easier to check for valid configurations. Otherwise the code would have to
collect the settings of all the ports and check if the combination is valid.
Then again, having num-lanes in the parent with one cell for each controller
isn't very nice if each controller can be individually disabled.

One other thing: in addition to the device tree binding, this will all have to
be represented in the platform data as well to support the legacy board
definitions currently in the tree. But instead of adding all of these changes
to the patch that converts the code to a driver, I'm thinking it might be
better to split these additions off into one or more separate patches. Do you
have any objections?

Thierry
Thierry Reding - June 13, 2012, 6:45 a.m.
* Mitch Bradley wrote:
> On 6/12/2012 10:15 AM, Stephen Warren wrote:
> >On 06/12/2012 11:20 AM, Thierry Reding wrote:
> >...
> >>I came up with the following alternative:
> >>
> >>	pci {
> >>		compatible = "nvidia,tegra20-pcie";
> >>		reg =<0x80003000 0x00000800   /* PADS registers */
> >>		       0x80003800 0x00000200   /* AFI registers */
> >>		       0x80004000 0x00100000   /* configuration space */
> >>		       0x80104000 0x00100000   /* extended configuration space */
> >>		       0x80400000 0x00010000   /* downstream I/O */
> >>		       0x90000000 0x10000000   /* non-prefetchable memory */
> >>		       0xa0000000 0x10000000>; /* prefetchable memory */
> >>		interrupts =<0 98 0x04   /* controller interrupt */
> >>		              0 99 0x04>; /* MSI interrupt */
> >>		status = "disabled";
> >>
> >>		ranges =<0x80000000 0x80000000 0x00002000   /* 2 root ports */
> >>			  0x80004000 0x80004000 0x00100000   /* configuration space */
> >>			  0x80104000 0x80104000 0x00100000   /* extended configuration space */
> >>			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
> >>			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
> >>			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
> >>
> >>		#address-cells =<1>;
> >>		#size-cells =<1>;
> >>
> >>		port@80000000 {
> >>			reg =<0x80000000 0x00001000>;
> >>			status = "disabled";
> >>		};
> >>
> >>		port@80001000 {
> >>			reg =<0x80001000 0x00001000>;
> >>			status = "disabled";
> >>		};
> >>	};
> >>
> >>The "ranges" property can probably be cleaned up a bit, but the most
> >>interesting part is the port@ children, which can simply be enabled in board
> >>DTS files by setting the status property to "okay". I find that somewhat more
> >>intuitive to the variant with an "enable-ports" property.
> >>
> >>What do you think of this?
> >
> >As a general concept, this kind of design seems OK to me.
> >
> >The "port" child nodes I think should be named "pci@..." given Mitch's
> >comments, I think.
> >
> >The port nodes probably need two entries in reg, given the following in
> >our downstream driver:
> >
> >>         int rp_offset = 0;
> >>         int ctrl_offset = AFI_PEX0_CTRL;
> >...
> >>         for (port = 0; port<  MAX_PCIE_SUPPORTED_PORTS; port++) {
> >>                 ctrl_offset += (port * 8);
> >>                 rp_offset = (rp_offset + 0x1000) * port;
> >>                 if (tegra_pcie.plat_data->port_status[port])
> >>                         tegra_pcie_add_port(port, rp_offset, ctrl_offset);
> >>         }
> >
> >(which actually looks likely to be horribly buggy for port>1 and only
> >accidentally correct for port==1, but anyway...)
> >
> >But instead, I'd be tempted to make the top-level node say:
> >
> >	#address-cells =<1>;
> >	#size-cells =<0>;
> >
> >... so that the child nodes' reg is just the port ID. The parent node
> >can calculate the addresses/offsets of the per-port registers within the
> >PCIe controller's register space based on the ID using code roughly like
> >what I quoted above:
> >
> >	pci@0 {
> >		reg =<0>;
> >		status = "disabled";
> >	};
> >
> >	pci@1 {
> >		reg =<0>;
> >		status = "disabled";
> >	};
> reg = <1> ?
> 
> >
> >
> >That would save having to put 2 entries in the reg, and perhaps remove
> >the need for any ranges property.
> 
> ISTM that having two reg entries, specifying the rp and ctrl
> registers, is preferable to having code to calculate the addresses.
> That makes the code simpler and the device tree more directly
> descriptive of the hardware layout.  The less "magic" (in this case,
> the register address calculation), the better.

The problem with this approach is that since the control registers are
within the AFI register range, both the reg and ranges properties of the
parent would have to include these holes. Furthermore it means that the
controller driver would have to remap the AFI registers in chunks, for the
sole reason of splitting out the controle registers.

Would it be acceptable to make an exception in this case and use the port's
second reg entry as an offset into the AFI register range instead?

Thierry
Mitch Bradley - June 13, 2012, 7:04 a.m.
On 6/12/2012 7:54 PM, Thierry Reding wrote:
> * Mitch Bradley wrote:
>> On 6/12/2012 9:46 AM, Stephen Warren wrote:
>>> On 06/12/2012 01:10 PM, Mitch Bradley wrote:
>>>> On 6/12/2012 7:20 AM, Thierry Reding wrote:
>>> ...
>>>>> I came up with the following alternative:
>>>>>
>>>>> 	pci {
>>>>> 		compatible = "nvidia,tegra20-pcie";
>>>>> 		reg =<0x80003000 0x00000800   /* PADS registers */
>>>>> 		       0x80003800 0x00000200   /* AFI registers */
>>>>> 		       0x80004000 0x00100000   /* configuration space */
>>>>> 		       0x80104000 0x00100000   /* extended configuration space */
>>>>> 		       0x80400000 0x00010000   /* downstream I/O */
>>>>> 		       0x90000000 0x10000000   /* non-prefetchable memory */
>>>>> 		       0xa0000000 0x10000000>; /* prefetchable memory */
>>>>> 		interrupts =<0 98 0x04   /* controller interrupt */
>>>>> 		              0 99 0x04>; /* MSI interrupt */
>>>>> 		status = "disabled";
>>>>>
>>>>> 		ranges =<0x80000000 0x80000000 0x00002000   /* 2 root ports */
>>>>> 			  0x80004000 0x80004000 0x00100000   /* configuration space */
>>>>> 			  0x80104000 0x80104000 0x00100000   /* extended configuration space */
>>>>> 			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
>>>>> 			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
>>>>> 			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
>>>>>
>>>>> 		#address-cells =<1>;
>>>>> 		#size-cells =<1>;
>>>>>
>>>>> 		port@80000000 {
>>>>> 			reg =<0x80000000 0x00001000>;
>>>>> 			status = "disabled";
>>>>> 		};
>>>>>
>>>>> 		port@80001000 {
>>>>> 			reg =<0x80001000 0x00001000>;
>>>>> 			status = "disabled";
>>>>> 		};
>>>>> 	};
>>>>>
>>>>> The "ranges" property can probably be cleaned up a bit, but the most
>>>>> interesting part is the port@ children, which can simply be enabled in board
>>>>> DTS files by setting the status property to "okay". I find that somewhat more
>>>>> intuitive to the variant with an "enable-ports" property.
>>>>>
>>>>> What do you think of this?
>>>>
>>>> The problem is that children of a PCI-ish bus have specific expectations
>>>> about the parent address format - the standard 3-address-cell PCI
>>>> addressing.  So making a PCI bus node - even if it is PCIe - with 1
>>>> address cell is a problem.
>>>
>>> Couldn't you put the #address-cells=<3>   underneath each port node, and
>>> put any PCIe devices there rather than under the main PCIe controller
>>> node? I'm not sure how the interrupt mapping table etc. would work in
>>> that case, but that seems to solve the addressing issue.
>>
>> Yes, that would work just fine, and in fact would be preferable, if
>> a "root port" is essentially an independent PCI bus.  The parent of
>> those root ports should not be named "pci", though, as it is not in
>> itself a PCI bus.  It could be called "pcis".
>
> Yes, on Tegra each of the ports provides a separate PCI bus. How about naming
> the parent "pcie-controller"?

That seems like a fine name.

> IMO that reflects the nature of the device
> better, whereas "pcis" would seem more like a "virtual" node for collecting
> similar children.
>
> Thierry
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mitch Bradley - June 13, 2012, 7:28 a.m.
On 6/12/2012 8:45 PM, Thierry Reding wrote:
> * Mitch Bradley wrote:
>> On 6/12/2012 10:15 AM, Stephen Warren wrote:
>>> On 06/12/2012 11:20 AM, Thierry Reding wrote:
>>> ...
>>>> I came up with the following alternative:
>>>>
>>>> 	pci {
>>>> 		compatible = "nvidia,tegra20-pcie";
>>>> 		reg =<0x80003000 0x00000800   /* PADS registers */
>>>> 		       0x80003800 0x00000200   /* AFI registers */
>>>> 		       0x80004000 0x00100000   /* configuration space */
>>>> 		       0x80104000 0x00100000   /* extended configuration space */
>>>> 		       0x80400000 0x00010000   /* downstream I/O */
>>>> 		       0x90000000 0x10000000   /* non-prefetchable memory */
>>>> 		       0xa0000000 0x10000000>; /* prefetchable memory */
>>>> 		interrupts =<0 98 0x04   /* controller interrupt */
>>>> 		              0 99 0x04>; /* MSI interrupt */
>>>> 		status = "disabled";
>>>>
>>>> 		ranges =<0x80000000 0x80000000 0x00002000   /* 2 root ports */
>>>> 			  0x80004000 0x80004000 0x00100000   /* configuration space */
>>>> 			  0x80104000 0x80104000 0x00100000   /* extended configuration space */
>>>> 			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
>>>> 			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
>>>> 			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
>>>>
>>>> 		#address-cells =<1>;
>>>> 		#size-cells =<1>;
>>>>
>>>> 		port@80000000 {
>>>> 			reg =<0x80000000 0x00001000>;
>>>> 			status = "disabled";
>>>> 		};
>>>>
>>>> 		port@80001000 {
>>>> 			reg =<0x80001000 0x00001000>;
>>>> 			status = "disabled";
>>>> 		};
>>>> 	};
>>>>
>>>> The "ranges" property can probably be cleaned up a bit, but the most
>>>> interesting part is the port@ children, which can simply be enabled in board
>>>> DTS files by setting the status property to "okay". I find that somewhat more
>>>> intuitive to the variant with an "enable-ports" property.
>>>>
>>>> What do you think of this?
>>>
>>> As a general concept, this kind of design seems OK to me.
>>>
>>> The "port" child nodes I think should be named "pci@..." given Mitch's
>>> comments, I think.
>>>
>>> The port nodes probably need two entries in reg, given the following in
>>> our downstream driver:
>>>
>>>>          int rp_offset = 0;
>>>>          int ctrl_offset = AFI_PEX0_CTRL;
>>> ...
>>>>          for (port = 0; port<   MAX_PCIE_SUPPORTED_PORTS; port++) {
>>>>                  ctrl_offset += (port * 8);
>>>>                  rp_offset = (rp_offset + 0x1000) * port;
>>>>                  if (tegra_pcie.plat_data->port_status[port])
>>>>                          tegra_pcie_add_port(port, rp_offset, ctrl_offset);
>>>>          }
>>>
>>> (which actually looks likely to be horribly buggy for port>1 and only
>>> accidentally correct for port==1, but anyway...)
>>>
>>> But instead, I'd be tempted to make the top-level node say:
>>>
>>> 	#address-cells =<1>;
>>> 	#size-cells =<0>;
>>>
>>> ... so that the child nodes' reg is just the port ID. The parent node
>>> can calculate the addresses/offsets of the per-port registers within the
>>> PCIe controller's register space based on the ID using code roughly like
>>> what I quoted above:
>>>
>>> 	pci@0 {
>>> 		reg =<0>;
>>> 		status = "disabled";
>>> 	};
>>>
>>> 	pci@1 {
>>> 		reg =<0>;
>>> 		status = "disabled";
>>> 	};
>> reg =<1>  ?
>>
>>>
>>>
>>> That would save having to put 2 entries in the reg, and perhaps remove
>>> the need for any ranges property.
>>
>> ISTM that having two reg entries, specifying the rp and ctrl
>> registers, is preferable to having code to calculate the addresses.
>> That makes the code simpler and the device tree more directly
>> descriptive of the hardware layout.  The less "magic" (in this case,
>> the register address calculation), the better.
>
> The problem with this approach is that since the control registers are
> within the AFI register range, both the reg and ranges properties of the
> parent would have to include these holes. Furthermore it means that the
> controller driver would have to remap the AFI registers in chunks, for the
> sole reason of splitting out the controle registers.
>
> Would it be acceptable to make an exception in this case and use the port's
> second reg entry as an offset into the AFI register range instead?

How about this:

     pcie-controller {
         compatible = "nvidia,tegra20-pcie";
         reg =<0x80003000 0x00000800   /* PADS registers */
               0x80003800 0x00000200>   /* AFI registers */
         interrupts =<0 98 0x04   /* controller interrupt */
                      0 99 0x04>; /* MSI interrupt */
         status = "disabled";

         ranges =<0x80000000 0x80000000 0x00002000   /* 2 root ports */
               0x80004000 0x80004000 0x00100000   /* configuration space */
               0x80104000 0x80104000 0x00100000   /* extended 
configuration space */
               0x80400000 0x80400000 0x00010000   /* downstream I/O */
               0x90000000 0x90000000 0x10000000   /* non-prefetchable 
memory */
               0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */

         #address-cells =<1>;
         #size-cells =<1>;

         pci@80000000 {
             reg =<0x80000000 0x00001000>;
             ctrl-offset =<0x0>;
             status = "disabled";
             #address-cells = <3>;  /* Standard for PCI */
             #size-cells =<2>;      /* Standard for PCI */
             ranges =</* Map from standard PCI child address spaces to 
linear spaces above */>;
         };

         pci@80001000 {
             reg =<0x80001000 0x00001000>;
             ctrl-offset =<0x8>;
             status = "disabled";
             #address-cells = <3>;  /* Standard for PCI */
             #size-cells =<2>;      /* Standard for PCI */
             ranges =</* Map from standard PCI child address spaces to 
linear spaces above */>;
         };
     };

Using ctrl-offset instead of reg avoids any violation of the usual reg 
semantics.

Note that the pci subnodes need the full complement of PCI bus node 
properties.

>
>
> Thierry
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thierry Reding - June 13, 2012, 7:52 a.m.
On Tue, Jun 12, 2012 at 09:28:51PM -1000, Mitch Bradley wrote:
> On 6/12/2012 8:45 PM, Thierry Reding wrote:
> >* Mitch Bradley wrote:
> >>On 6/12/2012 10:15 AM, Stephen Warren wrote:
> >>>On 06/12/2012 11:20 AM, Thierry Reding wrote:
> >>>...
> >>>>I came up with the following alternative:
> >>>>
> >>>>	pci {
> >>>>		compatible = "nvidia,tegra20-pcie";
> >>>>		reg =<0x80003000 0x00000800   /* PADS registers */
> >>>>		       0x80003800 0x00000200   /* AFI registers */
> >>>>		       0x80004000 0x00100000   /* configuration space */
> >>>>		       0x80104000 0x00100000   /* extended configuration space */
> >>>>		       0x80400000 0x00010000   /* downstream I/O */
> >>>>		       0x90000000 0x10000000   /* non-prefetchable memory */
> >>>>		       0xa0000000 0x10000000>; /* prefetchable memory */
> >>>>		interrupts =<0 98 0x04   /* controller interrupt */
> >>>>		              0 99 0x04>; /* MSI interrupt */
> >>>>		status = "disabled";
> >>>>
> >>>>		ranges =<0x80000000 0x80000000 0x00002000   /* 2 root ports */
> >>>>			  0x80004000 0x80004000 0x00100000   /* configuration space */
> >>>>			  0x80104000 0x80104000 0x00100000   /* extended configuration space */
> >>>>			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
> >>>>			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
> >>>>			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
> >>>>
> >>>>		#address-cells =<1>;
> >>>>		#size-cells =<1>;
> >>>>
> >>>>		port@80000000 {
> >>>>			reg =<0x80000000 0x00001000>;
> >>>>			status = "disabled";
> >>>>		};
> >>>>
> >>>>		port@80001000 {
> >>>>			reg =<0x80001000 0x00001000>;
> >>>>			status = "disabled";
> >>>>		};
> >>>>	};
> >>>>
> >>>>The "ranges" property can probably be cleaned up a bit, but the most
> >>>>interesting part is the port@ children, which can simply be enabled in board
> >>>>DTS files by setting the status property to "okay". I find that somewhat more
> >>>>intuitive to the variant with an "enable-ports" property.
> >>>>
> >>>>What do you think of this?
> >>>
> >>>As a general concept, this kind of design seems OK to me.
> >>>
> >>>The "port" child nodes I think should be named "pci@..." given Mitch's
> >>>comments, I think.
> >>>
> >>>The port nodes probably need two entries in reg, given the following in
> >>>our downstream driver:
> >>>
> >>>>         int rp_offset = 0;
> >>>>         int ctrl_offset = AFI_PEX0_CTRL;
> >>>...
> >>>>         for (port = 0; port<   MAX_PCIE_SUPPORTED_PORTS; port++) {
> >>>>                 ctrl_offset += (port * 8);
> >>>>                 rp_offset = (rp_offset + 0x1000) * port;
> >>>>                 if (tegra_pcie.plat_data->port_status[port])
> >>>>                         tegra_pcie_add_port(port, rp_offset, ctrl_offset);
> >>>>         }
> >>>
> >>>(which actually looks likely to be horribly buggy for port>1 and only
> >>>accidentally correct for port==1, but anyway...)
> >>>
> >>>But instead, I'd be tempted to make the top-level node say:
> >>>
> >>>	#address-cells =<1>;
> >>>	#size-cells =<0>;
> >>>
> >>>... so that the child nodes' reg is just the port ID. The parent node
> >>>can calculate the addresses/offsets of the per-port registers within the
> >>>PCIe controller's register space based on the ID using code roughly like
> >>>what I quoted above:
> >>>
> >>>	pci@0 {
> >>>		reg =<0>;
> >>>		status = "disabled";
> >>>	};
> >>>
> >>>	pci@1 {
> >>>		reg =<0>;
> >>>		status = "disabled";
> >>>	};
> >>reg =<1>  ?
> >>
> >>>
> >>>
> >>>That would save having to put 2 entries in the reg, and perhaps remove
> >>>the need for any ranges property.
> >>
> >>ISTM that having two reg entries, specifying the rp and ctrl
> >>registers, is preferable to having code to calculate the addresses.
> >>That makes the code simpler and the device tree more directly
> >>descriptive of the hardware layout.  The less "magic" (in this case,
> >>the register address calculation), the better.
> >
> >The problem with this approach is that since the control registers are
> >within the AFI register range, both the reg and ranges properties of the
> >parent would have to include these holes. Furthermore it means that the
> >controller driver would have to remap the AFI registers in chunks, for the
> >sole reason of splitting out the controle registers.
> >
> >Would it be acceptable to make an exception in this case and use the port's
> >second reg entry as an offset into the AFI register range instead?
> 
> How about this:
> 
>     pcie-controller {
>         compatible = "nvidia,tegra20-pcie";
>         reg =<0x80003000 0x00000800   /* PADS registers */
>               0x80003800 0x00000200>   /* AFI registers */
>         interrupts =<0 98 0x04   /* controller interrupt */
>                      0 99 0x04>; /* MSI interrupt */
>         status = "disabled";
> 
>         ranges =<0x80000000 0x80000000 0x00002000   /* 2 root ports */
>               0x80004000 0x80004000 0x00100000   /* configuration space */
>               0x80104000 0x80104000 0x00100000   /* extended
> configuration space */
>               0x80400000 0x80400000 0x00010000   /* downstream I/O */
>               0x90000000 0x90000000 0x10000000   /* non-prefetchable
> memory */
>               0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */

I think the configuration spaces and downstream I/O ranges need to be in the
pcie-controller's reg property because they are remapped and used by the
controller driver, not by the individual ports.

That's probably not really necessary but rather a result of how the driver
was written. Perhaps the driver should handle them differently instead,
listing the regions in the ranges property of the parent and listing the
corresponding partitions in the ranges properties of the pci child nodes.

Like in the following, where the ranges property of the ports partition the
ranges passed from the parent evenly:

	pcie-controller {
		compatible = "nvidia,tegra20-pcie";
		reg = <0x80003000 0x00000800   /* PADS registers */
		       0x80003800 0x00000200>; /* AFI registers */
		interrupts = <0 98 0x04   /* controller interrupt */
			      0 99 0x04>; /* MSI interrupt */
		status = "disabled";

		ranges = <0x80000000 0x80000000 0x00002000   /* 2 root ports */
			  0x80004000 0x80004000 0x00100000   /* configuration space */
			  0x80104000 0x80100000 0x00100000   /* extended configuration space */
			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */

		#address-cells = <1>;
		#size-cells = <1>;

		pci@80000000 {
			reg = <0x80000000 0x00001000>;
			status = "disabled";

			#address-cells = <3>;
			#size-cells = <2>;

			ranges = <0x80400000 0x80400000 0x00008000   /* I/O */
				  0x90000000 0x90000000 0x08000000   /* non-prefetchable memory */
				  0xa0000000 0xa0000000 0x08000000>; /* prefetchable memory */

			nvidia,ctrl-offset = <0x0>;
			nvidia,num-lanes = <2>;
		};

		pci@80001000 {
			reg = <0x80001000 0x00001000>;
			status = "disabled";

			#address-cells = <3>;
			#size-cells = <2>;

			ranges = <0x80408000 0x80408000 0x00008000   /* I/O */
				  0x98000000 0x98000000 0x08000000   /* non-prefetchable memory */
				  0xa8000000 0xa8000000 0x08000000>; /* prefetchable memory */

			nvidia,ctrl-offset = <0x8>;
			nvidia,num-lanes = <2>;
		};
	};

I've also added the new num-lanes property that describes the physical
connections of the port. Also the ctrl-offset and num-lanes property are
pretty specific to the Tegra PCIe controller, so I've prefixed them with
"nvidia,".

> 
>         #address-cells =<1>;
>         #size-cells =<1>;
> 
>         pci@80000000 {
>             reg =<0x80000000 0x00001000>;
>             ctrl-offset =<0x0>;
>             status = "disabled";
>             #address-cells = <3>;  /* Standard for PCI */
>             #size-cells =<2>;      /* Standard for PCI */
>             ranges =</* Map from standard PCI child address spaces
> to linear spaces above */>;
>         };
> 
>         pci@80001000 {
>             reg =<0x80001000 0x00001000>;
>             ctrl-offset =<0x8>;
>             status = "disabled";
>             #address-cells = <3>;  /* Standard for PCI */
>             #size-cells =<2>;      /* Standard for PCI */
>             ranges =</* Map from standard PCI child address spaces
> to linear spaces above */>;
>         };
>     };
> 
> Using ctrl-offset instead of reg avoids any violation of the usual
> reg semantics.

Yes, that sounds better.

> 
> Note that the pci subnodes need the full complement of PCI bus node
> properties.
> 
> >
> >
> >Thierry
>
Mitch Bradley - June 13, 2012, 8:05 a.m.
On 6/12/2012 9:52 PM, Thierry Reding wrote:
> On Tue, Jun 12, 2012 at 09:28:51PM -1000, Mitch Bradley wrote:
>> On 6/12/2012 8:45 PM, Thierry Reding wrote:
>>> * Mitch Bradley wrote:
>>>> On 6/12/2012 10:15 AM, Stephen Warren wrote:
>>>>> On 06/12/2012 11:20 AM, Thierry Reding wrote:
>>>>> ...
>>>>>> I came up with the following alternative:
>>>>>>
>>>>>> 	pci {
>>>>>> 		compatible = "nvidia,tegra20-pcie";
>>>>>> 		reg =<0x80003000 0x00000800   /* PADS registers */
>>>>>> 		       0x80003800 0x00000200   /* AFI registers */
>>>>>> 		       0x80004000 0x00100000   /* configuration space */
>>>>>> 		       0x80104000 0x00100000   /* extended configuration space */
>>>>>> 		       0x80400000 0x00010000   /* downstream I/O */
>>>>>> 		       0x90000000 0x10000000   /* non-prefetchable memory */
>>>>>> 		       0xa0000000 0x10000000>; /* prefetchable memory */
>>>>>> 		interrupts =<0 98 0x04   /* controller interrupt */
>>>>>> 		              0 99 0x04>; /* MSI interrupt */
>>>>>> 		status = "disabled";
>>>>>>
>>>>>> 		ranges =<0x80000000 0x80000000 0x00002000   /* 2 root ports */
>>>>>> 			  0x80004000 0x80004000 0x00100000   /* configuration space */
>>>>>> 			  0x80104000 0x80104000 0x00100000   /* extended configuration space */
>>>>>> 			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
>>>>>> 			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
>>>>>> 			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
>>>>>>
>>>>>> 		#address-cells =<1>;
>>>>>> 		#size-cells =<1>;
>>>>>>
>>>>>> 		port@80000000 {
>>>>>> 			reg =<0x80000000 0x00001000>;
>>>>>> 			status = "disabled";
>>>>>> 		};
>>>>>>
>>>>>> 		port@80001000 {
>>>>>> 			reg =<0x80001000 0x00001000>;
>>>>>> 			status = "disabled";
>>>>>> 		};
>>>>>> 	};
>>>>>>
>>>>>> The "ranges" property can probably be cleaned up a bit, but the most
>>>>>> interesting part is the port@ children, which can simply be enabled in board
>>>>>> DTS files by setting the status property to "okay". I find that somewhat more
>>>>>> intuitive to the variant with an "enable-ports" property.
>>>>>>
>>>>>> What do you think of this?
>>>>>
>>>>> As a general concept, this kind of design seems OK to me.
>>>>>
>>>>> The "port" child nodes I think should be named "pci@..." given Mitch's
>>>>> comments, I think.
>>>>>
>>>>> The port nodes probably need two entries in reg, given the following in
>>>>> our downstream driver:
>>>>>
>>>>>>          int rp_offset = 0;
>>>>>>          int ctrl_offset = AFI_PEX0_CTRL;
>>>>> ...
>>>>>>          for (port = 0; port<    MAX_PCIE_SUPPORTED_PORTS; port++) {
>>>>>>                  ctrl_offset += (port * 8);
>>>>>>                  rp_offset = (rp_offset + 0x1000) * port;
>>>>>>                  if (tegra_pcie.plat_data->port_status[port])
>>>>>>                          tegra_pcie_add_port(port, rp_offset, ctrl_offset);
>>>>>>          }
>>>>>
>>>>> (which actually looks likely to be horribly buggy for port>1 and only
>>>>> accidentally correct for port==1, but anyway...)
>>>>>
>>>>> But instead, I'd be tempted to make the top-level node say:
>>>>>
>>>>> 	#address-cells =<1>;
>>>>> 	#size-cells =<0>;
>>>>>
>>>>> ... so that the child nodes' reg is just the port ID. The parent node
>>>>> can calculate the addresses/offsets of the per-port registers within the
>>>>> PCIe controller's register space based on the ID using code roughly like
>>>>> what I quoted above:
>>>>>
>>>>> 	pci@0 {
>>>>> 		reg =<0>;
>>>>> 		status = "disabled";
>>>>> 	};
>>>>>
>>>>> 	pci@1 {
>>>>> 		reg =<0>;
>>>>> 		status = "disabled";
>>>>> 	};
>>>> reg =<1>   ?
>>>>
>>>>>
>>>>>
>>>>> That would save having to put 2 entries in the reg, and perhaps remove
>>>>> the need for any ranges property.
>>>>
>>>> ISTM that having two reg entries, specifying the rp and ctrl
>>>> registers, is preferable to having code to calculate the addresses.
>>>> That makes the code simpler and the device tree more directly
>>>> descriptive of the hardware layout.  The less "magic" (in this case,
>>>> the register address calculation), the better.
>>>
>>> The problem with this approach is that since the control registers are
>>> within the AFI register range, both the reg and ranges properties of the
>>> parent would have to include these holes. Furthermore it means that the
>>> controller driver would have to remap the AFI registers in chunks, for the
>>> sole reason of splitting out the controle registers.
>>>
>>> Would it be acceptable to make an exception in this case and use the port's
>>> second reg entry as an offset into the AFI register range instead?
>>
>> How about this:
>>
>>      pcie-controller {
>>          compatible = "nvidia,tegra20-pcie";
>>          reg =<0x80003000 0x00000800   /* PADS registers */
>>                0x80003800 0x00000200>    /* AFI registers */
>>          interrupts =<0 98 0x04   /* controller interrupt */
>>                       0 99 0x04>; /* MSI interrupt */
>>          status = "disabled";
>>
>>          ranges =<0x80000000 0x80000000 0x00002000   /* 2 root ports */
>>                0x80004000 0x80004000 0x00100000   /* configuration space */
>>                0x80104000 0x80104000 0x00100000   /* extended
>> configuration space */
>>                0x80400000 0x80400000 0x00010000   /* downstream I/O */
>>                0x90000000 0x90000000 0x10000000   /* non-prefetchable
>> memory */
>>                0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
>
> I think the configuration spaces and downstream I/O ranges need to be in the
> pcie-controller's reg property because they are remapped and used by the
> controller driver, not by the individual ports.
>
> That's probably not really necessary but rather a result of how the driver
> was written. Perhaps the driver should handle them differently instead,
> listing the regions in the ranges property of the parent and listing the
> corresponding partitions in the ranges properties of the pci child nodes.
>
> Like in the following, where the ranges property of the ports partition the
> ranges passed from the parent evenly:
>
> 	pcie-controller {
> 		compatible = "nvidia,tegra20-pcie";
> 		reg =<0x80003000 0x00000800   /* PADS registers */
> 		       0x80003800 0x00000200>; /* AFI registers */
> 		interrupts =<0 98 0x04   /* controller interrupt */
> 			      0 99 0x04>; /* MSI interrupt */
> 		status = "disabled";
>
> 		ranges =<0x80000000 0x80000000 0x00002000   /* 2 root ports */
> 			  0x80004000 0x80004000 0x00100000   /* configuration space */
> 			  0x80104000 0x80100000 0x00100000   /* extended configuration space */
> 			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
> 			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
> 			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
>
> 		#address-cells =<1>;
> 		#size-cells =<1>;
>
> 		pci@80000000 {
> 			reg =<0x80000000 0x00001000>;
> 			status = "disabled";
>
> 			#address-cells =<3>;
> 			#size-cells =<2>;
>
> 			ranges =<0x80400000 0x80400000 0x00008000   /* I/O */
> 				  0x90000000 0x90000000 0x08000000   /* non-prefetchable memory */
> 				  0xa0000000 0xa0000000 0x08000000>; /* prefetchable memory */

You are on the right track here, but the format of the child-address 
portion of the above ranges property is incorrect.  Since the child 
address space is the PCI address space, the child-address portion needs 
to be 3 cells.  It's not a linear address but rather a triple.  The 
first cell identifies the address type (config, I/O, memory..) and the 
second and third cells are offsets within that subspace.  The second and 
third cells will typically be 0.  The PCI binding has details.

>
>
> 			nvidia,ctrl-offset =<0x0>;
> 			nvidia,num-lanes =<2>;
> 		};
>
> 		pci@80001000 {
> 			reg =<0x80001000 0x00001000>;
> 			status = "disabled";
>
> 			#address-cells =<3>;
> 			#size-cells =<2>;
>
> 			ranges =<0x80408000 0x80408000 0x00008000   /* I/O */
> 				  0x98000000 0x98000000 0x08000000   /* non-prefetchable memory */
> 				  0xa8000000 0xa8000000 0x08000000>; /* prefetchable memory */
>
> 			nvidia,ctrl-offset =<0x8>;
> 			nvidia,num-lanes =<2>;
> 		};
> 	};
>
> I've also added the new num-lanes property that describes the physical
> connections of the port. Also the ctrl-offset and num-lanes property are
> pretty specific to the Tegra PCIe controller, so I've prefixed them with
> "nvidia,".
>
>>
>>          #address-cells =<1>;
>>          #size-cells =<1>;
>>
>>          pci@80000000 {
>>              reg =<0x80000000 0x00001000>;
>>              ctrl-offset =<0x0>;
>>              status = "disabled";
>>              #address-cells =<3>;  /* Standard for PCI */
>>              #size-cells =<2>;      /* Standard for PCI */
>>              ranges =</* Map from standard PCI child address spaces
>> to linear spaces above */>;
>>          };
>>
>>          pci@80001000 {
>>              reg =<0x80001000 0x00001000>;
>>              ctrl-offset =<0x8>;
>>              status = "disabled";
>>              #address-cells =<3>;  /* Standard for PCI */
>>              #size-cells =<2>;      /* Standard for PCI */
>>              ranges =</* Map from standard PCI child address spaces
>> to linear spaces above */>;
>>          };
>>      };
>>
>> Using ctrl-offset instead of reg avoids any violation of the usual
>> reg semantics.
>
> Yes, that sounds better.
>
>>
>> Note that the pci subnodes need the full complement of PCI bus node
>> properties.
>>
>>>
>>>
>>> Thierry
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thierry Reding - June 13, 2012, 8:19 a.m.
On Tue, Jun 12, 2012 at 10:05:35PM -1000, Mitch Bradley wrote:
> On 6/12/2012 9:52 PM, Thierry Reding wrote:
> >I think the configuration spaces and downstream I/O ranges need to be in the
> >pcie-controller's reg property because they are remapped and used by the
> >controller driver, not by the individual ports.
> >
> >That's probably not really necessary but rather a result of how the driver
> >was written. Perhaps the driver should handle them differently instead,
> >listing the regions in the ranges property of the parent and listing the
> >corresponding partitions in the ranges properties of the pci child nodes.
> >
> >Like in the following, where the ranges property of the ports partition the
> >ranges passed from the parent evenly:
> >
> >	pcie-controller {
> >		compatible = "nvidia,tegra20-pcie";
> >		reg =<0x80003000 0x00000800   /* PADS registers */
> >		       0x80003800 0x00000200>; /* AFI registers */
> >		interrupts =<0 98 0x04   /* controller interrupt */
> >			      0 99 0x04>; /* MSI interrupt */
> >		status = "disabled";
> >
> >		ranges =<0x80000000 0x80000000 0x00002000   /* 2 root ports */
> >			  0x80004000 0x80004000 0x00100000   /* configuration space */
> >			  0x80104000 0x80100000 0x00100000   /* extended configuration space */
> >			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
> >			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
> >			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
> >
> >		#address-cells =<1>;
> >		#size-cells =<1>;
> >
> >		pci@80000000 {
> >			reg =<0x80000000 0x00001000>;
> >			status = "disabled";
> >
> >			#address-cells =<3>;
> >			#size-cells =<2>;
> >
> >			ranges =<0x80400000 0x80400000 0x00008000   /* I/O */
> >				  0x90000000 0x90000000 0x08000000   /* non-prefetchable memory */
> >				  0xa0000000 0xa0000000 0x08000000>; /* prefetchable memory */
> 
> You are on the right track here, but the format of the child-address
> portion of the above ranges property is incorrect.  Since the child
> address space is the PCI address space, the child-address portion
> needs to be 3 cells.  It's not a linear address but rather a triple.
> The first cell identifies the address type (config, I/O, memory..)
> and the second and third cells are offsets within that subspace.
> The second and third cells will typically be 0.  The PCI binding has
> details.

Okay, I'll need to read up some more.

Thanks,
Thierry
Mitch Bradley - June 13, 2012, 8:36 a.m.
On 6/12/2012 10:19 PM, Thierry Reding wrote:
> On Tue, Jun 12, 2012 at 10:05:35PM -1000, Mitch Bradley wrote:
>> On 6/12/2012 9:52 PM, Thierry Reding wrote:
>>> I think the configuration spaces and downstream I/O ranges need to be in the
>>> pcie-controller's reg property because they are remapped and used by the
>>> controller driver, not by the individual ports.
>>>
>>> That's probably not really necessary but rather a result of how the driver
>>> was written. Perhaps the driver should handle them differently instead,
>>> listing the regions in the ranges property of the parent and listing the
>>> corresponding partitions in the ranges properties of the pci child nodes.
>>>
>>> Like in the following, where the ranges property of the ports partition the
>>> ranges passed from the parent evenly:
>>>
>>> 	pcie-controller {
>>> 		compatible = "nvidia,tegra20-pcie";
>>> 		reg =<0x80003000 0x00000800   /* PADS registers */
>>> 		       0x80003800 0x00000200>; /* AFI registers */
>>> 		interrupts =<0 98 0x04   /* controller interrupt */
>>> 			      0 99 0x04>; /* MSI interrupt */
>>> 		status = "disabled";
>>>
>>> 		ranges =<0x80000000 0x80000000 0x00002000   /* 2 root ports */
>>> 			  0x80004000 0x80004000 0x00100000   /* configuration space */
>>> 			  0x80104000 0x80100000 0x00100000   /* extended configuration space */
>>> 			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
>>> 			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
>>> 			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
>>>
>>> 		#address-cells =<1>;
>>> 		#size-cells =<1>;
>>>
>>> 		pci@80000000 {
>>> 			reg =<0x80000000 0x00001000>;
>>> 			status = "disabled";
>>>
>>> 			#address-cells =<3>;
>>> 			#size-cells =<2>;
>>>
>>> 			ranges =<0x80400000 0x80400000 0x00008000   /* I/O */
>>> 				  0x90000000 0x90000000 0x08000000   /* non-prefetchable memory */
>>> 				  0xa0000000 0xa0000000 0x08000000>; /* prefetchable memory */
>>
>> You are on the right track here, but the format of the child-address
>> portion of the above ranges property is incorrect.  Since the child
>> address space is the PCI address space, the child-address portion
>> needs to be 3 cells.  It's not a linear address but rather a triple.
>> The first cell identifies the address type (config, I/O, memory..)
>> and the second and third cells are offsets within that subspace.
>> The second and third cells will typically be 0.  The PCI binding has
>> details.

Also, the size field in ranges is specified according to the child 
address space, so there must be 2 size cells in the ranges at this 
level.  Each ranges entry at this level is:

<child_address_space, child_address_high, child_address_low, 
parent_address, child_size_high, child_size_low>

The above should be:

			ranges =<0x81000000 0 0  0x80400000  0 0x00008000   /* I/O */
				 0x82000000 0 0  0x90000000  0 0x08000000   /* non-prefetchable memory */
				 0xc2000000 0 0  0xa0000000  0 0x08000000>; /* prefetchable memory */


>>
>
> Okay, I'll need to read up some more.
>
> Thanks,
> Thierry
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thierry Reding - June 13, 2012, 8:42 a.m.
On Tue, Jun 12, 2012 at 10:36:55PM -1000, Mitch Bradley wrote:
> On 6/12/2012 10:19 PM, Thierry Reding wrote:
> >On Tue, Jun 12, 2012 at 10:05:35PM -1000, Mitch Bradley wrote:
> >>On 6/12/2012 9:52 PM, Thierry Reding wrote:
> >>>I think the configuration spaces and downstream I/O ranges need to be in the
> >>>pcie-controller's reg property because they are remapped and used by the
> >>>controller driver, not by the individual ports.
> >>>
> >>>That's probably not really necessary but rather a result of how the driver
> >>>was written. Perhaps the driver should handle them differently instead,
> >>>listing the regions in the ranges property of the parent and listing the
> >>>corresponding partitions in the ranges properties of the pci child nodes.
> >>>
> >>>Like in the following, where the ranges property of the ports partition the
> >>>ranges passed from the parent evenly:
> >>>
> >>>	pcie-controller {
> >>>		compatible = "nvidia,tegra20-pcie";
> >>>		reg =<0x80003000 0x00000800   /* PADS registers */
> >>>		       0x80003800 0x00000200>; /* AFI registers */
> >>>		interrupts =<0 98 0x04   /* controller interrupt */
> >>>			      0 99 0x04>; /* MSI interrupt */
> >>>		status = "disabled";
> >>>
> >>>		ranges =<0x80000000 0x80000000 0x00002000   /* 2 root ports */
> >>>			  0x80004000 0x80004000 0x00100000   /* configuration space */
> >>>			  0x80104000 0x80100000 0x00100000   /* extended configuration space */
> >>>			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
> >>>			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
> >>>			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
> >>>
> >>>		#address-cells =<1>;
> >>>		#size-cells =<1>;
> >>>
> >>>		pci@80000000 {
> >>>			reg =<0x80000000 0x00001000>;
> >>>			status = "disabled";
> >>>
> >>>			#address-cells =<3>;
> >>>			#size-cells =<2>;
> >>>
> >>>			ranges =<0x80400000 0x80400000 0x00008000   /* I/O */
> >>>				  0x90000000 0x90000000 0x08000000   /* non-prefetchable memory */
> >>>				  0xa0000000 0xa0000000 0x08000000>; /* prefetchable memory */
> >>
> >>You are on the right track here, but the format of the child-address
> >>portion of the above ranges property is incorrect.  Since the child
> >>address space is the PCI address space, the child-address portion
> >>needs to be 3 cells.  It's not a linear address but rather a triple.
> >>The first cell identifies the address type (config, I/O, memory..)
> >>and the second and third cells are offsets within that subspace.
> >>The second and third cells will typically be 0.  The PCI binding has
> >>details.
> 
> Also, the size field in ranges is specified according to the child
> address space, so there must be 2 size cells in the ranges at this
> level.  Each ranges entry at this level is:
> 
> <child_address_space, child_address_high, child_address_low,
> parent_address, child_size_high, child_size_low>
> 
> The above should be:
> 
> 			ranges =<0x81000000 0 0  0x80400000  0 0x00008000   /* I/O */
> 				 0x82000000 0 0  0x90000000  0 0x08000000   /* non-prefetchable memory */
> 				 0xc2000000 0 0  0xa0000000  0 0x08000000>; /* prefetchable memory */

Okay, I think I'm beginning to understand. Thanks.

Thierry
Stephen Warren - June 13, 2012, 4:20 p.m.
On 06/13/2012 12:34 AM, Thierry Reding wrote:
> * Stephen Warren wrote:
...
>> I think you also need a property to specify the exact port
>> layout; the Tegra20 controller supports:
>> 
>> 1 x4 port 2 x2 ports (you can choose to use only 1 of these I
>> assume)
>> 
>> So just because only 1 of the ports is enabled, doesn't imply
>> it's x4; it could still be x2.
>> 
>> Tegra30 has more options.
> 
> Both the upstream and downstream drivers currently hard-code this
> to dual and 411 (I assume that means 1x4, 2x1?) configurations for
> Tegra20 and Tegra30 respectively. Unfortunately the register
> AFI_PCIE_CONFIG isn't documented on Tegra20 at all and for Tegra30
> lists only the valid configurations (see 28.4.1.5 PCIe CONFIG) but
> not the corresponding encodings.
> 
> Maybe a good name for the new property would be "num-lanes". I also
> wonder if this property would be better off in the parent node,
> which would make it easier to check for valid configurations.
> Otherwise the code would have to collect the settings of all the
> ports and check if the combination is valid. Then again, having
> num-lanes in the parent with one cell for each controller isn't
> very nice if each controller can be individually disabled.

Indeed, since the register that controls this is in the main register
set rather than the per-port register set, it does seem like a good
idea to put the property in the main node, not the per-port node.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stephen Warren - June 13, 2012, 5:03 p.m.
On 06/13/2012 12:34 AM, Thierry Reding wrote:
...
> One other thing: in addition to the device tree binding, this will
> all have to be represented in the platform data as well to support
> the legacy board definitions currently in the tree. But instead of
> adding all of these changes to the patch that converts the code to
> a driver, I'm thinking it might be better to split these additions
> off into one or more separate patches. Do you have any objections?

So long as git bisect isn't broken for build- or run-time, that's just
fine.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arnd Bergmann - June 13, 2012, 8:21 p.m.
On Wednesday 13 June 2012, Thierry Reding wrote:
>                 pci@80000000 {
>                         reg = <0x80000000 0x00001000>;
>                         status = "disabled";
> 
>                         #address-cells = <3>;
>                         #size-cells = <2>;
> 
>                         ranges = <0x80400000 0x80400000 0x00008000   /* I/O */
>                                   0x90000000 0x90000000 0x08000000   /* non-prefetchable memory */
>                                   0xa0000000 0xa0000000 0x08000000>; /* prefetchable memory */
> 
>                         nvidia,ctrl-offset = <0x0>;
>                         nvidia,num-lanes = <2>;
>                 };
> 

I believe you will need an "interrupt-map" property here, to map the host
interrupts to the INTA-INTD lines of the attached devices.

I'm not sure whether we want to have a device_type="pciex" property here.
powerpc and sparc seem to use that information, to distinguish a pcie
bus from pci or cardbus.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thierry Reding - June 14, 2012, 8:37 a.m.
On Wed, Jun 13, 2012 at 08:21:08PM +0000, Arnd Bergmann wrote:
> On Wednesday 13 June 2012, Thierry Reding wrote:
> >                 pci@80000000 {
> >                         reg = <0x80000000 0x00001000>;
> >                         status = "disabled";
> > 
> >                         #address-cells = <3>;
> >                         #size-cells = <2>;
> > 
> >                         ranges = <0x80400000 0x80400000 0x00008000   /* I/O */
> >                                   0x90000000 0x90000000 0x08000000   /* non-prefetchable memory */
> >                                   0xa0000000 0xa0000000 0x08000000>; /* prefetchable memory */
> > 
> >                         nvidia,ctrl-offset = <0x0>;
> >                         nvidia,num-lanes = <2>;
> >                 };
> > 
> 
> I believe you will need an "interrupt-map" property here, to map the host
> interrupts to the INTA-INTD lines of the attached devices.

Legacy interrupts are something I cannot test at all because I have no
hardware that supports them.

> I'm not sure whether we want to have a device_type="pciex" property here.
> powerpc and sparc seem to use that information, to distinguish a pcie
> bus from pci or cardbus.

That'd be rather useless information given that the Tegra is unlikely to
support either PCI or CardBus at some point.

Thierry
Thierry Reding - June 14, 2012, 9:19 a.m.
On Tue, Jun 12, 2012 at 10:36:55PM -1000, Mitch Bradley wrote:
> On 6/12/2012 10:19 PM, Thierry Reding wrote:
> >On Tue, Jun 12, 2012 at 10:05:35PM -1000, Mitch Bradley wrote:
> >>On 6/12/2012 9:52 PM, Thierry Reding wrote:
> >>>I think the configuration spaces and downstream I/O ranges need to be in the
> >>>pcie-controller's reg property because they are remapped and used by the
> >>>controller driver, not by the individual ports.
> >>>
> >>>That's probably not really necessary but rather a result of how the driver
> >>>was written. Perhaps the driver should handle them differently instead,
> >>>listing the regions in the ranges property of the parent and listing the
> >>>corresponding partitions in the ranges properties of the pci child nodes.
> >>>
> >>>Like in the following, where the ranges property of the ports partition the
> >>>ranges passed from the parent evenly:
> >>>
> >>>	pcie-controller {
> >>>		compatible = "nvidia,tegra20-pcie";
> >>>		reg =<0x80003000 0x00000800   /* PADS registers */
> >>>		       0x80003800 0x00000200>; /* AFI registers */
> >>>		interrupts =<0 98 0x04   /* controller interrupt */
> >>>			      0 99 0x04>; /* MSI interrupt */
> >>>		status = "disabled";
> >>>
> >>>		ranges =<0x80000000 0x80000000 0x00002000   /* 2 root ports */
> >>>			  0x80004000 0x80004000 0x00100000   /* configuration space */
> >>>			  0x80104000 0x80100000 0x00100000   /* extended configuration space */
> >>>			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
> >>>			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
> >>>			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
> >>>
> >>>		#address-cells =<1>;
> >>>		#size-cells =<1>;
> >>>
> >>>		pci@80000000 {
> >>>			reg =<0x80000000 0x00001000>;
> >>>			status = "disabled";
> >>>
> >>>			#address-cells =<3>;
> >>>			#size-cells =<2>;
> >>>
> >>>			ranges =<0x80400000 0x80400000 0x00008000   /* I/O */
> >>>				  0x90000000 0x90000000 0x08000000   /* non-prefetchable memory */
> >>>				  0xa0000000 0xa0000000 0x08000000>; /* prefetchable memory */
> >>
> >>You are on the right track here, but the format of the child-address
> >>portion of the above ranges property is incorrect.  Since the child
> >>address space is the PCI address space, the child-address portion
> >>needs to be 3 cells.  It's not a linear address but rather a triple.
> >>The first cell identifies the address type (config, I/O, memory..)
> >>and the second and third cells are offsets within that subspace.
> >>The second and third cells will typically be 0.  The PCI binding has
> >>details.
> 
> Also, the size field in ranges is specified according to the child
> address space, so there must be 2 size cells in the ranges at this
> level.  Each ranges entry at this level is:
> 
> <child_address_space, child_address_high, child_address_low,
> parent_address, child_size_high, child_size_low>
> 
> The above should be:
> 
> 			ranges =<0x81000000 0 0  0x80400000  0 0x00008000   /* I/O */
> 				 0x82000000 0 0  0x90000000  0 0x08000000   /* non-prefetchable memory */
> 				 0xc2000000 0 0  0xa0000000  0 0x08000000>; /* prefetchable memory */
> 
> 

Okay, so the new pcie-controller node looks like this:

	pcie-controller {
		compatible = "nvidia,tegra20-pcie", "simple-bus";
		reg = <0x80003000 0x00000800   /* PADS registers */
		       0x80003800 0x00000200   /* AFI registers */
		       0x80004000 0x00100000   /* configuration space */
		       0x80104000 0x00100000>; /* extended configuration space */
		interrupts = <0 98 0x04   /* controller interrupt */
		              0 99 0x04>; /* MSI interrupt */
		status = "disabled";

		ranges = <0x80000000 0x80000000 0x00002000   /* 2 root ports */
			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */

		#address-cells = <1>;
		#size-cells = <1>;

		pci@80000000 {
			compatible = "nvidia,tegra20-pcie-port";
			reg = <0x80000000 0x00001000>;
			status = "disabled";

			#address-cells = <3>;
			#size-cells = <2>;

			ranges = <0x81000000 0 0 0x80400000 0 0x00008000   /* I/O */
				  0x82000000 0 0 0x90000000 0 0x08000000   /* non-prefetchable memory */
				  0xc2000000 0 0 0xa0000000 0 0x08000000>; /* prefetchable memory */

			nvidia,ctrl-offset = <0x110>;
			nvidia,num-lanes = <2>;
		};

		pci@80001000 {
			compatible = "nvidia,tegra20-pcie-port";
			reg = <0x80001000 0x00001000>;
			status = "disabled";

			#address-cells = <3>;
			#size-cells = <2>;

			ranges = <0x81000000 0 0 0x80408000 0 0x00008000   /* I/O */
				  0x82000000 0 0 0x98000000 0 0x08000000   /* non-prefetchable memory */
				  0xc2000000 0 0 0xa8000000 0 0x08000000>; /* prefetchable memory */

			nvidia,ctrl-offset = <0x118>;
			nvidia,num-lanes = <2>;
		};
	};

While looking into some more code, trying to figure out how to hook this all
up with the device tree I ran into a problem. I need to actually create a
'struct device' for each of the ports, so I added the "simple-bus" to the
pcie-controller's "compatible" property. Furthermore, each PCI root port now
becomes a platform_device, which are supported by a new tegra-pcie-port
driver. I'm not sure if "port" is very common in PCI speek, so something like
tegra-pcie-bridge (compatible = "nvidia,tegra20-pcie-bridge") may be more
appropriate?

Using real platform devices is not only more straightforward (each bridge is
after all a separate parent for the PCI endpoints) but it also has the
advantage that the bridges are properly hooked up with the device tree. If I
read the code correctly, passing the 'struct device' to pci_scan_root_bus()
will allow the PCI core code to set up the proper pointers that allow busses
and endpoints below each bridge to be matched to the corresponding DT nodes.

However this causes other problems with the initialization of the PCIe
controller. The translations cannot be setup and the controller shouldn't be
enabled until after all the ports have been probed because it isn't clear
which of them are really active and which are not. For the DT case this can
probably be done by parsing the device tree and collecting the information,
but for the non-DT case this will probably be more difficult.

To solve this for both cases I could probably use device_for_each_child() and
check that all children have been properly probed before setting up the
translations and enabling the controller. That leaves the question of how to
verify that a device has been correctly probed. The easiest would probably be
to check if dev_get_drvdata() != NULL, but I'm still trying to decide whether
that's too ugly or not. I don't know of any other way to determine that
probing was successful. In addition it may still be useful to continue if one
of the devices failed to initialize (the link on only one of two enabled
ports is up).

Thierry
Arnd Bergmann - June 14, 2012, 10:25 a.m.
On Thursday 14 June 2012, Thierry Reding wrote:
> > 
> > I believe you will need an "interrupt-map" property here, to map the host
> > interrupts to the INTA-INTD lines of the attached devices.
>
> Legacy interrupts are something I cannot test at all because I have no
> hardware that supports them.

Hmm, I thought all PCIe hardware has to support them when you do not
enable MSI. What hardware do you have then?

> > I'm not sure whether we want to have a device_type="pciex" property here.
> > powerpc and sparc seem to use that information, to distinguish a pcie
> > bus from pci or cardbus.
> 
> That'd be rather useless information given that the Tegra is unlikely to
> support either PCI or CardBus at some point.

But the generic code does not know that.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thierry Reding - June 14, 2012, 10:31 a.m.
On Thu, Jun 14, 2012 at 10:25:09AM +0000, Arnd Bergmann wrote:
> On Thursday 14 June 2012, Thierry Reding wrote:
> > > 
> > > I believe you will need an "interrupt-map" property here, to map the host
> > > interrupts to the INTA-INTD lines of the attached devices.
> >
> > Legacy interrupts are something I cannot test at all because I have no
> > hardware that supports them.
> 
> Hmm, I thought all PCIe hardware has to support them when you do not
> enable MSI. What hardware do you have then?

The TEC (Tamonten Evaluation Carrier) has an FPGA which is connected to one
of the Tegra PCIe ports and it only supports MSIs.

> > > I'm not sure whether we want to have a device_type="pciex" property here.
> > > powerpc and sparc seem to use that information, to distinguish a pcie
> > > bus from pci or cardbus.
> > 
> > That'd be rather useless information given that the Tegra is unlikely to
> > support either PCI or CardBus at some point.
> 
> But the generic code does not know that.

True. Still there's not much generic code on ARM, so maybe it'd be better to
add it along with the corresponding code?

Thierry
Arnd Bergmann - June 14, 2012, 11:06 a.m.
On Thursday 14 June 2012, Thierry Reding wrote:
> On Thu, Jun 14, 2012 at 10:25:09AM +0000, Arnd Bergmann wrote:
> > On Thursday 14 June 2012, Thierry Reding wrote:
> > > > 
> > > > I believe you will need an "interrupt-map" property here, to map the host
> > > > interrupts to the INTA-INTD lines of the attached devices.
> > >
> > > Legacy interrupts are something I cannot test at all because I have no
> > > hardware that supports them.
> > 
> > Hmm, I thought all PCIe hardware has to support them when you do not
> > enable MSI. What hardware do you have then?
> 
> The TEC (Tamonten Evaluation Carrier) has an FPGA which is connected to one
> of the Tegra PCIe ports and it only supports MSIs.

Ah, I see. FPGA based devices often have incomplete PCI support so that
explains why you can't test it. The board also appears to have a
PCIe mini port though, so anything that you could plug in there should
also support LSI interrupts.

> > > > I'm not sure whether we want to have a device_type="pciex" property here.
> > > > powerpc and sparc seem to use that information, to distinguish a pcie
> > > > bus from pci or cardbus.
> > > 
> > > That'd be rather useless information given that the Tegra is unlikely to
> > > support either PCI or CardBus at some point.
> > 
> > But the generic code does not know that.
> 
> True. Still there's not much generic code on ARM, so maybe it'd be better to
> add it along with the corresponding code?

I hope we can make the PCI host probing architecture independent one day,
instead of each architecture implementing their own. I'd say better put
that property in there so we don't have to change it in case the future
generic code will need it. It is part of the binding at
http://www.openfirmware.org/1275/bindings/pci/pci-express.txt after all.

Note that we should generally not make /code/ be written with the
anticipation of a future extension, but anything that concerns interfaces
to code outside of the kernel (firmware or user space) is best written
in a conservative way to allow later changes.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thierry Reding - June 14, 2012, 11:58 a.m.
On Thu, Jun 14, 2012 at 11:06:48AM +0000, Arnd Bergmann wrote:
> On Thursday 14 June 2012, Thierry Reding wrote:
> > On Thu, Jun 14, 2012 at 10:25:09AM +0000, Arnd Bergmann wrote:
> > > On Thursday 14 June 2012, Thierry Reding wrote:
> > > > > 
> > > > > I believe you will need an "interrupt-map" property here, to map the host
> > > > > interrupts to the INTA-INTD lines of the attached devices.
> > > >
> > > > Legacy interrupts are something I cannot test at all because I have no
> > > > hardware that supports them.
> > > 
> > > Hmm, I thought all PCIe hardware has to support them when you do not
> > > enable MSI. What hardware do you have then?
> > 
> > The TEC (Tamonten Evaluation Carrier) has an FPGA which is connected to one
> > of the Tegra PCIe ports and it only supports MSIs.
> 
> Ah, I see. FPGA based devices often have incomplete PCI support so that
> explains why you can't test it. The board also appears to have a
> PCIe mini port though, so anything that you could plug in there should
> also support LSI interrupts.

Yes, I haven't gotten my hands on a suitable module yet, but I'll try. I'm
not very familiar with legacy interrupts, though, and I'll have to read up
on the interrupt map bindings.

> > > > > I'm not sure whether we want to have a device_type="pciex" property here.
> > > > > powerpc and sparc seem to use that information, to distinguish a pcie
> > > > > bus from pci or cardbus.
> > > > 
> > > > That'd be rather useless information given that the Tegra is unlikely to
> > > > support either PCI or CardBus at some point.
> > > 
> > > But the generic code does not know that.
> > 
> > True. Still there's not much generic code on ARM, so maybe it'd be better to
> > add it along with the corresponding code?
> 
> I hope we can make the PCI host probing architecture independent one day,
> instead of each architecture implementing their own. I'd say better put
> that property in there so we don't have to change it in case the future
> generic code will need it. It is part of the binding at
> http://www.openfirmware.org/1275/bindings/pci/pci-express.txt after all.

There seems to be quite a lot in the PCI core already. Unfortunately every
architecture seems to do things differently, so unification is probably going
to be quite difficult.

> Note that we should generally not make /code/ be written with the
> anticipation of a future extension, but anything that concerns interfaces
> to code outside of the kernel (firmware or user space) is best written
> in a conservative way to allow later changes.

Okay, I'll add the property.

Thierry
Stephen Warren - June 14, 2012, 6:30 p.m.
On 06/14/2012 03:19 AM, Thierry Reding wrote:
...
> Okay, so the new pcie-controller node looks like this:
> 
> pcie-controller { compatible = "nvidia,tegra20-pcie",
> "simple-bus"; reg = <0x80003000 0x00000800   /* PADS registers */ 
> 0x80003800 0x00000200   /* AFI registers */ 0x80004000 0x00100000
> /* configuration space */ 0x80104000 0x00100000>; /* extended
> configuration space */ interrupts = <0 98 0x04   /* controller
> interrupt */ 0 99 0x04>; /* MSI interrupt */ status = "disabled";
> 
> ranges = <0x80000000 0x80000000 0x00002000   /* 2 root ports */ 
> 0x80400000 0x80400000 0x00010000   /* downstream I/O */ 0x90000000
> 0x90000000 0x10000000   /* non-prefetchable memory */ 0xa0000000
> 0xa0000000 0x10000000>; /* prefetchable memory */

Do we actually need to specifically enumerate all the ranges; would
just "ranges;" be simpler?

> #address-cells = <1>; #size-cells = <1>;
> 
> pci@80000000 {

I'm still not convinced that using the address of the port's registers
is the correct way to represent each port. The port index seems much
more useful.

The main reason here is that there are a lot of registers that contain
fields for each port - far more than the combination of this node's
reg and ctrl-offset (which I assume is an address offset for just one
example of this issue) properties can describe. The bit position and
bit stride of these fields isn't necessarily the same in each
register. Do we want a property like ctrl-offset for every single type
of field in every single shared register that describes the location
of the relevant data, or just a single "port ID" bit that can be
applied to anything?

(Perhaps this isn't so obvious looking at the TRM since it doesn't
document all registers, and I'm also looking at the Tegra30
documentation too, which might be more exposed to this - I haven't
correlated all the documentation sources to be sure though)

> compatible = "nvidia,tegra20-pcie-port"; reg = <0x80000000
> 0x00001000>; status = "disabled";
> 
> #address-cells = <3>; #size-cells = <2>;
> 
> ranges = <0x81000000 0 0 0x80400000 0 0x00008000   /* I/O */ 
> 0x82000000 0 0 0x90000000 0 0x08000000   /* non-prefetchable memory
> */ 0xc2000000 0 0 0xa0000000 0 0x08000000>; /* prefetchable memory
> */

The values here appear identical for both ports. Surely they should
describe just the parts of the overall address space that have been
assigned/delegated to the individual port/bridge?

Note that initial indications are that the overall controller receives
transactions for those address ranges, and standard PCIe bridge
registers are used to steer chunks of these address ranges into the
individual downstream ports/busses. Hence, standard PCIe documentation
should indicate how to do this.

> nvidia,ctrl-offset = <0x110>; nvidia,num-lanes = <2>; };
> 
> pci@80001000 { compatible = "nvidia,tegra20-pcie-port"; reg =
> <0x80001000 0x00001000>; status = "disabled";
> 
> #address-cells = <3>; #size-cells = <2>;
> 
> ranges = <0x81000000 0 0 0x80408000 0 0x00008000   /* I/O */ 
> 0x82000000 0 0 0x98000000 0 0x08000000   /* non-prefetchable memory
> */ 0xc2000000 0 0 0xa8000000 0 0x08000000>; /* prefetchable memory
> */
> 
> nvidia,ctrl-offset = <0x118>; nvidia,num-lanes = <2>; }; };
> 
> While looking into some more code, trying to figure out how to hook
> this all up with the device tree I ran into a problem. I need to
> actually create a 'struct device' for each of the ports, so I added
> the "simple-bus" to the pcie-controller's "compatible" property.
> Furthermore, each PCI root port now becomes a platform_device,
> which are supported by a new tegra-pcie-port driver. I'm not sure
> if "port" is very common in PCI speek, so something like 
> tegra-pcie-bridge (compatible = "nvidia,tegra20-pcie-bridge") may
> be more appropriate?

What is it that drives the need for each port to be a 'struct device'?
The current driver supports 2 host ports, yet there's only a single
struct device for it. Does the DT code assume a 1:1 mapping between
struct device and DT node that represents the child bus? If so,
perhaps it'd be better to rework that code to accept a DT node as a
parameter and call it multiple times, rather than accept a struct
device as a parameter and hence need multiple devices?
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thierry Reding - June 14, 2012, 7:29 p.m.
On Thu, Jun 14, 2012 at 12:30:50PM -0600, Stephen Warren wrote:
> On 06/14/2012 03:19 AM, Thierry Reding wrote:
> ...
> > Okay, so the new pcie-controller node looks like this:
> > 
> > pcie-controller { compatible = "nvidia,tegra20-pcie",
> > "simple-bus"; reg = <0x80003000 0x00000800   /* PADS registers */ 
> > 0x80003800 0x00000200   /* AFI registers */ 0x80004000 0x00100000
> > /* configuration space */ 0x80104000 0x00100000>; /* extended
> > configuration space */ interrupts = <0 98 0x04   /* controller
> > interrupt */ 0 99 0x04>; /* MSI interrupt */ status = "disabled";
> > 
> > ranges = <0x80000000 0x80000000 0x00002000   /* 2 root ports */ 
> > 0x80400000 0x80400000 0x00010000   /* downstream I/O */ 0x90000000
> > 0x90000000 0x10000000   /* non-prefetchable memory */ 0xa0000000
> > 0xa0000000 0x10000000>; /* prefetchable memory */
> 
> Do we actually need to specifically enumerate all the ranges; would
> just "ranges;" be simpler?

I think they need to be listed here in order for the child nodes to be able
to map them using their ranges property. It's possible that it'll work by
specifying an empty ranges, but I don't think it'd be correct in the OF
sense.

> > #address-cells = <1>; #size-cells = <1>;
> > 
> > pci@80000000 {
> 
> I'm still not convinced that using the address of the port's registers
> is the correct way to represent each port. The port index seems much
> more useful.
> 
> The main reason here is that there are a lot of registers that contain
> fields for each port - far more than the combination of this node's
> reg and ctrl-offset (which I assume is an address offset for just one
> example of this issue) properties can describe. The bit position and
> bit stride of these fields isn't necessarily the same in each
> register. Do we want a property like ctrl-offset for every single type
> of field in every single shared register that describes the location
> of the relevant data, or just a single "port ID" bit that can be
> applied to anything?
> 
> (Perhaps this isn't so obvious looking at the TRM since it doesn't
> document all registers, and I'm also looking at the Tegra30
> documentation too, which might be more exposed to this - I haven't
> correlated all the documentation sources to be sure though)

I agree that maybe adding properties for each bit position or register offset
may not work out too well. But I think it still makes sense to use the base
address of the port's registers (see below). We could of course add some code
to determine the index from the base address at initialization time and reuse
the index where appropriate.

> > compatible = "nvidia,tegra20-pcie-port"; reg = <0x80000000
> > 0x00001000>; status = "disabled";
> > 
> > #address-cells = <3>; #size-cells = <2>;
> > 
> > ranges = <0x81000000 0 0 0x80400000 0 0x00008000   /* I/O */ 
> > 0x82000000 0 0 0x90000000 0 0x08000000   /* non-prefetchable memory
> > */ 0xc2000000 0 0 0xa0000000 0 0x08000000>; /* prefetchable memory
> > */
> 
> The values here appear identical for both ports. Surely they should
> describe just the parts of the overall address space that have been
> assigned/delegated to the individual port/bridge?

They're not identical. Port 0 gets the first half and port 1 gets the second
half of the ranges specified in the parent.

> Note that initial indications are that the overall controller receives
> transactions for those address ranges, and standard PCIe bridge
> registers are used to steer chunks of these address ranges into the
> individual downstream ports/busses. Hence, standard PCIe documentation
> should indicate how to do this.

Yes, I believe that works with the code I have currently, which basically
parses the ranges property of each port and initializes the I/O, memory and
prefetchable regions from those values.

> > nvidia,ctrl-offset = <0x110>; nvidia,num-lanes = <2>; };
> > 
> > pci@80001000 { compatible = "nvidia,tegra20-pcie-port"; reg =
> > <0x80001000 0x00001000>; status = "disabled";
> > 
> > #address-cells = <3>; #size-cells = <2>;
> > 
> > ranges = <0x81000000 0 0 0x80408000 0 0x00008000   /* I/O */ 
> > 0x82000000 0 0 0x98000000 0 0x08000000   /* non-prefetchable memory
> > */ 0xc2000000 0 0 0xa8000000 0 0x08000000>; /* prefetchable memory
> > */
> > 
> > nvidia,ctrl-offset = <0x118>; nvidia,num-lanes = <2>; }; };
> > 
> > While looking into some more code, trying to figure out how to hook
> > this all up with the device tree I ran into a problem. I need to
> > actually create a 'struct device' for each of the ports, so I added
> > the "simple-bus" to the pcie-controller's "compatible" property.
> > Furthermore, each PCI root port now becomes a platform_device,
> > which are supported by a new tegra-pcie-port driver. I'm not sure
> > if "port" is very common in PCI speek, so something like 
> > tegra-pcie-bridge (compatible = "nvidia,tegra20-pcie-bridge") may
> > be more appropriate?
> 
> What is it that drives the need for each port to be a 'struct device'?
> The current driver supports 2 host ports, yet there's only a single
> struct device for it. Does the DT code assume a 1:1 mapping between
> struct device and DT node that represents the child bus? If so,
> perhaps it'd be better to rework that code to accept a DT node as a
> parameter and call it multiple times, rather than accept a struct
> device as a parameter and hence need multiple devices?

It's not so much the DT code, but rather the PCI core and ultimately the
device model that requires it. Each port is basically a PCI host bridge that
provides a root PCI bus and the device model is used to represent the
hierachy of the busses. Providing just the DT node isn't going to be enough.

Thierry
Stephen Warren - June 14, 2012, 7:50 p.m.
On 06/14/2012 01:29 PM, Thierry Reding wrote:
> On Thu, Jun 14, 2012 at 12:30:50PM -0600, Stephen Warren wrote:
>> On 06/14/2012 03:19 AM, Thierry Reding wrote:
...
>>> #address-cells = <1>; #size-cells = <1>;
>>> 
>>> pci@80000000 {
>> 
>> I'm still not convinced that using the address of the port's
>> registers is the correct way to represent each port. The port
>> index seems much more useful.
>> 
>> The main reason here is that there are a lot of registers that
>> contain fields for each port - far more than the combination of
>> this node's reg and ctrl-offset (which I assume is an address
>> offset for just one example of this issue) properties can
>> describe. The bit position and bit stride of these fields isn't
>> necessarily the same in each register. Do we want a property like
>> ctrl-offset for every single type of field in every single shared
>> register that describes the location of the relevant data, or
>> just a single "port ID" bit that can be applied to anything?
>> 
>> (Perhaps this isn't so obvious looking at the TRM since it
>> doesn't document all registers, and I'm also looking at the
>> Tegra30 documentation too, which might be more exposed to this -
>> I haven't correlated all the documentation sources to be sure
>> though)
> 
> I agree that maybe adding properties for each bit position or
> register offset may not work out too well. But I think it still
> makes sense to use the base address of the port's registers (see
> below). We could of course add some code to determine the index
> from the base address at initialization time and reuse the index
> where appropriate.

To me, working back from address to ID then using the ID to calculate
some other addresses seems far more icky than just calculating all the
addresses based off of one ID. But, I suppose this doesn't make a huge
practical difference.

>>> compatible = "nvidia,tegra20-pcie-port"; reg = <0x80000000 
>>> 0x00001000>; status = "disabled";
>>> 
>>> #address-cells = <3>; #size-cells = <2>;
>>> 
>>> ranges = <0x81000000 0 0 0x80400000 0 0x00008000   /* I/O */ 
>>> 0x82000000 0 0 0x90000000 0 0x08000000   /* non-prefetchable
>>> memory */ 0xc2000000 0 0 0xa0000000 0 0x08000000>; /*
>>> prefetchable memory */
>> 
>> The values here appear identical for both ports. Surely they
>> should describe just the parts of the overall address space that
>> have been assigned/delegated to the individual port/bridge?
> 
> They're not identical. Port 0 gets the first half and port 1 gets
> the second half of the ranges specified in the parent.

Oh right, I missed some 8s and 0s that looked the same!

>>> While looking into some more code, trying to figure out how to
>>> hook this all up with the device tree I ran into a problem. I
>>> need to actually create a 'struct device' for each of the
>>> ports, so I added the "simple-bus" to the pcie-controller's
>>> "compatible" property. Furthermore, each PCI root port now
>>> becomes a platform_device, which are supported by a new
>>> tegra-pcie-port driver. I'm not sure if "port" is very common
>>> in PCI speek, so something like tegra-pcie-bridge (compatible =
>>> "nvidia,tegra20-pcie-bridge") may be more appropriate?
>> 
>> What is it that drives the need for each port to be a 'struct
>> device'? The current driver supports 2 host ports, yet there's
>> only a single struct device for it. Does the DT code assume a 1:1
>> mapping between struct device and DT node that represents the
>> child bus? If so, perhaps it'd be better to rework that code to
>> accept a DT node as a parameter and call it multiple times,
>> rather than accept a struct device as a parameter and hence need
>> multiple devices?
> 
> It's not so much the DT code, but rather the PCI core and
> ultimately the device model that requires it. Each port is
> basically a PCI host bridge that provides a root PCI bus and the
> device model is used to represent the hierachy of the busses.
> Providing just the DT node isn't going to be enough.

But the existing driver works without /any/ devices, let alone one per
port.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thierry Reding - June 15, 2012, 6:12 a.m.
On Thu, Jun 14, 2012 at 01:50:56PM -0600, Stephen Warren wrote:
> On 06/14/2012 01:29 PM, Thierry Reding wrote:
> > On Thu, Jun 14, 2012 at 12:30:50PM -0600, Stephen Warren wrote:
> >> On 06/14/2012 03:19 AM, Thierry Reding wrote:
> ...
> >>> #address-cells = <1>; #size-cells = <1>;
> >>> 
> >>> pci@80000000 {
> >> 
> >> I'm still not convinced that using the address of the port's
> >> registers is the correct way to represent each port. The port
> >> index seems much more useful.
> >> 
> >> The main reason here is that there are a lot of registers that
> >> contain fields for each port - far more than the combination of
> >> this node's reg and ctrl-offset (which I assume is an address
> >> offset for just one example of this issue) properties can
> >> describe. The bit position and bit stride of these fields isn't
> >> necessarily the same in each register. Do we want a property like
> >> ctrl-offset for every single type of field in every single shared
> >> register that describes the location of the relevant data, or
> >> just a single "port ID" bit that can be applied to anything?
> >> 
> >> (Perhaps this isn't so obvious looking at the TRM since it
> >> doesn't document all registers, and I'm also looking at the
> >> Tegra30 documentation too, which might be more exposed to this -
> >> I haven't correlated all the documentation sources to be sure
> >> though)
> > 
> > I agree that maybe adding properties for each bit position or
> > register offset may not work out too well. But I think it still
> > makes sense to use the base address of the port's registers (see
> > below). We could of course add some code to determine the index
> > from the base address at initialization time and reuse the index
> > where appropriate.
> 
> To me, working back from address to ID then using the ID to calculate
> some other addresses seems far more icky than just calculating all the
> addresses based off of one ID. But, I suppose this doesn't make a huge
> practical difference.

This really depends on the device vs. no device decision below. If we can
make it work without needing an extra device for it, then using the index
is certainly better. However, if we instantiate devices from the DT, then
we have the address anyway and adding the index as a property would be
redundant and error prone (what happens if somebody sets the index of the
port at address 0x80000000 to 2?).

> >>> compatible = "nvidia,tegra20-pcie-port"; reg = <0x80000000 
> >>> 0x00001000>; status = "disabled";
> >>> 
> >>> #address-cells = <3>; #size-cells = <2>;
> >>> 
> >>> ranges = <0x81000000 0 0 0x80400000 0 0x00008000   /* I/O */ 
> >>> 0x82000000 0 0 0x90000000 0 0x08000000   /* non-prefetchable
> >>> memory */ 0xc2000000 0 0 0xa0000000 0 0x08000000>; /*
> >>> prefetchable memory */
> >> 
> >> The values here appear identical for both ports. Surely they
> >> should describe just the parts of the overall address space that
> >> have been assigned/delegated to the individual port/bridge?
> > 
> > They're not identical. Port 0 gets the first half and port 1 gets
> > the second half of the ranges specified in the parent.
> 
> Oh right, I missed some 8s and 0s that looked the same!
> 
> >>> While looking into some more code, trying to figure out how to
> >>> hook this all up with the device tree I ran into a problem. I
> >>> need to actually create a 'struct device' for each of the
> >>> ports, so I added the "simple-bus" to the pcie-controller's
> >>> "compatible" property. Furthermore, each PCI root port now
> >>> becomes a platform_device, which are supported by a new
> >>> tegra-pcie-port driver. I'm not sure if "port" is very common
> >>> in PCI speek, so something like tegra-pcie-bridge (compatible =
> >>> "nvidia,tegra20-pcie-bridge") may be more appropriate?
> >> 
> >> What is it that drives the need for each port to be a 'struct
> >> device'? The current driver supports 2 host ports, yet there's
> >> only a single struct device for it. Does the DT code assume a 1:1
> >> mapping between struct device and DT node that represents the
> >> child bus? If so, perhaps it'd be better to rework that code to
> >> accept a DT node as a parameter and call it multiple times,
> >> rather than accept a struct device as a parameter and hence need
> >> multiple devices?
> > 
> > It's not so much the DT code, but rather the PCI core and
> > ultimately the device model that requires it. Each port is
> > basically a PCI host bridge that provides a root PCI bus and the
> > device model is used to represent the hierachy of the busses.
> > Providing just the DT node isn't going to be enough.
> 
> But the existing driver works without /any/ devices, let alone one per
> port.

That doesn't necessarily mean it is correct. The representation of the
device tree (as in the Linux kernel device tree) isn't quite accurate
because you're missing the link of the host bridge to the parent PCIe
controller. It also means that the representation of the kernel device
tree doesn't match the DT representation.

Additionally, if you look at how PCI busses and devices are matched to
their respective DT nodes, the code in drivers/pci/of.c provides a
default implementation of pcibios_get_phb_of_node(), which matches the
struct pci_bus up with the device_node of the parent device.

If we keep the current implementation that passes NULL as parent to the
pci_scan_root_bus() function, then we'll have to provide a custom
implementation of pcibios_get_phb_of_node() which has to go through
similar hoops as the x86 version (see arch/x86/kernel/devicetree.c).
That of course will not contribute to improving the current state of
fragmentation in the PCI subsystem.

Thierry
Thierry Reding - June 19, 2012, 1:30 p.m.
On Fri, Jun 15, 2012 at 08:12:36AM +0200, Thierry Reding wrote:
> On Thu, Jun 14, 2012 at 01:50:56PM -0600, Stephen Warren wrote:
> > On 06/14/2012 01:29 PM, Thierry Reding wrote:
> > > On Thu, Jun 14, 2012 at 12:30:50PM -0600, Stephen Warren wrote:
> > >> On 06/14/2012 03:19 AM, Thierry Reding wrote:
> > ...
> > >>> #address-cells = <1>; #size-cells = <1>;
> > >>> 
> > >>> pci@80000000 {
> > >> 
> > >> I'm still not convinced that using the address of the port's
> > >> registers is the correct way to represent each port. The port
> > >> index seems much more useful.
> > >> 
> > >> The main reason here is that there are a lot of registers that
> > >> contain fields for each port - far more than the combination of
> > >> this node's reg and ctrl-offset (which I assume is an address
> > >> offset for just one example of this issue) properties can
> > >> describe. The bit position and bit stride of these fields isn't
> > >> necessarily the same in each register. Do we want a property like
> > >> ctrl-offset for every single type of field in every single shared
> > >> register that describes the location of the relevant data, or
> > >> just a single "port ID" bit that can be applied to anything?
> > >> 
> > >> (Perhaps this isn't so obvious looking at the TRM since it
> > >> doesn't document all registers, and I'm also looking at the
> > >> Tegra30 documentation too, which might be more exposed to this -
> > >> I haven't correlated all the documentation sources to be sure
> > >> though)
> > > 
> > > I agree that maybe adding properties for each bit position or
> > > register offset may not work out too well. But I think it still
> > > makes sense to use the base address of the port's registers (see
> > > below). We could of course add some code to determine the index
> > > from the base address at initialization time and reuse the index
> > > where appropriate.
> > 
> > To me, working back from address to ID then using the ID to calculate
> > some other addresses seems far more icky than just calculating all the
> > addresses based off of one ID. But, I suppose this doesn't make a huge
> > practical difference.
> 
> This really depends on the device vs. no device decision below. If we can
> make it work without needing an extra device for it, then using the index
> is certainly better. However, if we instantiate devices from the DT, then
> we have the address anyway and adding the index as a property would be
> redundant and error prone (what happens if somebody sets the index of the
> port at address 0x80000000 to 2?).

An additional problem with this is that we'd have to add the following
to the pcie-controller node:

	#address-cells = <1>;
	#size-cells = <0>;

This will conflict with the "ranges" property, because suddenly we can
no longer map the regions properly. Maybe Mitch can comment on whether
this is possible or not?

To make it clearer what I'm talking about, here's the DT snippet again
(with the compatible property removed from the pci@ nodes because they
are no longer probed by a driver, the "simple-bus" removed from the
pcie-controller node's compatible property removed and its #address-
and #size-cells properties adjusted as described above).

	pcie-controller {
		compatible = "nvidia,tegra20-pcie";
		reg = <0x80003000 0x00000800   /* PADS registers */
		       0x80003800 0x00000200   /* AFI registers */
		       0x80004000 0x00100000   /* configuration space */
		       0x80104000 0x00100000>; /* extended configuration space */
		interrupts = <0 98 0x04   /* controller interrupt */
			      0 99 0x04>; /* MSI interrupt */
		status = "disabled";

		ranges = <0x80000000 0x80000000 0x00002000   /* 2 root ports */
			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */

		#address-cells = <1>;
		#size-cells = <0>;

		pci@0 {
			reg = <2>;
			status = "disabled";

			#address-cells = <3>;
			#size-cells = <2>;

			ranges = <0x81000000 0 0 0x80400000 0 0x00008000   /* I/O */
				  0x82000000 0 0 0x90000000 0 0x08000000   /* non-prefetchable memory */
				  0xc2000000 0 0 0xa0000000 0 0x08000000>; /* prefetchable memory */

			nvidia,ctrl-offset = <0x110>;
			nvidia,num-lanes = <2>;
		};

		pci@1 {
			reg = <1>;
			status = "disabled";

			#address-cells = <3>;
			#size-cells = <2>;

			ranges = <0x81000000 0 0 0x80408000 0 0x00008000   /* I/O */
				  0x82000000 0 0 0x98000000 0 0x08000000   /* non-prefetchable memory */
				  0xc2000000 0 0 0xa8000000 0 0x08000000>; /* prefetchable memory */

			nvidia,ctrl-offset = <0x118>;
			nvidia,num-lanes = <2>;
		};
	};

AIUI none of the ranges properties are valid anymore, because the bus
represented by pcie-controller no longer reflects the truth, namely that
it translates the CPU address space to the PCI address space.

> > >>> compatible = "nvidia,tegra20-pcie-port"; reg = <0x80000000 
> > >>> 0x00001000>; status = "disabled";
> > >>> 
> > >>> #address-cells = <3>; #size-cells = <2>;
> > >>> 
> > >>> ranges = <0x81000000 0 0 0x80400000 0 0x00008000   /* I/O */ 
> > >>> 0x82000000 0 0 0x90000000 0 0x08000000   /* non-prefetchable
> > >>> memory */ 0xc2000000 0 0 0xa0000000 0 0x08000000>; /*
> > >>> prefetchable memory */
> > >> 
> > >> The values here appear identical for both ports. Surely they
> > >> should describe just the parts of the overall address space that
> > >> have been assigned/delegated to the individual port/bridge?
> > > 
> > > They're not identical. Port 0 gets the first half and port 1 gets
> > > the second half of the ranges specified in the parent.
> > 
> > Oh right, I missed some 8s and 0s that looked the same!
> > 
> > >>> While looking into some more code, trying to figure out how to
> > >>> hook this all up with the device tree I ran into a problem. I
> > >>> need to actually create a 'struct device' for each of the
> > >>> ports, so I added the "simple-bus" to the pcie-controller's
> > >>> "compatible" property. Furthermore, each PCI root port now
> > >>> becomes a platform_device, which are supported by a new
> > >>> tegra-pcie-port driver. I'm not sure if "port" is very common
> > >>> in PCI speek, so something like tegra-pcie-bridge (compatible =
> > >>> "nvidia,tegra20-pcie-bridge") may be more appropriate?
> > >> 
> > >> What is it that drives the need for each port to be a 'struct
> > >> device'? The current driver supports 2 host ports, yet there's
> > >> only a single struct device for it. Does the DT code assume a 1:1
> > >> mapping between struct device and DT node that represents the
> > >> child bus? If so, perhaps it'd be better to rework that code to
> > >> accept a DT node as a parameter and call it multiple times,
> > >> rather than accept a struct device as a parameter and hence need
> > >> multiple devices?
> > > 
> > > It's not so much the DT code, but rather the PCI core and
> > > ultimately the device model that requires it. Each port is
> > > basically a PCI host bridge that provides a root PCI bus and the
> > > device model is used to represent the hierachy of the busses.
> > > Providing just the DT node isn't going to be enough.
> > 
> > But the existing driver works without /any/ devices, let alone one per
> > port.
> 
> That doesn't necessarily mean it is correct. The representation of the
> device tree (as in the Linux kernel device tree) isn't quite accurate
> because you're missing the link of the host bridge to the parent PCIe
> controller. It also means that the representation of the kernel device
> tree doesn't match the DT representation.
> 
> Additionally, if you look at how PCI busses and devices are matched to
> their respective DT nodes, the code in drivers/pci/of.c provides a
> default implementation of pcibios_get_phb_of_node(), which matches the
> struct pci_bus up with the device_node of the parent device.
> 
> If we keep the current implementation that passes NULL as parent to the
> pci_scan_root_bus() function, then we'll have to provide a custom
> implementation of pcibios_get_phb_of_node() which has to go through
> similar hoops as the x86 version (see arch/x86/kernel/devicetree.c).
> That of course will not contribute to improving the current state of
> fragmentation in the PCI subsystem.

I suppose this could work if we passed the 'struct device' of the pcie-
controller node as the parent for all ports instead of requiring a
separate device for each port. That way the pcie-controller would get to
be a parent for two/three bridges on Tegra20/30.

I'll have a go at the implementation, but I'm busy with other stuff and
it will probably be some time before I can get back on it.

Thierry
Stephen Warren - June 19, 2012, 4:40 p.m.
On 06/19/2012 07:30 AM, Thierry Reding wrote:
> On Fri, Jun 15, 2012 at 08:12:36AM +0200, Thierry Reding wrote:
>> On Thu, Jun 14, 2012 at 01:50:56PM -0600, Stephen Warren wrote:
...
>>> To me, working back from address to ID then using the ID to calculate
>>> some other addresses seems far more icky than just calculating all the
>>> addresses based off of one ID. But, I suppose this doesn't make a huge
>>> practical difference.
>>
>> This really depends on the device vs. no device decision below. If we can
>> make it work without needing an extra device for it, then using the index
>> is certainly better. However, if we instantiate devices from the DT, then
>> we have the address anyway and adding the index as a property would be
>> redundant and error prone (what happens if somebody sets the index of the
>> port at address 0x80000000 to 2?).
> 
> An additional problem with this is that we'd have to add the following
> to the pcie-controller node:
> 
> 	#address-cells = <1>;
> 	#size-cells = <0>;
> 
> This will conflict with the "ranges" property, because suddenly we can
> no longer map the regions properly. Maybe Mitch can comment on whether
> this is possible or not?
> 
> To make it clearer what I'm talking about, here's the DT snippet again
> (with the compatible property removed from the pci@ nodes because they
> are no longer probed by a driver, the "simple-bus" removed from the
> pcie-controller node's compatible property removed and its #address-
> and #size-cells properties adjusted as described above).
> 
> 	pcie-controller {
> 		compatible = "nvidia,tegra20-pcie";
> 		reg = <0x80003000 0x00000800   /* PADS registers */
> 		       0x80003800 0x00000200   /* AFI registers */
> 		       0x80004000 0x00100000   /* configuration space */
> 		       0x80104000 0x00100000>; /* extended configuration space */
> 		interrupts = <0 98 0x04   /* controller interrupt */
> 			      0 99 0x04>; /* MSI interrupt */
> 		status = "disabled";
> 
> 		ranges = <0x80000000 0x80000000 0x00002000   /* 2 root ports */
> 			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
> 			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
> 			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
> 
> 		#address-cells = <1>;
> 		#size-cells = <0>;
> 
> 		pci@0 {
> 			reg = <2>;
> 			status = "disabled";
> 
> 			#address-cells = <3>;
> 			#size-cells = <2>;
> 
> 			ranges = <0x81000000 0 0 0x80400000 0 0x00008000   /* I/O */
> 				  0x82000000 0 0 0x90000000 0 0x08000000   /* non-prefetchable memory */
> 				  0xc2000000 0 0 0xa0000000 0 0x08000000>; /* prefetchable memory */
> 
> 			nvidia,ctrl-offset = <0x110>;
> 			nvidia,num-lanes = <2>;
> 		};
> 
> 		pci@1 {
> 			reg = <1>;
> 			status = "disabled";
> 
> 			#address-cells = <3>;
> 			#size-cells = <2>;
> 
> 			ranges = <0x81000000 0 0 0x80408000 0 0x00008000   /* I/O */
> 				  0x82000000 0 0 0x98000000 0 0x08000000   /* non-prefetchable memory */
> 				  0xc2000000 0 0 0xa8000000 0 0x08000000>; /* prefetchable memory */
> 
> 			nvidia,ctrl-offset = <0x118>;
> 			nvidia,num-lanes = <2>;
> 		};
> 	};
> 
> AIUI none of the ranges properties are valid anymore, because the bus
> represented by pcie-controller no longer reflects the truth, namely that
> it translates the CPU address space to the PCI address space.

Yes, I imagine that's a show-stopper for this approach.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mitch Bradley - June 19, 2012, 9:31 p.m.
On 6/19/2012 3:30 AM, Thierry Reding wrote:
> On Fri, Jun 15, 2012 at 08:12:36AM +0200, Thierry Reding wrote:
>> On Thu, Jun 14, 2012 at 01:50:56PM -0600, Stephen Warren wrote:
>>> On 06/14/2012 01:29 PM, Thierry Reding wrote:
>>>> On Thu, Jun 14, 2012 at 12:30:50PM -0600, Stephen Warren wrote:
>>>>> On 06/14/2012 03:19 AM, Thierry Reding wrote:
>>> ...
>>>>>> #address-cells = <1>; #size-cells = <1>;
>>>>>>
>>>>>> pci@80000000 {
>>>>>
>>>>> I'm still not convinced that using the address of the port's
>>>>> registers is the correct way to represent each port. The port
>>>>> index seems much more useful.
>>>>>
>>>>> The main reason here is that there are a lot of registers that
>>>>> contain fields for each port - far more than the combination of
>>>>> this node's reg and ctrl-offset (which I assume is an address
>>>>> offset for just one example of this issue) properties can
>>>>> describe. The bit position and bit stride of these fields isn't
>>>>> necessarily the same in each register. Do we want a property like
>>>>> ctrl-offset for every single type of field in every single shared
>>>>> register that describes the location of the relevant data, or
>>>>> just a single "port ID" bit that can be applied to anything?
>>>>>
>>>>> (Perhaps this isn't so obvious looking at the TRM since it
>>>>> doesn't document all registers, and I'm also looking at the
>>>>> Tegra30 documentation too, which might be more exposed to this -
>>>>> I haven't correlated all the documentation sources to be sure
>>>>> though)
>>>>
>>>> I agree that maybe adding properties for each bit position or
>>>> register offset may not work out too well. But I think it still
>>>> makes sense to use the base address of the port's registers (see
>>>> below). We could of course add some code to determine the index
>>>> from the base address at initialization time and reuse the index
>>>> where appropriate.
>>>
>>> To me, working back from address to ID then using the ID to calculate
>>> some other addresses seems far more icky than just calculating all the
>>> addresses based off of one ID. But, I suppose this doesn't make a huge
>>> practical difference.
>>
>> This really depends on the device vs. no device decision below. If we can
>> make it work without needing an extra device for it, then using the index
>> is certainly better. However, if we instantiate devices from the DT, then
>> we have the address anyway and adding the index as a property would be
>> redundant and error prone (what happens if somebody sets the index of the
>> port at address 0x80000000 to 2?).
>
> An additional problem with this is that we'd have to add the following
> to the pcie-controller node:
>
> 	#address-cells = <1>;
> 	#size-cells = <0>;
>
> This will conflict with the "ranges" property, because suddenly we can
> no longer map the regions properly. Maybe Mitch can comment on whether
> this is possible or not?
>
> To make it clearer what I'm talking about, here's the DT snippet again
> (with the compatible property removed from the pci@ nodes because they
> are no longer probed by a driver, the "simple-bus" removed from the
> pcie-controller node's compatible property removed and its #address-
> and #size-cells properties adjusted as described above).
>
> 	pcie-controller {
> 		compatible = "nvidia,tegra20-pcie";
> 		reg = <0x80003000 0x00000800   /* PADS registers */
> 		       0x80003800 0x00000200   /* AFI registers */
> 		       0x80004000 0x00100000   /* configuration space */
> 		       0x80104000 0x00100000>; /* extended configuration space */
> 		interrupts = <0 98 0x04   /* controller interrupt */
> 			      0 99 0x04>; /* MSI interrupt */
> 		status = "disabled";
>
> 		ranges = <0x80000000 0x80000000 0x00002000   /* 2 root ports */
> 			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
> 			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
> 			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
>
> 		#address-cells = <1>;
> 		#size-cells = <0>;
>
> 		pci@0 {
> 			reg = <2>;
> 			status = "disabled";
>
> 			#address-cells = <3>;
> 			#size-cells = <2>;
>
> 			ranges = <0x81000000 0 0 0x80400000 0 0x00008000   /* I/O */
> 				  0x82000000 0 0 0x90000000 0 0x08000000   /* non-prefetchable memory */
> 				  0xc2000000 0 0 0xa0000000 0 0x08000000>; /* prefetchable memory */
>
> 			nvidia,ctrl-offset = <0x110>;
> 			nvidia,num-lanes = <2>;
> 		};
>
> 		pci@1 {
> 			reg = <1>;
> 			status = "disabled";
>
> 			#address-cells = <3>;
> 			#size-cells = <2>;
>
> 			ranges = <0x81000000 0 0 0x80408000 0 0x00008000   /* I/O */
> 				  0x82000000 0 0 0x98000000 0 0x08000000   /* non-prefetchable memory */
> 				  0xc2000000 0 0 0xa8000000 0 0x08000000>; /* prefetchable memory */
>
> 			nvidia,ctrl-offset = <0x118>;
> 			nvidia,num-lanes = <2>;
> 		};
> 	};
>
> AIUI none of the ranges properties are valid anymore, because the bus
> represented by pcie-controller no longer reflects the truth, namely that
> it translates the CPU address space to the PCI address space.
>

I think you can use a small-integer port number as an address by 
defining the intermediate address space properly, and using appropriate 
ranges above and below.  Here's a swag at how that would look.

I present three versions, using different choices for the intermediate 
address space encoding.  The intermediate address space may seem 
somewhat artificial, in that it decouples the "linear pass through of 
ranges" between the lower PCI address space and the upper CPU address 
space.  But in another sense, it accurately reflects that fact that the 
bus bridge "slices and dices" that linear address space into 
non-contiguous pieces.

Note that I'm also fixing a problem that I neglected to mention earlier 
- namely the fact that config space is part of the child PCI address 
space so it must be passed through.

Version A - 3 address cells:  In this version, the intermediate address 
space has 3 cells:  port#, address type, offset.  Address type is
   0 : root port
   1 : config space
   2 : extended config space
   3 : I/O
   4 : non-prefetchable memory
   5 : prefetchable memory.

The third cell "offset" is necessary so that the size field has a number 
space that can include it.

	pcie-controller {
		compatible = "nvidia,tegra20-pcie";
		reg = <0x80003000 0x00000800   /* PADS registers */
		       0x80003800 0x00000200>; /* extended configuration space */
		interrupts = <0 98 0x04   /* controller interrupt */
			      0 99 0x04>; /* MSI interrupt */
		status = "disabled";

		ranges = <0 0 0  0x80000000 0x00001000   /* Root port 0 */
			  0 1 0  0x80004000 0x00080000   /* Port 0 config space */
			  0 2 0  0x80104000 0x00080000   /* Port 0 ext config space *
			  0 3 0  0x80400000 0x00008000   /* Port 0 downstream I/O */
			  0 4 0  0x90000000 0x08000000   /* Port 0 non-prefetchable memory */
			  0 5 0  0xa0000000 0x08000000   /* Port 0 prefetchable memory */

			  1 0 0  0x80001000 0x00001000   /* Root port 1 */
			  1 1 0  0x80004000 0x00080000   /* Port 1 config space */
			  1 2 0  0x80184000 0x00080000   /* Port 1 ext config space */
			  1 3 0  0x80408000 0x00010000   /* Port 1 downstream I/O */
			  1 4 0  0x98000000 0x08000000   /* Port 1 non-prefetchable memory */
			  1 5 0  0xa0000000 0x08000000>; /* Port 1 prefetchable memory */

		#address-cells = <3>;
		#size-cells = <1>;

		pci@0 {
			reg = <0 0 0 0x1000>;
			status = "disabled";

			#address-cells = <3>;
			#size-cells = <2>;

			ranges = <0x80000000 0 0  0 1 0  0 0x00080000   /* config */
				  0x90000000 0 0  0 2 0  0 0x00080000   /* extended config */
				  0x81000000 0 0  0 3 0  0 0x00008000   /* I/O */
				  0x82000000 0 0  0 4 0  0 0x08000000   /* non-prefetchable memory */
				  0xc2000000 0 0  0 5 0  0 0x08000000>; /* prefetchable memory */

			nvidia,ctrl-offset = <0x110>;
			nvidia,num-lanes = <2>;
		};


		pci@1 {
			reg = <1 0 0 0x1000>;
			status = "disabled";

			#address-cells = <3>;
			#size-cells = <2>;

			ranges = <0x80000000 0 0  1 1 0  0 0x00080000   /* config */
				  0x90000000 0 0  1 2 0  0 0x00080000   /* extended config */
				  0x81000000 0 0  1 3 0  0 0x00008000   /* I/O */
				  0x82000000 0 0  1 4 0  0 0x08000000   /* non-prefetchable memory */
				  0xc2000000 0 0  1 5 0  0 0x08000000>; /* prefetchable memory */

			nvidia,ctrl-offset = <0x118>;
			nvidia,num-lanes = <2>;
		};

Version B - 2 address cells:  In this first version, the intermediate
address space has 2 cells:  port#, offset.  The address type (I/O, mem,
etc) is the high digit of in the offset:
    Offset 0....... : Root port
    Offset 1....... : config
    Offset 2....... : extended config
    Offset 3....... : I/O
    Offset 4....... : non-prefetchable memory
    Offset 5....... : prefetchable memory.

	pcie-controller {
		compatible = "nvidia,tegra20-pcie";
		reg = <0x80003000 0x00000800   /* PADS registers */
		       0x80003800 0x00000200>; /* AFI registers */

		interrupts = <0 98 0x04   /* controller interrupt */
			      0 99 0x04>; /* MSI interrupt */
		status = "disabled";

		ranges = <0 0x00000000  0x80000000 0x00001000   /* Root port 0 */
			  0 0x10000000  0x80004000 0x00080000   /* Port 0 config */
			  0 0x20000000  0x80104000 0x00080000	/* Port 0 ext config */
			  0 0x30000000  0x80400000 0x00008000   /* Port 0 downstream I/O */
			  0 0x40000000  0x90000000 0x08000000   /* Port 0 non-prefetchable memory */
			  0 0x50000000  0xa0000000 0x08000000   /* Port 0 prefetchable memory */

			  1 0x00000000  0x80001000 0x00001000   /* Root port 1 */
			  1 0x10000000  0x80084000 0x00080000   /* Port 1 config */
			  1 0x20000000  0x80184000 0x00080000	/* Port 1 ext config */
			  1 0x30000000  0x80408000 0x00010000   /* Port 1 downstream I/O */
			  1 0x40000000  0x98000000 0x08000000   /* Port 1 non-prefetchable memory */
			  1 0x50000000  0xa0000000 0x08000000>; /* Port 1 prefetchable memory */

		#address-cells = <2>;
		#size-cells = <1>;

		pci@0 {
			reg = <0 0 0x1000>;
			status = "disabled";

			#address-cells = <3>;
			#size-cells = <2>;

			ranges = <0x80000000 0 0  0 0x10000000  0 0x00080000   /* config */
				  0x90000000 0 0  0 0x20000000  0 0x00080000   /* ext config */
				  0x81000000 0 0  0 0x30000000  0 0x00008000   /* I/O */
				  0x82000000 0 0  0 0x40000000  0 0x08000000   /* non-prefetchable memory */
				  0xc2000000 0 0  0 0x50000000  0 0x08000000>; /* prefetchable memory */

			nvidia,ctrl-offset = <0x110>;
			nvidia,num-lanes = <2>;
		};


		pci@1 {
			reg = <1 0 0x1000>;
			status = "disabled";

			#address-cells = <3>;
			#size-cells = <2>;

			ranges = <0x80000000 0 0  1 0x10000000  0 0x00080000   /* config */
				  0x90000000 0 0  1 0x20000000  0 0x00080000   /* ext config */
				  0x81000000 0 0  1 0x30000000  0 0x00008000   /* I/O */
				  0x82000000 0 0  1 0x40000000  0 0x08000000   /* non-prefetchable memory */
				  0xc2000000 0 0  1 0x50000000  0 0x08000000>; /* prefetchable memory */

			nvidia,ctrl-offset = <0x118>;
			nvidia,num-lanes = <2>;
		};

  
Version C - 2 address cells, 0 size cells (!) : In this version we hide the size component in the intermediate space.  I don't know if that will actually work, but my guess is that it probably would.  The intermediate address space is like version A with the omission of the offset field.  The intermediate address just specifies the port number and the address type.

	pcie-controller {
		compatible = "nvidia,tegra20-pcie";
		reg = <0x80003000 0x00000800   /* PADS registers */
		       0x80003800 0x00000200>; /* AFI registers */
		interrupts = <0 98 0x04   /* controller interrupt */
			      0 99 0x04>; /* MSI interrupt */
		status = "disabled";

		ranges = <0 0  0x80000000   /* Root port 0 */
			  0 1  0x80004000   /* Port 0 config */
			  0 2  0x80104000   /* Port 0 ext config */
			  0 3  0x80400000   /* Port 0 downstream I/O */
			  0 4  0x90000000   /* Port 0 non-prefetchable memory */
			  0 5  0xa0000000   /* Port 0 prefetchable memory */

			  1 0  0x80001000   /* Root port 1 */
			  1 1  0x80084000   /* Port 1 config */
			  1 2  0x80184000   /* Port 1 ext config */
			  1 3  0x80408000   /* Port 1 downstream I/O */
			  1 4  0x98000000   /* Port 1 non-prefetchable memory */
			  1 5  0xa0000000>; /* Port 1 prefetchable memory */

		#address-cells = <2>;
		#size-cells = <0>;

		pci@0 {
			reg = <0 0>;
			status = "disabled";

			#address-cells = <3>;
			#size-cells = <2>;

			ranges = <0x80000000 0 0  0 1  0 0x00080000   /* config */
				  0x90000000 0 0  0 2  0 0x00080000   /* ext config */
				  0x81000000 0 0  0 3  0 0x00008000   /* I/O */
				  0x82000000 0 0  0 4  0 0x08000000   /* non-prefetchable memory */
				  0xc2000000 0 0  0 5  0 0x08000000>; /* prefetchable memory */

			nvidia,ctrl-offset = <0x110>;
			nvidia,num-lanes = <2>;
		};


		pci@1 {
			reg = <1 0>;
			status = "disabled";

			#address-cells = <3>;
			#size-cells = <2>;

			ranges = <0x80000000 0 0  1 1  0 0x00080000   /* config */
				  0x90000000 0 0  1 2  0 0x00080000   /* ext config */
				  0x81000000 0 0  1 3  0 0x00008000   /* I/O */
				  0x82000000 0 0  1 4  0 0x08000000   /* non-prefetchable memory */
				  0xc2000000 0 0  1 5  0 0x08000000>; /* prefetchable memory */

			nvidia,ctrl-offset = <0x118>;
			nvidia,num-lanes = <2>;
		};

In version C, you might even be able to include the ctrl register offset as a second entry in the reg property.
  

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stephen Warren - June 20, 2012, 4:32 p.m.
On 06/19/2012 03:31 PM, Mitch Bradley wrote:
> On 6/19/2012 3:30 AM, Thierry Reding wrote:
>> On Fri, Jun 15, 2012 at 08:12:36AM +0200, Thierry Reding wrote:
>>> On Thu, Jun 14, 2012 at 01:50:56PM -0600, Stephen Warren wrote:
>>>> On 06/14/2012 01:29 PM, Thierry Reding wrote:
>>>>> On Thu, Jun 14, 2012 at 12:30:50PM -0600, Stephen Warren wrote:
>>>>>> On 06/14/2012 03:19 AM, Thierry Reding wrote:
>>>> ...
>>>>>>> #address-cells = <1>; #size-cells = <1>;
>>>>>>>
>>>>>>> pci@80000000 {
>>>>>>
>>>>>> I'm still not convinced that using the address of the port's
>>>>>> registers is the correct way to represent each port. The port
>>>>>> index seems much more useful.
>>>>>>
>>>>>> The main reason here is that there are a lot of registers that
>>>>>> contain fields for each port - far more than the combination of
>>>>>> this node's reg and ctrl-offset (which I assume is an address
>>>>>> offset for just one example of this issue) properties can
>>>>>> describe. The bit position and bit stride of these fields isn't
>>>>>> necessarily the same in each register. Do we want a property like
>>>>>> ctrl-offset for every single type of field in every single shared
>>>>>> register that describes the location of the relevant data, or
>>>>>> just a single "port ID" bit that can be applied to anything?
>>>>>>
>>>>>> (Perhaps this isn't so obvious looking at the TRM since it
>>>>>> doesn't document all registers, and I'm also looking at the
>>>>>> Tegra30 documentation too, which might be more exposed to this -
>>>>>> I haven't correlated all the documentation sources to be sure
>>>>>> though)
>>>>>
>>>>> I agree that maybe adding properties for each bit position or
>>>>> register offset may not work out too well. But I think it still
>>>>> makes sense to use the base address of the port's registers (see
>>>>> below). We could of course add some code to determine the index
>>>>> from the base address at initialization time and reuse the index
>>>>> where appropriate.
>>>>
>>>> To me, working back from address to ID then using the ID to calculate
>>>> some other addresses seems far more icky than just calculating all the
>>>> addresses based off of one ID. But, I suppose this doesn't make a huge
>>>> practical difference.
>>>
>>> This really depends on the device vs. no device decision below. If we
>>> can
>>> make it work without needing an extra device for it, then using the
>>> index
>>> is certainly better. However, if we instantiate devices from the DT,
>>> then
>>> we have the address anyway and adding the index as a property would be
>>> redundant and error prone (what happens if somebody sets the index of
>>> the
>>> port at address 0x80000000 to 2?).
>>
>> An additional problem with this is that we'd have to add the following
>> to the pcie-controller node:
>>
>>     #address-cells = <1>;
>>     #size-cells = <0>;
>>
>> This will conflict with the "ranges" property, because suddenly we can
>> no longer map the regions properly. Maybe Mitch can comment on whether
>> this is possible or not?
>>
>> To make it clearer what I'm talking about, here's the DT snippet again
>> (with the compatible property removed from the pci@ nodes because they
>> are no longer probed by a driver, the "simple-bus" removed from the
>> pcie-controller node's compatible property removed and its #address-
>> and #size-cells properties adjusted as described above).
>>
>>     pcie-controller {
>>         compatible = "nvidia,tegra20-pcie";
>>         reg = <0x80003000 0x00000800   /* PADS registers */
>>                0x80003800 0x00000200   /* AFI registers */
>>                0x80004000 0x00100000   /* configuration space */
>>                0x80104000 0x00100000>; /* extended configuration space */
>>         interrupts = <0 98 0x04   /* controller interrupt */
>>                   0 99 0x04>; /* MSI interrupt */
>>         status = "disabled";
>>
>>         ranges = <0x80000000 0x80000000 0x00002000   /* 2 root ports */
>>               0x80400000 0x80400000 0x00010000   /* downstream I/O */
>>               0x90000000 0x90000000 0x10000000   /* non-prefetchable
>> memory */
>>               0xa0000000 0xa0000000 0x10000000>; /* prefetchable
>> memory */
>>
>>         #address-cells = <1>;
>>         #size-cells = <0>;
>>
>>         pci@0 {
>>             reg = <2>;
>>             status = "disabled";
>>
>>             #address-cells = <3>;
>>             #size-cells = <2>;
>>
>>             ranges = <0x81000000 0 0 0x80400000 0 0x00008000   /* I/O */
>>                   0x82000000 0 0 0x90000000 0 0x08000000   /*
>> non-prefetchable memory */
>>                   0xc2000000 0 0 0xa0000000 0 0x08000000>; /*
>> prefetchable memory */
>>
>>             nvidia,ctrl-offset = <0x110>;
>>             nvidia,num-lanes = <2>;
>>         };
>>
>>         pci@1 {
>>             reg = <1>;
>>             status = "disabled";
>>
>>             #address-cells = <3>;
>>             #size-cells = <2>;
>>
>>             ranges = <0x81000000 0 0 0x80408000 0 0x00008000   /* I/O */
>>                   0x82000000 0 0 0x98000000 0 0x08000000   /*
>> non-prefetchable memory */
>>                   0xc2000000 0 0 0xa8000000 0 0x08000000>; /*
>> prefetchable memory */
>>
>>             nvidia,ctrl-offset = <0x118>;
>>             nvidia,num-lanes = <2>;
>>         };
>>     };
>>
>> AIUI none of the ranges properties are valid anymore, because the bus
>> represented by pcie-controller no longer reflects the truth, namely that
>> it translates the CPU address space to the PCI address space.
>>
> 
> I think you can use a small-integer port number as an address by
> defining the intermediate address space properly, and using appropriate
> ranges above and below.  Here's a swag at how that would look.
> 
> I present three versions, using different choices for the intermediate
> address space encoding.  The intermediate address space may seem
> somewhat artificial, in that it decouples the "linear pass through of
> ranges" between the lower PCI address space and the upper CPU address
> space.  But in another sense, it accurately reflects that fact that the
> bus bridge "slices and dices" that linear address space into
> non-contiguous pieces.
> 
> Note that I'm also fixing a problem that I neglected to mention earlier
> - namely the fact that config space is part of the child PCI address
> space so it must be passed through.

Impressive. I do tend to like these ideas...

> Version A - 3 address cells:  In this version, the intermediate address
> space has 3 cells:  port#, address type, offset.  Address type is

I think I personally tend to prefer this option; I like that everything
is explicit and nicely separated.

>   0 : root port
>   1 : config space
>   2 : extended config space
>   3 : I/O
>   4 : non-prefetchable memory
>   5 : prefetchable memory.
> 
> The third cell "offset" is necessary so that the size field has a number
> space that can include it.

Can you expand on that sentence a bit more; I don't quite understand
that aspect. Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mitch Bradley - June 20, 2012, 5:41 p.m.
On 6/20/2012 6:32 AM, Stephen Warren wrote:
> On 06/19/2012 03:31 PM, Mitch Bradley wrote:
> ...
>
>
> The third cell "offset" is necessary so that the size field has a number
> space that can include it.
>
> Can you expand on that sentence a bit more; I don't quite understand
> that aspect. Thanks.

The meaning of "size"(in the context of "reg" which is phys,size) is 
"the device occupies a
contiguous sequence of addresses beginning at phys and continuing to 
phys+size-1". The
implicit addition occurs on the last #size-cells cells of the phys.

Suppose you have 2-cell phys where the values for different devices are 
<0 0>, <0 1>, <0 2>, <1 0>, <1 1>,
<1 2>.  The first number is the port number and the second is address 
space type.  It doesn't make
sense to say that size is 0x8000, because that would imply that the 
device extends from, say,
<1 2> to <1 0x8001>.

I'm not sure this is written down anywhere as explicitly as above, but 
it is certainly implicit
in the way that ranges properties work, and it is certainly part of the 
mental model that
was in my head when I developed the device tree addressing scheme.


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stephen Warren - June 20, 2012, 5:47 p.m.
On 06/20/2012 11:41 AM, Mitch Bradley wrote:
> On 6/20/2012 6:32 AM, Stephen Warren wrote:
>> On 06/19/2012 03:31 PM, Mitch Bradley wrote:
>> ...
>>
>>
>> The third cell "offset" is necessary so that the size field has a number
>> space that can include it.
>>
>> Can you expand on that sentence a bit more; I don't quite understand
>> that aspect. Thanks.
> 
> The meaning of "size"(in the context of "reg" which is phys,size) is
> "the device occupies a
> contiguous sequence of addresses beginning at phys and continuing to
> phys+size-1". The
> implicit addition occurs on the last #size-cells cells of the phys.

Ah right of course - that makes perfect sense. Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arnd Bergmann - June 20, 2012, 7:57 p.m.
On Tuesday 19 June 2012, Mitch Bradley wrote:

> Version A - 3 address cells:  In this version, the intermediate address 
> space has 3 cells:  port#, address type, offset.  Address type is
>    0 : root port
>    1 : config space
>    2 : extended config space
>    3 : I/O
>    4 : non-prefetchable memory
>    5 : prefetchable memory.
> 
> The third cell "offset" is necessary so that the size field has a number 
> space that can include it.

I agree with Stephen that this is a clever way to encode all the address spaces,
very nice!

> Version B - 2 address cells:  In this first version, the intermediate
> address space has 2 cells:  port#, offset.  The address type (I/O, mem,
> etc) is the high digit of in the offset:
>     Offset 0....... : Root port
>     Offset 1....... : config
>     Offset 2....... : extended config
>     Offset 3....... : I/O
>     Offset 4....... : non-prefetchable memory
>     Offset 5....... : prefetchable memory.

This is similar to how the PCI binding works internally, but I find that
a bit confusing as well, so given those two choices, I'd prefer the first
one.
   
> Version C - 2 address cells, 0 size cells (!) : In this version we hide the size component in 
> the intermediate space.  I don't know if that will actually work, but my guess is that it
> probably would.  The intermediate address space is like version A with the omission of the
> offset field.  The intermediate address just specifies the port number and the address type.

My guess is that it doesn't work with the current Linux code that transforms
the addresses. I've just recently seen the code complain about any platform
device whose parent has #size-cells=0.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mitch Bradley - June 20, 2012, 8:19 p.m.
On 6/20/2012 9:57 AM, Arnd Bergmann wrote:
> On Tuesday 19 June 2012, Mitch Bradley wrote:
>
>> Version A - 3 address cells:  In this version, the intermediate address
>> space has 3 cells:  port#, address type, offset.  Address type is
>>     0 : root port
>>     1 : config space
>>     2 : extended config space
>>     3 : I/O
>>     4 : non-prefetchable memory
>>     5 : prefetchable memory.
>>
>> The third cell "offset" is necessary so that the size field has a number
>> space that can include it.
>
> I agree with Stephen that this is a clever way to encode all the address spaces,
> very nice!
>
>> Version B - 2 address cells:  In this first version, the intermediate
>> address space has 2 cells:  port#, offset.  The address type (I/O, mem,
>> etc) is the high digit of in the offset:
>>      Offset 0....... : Root port
>>      Offset 1....... : config
>>      Offset 2....... : extended config
>>      Offset 3....... : I/O
>>      Offset 4....... : non-prefetchable memory
>>      Offset 5....... : prefetchable memory.
>
> This is similar to how the PCI binding works internally, but I find that
> a bit confusing as well, so given those two choices, I'd prefer the first
> one.

Yeah, version A is my preference too.  It's straightforward and clear.  
I did the other two
versions mostly to explore the possibility space.

>
>     
>> Version C - 2 address cells, 0 size cells (!) : In this version we hide the size component in
>> the intermediate space.  I don't know if that will actually work, but my guess is that it
>> probably would.  The intermediate address space is like version A with the omission of the
>> offset field.  The intermediate address just specifies the port number and the address type.
>
> My guess is that it doesn't work with the current Linux code that transforms
> the addresses. I've just recently seen the code complain about any platform
> device whose parent has #size-cells=0.
>
> 	Arnd
>

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thierry Reding - June 21, 2012, 6:47 a.m.
On Tue, Jun 19, 2012 at 11:31:39AM -1000, Mitch Bradley wrote:
> On 6/19/2012 3:30 AM, Thierry Reding wrote:
> >On Fri, Jun 15, 2012 at 08:12:36AM +0200, Thierry Reding wrote:
> >>On Thu, Jun 14, 2012 at 01:50:56PM -0600, Stephen Warren wrote:
> >>>On 06/14/2012 01:29 PM, Thierry Reding wrote:
> >>>>On Thu, Jun 14, 2012 at 12:30:50PM -0600, Stephen Warren wrote:
> >>>>>On 06/14/2012 03:19 AM, Thierry Reding wrote:
> >>>...
> >>>>>>#address-cells = <1>; #size-cells = <1>;
> >>>>>>
> >>>>>>pci@80000000 {
> >>>>>
> >>>>>I'm still not convinced that using the address of the port's
> >>>>>registers is the correct way to represent each port. The port
> >>>>>index seems much more useful.
> >>>>>
> >>>>>The main reason here is that there are a lot of registers that
> >>>>>contain fields for each port - far more than the combination of
> >>>>>this node's reg and ctrl-offset (which I assume is an address
> >>>>>offset for just one example of this issue) properties can
> >>>>>describe. The bit position and bit stride of these fields isn't
> >>>>>necessarily the same in each register. Do we want a property like
> >>>>>ctrl-offset for every single type of field in every single shared
> >>>>>register that describes the location of the relevant data, or
> >>>>>just a single "port ID" bit that can be applied to anything?
> >>>>>
> >>>>>(Perhaps this isn't so obvious looking at the TRM since it
> >>>>>doesn't document all registers, and I'm also looking at the
> >>>>>Tegra30 documentation too, which might be more exposed to this -
> >>>>>I haven't correlated all the documentation sources to be sure
> >>>>>though)
> >>>>
> >>>>I agree that maybe adding properties for each bit position or
> >>>>register offset may not work out too well. But I think it still
> >>>>makes sense to use the base address of the port's registers (see
> >>>>below). We could of course add some code to determine the index
> >>>>from the base address at initialization time and reuse the index
> >>>>where appropriate.
> >>>
> >>>To me, working back from address to ID then using the ID to calculate
> >>>some other addresses seems far more icky than just calculating all the
> >>>addresses based off of one ID. But, I suppose this doesn't make a huge
> >>>practical difference.
> >>
> >>This really depends on the device vs. no device decision below. If we can
> >>make it work without needing an extra device for it, then using the index
> >>is certainly better. However, if we instantiate devices from the DT, then
> >>we have the address anyway and adding the index as a property would be
> >>redundant and error prone (what happens if somebody sets the index of the
> >>port at address 0x80000000 to 2?).
> >
> >An additional problem with this is that we'd have to add the following
> >to the pcie-controller node:
> >
> >	#address-cells = <1>;
> >	#size-cells = <0>;
> >
> >This will conflict with the "ranges" property, because suddenly we can
> >no longer map the regions properly. Maybe Mitch can comment on whether
> >this is possible or not?
> >
> >To make it clearer what I'm talking about, here's the DT snippet again
> >(with the compatible property removed from the pci@ nodes because they
> >are no longer probed by a driver, the "simple-bus" removed from the
> >pcie-controller node's compatible property removed and its #address-
> >and #size-cells properties adjusted as described above).
> >
> >	pcie-controller {
> >		compatible = "nvidia,tegra20-pcie";
> >		reg = <0x80003000 0x00000800   /* PADS registers */
> >		       0x80003800 0x00000200   /* AFI registers */
> >		       0x80004000 0x00100000   /* configuration space */
> >		       0x80104000 0x00100000>; /* extended configuration space */
> >		interrupts = <0 98 0x04   /* controller interrupt */
> >			      0 99 0x04>; /* MSI interrupt */
> >		status = "disabled";
> >
> >		ranges = <0x80000000 0x80000000 0x00002000   /* 2 root ports */
> >			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
> >			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
> >			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
> >
> >		#address-cells = <1>;
> >		#size-cells = <0>;
> >
> >		pci@0 {
> >			reg = <2>;
> >			status = "disabled";
> >
> >			#address-cells = <3>;
> >			#size-cells = <2>;
> >
> >			ranges = <0x81000000 0 0 0x80400000 0 0x00008000   /* I/O */
> >				  0x82000000 0 0 0x90000000 0 0x08000000   /* non-prefetchable memory */
> >				  0xc2000000 0 0 0xa0000000 0 0x08000000>; /* prefetchable memory */
> >
> >			nvidia,ctrl-offset = <0x110>;
> >			nvidia,num-lanes = <2>;
> >		};
> >
> >		pci@1 {
> >			reg = <1>;
> >			status = "disabled";
> >
> >			#address-cells = <3>;
> >			#size-cells = <2>;
> >
> >			ranges = <0x81000000 0 0 0x80408000 0 0x00008000   /* I/O */
> >				  0x82000000 0 0 0x98000000 0 0x08000000   /* non-prefetchable memory */
> >				  0xc2000000 0 0 0xa8000000 0 0x08000000>; /* prefetchable memory */
> >
> >			nvidia,ctrl-offset = <0x118>;
> >			nvidia,num-lanes = <2>;
> >		};
> >	};
> >
> >AIUI none of the ranges properties are valid anymore, because the bus
> >represented by pcie-controller no longer reflects the truth, namely that
> >it translates the CPU address space to the PCI address space.
> >
> 
> I think you can use a small-integer port number as an address by
> defining the intermediate address space properly, and using
> appropriate ranges above and below.  Here's a swag at how that would
> look.
> 
> I present three versions, using different choices for the
> intermediate address space encoding.  The intermediate address space
> may seem somewhat artificial, in that it decouples the "linear pass
> through of ranges" between the lower PCI address space and the upper
> CPU address space.  But in another sense, it accurately reflects
> that fact that the bus bridge "slices and dices" that linear address
> space into non-contiguous pieces.
> 
> Note that I'm also fixing a problem that I neglected to mention
> earlier - namely the fact that config space is part of the child PCI
> address space so it must be passed through.

This doesn't quite match how the Tegra PCIe controller works. Basically,
every access to a device's (extended) configuration space goes through a
single region of memory-mapped registers, independent of which port is
the parent of the device.

Is it okay to pass both configuration spaces via the ranges property but
still list them in the reg property of the controller?

> Version A - 3 address cells:  In this version, the intermediate
> address space has 3 cells:  port#, address type, offset.  Address
> type is
>   0 : root port
>   1 : config space
>   2 : extended config space
>   3 : I/O
>   4 : non-prefetchable memory
>   5 : prefetchable memory.
> 
> The third cell "offset" is necessary so that the size field has a
> number space that can include it.
> 
> 	pcie-controller {
> 		compatible = "nvidia,tegra20-pcie";
> 		reg = <0x80003000 0x00000800   /* PADS registers */
> 		       0x80003800 0x00000200>; /* extended configuration space */
> 		interrupts = <0 98 0x04   /* controller interrupt */
> 			      0 99 0x04>; /* MSI interrupt */
> 		status = "disabled";
> 
> 		ranges = <0 0 0  0x80000000 0x00001000   /* Root port 0 */
> 			  0 1 0  0x80004000 0x00080000   /* Port 0 config space */
> 			  0 2 0  0x80104000 0x00080000   /* Port 0 ext config space *
> 			  0 3 0  0x80400000 0x00008000   /* Port 0 downstream I/O */
> 			  0 4 0  0x90000000 0x08000000   /* Port 0 non-prefetchable memory */
> 			  0 5 0  0xa0000000 0x08000000   /* Port 0 prefetchable memory */
> 
> 			  1 0 0  0x80001000 0x00001000   /* Root port 1 */
> 			  1 1 0  0x80004000 0x00080000   /* Port 1 config space */
> 			  1 2 0  0x80184000 0x00080000   /* Port 1 ext config space */
> 			  1 3 0  0x80408000 0x00010000   /* Port 1 downstream I/O */
> 			  1 4 0  0x98000000 0x08000000   /* Port 1 non-prefetchable memory */
> 			  1 5 0  0xa0000000 0x08000000>; /* Port 1 prefetchable memory */
> 
> 		#address-cells = <3>;
> 		#size-cells = <1>;
> 
> 		pci@0 {
> 			reg = <0 0 0 0x1000>;
> 			status = "disabled";
> 
> 			#address-cells = <3>;
> 			#size-cells = <2>;
> 
> 			ranges = <0x80000000 0 0  0 1 0  0 0x00080000   /* config */
> 				  0x90000000 0 0  0 2 0  0 0x00080000   /* extended config */
> 				  0x81000000 0 0  0 3 0  0 0x00008000   /* I/O */
> 				  0x82000000 0 0  0 4 0  0 0x08000000   /* non-prefetchable memory */
> 				  0xc2000000 0 0  0 5 0  0 0x08000000>; /* prefetchable memory */
> 
> 			nvidia,ctrl-offset = <0x110>;
> 			nvidia,num-lanes = <2>;
> 		};
> 
> 
> 		pci@1 {
> 			reg = <1 0 0 0x1000>;
> 			status = "disabled";
> 
> 			#address-cells = <3>;
> 			#size-cells = <2>;
> 
> 			ranges = <0x80000000 0 0  1 1 0  0 0x00080000   /* config */
> 				  0x90000000 0 0  1 2 0  0 0x00080000   /* extended config */
> 				  0x81000000 0 0  1 3 0  0 0x00008000   /* I/O */
> 				  0x82000000 0 0  1 4 0  0 0x08000000   /* non-prefetchable memory */
> 				  0xc2000000 0 0  1 5 0  0 0x08000000>; /* prefetchable memory */
> 
> 			nvidia,ctrl-offset = <0x118>;
> 			nvidia,num-lanes = <2>;
> 		};

Everybody seems to be happy with this approach, so I'll give it a shot.
There is one thing I'm still unsure about, though. What if somebody uses
the above scheme and maps the registers to the wrong port. The same goes
for the nvidia,ctrl-offset property. It needs to match the register
offset because they are directly related. I suppose we could leave that
property away and look up the register via the port index (which, as
Stephen already said, we'll have to do in other places anyway, unless we
list all bit positions in the DT).

Can we safely ignore such issues and assume the device tree to always be
right? Should we just not care if somebody uses it wrongly?

Thierry
Bjorn Helgaas - June 22, 2012, 10:18 a.m.
On Thu, Jun 21, 2012 at 12:47 AM, Thierry Reding
<thierry.reding@avionic-design.de> wrote:
> On Tue, Jun 19, 2012 at 11:31:39AM -1000, Mitch Bradley wrote:

>> Version A - 3 address cells:  In this version, the intermediate
>> address space has 3 cells:  port#, address type, offset.  Address
>> type is
>>   0 : root port
>>   1 : config space
>>   2 : extended config space
>>   3 : I/O
>>   4 : non-prefetchable memory
>>   5 : prefetchable memory.
>>
>> The third cell "offset" is necessary so that the size field has a
>> number space that can include it.
>>
>>       pcie-controller {
>>               compatible = "nvidia,tegra20-pcie";
>>               reg = <0x80003000 0x00000800   /* PADS registers */
>>                      0x80003800 0x00000200>; /* extended configuration space */
>>               interrupts = <0 98 0x04   /* controller interrupt */
>>                             0 99 0x04>; /* MSI interrupt */
>>               status = "disabled";
>>
>>               ranges = <0 0 0  0x80000000 0x00001000   /* Root port 0 */
>>                         0 1 0  0x80004000 0x00080000   /* Port 0 config space */
>>                         0 2 0  0x80104000 0x00080000   /* Port 0 ext config space *
>>                         0 3 0  0x80400000 0x00008000   /* Port 0 downstream I/O */
>>                         0 4 0  0x90000000 0x08000000   /* Port 0 non-prefetchable memory */
>>                         0 5 0  0xa0000000 0x08000000   /* Port 0 prefetchable memory */
>>
>>                         1 0 0  0x80001000 0x00001000   /* Root port 1 */
>>                         1 1 0  0x80004000 0x00080000   /* Port 1 config space */
>>                         1 2 0  0x80184000 0x00080000   /* Port 1 ext config space */
>>                         1 3 0  0x80408000 0x00010000   /* Port 1 downstream I/O */
>>                         1 4 0  0x98000000 0x08000000   /* Port 1 non-prefetchable memory */
>>                         1 5 0  0xa0000000 0x08000000>; /* Port 1 prefetchable memory */
>>
>>               #address-cells = <3>;
>>               #size-cells = <1>;
>>
>>               pci@0 {
>>                       reg = <0 0 0 0x1000>;
>>                       status = "disabled";
>>
>>                       #address-cells = <3>;
>>                       #size-cells = <2>;
>>
>>                       ranges = <0x80000000 0 0  0 1 0  0 0x00080000   /* config */
>>                                 0x90000000 0 0  0 2 0  0 0x00080000   /* extended config */
>>                                 0x81000000 0 0  0 3 0  0 0x00008000   /* I/O */
>>                                 0x82000000 0 0  0 4 0  0 0x08000000   /* non-prefetchable memory */
>>                                 0xc2000000 0 0  0 5 0  0 0x08000000>; /* prefetchable memory */
>>
>>                       nvidia,ctrl-offset = <0x110>;
>>                       nvidia,num-lanes = <2>;
>>               };
>>
>>
>>               pci@1 {
>>                       reg = <1 0 0 0x1000>;
>>                       status = "disabled";
>>
>>                       #address-cells = <3>;
>>                       #size-cells = <2>;
>>
>>                       ranges = <0x80000000 0 0  1 1 0  0 0x00080000   /* config */
>>                                 0x90000000 0 0  1 2 0  0 0x00080000   /* extended config */
>>                                 0x81000000 0 0  1 3 0  0 0x00008000   /* I/O */
>>                                 0x82000000 0 0  1 4 0  0 0x08000000   /* non-prefetchable memory */
>>                                 0xc2000000 0 0  1 5 0  0 0x08000000>; /* prefetchable memory */
>>
>>                       nvidia,ctrl-offset = <0x118>;
>>                       nvidia,num-lanes = <2>;
>>               };

I'm not familiar with device tree, so pardon me if these are stupid
questions.  These seem to be describing PCI host bridges.  I assume
some of these ranges describe MMIO, prefetchable MMIO, and I/O port
apertures that the bridge forwards to the PCI bus.

Is there provision for any address offset applied by the bridge when
it forwards downstream?  (Maybe these bridges don't apply any offset?)
 "0xa0000000 0x08000000" appears for both ports 0 and 1 prefetchable
memory; I don't know if that's an error, an indication that each
bridge applies a different offset, or that both bridges forward the
same aperture (I hope not the latter, because Linux can't really deal
with that).

Is the bus number aperture included somewhere?  How do we know what
bus numbers are available for allocation under each bridge?

I see mention of config space.  Is some of that referring to the ECAM
as in 7.2.2 of the PCIe spec v3.0 (what we refer to as MMCONFIG on
x86)?
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thierry Reding - June 22, 2012, 11 a.m.
On Fri, Jun 22, 2012 at 04:18:05AM -0600, Bjorn Helgaas wrote:
> On Thu, Jun 21, 2012 at 12:47 AM, Thierry Reding
> <thierry.reding@avionic-design.de> wrote:
> > On Tue, Jun 19, 2012 at 11:31:39AM -1000, Mitch Bradley wrote:
> 
> >> Version A - 3 address cells:  In this version, the intermediate
> >> address space has 3 cells:  port#, address type, offset.  Address
> >> type is
> >>   0 : root port
> >>   1 : config space
> >>   2 : extended config space
> >>   3 : I/O
> >>   4 : non-prefetchable memory
> >>   5 : prefetchable memory.
> >>
> >> The third cell "offset" is necessary so that the size field has a
> >> number space that can include it.
> >>
> >>       pcie-controller {
> >>               compatible = "nvidia,tegra20-pcie";
> >>               reg = <0x80003000 0x00000800   /* PADS registers */
> >>                      0x80003800 0x00000200>; /* extended configuration space */
> >>               interrupts = <0 98 0x04   /* controller interrupt */
> >>                             0 99 0x04>; /* MSI interrupt */
> >>               status = "disabled";
> >>
> >>               ranges = <0 0 0  0x80000000 0x00001000   /* Root port 0 */
> >>                         0 1 0  0x80004000 0x00080000   /* Port 0 config space */
> >>                         0 2 0  0x80104000 0x00080000   /* Port 0 ext config space *
> >>                         0 3 0  0x80400000 0x00008000   /* Port 0 downstream I/O */
> >>                         0 4 0  0x90000000 0x08000000   /* Port 0 non-prefetchable memory */
> >>                         0 5 0  0xa0000000 0x08000000   /* Port 0 prefetchable memory */
> >>
> >>                         1 0 0  0x80001000 0x00001000   /* Root port 1 */
> >>                         1 1 0  0x80004000 0x00080000   /* Port 1 config space */
> >>                         1 2 0  0x80184000 0x00080000   /* Port 1 ext config space */
> >>                         1 3 0  0x80408000 0x00010000   /* Port 1 downstream I/O */
> >>                         1 4 0  0x98000000 0x08000000   /* Port 1 non-prefetchable memory */
> >>                         1 5 0  0xa0000000 0x08000000>; /* Port 1 prefetchable memory */
> >>
> >>               #address-cells = <3>;
> >>               #size-cells = <1>;
> >>
> >>               pci@0 {
> >>                       reg = <0 0 0 0x1000>;
> >>                       status = "disabled";
> >>
> >>                       #address-cells = <3>;
> >>                       #size-cells = <2>;
> >>
> >>                       ranges = <0x80000000 0 0  0 1 0  0 0x00080000   /* config */
> >>                                 0x90000000 0 0  0 2 0  0 0x00080000   /* extended config */
> >>                                 0x81000000 0 0  0 3 0  0 0x00008000   /* I/O */
> >>                                 0x82000000 0 0  0 4 0  0 0x08000000   /* non-prefetchable memory */
> >>                                 0xc2000000 0 0  0 5 0  0 0x08000000>; /* prefetchable memory */
> >>
> >>                       nvidia,ctrl-offset = <0x110>;
> >>                       nvidia,num-lanes = <2>;
> >>               };
> >>
> >>
> >>               pci@1 {
> >>                       reg = <1 0 0 0x1000>;
> >>                       status = "disabled";
> >>
> >>                       #address-cells = <3>;
> >>                       #size-cells = <2>;
> >>
> >>                       ranges = <0x80000000 0 0  1 1 0  0 0x00080000   /* config */
> >>                                 0x90000000 0 0  1 2 0  0 0x00080000   /* extended config */
> >>                                 0x81000000 0 0  1 3 0  0 0x00008000   /* I/O */
> >>                                 0x82000000 0 0  1 4 0  0 0x08000000   /* non-prefetchable memory */
> >>                                 0xc2000000 0 0  1 5 0  0 0x08000000>; /* prefetchable memory */
> >>
> >>                       nvidia,ctrl-offset = <0x118>;
> >>                       nvidia,num-lanes = <2>;
> >>               };
> 
> I'm not familiar with device tree, so pardon me if these are stupid
> questions.  These seem to be describing PCI host bridges.

Yes, correct.

> I assume some of these ranges describe MMIO, prefetchable MMIO, and
> I/O port apertures that the bridge forwards to the PCI bus.

Yes.

> Is there provision for any address offset applied by the bridge when
> it forwards downstream?  (Maybe these bridges don't apply any offset?)
>  "0xa0000000 0x08000000" appears for both ports 0 and 1 prefetchable
> memory; I don't know if that's an error, an indication that each
> bridge applies a different offset, or that both bridges forward the
> same aperture (I hope not the latter, because Linux can't really deal
> with that).

That seems to be a typo. It should have been 0xa8000000 in the second
case. The PCIe controller has registers that are programmed with the
aperture for the configuration and extended configuration spaces, as
well as the downstream I/O, prefetchable and non-prefetchable memory
regions.

The configuration spaces aren't actually forwarded by the bridges, but
are handled only by the controller. The other apertures are programmed
into the bridges using standard PCI registers.

> Is the bus number aperture included somewhere?  How do we know what
> bus numbers are available for allocation under each bridge?

Not yet. I don't think DT imposes a bus number allocation on PCI
bridges. However the matching of DT nodes to PCI bridges is done based
on the bus number. For that you provide a bus-ranges property which
defines the bus aperture of the given PCI bridge. The DT matching code
compares the first cell of this property with the primary bus number of
the bridge.

This is in fact somewhat counter-productive because it requires you to
hard-code the PCI hierarchy within the DT and you loose a lot of the
probing functionality. This is not much of an issue for typical PCI
endpoints (like network cards) but becomes a problem if you have a PCI
endpoint that implements an I2C bus and you want to match I2C slaves on
that bus to device nodes so that the kernel can parse and instantiate
them properly. I guess in the latter cases hard-coding the PCI tree in
DT is not actually a drawback, though.

> I see mention of config space.  Is some of that referring to the ECAM
> as in 7.2.2 of the PCIe spec v3.0 (what we refer to as MMCONFIG on
> x86)?

I only have access to the PCIe 2.0 specification, but it seems to
contain the same 7.2.2 section. In fact it looks as though this
particular controller indeed implements something similar to ECAM. The
address mapping is a little different, though:

	A[(16 + n - 1):16]: bus number
	A[15:11]: device number
	A[10:8]: function number
	A[7:2]: register number
	A[1:0]: unused

That information comes directly from the driver and I have no access to
the corresponding documentation. It seems like the extended
configuration space is accessed using the same mapping but starting at a
different base address.

But now that you mention it, maybe the driver code is buggy here, and
the controller indeed implements ECAM. Furthermore it also seems like
the current code doesn't work for the extended configuration space
because while the correct offset into the extended configuration space
is chosen, the access to registers is done using the same mapping as
above, so instead of the 12 (10) bits required for the register number
only 8 (6) are in fact used.

I wonder whether anyone's actually tested this.

Stephen: can you try to find out whether the Tegra PCIe controller
indeed implements ECAM, or if this scheme is actually just a proprietary
variant?

Thierry
Thierry Reding - June 22, 2012, 11:04 a.m.
On Tue, Jun 19, 2012 at 11:31:39AM -1000, Mitch Bradley wrote:
> On 6/19/2012 3:30 AM, Thierry Reding wrote:
> >On Fri, Jun 15, 2012 at 08:12:36AM +0200, Thierry Reding wrote:
> >>On Thu, Jun 14, 2012 at 01:50:56PM -0600, Stephen Warren wrote:
> >>>On 06/14/2012 01:29 PM, Thierry Reding wrote:
> >>>>On Thu, Jun 14, 2012 at 12:30:50PM -0600, Stephen Warren wrote:
> >>>>>On 06/14/2012 03:19 AM, Thierry Reding wrote:
> >>>...
> >>>>>>#address-cells = <1>; #size-cells = <1>;
> >>>>>>
> >>>>>>pci@80000000 {
> >>>>>
> >>>>>I'm still not convinced that using the address of the port's
> >>>>>registers is the correct way to represent each port. The port
> >>>>>index seems much more useful.
> >>>>>
> >>>>>The main reason here is that there are a lot of registers that
> >>>>>contain fields for each port - far more than the combination of
> >>>>>this node's reg and ctrl-offset (which I assume is an address
> >>>>>offset for just one example of this issue) properties can
> >>>>>describe. The bit position and bit stride of these fields isn't
> >>>>>necessarily the same in each register. Do we want a property like
> >>>>>ctrl-offset for every single type of field in every single shared
> >>>>>register that describes the location of the relevant data, or
> >>>>>just a single "port ID" bit that can be applied to anything?
> >>>>>
> >>>>>(Perhaps this isn't so obvious looking at the TRM since it
> >>>>>doesn't document all registers, and I'm also looking at the
> >>>>>Tegra30 documentation too, which might be more exposed to this -
> >>>>>I haven't correlated all the documentation sources to be sure
> >>>>>though)
> >>>>
> >>>>I agree that maybe adding properties for each bit position or
> >>>>register offset may not work out too well. But I think it still
> >>>>makes sense to use the base address of the port's registers (see
> >>>>below). We could of course add some code to determine the index
> >>>>from the base address at initialization time and reuse the index
> >>>>where appropriate.
> >>>
> >>>To me, working back from address to ID then using the ID to calculate
> >>>some other addresses seems far more icky than just calculating all the
> >>>addresses based off of one ID. But, I suppose this doesn't make a huge
> >>>practical difference.
> >>
> >>This really depends on the device vs. no device decision below. If we can
> >>make it work without needing an extra device for it, then using the index
> >>is certainly better. However, if we instantiate devices from the DT, then
> >>we have the address anyway and adding the index as a property would be
> >>redundant and error prone (what happens if somebody sets the index of the
> >>port at address 0x80000000 to 2?).
> >
> >An additional problem with this is that we'd have to add the following
> >to the pcie-controller node:
> >
> >	#address-cells = <1>;
> >	#size-cells = <0>;
> >
> >This will conflict with the "ranges" property, because suddenly we can
> >no longer map the regions properly. Maybe Mitch can comment on whether
> >this is possible or not?
> >
> >To make it clearer what I'm talking about, here's the DT snippet again
> >(with the compatible property removed from the pci@ nodes because they
> >are no longer probed by a driver, the "simple-bus" removed from the
> >pcie-controller node's compatible property removed and its #address-
> >and #size-cells properties adjusted as described above).
> >
> >	pcie-controller {
> >		compatible = "nvidia,tegra20-pcie";
> >		reg = <0x80003000 0x00000800   /* PADS registers */
> >		       0x80003800 0x00000200   /* AFI registers */
> >		       0x80004000 0x00100000   /* configuration space */
> >		       0x80104000 0x00100000>; /* extended configuration space */
> >		interrupts = <0 98 0x04   /* controller interrupt */
> >			      0 99 0x04>; /* MSI interrupt */
> >		status = "disabled";
> >
> >		ranges = <0x80000000 0x80000000 0x00002000   /* 2 root ports */
> >			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
> >			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
> >			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
> >
> >		#address-cells = <1>;
> >		#size-cells = <0>;
> >
> >		pci@0 {
> >			reg = <2>;
> >			status = "disabled";
> >
> >			#address-cells = <3>;
> >			#size-cells = <2>;
> >
> >			ranges = <0x81000000 0 0 0x80400000 0 0x00008000   /* I/O */
> >				  0x82000000 0 0 0x90000000 0 0x08000000   /* non-prefetchable memory */
> >				  0xc2000000 0 0 0xa0000000 0 0x08000000>; /* prefetchable memory */
> >
> >			nvidia,ctrl-offset = <0x110>;
> >			nvidia,num-lanes = <2>;
> >		};
> >
> >		pci@1 {
> >			reg = <1>;
> >			status = "disabled";
> >
> >			#address-cells = <3>;
> >			#size-cells = <2>;
> >
> >			ranges = <0x81000000 0 0 0x80408000 0 0x00008000   /* I/O */
> >				  0x82000000 0 0 0x98000000 0 0x08000000   /* non-prefetchable memory */
> >				  0xc2000000 0 0 0xa8000000 0 0x08000000>; /* prefetchable memory */
> >
> >			nvidia,ctrl-offset = <0x118>;
> >			nvidia,num-lanes = <2>;
> >		};
> >	};
> >
> >AIUI none of the ranges properties are valid anymore, because the bus
> >represented by pcie-controller no longer reflects the truth, namely that
> >it translates the CPU address space to the PCI address space.
> >
> 
> I think you can use a small-integer port number as an address by
> defining the intermediate address space properly, and using
> appropriate ranges above and below.  Here's a swag at how that would
> look.
> 
> I present three versions, using different choices for the
> intermediate address space encoding.  The intermediate address space
> may seem somewhat artificial, in that it decouples the "linear pass
> through of ranges" between the lower PCI address space and the upper
> CPU address space.  But in another sense, it accurately reflects
> that fact that the bus bridge "slices and dices" that linear address
> space into non-contiguous pieces.
> 
> Note that I'm also fixing a problem that I neglected to mention
> earlier - namely the fact that config space is part of the child PCI
> address space so it must be passed through.
> 
> Version A - 3 address cells:  In this version, the intermediate
> address space has 3 cells:  port#, address type, offset.  Address
> type is
>   0 : root port
>   1 : config space
>   2 : extended config space
>   3 : I/O
>   4 : non-prefetchable memory
>   5 : prefetchable memory.
> 
> The third cell "offset" is necessary so that the size field has a
> number space that can include it.
> 
> 	pcie-controller {
> 		compatible = "nvidia,tegra20-pcie";
> 		reg = <0x80003000 0x00000800   /* PADS registers */
> 		       0x80003800 0x00000200>; /* extended configuration space */
> 		interrupts = <0 98 0x04   /* controller interrupt */
> 			      0 99 0x04>; /* MSI interrupt */
> 		status = "disabled";
> 
> 		ranges = <0 0 0  0x80000000 0x00001000   /* Root port 0 */
> 			  0 1 0  0x80004000 0x00080000   /* Port 0 config space */
> 			  0 2 0  0x80104000 0x00080000   /* Port 0 ext config space *
> 			  0 3 0  0x80400000 0x00008000   /* Port 0 downstream I/O */
> 			  0 4 0  0x90000000 0x08000000   /* Port 0 non-prefetchable memory */
> 			  0 5 0  0xa0000000 0x08000000   /* Port 0 prefetchable memory */
> 
> 			  1 0 0  0x80001000 0x00001000   /* Root port 1 */
> 			  1 1 0  0x80004000 0x00080000   /* Port 1 config space */
> 			  1 2 0  0x80184000 0x00080000   /* Port 1 ext config space */
> 			  1 3 0  0x80408000 0x00010000   /* Port 1 downstream I/O */
> 			  1 4 0  0x98000000 0x08000000   /* Port 1 non-prefetchable memory */
> 			  1 5 0  0xa0000000 0x08000000>; /* Port 1 prefetchable memory */
> 
> 		#address-cells = <3>;
> 		#size-cells = <1>;
> 
> 		pci@0 {
> 			reg = <0 0 0 0x1000>;
[...]
> 		};
> 
> 		pci@1 {
> 			reg = <1 0 0 0x1000>;
[...]
> 		};

It seems like this isn't working properly. For some reason both the reg
property of pci@0 and pci@1 are translated to the same parent address
0x80000000. I'll have to investigate where exactly this goes wrong.

Thierry
Bjorn Helgaas - June 22, 2012, 11:46 a.m.
On Fri, Jun 22, 2012 at 5:00 AM, Thierry Reding
<thierry.reding@avionic-design.de> wrote:
> On Fri, Jun 22, 2012 at 04:18:05AM -0600, Bjorn Helgaas wrote:
>> On Thu, Jun 21, 2012 at 12:47 AM, Thierry Reding
>> <thierry.reding@avionic-design.de> wrote:
>> > On Tue, Jun 19, 2012 at 11:31:39AM -1000, Mitch Bradley wrote:
>>
>> >> Version A - 3 address cells:  In this version, the intermediate
>> >> address space has 3 cells:  port#, address type, offset.  Address
>> >> type is
>> >>   0 : root port
>> >>   1 : config space
>> >>   2 : extended config space
>> >>   3 : I/O
>> >>   4 : non-prefetchable memory
>> >>   5 : prefetchable memory.
>> >>
>> >> The third cell "offset" is necessary so that the size field has a
>> >> number space that can include it.
>> >>
>> >>       pcie-controller {
>> >>               compatible = "nvidia,tegra20-pcie";
>> >>               reg = <0x80003000 0x00000800   /* PADS registers */
>> >>                      0x80003800 0x00000200>; /* extended configuration space */
>> >>               interrupts = <0 98 0x04   /* controller interrupt */
>> >>                             0 99 0x04>; /* MSI interrupt */
>> >>               status = "disabled";
>> >>
>> >>               ranges = <0 0 0  0x80000000 0x00001000   /* Root port 0 */
>> >>                         0 1 0  0x80004000 0x00080000   /* Port 0 config space */
>> >>                         0 2 0  0x80104000 0x00080000   /* Port 0 ext config space *
>> >>                         0 3 0  0x80400000 0x00008000   /* Port 0 downstream I/O */
>> >>                         0 4 0  0x90000000 0x08000000   /* Port 0 non-prefetchable memory */
>> >>                         0 5 0  0xa0000000 0x08000000   /* Port 0 prefetchable memory */
>> >>
>> >>                         1 0 0  0x80001000 0x00001000   /* Root port 1 */
>> >>                         1 1 0  0x80004000 0x00080000   /* Port 1 config space */
>> >>                         1 2 0  0x80184000 0x00080000   /* Port 1 ext config space */
>> >>                         1 3 0  0x80408000 0x00010000   /* Port 1 downstream I/O */
>> >>                         1 4 0  0x98000000 0x08000000   /* Port 1 non-prefetchable memory */
>> >>                         1 5 0  0xa0000000 0x08000000>; /* Port 1 prefetchable memory */
>> >>
>> >>               #address-cells = <3>;
>> >>               #size-cells = <1>;
>> >>
>> >>               pci@0 {
>> >>                       reg = <0 0 0 0x1000>;
>> >>                       status = "disabled";
>> >>
>> >>                       #address-cells = <3>;
>> >>                       #size-cells = <2>;
>> >>
>> >>                       ranges = <0x80000000 0 0  0 1 0  0 0x00080000   /* config */
>> >>                                 0x90000000 0 0  0 2 0  0 0x00080000   /* extended config */
>> >>                                 0x81000000 0 0  0 3 0  0 0x00008000   /* I/O */
>> >>                                 0x82000000 0 0  0 4 0  0 0x08000000   /* non-prefetchable memory */
>> >>                                 0xc2000000 0 0  0 5 0  0 0x08000000>; /* prefetchable memory */
>> >>
>> >>                       nvidia,ctrl-offset = <0x110>;
>> >>                       nvidia,num-lanes = <2>;
>> >>               };
>> >>
>> >>
>> >>               pci@1 {
>> >>                       reg = <1 0 0 0x1000>;
>> >>                       status = "disabled";
>> >>
>> >>                       #address-cells = <3>;
>> >>                       #size-cells = <2>;
>> >>
>> >>                       ranges = <0x80000000 0 0  1 1 0  0 0x00080000   /* config */
>> >>                                 0x90000000 0 0  1 2 0  0 0x00080000   /* extended config */
>> >>                                 0x81000000 0 0  1 3 0  0 0x00008000   /* I/O */
>> >>                                 0x82000000 0 0  1 4 0  0 0x08000000   /* non-prefetchable memory */
>> >>                                 0xc2000000 0 0  1 5 0  0 0x08000000>; /* prefetchable memory */
>> >>
>> >>                       nvidia,ctrl-offset = <0x118>;
>> >>                       nvidia,num-lanes = <2>;
>> >>               };
>>
>> I'm not familiar with device tree, so pardon me if these are stupid
>> questions.  These seem to be describing PCI host bridges.
>
> Yes, correct.
>
>> I assume some of these ranges describe MMIO, prefetchable MMIO, and
>> I/O port apertures that the bridge forwards to the PCI bus.
>
> Yes.
>
>> Is there provision for any address offset applied by the bridge when
>> it forwards downstream?  (Maybe these bridges don't apply any offset?)
>>  "0xa0000000 0x08000000" appears for both ports 0 and 1 prefetchable
>> memory; I don't know if that's an error, an indication that each
>> bridge applies a different offset, or that both bridges forward the
>> same aperture (I hope not the latter, because Linux can't really deal
>> with that).
>
> That seems to be a typo. It should have been 0xa8000000 in the second
> case. The PCIe controller has registers that are programmed with the
> aperture for the configuration and extended configuration spaces, as
> well as the downstream I/O, prefetchable and non-prefetchable memory
> regions.
>
> The configuration spaces aren't actually forwarded by the bridges, but
> are handled only by the controller.

Right.  The host bridge (controller) would convert memory accesses on
the upstream side into PCI config accesses on its downstream PCI
interface.

> The other apertures are programmed
> into the bridges using standard PCI registers.

If you're talking about programming host bridge apertures, there are
no "standard PCI registers" to do that.  The programming model of the
host bridge itself is not part of the PCI spec.  PCI-to-PCI bridge
apertures *are* specified by the PCI Bridge spec, so those are
standard.

>> Is the bus number aperture included somewhere?  How do we know what
>> bus numbers are available for allocation under each bridge?
>
> Not yet. I don't think DT imposes a bus number allocation on PCI
> bridges. However the matching of DT nodes to PCI bridges is done based
> on the bus number. For that you provide a bus-ranges property which
> defines the bus aperture of the given PCI bridge. The DT matching code
> compares the first cell of this property with the primary bus number of
> the bridge.

I don't fully understand this, but I can tell you that things don't
work very well if we don't know the aperture.  We can make
assumptions, like the root bus is 00, and then enumerate everything
reachable from there.  But then all we know is the largest bus number
actually reachable from bus 00, which is usually smaller then the end
of the aperture.  We don't know how many unused bus numbers there are
in the aperture, so we can't safely allocate any for hot-added
devices.

> This is in fact somewhat counter-productive because it requires you to
> hard-code the PCI hierarchy within the DT and you loose a lot of the
> probing functionality. This is not much of an issue for typical PCI
> endpoints (like network cards) but becomes a problem if you have a PCI
> endpoint that implements an I2C bus and you want to match I2C slaves on
> that bus to device nodes so that the kernel can parse and instantiate
> them properly. I guess in the latter cases hard-coding the PCI tree in
> DT is not actually a drawback, though.
>
>> I see mention of config space.  Is some of that referring to the ECAM
>> as in 7.2.2 of the PCIe spec v3.0 (what we refer to as MMCONFIG on
>> x86)?
>
> I only have access to the PCIe 2.0 specification, but it seems to
> contain the same 7.2.2 section. In fact it looks as though this
> particular controller indeed implements something similar to ECAM. The
> address mapping is a little different, though:
>
>        A[(16 + n - 1):16]: bus number
>        A[15:11]: device number
>        A[10:8]: function number
>        A[7:2]: register number
>        A[1:0]: unused
>
> That information comes directly from the driver and I have no access to
> the corresponding documentation. It seems like the extended
> configuration space is accessed using the same mapping but starting at a
> different base address.
>
> But now that you mention it, maybe the driver code is buggy here, and
> the controller indeed implements ECAM. Furthermore it also seems like
> the current code doesn't work for the extended configuration space
> because while the correct offset into the extended configuration space
> is chosen, the access to registers is done using the same mapping as
> above, so instead of the 12 (10) bits required for the register number
> only 8 (6) are in fact used.
>
> I wonder whether anyone's actually tested this.
>
> Stephen: can you try to find out whether the Tegra PCIe controller
> indeed implements ECAM, or if this scheme is actually just a proprietary
> variant?

It'd be a real shame if they made up a proprietary scheme when ECAM
seems to be required by the spec.  I'm interested because the current
MMCONFIG code seems unnecessarily x86-specific, and I expect every
arch with PCIe would want to use ECAM, so maybe there's some chance
for a generic implementation.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thierry Reding - June 22, 2012, 12:43 p.m.
On Fri, Jun 22, 2012 at 05:46:52AM -0600, Bjorn Helgaas wrote:
> On Fri, Jun 22, 2012 at 5:00 AM, Thierry Reding
> <thierry.reding@avionic-design.de> wrote:
> > On Fri, Jun 22, 2012 at 04:18:05AM -0600, Bjorn Helgaas wrote:
> >> On Thu, Jun 21, 2012 at 12:47 AM, Thierry Reding
> >> <thierry.reding@avionic-design.de> wrote:
> >> > On Tue, Jun 19, 2012 at 11:31:39AM -1000, Mitch Bradley wrote:
> >>
> >> >> Version A - 3 address cells:  In this version, the intermediate
> >> >> address space has 3 cells:  port#, address type, offset.  Address
> >> >> type is
> >> >>   0 : root port
> >> >>   1 : config space
> >> >>   2 : extended config space
> >> >>   3 : I/O
> >> >>   4 : non-prefetchable memory
> >> >>   5 : prefetchable memory.
> >> >>
> >> >> The third cell "offset" is necessary so that the size field has a
> >> >> number space that can include it.
> >> >>
> >> >>       pcie-controller {
> >> >>               compatible = "nvidia,tegra20-pcie";
> >> >>               reg = <0x80003000 0x00000800   /* PADS registers */
> >> >>                      0x80003800 0x00000200>; /* extended configuration space */
> >> >>               interrupts = <0 98 0x04   /* controller interrupt */
> >> >>                             0 99 0x04>; /* MSI interrupt */
> >> >>               status = "disabled";
> >> >>
> >> >>               ranges = <0 0 0  0x80000000 0x00001000   /* Root port 0 */
> >> >>                         0 1 0  0x80004000 0x00080000   /* Port 0 config space */
> >> >>                         0 2 0  0x80104000 0x00080000   /* Port 0 ext config space *
> >> >>                         0 3 0  0x80400000 0x00008000   /* Port 0 downstream I/O */
> >> >>                         0 4 0  0x90000000 0x08000000   /* Port 0 non-prefetchable memory */
> >> >>                         0 5 0  0xa0000000 0x08000000   /* Port 0 prefetchable memory */
> >> >>
> >> >>                         1 0 0  0x80001000 0x00001000   /* Root port 1 */
> >> >>                         1 1 0  0x80004000 0x00080000   /* Port 1 config space */
> >> >>                         1 2 0  0x80184000 0x00080000   /* Port 1 ext config space */
> >> >>                         1 3 0  0x80408000 0x00010000   /* Port 1 downstream I/O */
> >> >>                         1 4 0  0x98000000 0x08000000   /* Port 1 non-prefetchable memory */
> >> >>                         1 5 0  0xa0000000 0x08000000>; /* Port 1 prefetchable memory */
> >> >>
> >> >>               #address-cells = <3>;
> >> >>               #size-cells = <1>;
> >> >>
> >> >>               pci@0 {
> >> >>                       reg = <0 0 0 0x1000>;
> >> >>                       status = "disabled";
> >> >>
> >> >>                       #address-cells = <3>;
> >> >>                       #size-cells = <2>;
> >> >>
> >> >>                       ranges = <0x80000000 0 0  0 1 0  0 0x00080000   /* config */
> >> >>                                 0x90000000 0 0  0 2 0  0 0x00080000   /* extended config */
> >> >>                                 0x81000000 0 0  0 3 0  0 0x00008000   /* I/O */
> >> >>                                 0x82000000 0 0  0 4 0  0 0x08000000   /* non-prefetchable memory */
> >> >>                                 0xc2000000 0 0  0 5 0  0 0x08000000>; /* prefetchable memory */
> >> >>
> >> >>                       nvidia,ctrl-offset = <0x110>;
> >> >>                       nvidia,num-lanes = <2>;
> >> >>               };
> >> >>
> >> >>
> >> >>               pci@1 {
> >> >>                       reg = <1 0 0 0x1000>;
> >> >>                       status = "disabled";
> >> >>
> >> >>                       #address-cells = <3>;
> >> >>                       #size-cells = <2>;
> >> >>
> >> >>                       ranges = <0x80000000 0 0  1 1 0  0 0x00080000   /* config */
> >> >>                                 0x90000000 0 0  1 2 0  0 0x00080000   /* extended config */
> >> >>                                 0x81000000 0 0  1 3 0  0 0x00008000   /* I/O */
> >> >>                                 0x82000000 0 0  1 4 0  0 0x08000000   /* non-prefetchable memory */
> >> >>                                 0xc2000000 0 0  1 5 0  0 0x08000000>; /* prefetchable memory */
> >> >>
> >> >>                       nvidia,ctrl-offset = <0x118>;
> >> >>                       nvidia,num-lanes = <2>;
> >> >>               };
> >>
> >> I'm not familiar with device tree, so pardon me if these are stupid
> >> questions.  These seem to be describing PCI host bridges.
> >
> > Yes, correct.
> >
> >> I assume some of these ranges describe MMIO, prefetchable MMIO, and
> >> I/O port apertures that the bridge forwards to the PCI bus.
> >
> > Yes.
> >
> >> Is there provision for any address offset applied by the bridge when
> >> it forwards downstream?  (Maybe these bridges don't apply any offset?)
> >>  "0xa0000000 0x08000000" appears for both ports 0 and 1 prefetchable
> >> memory; I don't know if that's an error, an indication that each
> >> bridge applies a different offset, or that both bridges forward the
> >> same aperture (I hope not the latter, because Linux can't really deal
> >> with that).
> >
> > That seems to be a typo. It should have been 0xa8000000 in the second
> > case. The PCIe controller has registers that are programmed with the
> > aperture for the configuration and extended configuration spaces, as
> > well as the downstream I/O, prefetchable and non-prefetchable memory
> > regions.
> >
> > The configuration spaces aren't actually forwarded by the bridges, but
> > are handled only by the controller.
> 
> Right.  The host bridge (controller) would convert memory accesses on
> the upstream side into PCI config accesses on its downstream PCI
> interface.
> 
> > The other apertures are programmed
> > into the bridges using standard PCI registers.
> 
> If you're talking about programming host bridge apertures, there are
> no "standard PCI registers" to do that.  The programming model of the
> host bridge itself is not part of the PCI spec.  PCI-to-PCI bridge
> apertures *are* specified by the PCI Bridge spec, so those are
> standard.

Each of the root ports has a PCI-compatible set of configuration
registers, so I guess what is meant is that the apertures a programmed
according to the PCI bridge specification.

> >> Is the bus number aperture included somewhere?  How do we know what
> >> bus numbers are available for allocation under each bridge?
> >
> > Not yet. I don't think DT imposes a bus number allocation on PCI
> > bridges. However the matching of DT nodes to PCI bridges is done based
> > on the bus number. For that you provide a bus-ranges property which
> > defines the bus aperture of the given PCI bridge. The DT matching code
> > compares the first cell of this property with the primary bus number of
> > the bridge.
> 
> I don't fully understand this, but I can tell you that things don't
> work very well if we don't know the aperture.  We can make
> assumptions, like the root bus is 00, and then enumerate everything
> reachable from there.  But then all we know is the largest bus number
> actually reachable from bus 00, which is usually smaller then the end
> of the aperture.  We don't know how many unused bus numbers there are
> in the aperture, so we can't safely allocate any for hot-added
> devices.

I think DT support for PCI is lacking in a lot of areas. PowerPC seems
to be the only architecture actively setting up busses according to the
bus-range property specified in the DT. All other architectures seem to
not pre-allocate bus apertures and go with the default instead. From
what you say that means hot-plugging is out. Although I don't think the
Tegra PCIe controller even supports hot-plugging.

> > This is in fact somewhat counter-productive because it requires you to
> > hard-code the PCI hierarchy within the DT and you loose a lot of the
> > probing functionality. This is not much of an issue for typical PCI
> > endpoints (like network cards) but becomes a problem if you have a PCI
> > endpoint that implements an I2C bus and you want to match I2C slaves on
> > that bus to device nodes so that the kernel can parse and instantiate
> > them properly. I guess in the latter cases hard-coding the PCI tree in
> > DT is not actually a drawback, though.
> >
> >> I see mention of config space.  Is some of that referring to the ECAM
> >> as in 7.2.2 of the PCIe spec v3.0 (what we refer to as MMCONFIG on
> >> x86)?
> >
> > I only have access to the PCIe 2.0 specification, but it seems to
> > contain the same 7.2.2 section. In fact it looks as though this
> > particular controller indeed implements something similar to ECAM. The
> > address mapping is a little different, though:
> >
> >        A[(16 + n - 1):16]: bus number
> >        A[15:11]: device number
> >        A[10:8]: function number
> >        A[7:2]: register number
> >        A[1:0]: unused
> >
> > That information comes directly from the driver and I have no access to
> > the corresponding documentation. It seems like the extended
> > configuration space is accessed using the same mapping but starting at a
> > different base address.
> >
> > But now that you mention it, maybe the driver code is buggy here, and
> > the controller indeed implements ECAM. Furthermore it also seems like
> > the current code doesn't work for the extended configuration space
> > because while the correct offset into the extended configuration space
> > is chosen, the access to registers is done using the same mapping as
> > above, so instead of the 12 (10) bits required for the register number
> > only 8 (6) are in fact used.
> >
> > I wonder whether anyone's actually tested this.
> >
> > Stephen: can you try to find out whether the Tegra PCIe controller
> > indeed implements ECAM, or if this scheme is actually just a proprietary
> > variant?
> 
> It'd be a real shame if they made up a proprietary scheme when ECAM
> seems to be required by the spec.  I'm interested because the current
> MMCONFIG code seems unnecessarily x86-specific, and I expect every
> arch with PCIe would want to use ECAM, so maybe there's some chance
> for a generic implementation.

Since this is pretty well specified by PCIe, a generic implementation
should be possible. We'll have to wait for some better documentation on
Tegra to know whether this is actually ECAM or not.

Thierry
Arnd Bergmann - June 22, 2012, 1:03 p.m.
On Friday 22 June 2012, Thierry Reding wrote:
> On Fri, Jun 22, 2012 at 05:46:52AM -0600, Bjorn Helgaas wrote:
> > On Fri, Jun 22, 2012 at 5:00 AM, Thierry Reding
> > <thierry.reding@avionic-design.de> wrote:
> > >> Is the bus number aperture included somewhere?  How do we know what
> > >> bus numbers are available for allocation under each bridge?
> > >
> > > Not yet. I don't think DT imposes a bus number allocation on PCI
> > > bridges. However the matching of DT nodes to PCI bridges is done based
> > > on the bus number. For that you provide a bus-ranges property which
> > > defines the bus aperture of the given PCI bridge. The DT matching code
> > > compares the first cell of this property with the primary bus number of
> > > the bridge.
> > 
> > I don't fully understand this, but I can tell you that things don't
> > work very well if we don't know the aperture.  We can make
> > assumptions, like the root bus is 00, and then enumerate everything
> > reachable from there.  But then all we know is the largest bus number
> > actually reachable from bus 00, which is usually smaller then the end
> > of the aperture.  We don't know how many unused bus numbers there are
> > in the aperture, so we can't safely allocate any for hot-added
> > devices.

I believe the assumption today is that all bus numbers are ok on each
root port, which is usually the case. There was something about a Power
Mac that faked a single PCI bus number space across both an AGP and
a PCIe domain in pmac_pci_fixup_u4_of_node(), but I think that is the
exception.

Is there any requirement to use a bus aperture on non-PC hardware?
 
> I think DT support for PCI is lacking in a lot of areas. PowerPC seems
> to be the only architecture actively setting up busses according to the
> bus-range property specified in the DT. All other architectures seem to
> not pre-allocate bus apertures and go with the default instead. From
> what you say that means hot-plugging is out. Although I don't think the
> Tegra PCIe controller even supports hot-plugging.

If the default is allowing any bus numbers, and that matches the
hardware capabilities, I think hot-plugging would work.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thierry Reding - June 22, 2012, 1:22 p.m.
On Fri, Jun 22, 2012 at 01:04:03PM +0200, Thierry Reding wrote:
> On Tue, Jun 19, 2012 at 11:31:39AM -1000, Mitch Bradley wrote:
> > On 6/19/2012 3:30 AM, Thierry Reding wrote:
> > >On Fri, Jun 15, 2012 at 08:12:36AM +0200, Thierry Reding wrote:
> > >>On Thu, Jun 14, 2012 at 01:50:56PM -0600, Stephen Warren wrote:
> > >>>On 06/14/2012 01:29 PM, Thierry Reding wrote:
> > >>>>On Thu, Jun 14, 2012 at 12:30:50PM -0600, Stephen Warren wrote:
> > >>>>>On 06/14/2012 03:19 AM, Thierry Reding wrote:
> > >>>...
> > >>>>>>#address-cells = <1>; #size-cells = <1>;
> > >>>>>>
> > >>>>>>pci@80000000 {
> > >>>>>
> > >>>>>I'm still not convinced that using the address of the port's
> > >>>>>registers is the correct way to represent each port. The port
> > >>>>>index seems much more useful.
> > >>>>>
> > >>>>>The main reason here is that there are a lot of registers that
> > >>>>>contain fields for each port - far more than the combination of
> > >>>>>this node's reg and ctrl-offset (which I assume is an address
> > >>>>>offset for just one example of this issue) properties can
> > >>>>>describe. The bit position and bit stride of these fields isn't
> > >>>>>necessarily the same in each register. Do we want a property like
> > >>>>>ctrl-offset for every single type of field in every single shared
> > >>>>>register that describes the location of the relevant data, or
> > >>>>>just a single "port ID" bit that can be applied to anything?
> > >>>>>
> > >>>>>(Perhaps this isn't so obvious looking at the TRM since it
> > >>>>>doesn't document all registers, and I'm also looking at the
> > >>>>>Tegra30 documentation too, which might be more exposed to this -
> > >>>>>I haven't correlated all the documentation sources to be sure
> > >>>>>though)
> > >>>>
> > >>>>I agree that maybe adding properties for each bit position or
> > >>>>register offset may not work out too well. But I think it still
> > >>>>makes sense to use the base address of the port's registers (see
> > >>>>below). We could of course add some code to determine the index
> > >>>>from the base address at initialization time and reuse the index
> > >>>>where appropriate.
> > >>>
> > >>>To me, working back from address to ID then using the ID to calculate
> > >>>some other addresses seems far more icky than just calculating all the
> > >>>addresses based off of one ID. But, I suppose this doesn't make a huge
> > >>>practical difference.
> > >>
> > >>This really depends on the device vs. no device decision below. If we can
> > >>make it work without needing an extra device for it, then using the index
> > >>is certainly better. However, if we instantiate devices from the DT, then
> > >>we have the address anyway and adding the index as a property would be
> > >>redundant and error prone (what happens if somebody sets the index of the
> > >>port at address 0x80000000 to 2?).
> > >
> > >An additional problem with this is that we'd have to add the following
> > >to the pcie-controller node:
> > >
> > >	#address-cells = <1>;
> > >	#size-cells = <0>;
> > >
> > >This will conflict with the "ranges" property, because suddenly we can
> > >no longer map the regions properly. Maybe Mitch can comment on whether
> > >this is possible or not?
> > >
> > >To make it clearer what I'm talking about, here's the DT snippet again
> > >(with the compatible property removed from the pci@ nodes because they
> > >are no longer probed by a driver, the "simple-bus" removed from the
> > >pcie-controller node's compatible property removed and its #address-
> > >and #size-cells properties adjusted as described above).
> > >
> > >	pcie-controller {
> > >		compatible = "nvidia,tegra20-pcie";
> > >		reg = <0x80003000 0x00000800   /* PADS registers */
> > >		       0x80003800 0x00000200   /* AFI registers */
> > >		       0x80004000 0x00100000   /* configuration space */
> > >		       0x80104000 0x00100000>; /* extended configuration space */
> > >		interrupts = <0 98 0x04   /* controller interrupt */
> > >			      0 99 0x04>; /* MSI interrupt */
> > >		status = "disabled";
> > >
> > >		ranges = <0x80000000 0x80000000 0x00002000   /* 2 root ports */
> > >			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
> > >			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
> > >			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
> > >
> > >		#address-cells = <1>;
> > >		#size-cells = <0>;
> > >
> > >		pci@0 {
> > >			reg = <2>;
> > >			status = "disabled";
> > >
> > >			#address-cells = <3>;
> > >			#size-cells = <2>;
> > >
> > >			ranges = <0x81000000 0 0 0x80400000 0 0x00008000   /* I/O */
> > >				  0x82000000 0 0 0x90000000 0 0x08000000   /* non-prefetchable memory */
> > >				  0xc2000000 0 0 0xa0000000 0 0x08000000>; /* prefetchable memory */
> > >
> > >			nvidia,ctrl-offset = <0x110>;
> > >			nvidia,num-lanes = <2>;
> > >		};
> > >
> > >		pci@1 {
> > >			reg = <1>;
> > >			status = "disabled";
> > >
> > >			#address-cells = <3>;
> > >			#size-cells = <2>;
> > >
> > >			ranges = <0x81000000 0 0 0x80408000 0 0x00008000   /* I/O */
> > >				  0x82000000 0 0 0x98000000 0 0x08000000   /* non-prefetchable memory */
> > >				  0xc2000000 0 0 0xa8000000 0 0x08000000>; /* prefetchable memory */
> > >
> > >			nvidia,ctrl-offset = <0x118>;
> > >			nvidia,num-lanes = <2>;
> > >		};
> > >	};
> > >
> > >AIUI none of the ranges properties are valid anymore, because the bus
> > >represented by pcie-controller no longer reflects the truth, namely that
> > >it translates the CPU address space to the PCI address space.
> > >
> > 
> > I think you can use a small-integer port number as an address by
> > defining the intermediate address space properly, and using
> > appropriate ranges above and below.  Here's a swag at how that would
> > look.
> > 
> > I present three versions, using different choices for the
> > intermediate address space encoding.  The intermediate address space
> > may seem somewhat artificial, in that it decouples the "linear pass
> > through of ranges" between the lower PCI address space and the upper
> > CPU address space.  But in another sense, it accurately reflects
> > that fact that the bus bridge "slices and dices" that linear address
> > space into non-contiguous pieces.
> > 
> > Note that I'm also fixing a problem that I neglected to mention
> > earlier - namely the fact that config space is part of the child PCI
> > address space so it must be passed through.
> > 
> > Version A - 3 address cells:  In this version, the intermediate
> > address space has 3 cells:  port#, address type, offset.  Address
> > type is
> >   0 : root port
> >   1 : config space
> >   2 : extended config space
> >   3 : I/O
> >   4 : non-prefetchable memory
> >   5 : prefetchable memory.
> > 
> > The third cell "offset" is necessary so that the size field has a
> > number space that can include it.
> > 
> > 	pcie-controller {
> > 		compatible = "nvidia,tegra20-pcie";
> > 		reg = <0x80003000 0x00000800   /* PADS registers */
> > 		       0x80003800 0x00000200>; /* extended configuration space */
> > 		interrupts = <0 98 0x04   /* controller interrupt */
> > 			      0 99 0x04>; /* MSI interrupt */
> > 		status = "disabled";
> > 
> > 		ranges = <0 0 0  0x80000000 0x00001000   /* Root port 0 */
> > 			  0 1 0  0x80004000 0x00080000   /* Port 0 config space */
> > 			  0 2 0  0x80104000 0x00080000   /* Port 0 ext config space *
> > 			  0 3 0  0x80400000 0x00008000   /* Port 0 downstream I/O */
> > 			  0 4 0  0x90000000 0x08000000   /* Port 0 non-prefetchable memory */
> > 			  0 5 0  0xa0000000 0x08000000   /* Port 0 prefetchable memory */
> > 
> > 			  1 0 0  0x80001000 0x00001000   /* Root port 1 */
> > 			  1 1 0  0x80004000 0x00080000   /* Port 1 config space */
> > 			  1 2 0  0x80184000 0x00080000   /* Port 1 ext config space */
> > 			  1 3 0  0x80408000 0x00010000   /* Port 1 downstream I/O */
> > 			  1 4 0  0x98000000 0x08000000   /* Port 1 non-prefetchable memory */
> > 			  1 5 0  0xa0000000 0x08000000>; /* Port 1 prefetchable memory */
> > 
> > 		#address-cells = <3>;
> > 		#size-cells = <1>;
> > 
> > 		pci@0 {
> > 			reg = <0 0 0 0x1000>;
> [...]
> > 		};
> > 
> > 		pci@1 {
> > 			reg = <1 0 0 0x1000>;
> [...]
> > 		};
> 
> It seems like this isn't working properly. For some reason both the reg
> property of pci@0 and pci@1 are translated to the same parent address
> 0x80000000. I'll have to investigate where exactly this goes wrong.

So it turns out that while of_read_number() can actually read any number
of cells, only the two least significant are kept because it returns a
u64. Since of_bus_default_map() uses of_read_number() the addresses
<0 0 0> and <1 0 0> are in fact the same. I'm not sure how best to solve
this. I'm not aware of a 128 bit integer type in the kernel, so I guess
the better alternative would be to fix of_bus_default_map() to cope with
#address-cells > 2 properly.

Thierry
Arnd Bergmann - June 22, 2012, 1:48 p.m.
On Friday 22 June 2012, Thierry Reding wrote:
> > It seems like this isn't working properly. For some reason both the reg
> > property of pci@0 and pci@1 are translated to the same parent address
> > 0x80000000. I'll have to investigate where exactly this goes wrong.
> 
> So it turns out that while of_read_number() can actually read any number
> of cells, only the two least significant are kept because it returns a
> u64. Since of_bus_default_map() uses of_read_number() the addresses
> <0 0 0> and <1 0 0> are in fact the same. I'm not sure how best to solve
> this. I'm not aware of a 128 bit integer type in the kernel, so I guess
> the better alternative would be to fix of_bus_default_map() to cope with
> #address-cells > 2 properly.

of_translate_address should get it right. Which codes uses of_read_number()?
Can it be converted to use of_translate_address()?

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thierry Reding - June 22, 2012, 2:02 p.m.
On Fri, Jun 22, 2012 at 01:48:39PM +0000, Arnd Bergmann wrote:
> On Friday 22 June 2012, Thierry Reding wrote:
> > > It seems like this isn't working properly. For some reason both the reg
> > > property of pci@0 and pci@1 are translated to the same parent address
> > > 0x80000000. I'll have to investigate where exactly this goes wrong.
> > 
> > So it turns out that while of_read_number() can actually read any number
> > of cells, only the two least significant are kept because it returns a
> > u64. Since of_bus_default_map() uses of_read_number() the addresses
> > <0 0 0> and <1 0 0> are in fact the same. I'm not sure how best to solve
> > this. I'm not aware of a 128 bit integer type in the kernel, so I guess
> > the better alternative would be to fix of_bus_default_map() to cope with
> > #address-cells > 2 properly.
> 
> of_translate_address should get it right. Which codes uses of_read_number()?
> Can it be converted to use of_translate_address()?

Actually this is from of_translate_address(). The calling sequence looks
like this:

	of_address_to_resource()
	  __of_address_to_resource()
	    of_translate_address()
	      __of_translate_address()
	        of_translate_one()
	          of_bus_default_map()
	            of_read_number()

Thierry
Stephen Warren - June 22, 2012, 4:20 p.m.
On 06/21/2012 12:47 AM, Thierry Reding wrote:
...
> Everybody seems to be happy with this approach, so I'll give it a
> shot. There is one thing I'm still unsure about, though. What if
> somebody uses the above scheme and maps the registers to the wrong
> port. The same goes for the nvidia,ctrl-offset property. It needs
> to match the register offset because they are directly related. I
> suppose we could leave that property away and look up the register
> via the port index (which, as Stephen already said, we'll have to
> do in other places anyway, unless we list all bit positions in the
> DT).
> 
> Can we safely ignore such issues and assume the device tree to
> always be right? Should we just not care if somebody uses it
> wrongly?

I think that's pretty much the same thing as plain putting the wrong
reg property into any node - the value is wrong, so it doesn't work.
There's not too much you can do about it.

I'd be happy to remove the nvidia,ctrl-offset property to avoid the
need to specify basically the same information multiple times though;
nothing wrong with making it easier to write the correct DT content.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arnd Bergmann - June 22, 2012, 4:40 p.m.
On Friday 22 June 2012, Thierry Reding wrote:
> Actually this is from of_translate_address(). The calling sequence looks
> like this:
> 
>         of_address_to_resource()
>           __of_address_to_resource()
>             of_translate_address()
>               __of_translate_address()
>                 of_translate_one()
>                   of_bus_default_map()
>                     of_read_number()

Ok, I see.

This looks like a problem in of_bus_default_map, which expects to
see only 64 bit addresses at most. of_bus_pci_map by comparison
handles the pci addresses with three cells.

If I read this correctly, this fix is to make sure we compare the
upper cells of the address in of_bus_default_map:

static u64 of_bus_default_map(u32 *addr, const __be32 *range,
                int na, int ns, int pna)
{
        u64 cp, s, da;

        cp = of_read_number(range, na);
        s  = of_read_number(range + na + pna, ns);
        da = of_read_number(addr, na);

        pr_debug("OF: default map, cp=%llx, s=%llx, da=%llx\n",
                 (unsigned long long)cp, (unsigned long long)s,
                 (unsigned long long)da);

        if (da < cp || da >= (cp + s))
                return OF_BAD_ADDR;
        return da - cp;
}

How about adding code like:

	if ((na > 2) && memcmp(range, addr, na * 4) != 0)
		return OF_BAD_ADDR;

This won't handle entries with #size-cells>2 or those that
span a 64-bit boundary, but I think it will work for all
relevant cases.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas - June 22, 2012, 4:49 p.m.
On Fri, Jun 22, 2012 at 7:03 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Friday 22 June 2012, Thierry Reding wrote:
>> On Fri, Jun 22, 2012 at 05:46:52AM -0600, Bjorn Helgaas wrote:
>> > On Fri, Jun 22, 2012 at 5:00 AM, Thierry Reding
>> > <thierry.reding@avionic-design.de> wrote:
>> > >> Is the bus number aperture included somewhere?  How do we know what
>> > >> bus numbers are available for allocation under each bridge?
>> > >
>> > > Not yet. I don't think DT imposes a bus number allocation on PCI
>> > > bridges. However the matching of DT nodes to PCI bridges is done based
>> > > on the bus number. For that you provide a bus-ranges property which
>> > > defines the bus aperture of the given PCI bridge. The DT matching code
>> > > compares the first cell of this property with the primary bus number of
>> > > the bridge.
>> >
>> > I don't fully understand this, but I can tell you that things don't
>> > work very well if we don't know the aperture.  We can make
>> > assumptions, like the root bus is 00, and then enumerate everything
>> > reachable from there.  But then all we know is the largest bus number
>> > actually reachable from bus 00, which is usually smaller then the end
>> > of the aperture.  We don't know how many unused bus numbers there are
>> > in the aperture, so we can't safely allocate any for hot-added
>> > devices.
>
> I believe the assumption today is that all bus numbers are ok on each
> root port, which is usually the case. There was something about a Power
> Mac that faked a single PCI bus number space across both an AGP and
> a PCIe domain in pmac_pci_fixup_u4_of_node(), but I think that is the
> exception.
>
> Is there any requirement to use a bus aperture on non-PC hardware?

The requirement (if there is one) isn't anything related to PC-ness.
I just don't understand how things can actually work if two host
bridges both claim the same bus number.  If we do a config read to
that bus, both bridges should claim it and turn it into config cycles
on their respective root buses, and we should get two responses.  I
would expect the second response to cause an "unexpected response"
machine check or similar.

Beyond that, Yinghai's recent "busn-alloc" work (now in my -next
branch) tracks bus number assignments using a struct resource tree,
and that obviously won't work with overlaps.  Here's one of the
relevant commits:
http://git.kernel.org/?p=linux/kernel/git/helgaas/pci.git;a=commitdiff;h=5cc62c202211096ec26309722ec27455d52c8726

>> I think DT support for PCI is lacking in a lot of areas. PowerPC seems
>> to be the only architecture actively setting up busses according to the
>> bus-range property specified in the DT. All other architectures seem to
>> not pre-allocate bus apertures and go with the default instead. From
>> what you say that means hot-plugging is out. Although I don't think the
>> Tegra PCIe controller even supports hot-plugging.
>
> If the default is allowing any bus numbers, and that matches the
> hardware capabilities, I think hot-plugging would work.
>
>        Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arnd Bergmann - June 22, 2012, 4:53 p.m.
On Friday 22 June 2012, Bjorn Helgaas wrote:
> The requirement (if there is one) isn't anything related to PC-ness.
> I just don't understand how things can actually work if two host
> bridges both claim the same bus number.  If we do a config read to
> that bus, both bridges should claim it and turn it into config cycles
> on their respective root buses, and we should get two responses.  I
> would expect the second response to cause an "unexpected response"
> machine check or similar.

But each PCI domain has its own config space, so if you do a config
read for one bus, it's always relative to the domain.

The part that is specific to PCs is that they don't have PCI domains
normally.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stephen Warren - June 22, 2012, 5 p.m.
On 06/22/2012 05:00 AM, Thierry Reding wrote:
...
> Stephen: can you try to find out whether the Tegra PCIe controller 
> indeed implements ECAM, or if this scheme is actually just a
> proprietary variant?

Sure. I have added this request to the bug I filed requesting more
complete PCIe host documentation.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mitch Bradley - June 22, 2012, 5:09 p.m.
On 6/20/2012 8:47 PM, Thierry Reding wrote:
> On Tue, Jun 19, 2012 at 11:31:39AM -1000, Mitch Bradley wrote:
>> On 6/19/2012 3:30 AM, Thierry Reding wrote:
>>> On Fri, Jun 15, 2012 at 08:12:36AM +0200, Thierry Reding wrote:
>>>> On Thu, Jun 14, 2012 at 01:50:56PM -0600, Stephen Warren wrote:
>>>>> On 06/14/2012 01:29 PM, Thierry Reding wrote:
>>>>>> On Thu, Jun 14, 2012 at 12:30:50PM -0600, Stephen Warren wrote:
>>>>>>> On 06/14/2012 03:19 AM, Thierry Reding wrote:
>>>>> ...
>>>>>>>> #address-cells = <1>; #size-cells = <1>;
>>>>>>>>
>>>>>>>> pci@80000000 {
>>>>>>>
>>>>>>> I'm still not convinced that using the address of the port's
>>>>>>> registers is the correct way to represent each port. The port
>>>>>>> index seems much more useful.
>>>>>>>
>>>>>>> The main reason here is that there are a lot of registers that
>>>>>>> contain fields for each port - far more than the combination of
>>>>>>> this node's reg and ctrl-offset (which I assume is an address
>>>>>>> offset for just one example of this issue) properties can
>>>>>>> describe. The bit position and bit stride of these fields isn't
>>>>>>> necessarily the same in each register. Do we want a property like
>>>>>>> ctrl-offset for every single type of field in every single shared
>>>>>>> register that describes the location of the relevant data, or
>>>>>>> just a single "port ID" bit that can be applied to anything?
>>>>>>>
>>>>>>> (Perhaps this isn't so obvious looking at the TRM since it
>>>>>>> doesn't document all registers, and I'm also looking at the
>>>>>>> Tegra30 documentation too, which might be more exposed to this -
>>>>>>> I haven't correlated all the documentation sources to be sure
>>>>>>> though)
>>>>>>
>>>>>> I agree that maybe adding properties for each bit position or
>>>>>> register offset may not work out too well. But I think it still
>>>>>> makes sense to use the base address of the port's registers (see
>>>>>> below). We could of course add some code to determine the index
>>>>> >from the base address at initialization time and reuse the index
>>>>>> where appropriate.
>>>>>
>>>>> To me, working back from address to ID then using the ID to calculate
>>>>> some other addresses seems far more icky than just calculating all the
>>>>> addresses based off of one ID. But, I suppose this doesn't make a huge
>>>>> practical difference.
>>>>
>>>> This really depends on the device vs. no device decision below. If we can
>>>> make it work without needing an extra device for it, then using the index
>>>> is certainly better. However, if we instantiate devices from the DT, then
>>>> we have the address anyway and adding the index as a property would be
>>>> redundant and error prone (what happens if somebody sets the index of the
>>>> port at address 0x80000000 to 2?).
>>>
>>> An additional problem with this is that we'd have to add the following
>>> to the pcie-controller node:
>>>
>>> 	#address-cells = <1>;
>>> 	#size-cells = <0>;
>>>
>>> This will conflict with the "ranges" property, because suddenly we can
>>> no longer map the regions properly. Maybe Mitch can comment on whether
>>> this is possible or not?
>>>
>>> To make it clearer what I'm talking about, here's the DT snippet again
>>> (with the compatible property removed from the pci@ nodes because they
>>> are no longer probed by a driver, the "simple-bus" removed from the
>>> pcie-controller node's compatible property removed and its #address-
>>> and #size-cells properties adjusted as described above).
>>>
>>> 	pcie-controller {
>>> 		compatible = "nvidia,tegra20-pcie";
>>> 		reg = <0x80003000 0x00000800   /* PADS registers */
>>> 		       0x80003800 0x00000200   /* AFI registers */
>>> 		       0x80004000 0x00100000   /* configuration space */
>>> 		       0x80104000 0x00100000>; /* extended configuration space */
>>> 		interrupts = <0 98 0x04   /* controller interrupt */
>>> 			      0 99 0x04>; /* MSI interrupt */
>>> 		status = "disabled";
>>>
>>> 		ranges = <0x80000000 0x80000000 0x00002000   /* 2 root ports */
>>> 			  0x80400000 0x80400000 0x00010000   /* downstream I/O */
>>> 			  0x90000000 0x90000000 0x10000000   /* non-prefetchable memory */
>>> 			  0xa0000000 0xa0000000 0x10000000>; /* prefetchable memory */
>>>
>>> 		#address-cells = <1>;
>>> 		#size-cells = <0>;
>>>
>>> 		pci@0 {
>>> 			reg = <2>;
>>> 			status = "disabled";
>>>
>>> 			#address-cells = <3>;
>>> 			#size-cells = <2>;
>>>
>>> 			ranges = <0x81000000 0 0 0x80400000 0 0x00008000   /* I/O */
>>> 				  0x82000000 0 0 0x90000000 0 0x08000000   /* non-prefetchable memory */
>>> 				  0xc2000000 0 0 0xa0000000 0 0x08000000>; /* prefetchable memory */
>>>
>>> 			nvidia,ctrl-offset = <0x110>;
>>> 			nvidia,num-lanes = <2>;
>>> 		};
>>>
>>> 		pci@1 {
>>> 			reg = <1>;
>>> 			status = "disabled";
>>>
>>> 			#address-cells = <3>;
>>> 			#size-cells = <2>;
>>>
>>> 			ranges = <0x81000000 0 0 0x80408000 0 0x00008000   /* I/O */
>>> 				  0x82000000 0 0 0x98000000 0 0x08000000   /* non-prefetchable memory */
>>> 				  0xc2000000 0 0 0xa8000000 0 0x08000000>; /* prefetchable memory */
>>>
>>> 			nvidia,ctrl-offset = <0x118>;
>>> 			nvidia,num-lanes = <2>;
>>> 		};
>>> 	};
>>>
>>> AIUI none of the ranges properties are valid anymore, because the bus
>>> represented by pcie-controller no longer reflects the truth, namely that
>>> it translates the CPU address space to the PCI address space.
>>>
>>
>> I think you can use a small-integer port number as an address by
>> defining the intermediate address space properly, and using
>> appropriate ranges above and below.  Here's a swag at how that would
>> look.
>>
>> I present three versions, using different choices for the
>> intermediate address space encoding.  The intermediate address space
>> may seem somewhat artificial, in that it decouples the "linear pass
>> through of ranges" between the lower PCI address space and the upper
>> CPU address space.  But in another sense, it accurately reflects
>> that fact that the bus bridge "slices and dices" that linear address
>> space into non-contiguous pieces.
>>
>> Note that I'm also fixing a problem that I neglected to mention
>> earlier - namely the fact that config space is part of the child PCI
>> address space so it must be passed through.
>
> This doesn't quite match how the Tegra PCIe controller works. Basically,
> every access to a device's (extended) configuration space goes through a
> single region of memory-mapped registers, independent of which port is
> the parent of the device.
>
> Is it okay to pass both configuration spaces via the ranges property but
> still list them in the reg property of the controller?


It's a small hierarchy violation, but I expect that it would work in 
practice.

>
>> Version A - 3 address cells:  In this version, the intermediate
>> address space has 3 cells:  port#, address type, offset.  Address
>> type is
>>    0 : root port
>>    1 : config space
>>    2 : extended config space
>>    3 : I/O
>>    4 : non-prefetchable memory
>>    5 : prefetchable memory.
>>
>> The third cell "offset" is necessary so that the size field has a
>> number space that can include it.
>>
>> 	pcie-controller {
>> 		compatible = "nvidia,tegra20-pcie";
>> 		reg = <0x80003000 0x00000800   /* PADS registers */
>> 		       0x80003800 0x00000200>; /* extended configuration space */
>> 		interrupts = <0 98 0x04   /* controller interrupt */
>> 			      0 99 0x04>; /* MSI interrupt */
>> 		status = "disabled";
>>
>> 		ranges = <0 0 0  0x80000000 0x00001000   /* Root port 0 */
>> 			  0 1 0  0x80004000 0x00080000   /* Port 0 config space */
>> 			  0 2 0  0x80104000 0x00080000   /* Port 0 ext config space *
>> 			  0 3 0  0x80400000 0x00008000   /* Port 0 downstream I/O */
>> 			  0 4 0  0x90000000 0x08000000   /* Port 0 non-prefetchable memory */
>> 			  0 5 0  0xa0000000 0x08000000   /* Port 0 prefetchable memory */
>>
>> 			  1 0 0  0x80001000 0x00001000   /* Root port 1 */
>> 			  1 1 0  0x80004000 0x00080000   /* Port 1 config space */
>> 			  1 2 0  0x80184000 0x00080000   /* Port 1 ext config space */
>> 			  1 3 0  0x80408000 0x00010000   /* Port 1 downstream I/O */
>> 			  1 4 0  0x98000000 0x08000000   /* Port 1 non-prefetchable memory */
>> 			  1 5 0  0xa0000000 0x08000000>; /* Port 1 prefetchable memory */
>>
>> 		#address-cells = <3>;
>> 		#size-cells = <1>;
>>
>> 		pci@0 {
>> 			reg = <0 0 0 0x1000>;
>> 			status = "disabled";
>>
>> 			#address-cells = <3>;
>> 			#size-cells = <2>;
>>
>> 			ranges = <0x80000000 0 0  0 1 0  0 0x00080000   /* config */
>> 				  0x90000000 0 0  0 2 0  0 0x00080000   /* extended config */
>> 				  0x81000000 0 0  0 3 0  0 0x00008000   /* I/O */
>> 				  0x82000000 0 0  0 4 0  0 0x08000000   /* non-prefetchable memory */
>> 				  0xc2000000 0 0  0 5 0  0 0x08000000>; /* prefetchable memory */
>>
>> 			nvidia,ctrl-offset = <0x110>;
>> 			nvidia,num-lanes = <2>;
>> 		};
>>
>>
>> 		pci@1 {
>> 			reg = <1 0 0 0x1000>;
>> 			status = "disabled";
>>
>> 			#address-cells = <3>;
>> 			#size-cells = <2>;
>>
>> 			ranges = <0x80000000 0 0  1 1 0  0 0x00080000   /* config */
>> 				  0x90000000 0 0  1 2 0  0 0x00080000   /* extended config */
>> 				  0x81000000 0 0  1 3 0  0 0x00008000   /* I/O */
>> 				  0x82000000 0 0  1 4 0  0 0x08000000   /* non-prefetchable memory */
>> 				  0xc2000000 0 0  1 5 0  0 0x08000000>; /* prefetchable memory */
>>
>> 			nvidia,ctrl-offset = <0x118>;
>> 			nvidia,num-lanes = <2>;
>> 		};
>
> Everybody seems to be happy with this approach, so I'll give it a shot.
> There is one thing I'm still unsure about, though. What if somebody uses
> the above scheme and maps the registers to the wrong port. The same goes
> for the nvidia,ctrl-offset property. It needs to match the register
> offset because they are directly related. I suppose we could leave that
> property away and look up the register via the port index (which, as
> Stephen already said, we'll have to do in other places anyway, unless we
> list all bit positions in the DT).
>
> Can we safely ignore such issues and assume the device tree to always be
> right? Should we just not care if somebody uses it wrongly?


To the extent that the device tree is the definitive answer about the 
system addressing, if you get it wrong, either the system will not work 
or the OS isn't paying attention to that part of the device tree, 
hardcoding the right answer.

>
> Thierry
>

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas - June 22, 2012, 5:13 p.m.
On Fri, Jun 22, 2012 at 10:53 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Friday 22 June 2012, Bjorn Helgaas wrote:
>> The requirement (if there is one) isn't anything related to PC-ness.
>> I just don't understand how things can actually work if two host
>> bridges both claim the same bus number.  If we do a config read to
>> that bus, both bridges should claim it and turn it into config cycles
>> on their respective root buses, and we should get two responses.  I
>> would expect the second response to cause an "unexpected response"
>> machine check or similar.
>
> But each PCI domain has its own config space, so if you do a config
> read for one bus, it's always relative to the domain.

Oh, I see!  I totally missed the fact that each host bridge was in its
own domain.  If each has its own domain, then the bus number aperture
can certainly be [bus 00-ff] and there's no problem.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arnd Bergmann - June 22, 2012, 5:14 p.m.
On Friday 22 June 2012, Arnd Bergmann wrote:
> The part that is specific to PCs is that they don't have PCI domains
> normally.

Scratch that, PCs have moved on beyond that long ago, I just wasn't
aware that they also do that now. That was only true before mmconfig.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stephen Warren - June 22, 2012, 5:28 p.m.
On 06/22/2012 11:00 AM, Stephen Warren wrote:
> On 06/22/2012 05:00 AM, Thierry Reding wrote:
> ...
>> Stephen: can you try to find out whether the Tegra PCIe controller 
>> indeed implements ECAM, or if this scheme is actually just a
>> proprietary variant?
> 
> Sure. I have added this request to the bug I filed requesting more
> complete PCIe host documentation.

I've received unofficial confirmation that we do indeed implement a
non-standard/non-ECAM mapping, and what's in our driver matches our HW.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arnd Bergmann - June 22, 2012, 9:08 p.m.
On Friday 22 June 2012, Bjorn Helgaas wrote:
> 
> On Fri, Jun 22, 2012 at 10:53 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > On Friday 22 June 2012, Bjorn Helgaas wrote:
> >> The requirement (if there is one) isn't anything related to PC-ness.
> >> I just don't understand how things can actually work if two host
> >> bridges both claim the same bus number.  If we do a config read to
> >> that bus, both bridges should claim it and turn it into config cycles
> >> on their respective root buses, and we should get two responses.  I
> >> would expect the second response to cause an "unexpected response"
> >> machine check or similar.
> >
> > But each PCI domain has its own config space, so if you do a config
> > read for one bus, it's always relative to the domain.
> 
> Oh, I see!  I totally missed the fact that each host bridge was in its
> own domain.  If each has its own domain, then the bus number aperture
> can certainly be [bus 00-ff] and there's no problem.

Right. On my side, it took me a while to figure out that you can
actually have multiple root ports in one domain ;-)

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas - June 23, 2012, 9:35 p.m.
On Fri, Jun 22, 2012 at 11:28 AM, Stephen Warren <swarren@wwwdotorg.org> wrote:
> On 06/22/2012 11:00 AM, Stephen Warren wrote:
>> On 06/22/2012 05:00 AM, Thierry Reding wrote:
>> ...
>>> Stephen: can you try to find out whether the Tegra PCIe controller
>>> indeed implements ECAM, or if this scheme is actually just a
>>> proprietary variant?
>>
>> Sure. I have added this request to the bug I filed requesting more
>> complete PCIe host documentation.
>
> I've received unofficial confirmation that we do indeed implement a
> non-standard/non-ECAM mapping, and what's in our driver matches our HW.

Interesting.  A generic ECAM/MMCONFIG solution might still be worth
thinking about.  We could easily have a hook to encapsulate the
address mapping function, while still sharing the resource management
(request_region(), etc).  Not any kind of requirement for the current
patch series, of course, just possible future work.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thierry Reding - June 25, 2012, 6:34 a.m.
On Fri, Jun 22, 2012 at 11:28:15AM -0600, Stephen Warren wrote:
> On 06/22/2012 11:00 AM, Stephen Warren wrote:
> > On 06/22/2012 05:00 AM, Thierry Reding wrote:
> > ...
> >> Stephen: can you try to find out whether the Tegra PCIe controller 
> >> indeed implements ECAM, or if this scheme is actually just a
> >> proprietary variant?
> > 
> > Sure. I have added this request to the bug I filed requesting more
> > complete PCIe host documentation.
> 
> I've received unofficial confirmation that we do indeed implement a
> non-standard/non-ECAM mapping, and what's in our driver matches our HW.

What I don't quite see yet is how the extended configuration space is
supposed to work with the current driver. The PCIE_CONF_* macros don't
provide for registers >= 256. Passing a value higher than that will mess
with the device function field.

Thierry
Stephen Warren - June 26, 2012, 5:22 p.m.
On 06/25/2012 12:34 AM, Thierry Reding wrote:
> On Fri, Jun 22, 2012 at 11:28:15AM -0600, Stephen Warren wrote:
>> On 06/22/2012 11:00 AM, Stephen Warren wrote:
>>> On 06/22/2012 05:00 AM, Thierry Reding wrote: ...
>>>> Stephen: can you try to find out whether the Tegra PCIe
>>>> controller indeed implements ECAM, or if this scheme is
>>>> actually just a proprietary variant?
>>> 
>>> Sure. I have added this request to the bug I filed requesting
>>> more complete PCIe host documentation.
>> 
>> I've received unofficial confirmation that we do indeed implement
>> a non-standard/non-ECAM mapping, and what's in our driver matches
>> our HW.
> 
> What I don't quite see yet is how the extended configuration space
> is supposed to work with the current driver. The PCIE_CONF_* macros
> don't provide for registers >= 256. Passing a value higher than
> that will mess with the device function field.

The current downstream driver is incorrect for that case. We should be
updating it (at least, I filed a bug to do this).

Here's how the HW works (I believe this information should be in some
future version of the TRM):

There are 3 address spaces:

* The CPU bus.

* An internal (to the PCIe controller) 40-bit address space.
Apparently the layout is HyperTransport-based. Whether HT defines
this, or whether our HW engineers are referring to our particular
implementation of HT, I'm not sure.

* The PCIe external bus.

Accesses from the CPU to the PCIe controller's 1GB aperture are mapped
into the 40-bit bus using the BAR configurations in the PCIe
controller; I believe this is what tegra_pcie_setup_translations() is
configuring in our downstream driver.

Accesses are then mapped from this 40-bit bus to the external 32-bit
bus, I believe using a hard-coded mapping, which I believe may be
inherited from (our implementation of?) HyperTransport

For config and extended config accesses the mapping from the internal
to external bus is as follows:

In the case of PCICFG space,
addr[39:28]=12'hFDF,
addr[27:24]=4'hE means type 0 and addr[27:24]=4'hF means type 1,
addr[23:16]=bus number
addr[15:11]=device number
addr[10:8]=function number
addr[7:0]=register number

In the case of EXTCFG space,
addr[39:32]=8'hFE,
addr[31:28]=4'h0 means type 0 and addr[31:28]=4'h1 means type 1,
addr[27:24]=upper register number (i.e. register number[11:8])
addr[23:16]=bus number
addr[15:11]=device number
addr[10:8]=function number
addr[7:0]=register number (i.e. register number[7:0])

(in actual fact, the HW matches the top 16 or 12 bits to determine
config/ext-config and transaction type, so the top two fields in the
lists above should really be considered merged)

I hope this helps!
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thierry Reding - June 27, 2012, 6:19 a.m.
On Tue, Jun 26, 2012 at 11:22:32AM -0600, Stephen Warren wrote:
> On 06/25/2012 12:34 AM, Thierry Reding wrote:
> > On Fri, Jun 22, 2012 at 11:28:15AM -0600, Stephen Warren wrote:
> >> On 06/22/2012 11:00 AM, Stephen Warren wrote:
> >>> On 06/22/2012 05:00 AM, Thierry Reding wrote: ...
> >>>> Stephen: can you try to find out whether the Tegra PCIe
> >>>> controller indeed implements ECAM, or if this scheme is
> >>>> actually just a proprietary variant?
> >>> 
> >>> Sure. I have added this request to the bug I filed requesting
> >>> more complete PCIe host documentation.
> >> 
> >> I've received unofficial confirmation that we do indeed implement
> >> a non-standard/non-ECAM mapping, and what's in our driver matches
> >> our HW.
> > 
> > What I don't quite see yet is how the extended configuration space
> > is supposed to work with the current driver. The PCIE_CONF_* macros
> > don't provide for registers >= 256. Passing a value higher than
> > that will mess with the device function field.
> 
> The current downstream driver is incorrect for that case. We should be
> updating it (at least, I filed a bug to do this).
> 
> Here's how the HW works (I believe this information should be in some
> future version of the TRM):

Definitely. I'm having a hard time grasping all of this by just looking
at the code. Some sort of functional description (like what you provide
here) would be really useful.

> There are 3 address spaces:
> 
> * The CPU bus.
> 
> * An internal (to the PCIe controller) 40-bit address space.
> Apparently the layout is HyperTransport-based. Whether HT defines
> this, or whether our HW engineers are referring to our particular
> implementation of HT, I'm not sure.
> 
> * The PCIe external bus.
> 
> Accesses from the CPU to the PCIe controller's 1GB aperture are mapped
> into the 40-bit bus using the BAR configurations in the PCIe
> controller; I believe this is what tegra_pcie_setup_translations() is
> configuring in our downstream driver.
> 
> Accesses are then mapped from this 40-bit bus to the external 32-bit
> bus, I believe using a hard-coded mapping, which I believe may be
> inherited from (our implementation of?) HyperTransport

This seems to be what is referred to as the FPCI in the TRM and the
driver.

> For config and extended config accesses the mapping from the internal
> to external bus is as follows:
> 
> In the case of PCICFG space,
> addr[39:28]=12'hFDF,
> addr[27:24]=4'hE means type 0 and addr[27:24]=4'hF means type 1,
> addr[23:16]=bus number
> addr[15:11]=device number
> addr[10:8]=function number
> addr[7:0]=register number
> 
> In the case of EXTCFG space,
> addr[39:32]=8'hFE,
> addr[31:28]=4'h0 means type 0 and addr[31:28]=4'h1 means type 1,
> addr[27:24]=upper register number (i.e. register number[11:8])
> addr[23:16]=bus number
> addr[15:11]=device number
> addr[10:8]=function number
> addr[7:0]=register number (i.e. register number[7:0])
> 
> (in actual fact, the HW matches the top 16 or 12 bits to determine
> config/ext-config and transaction type, so the top two fields in the
> lists above should really be considered merged)

Yes, this actually matches with the driver. Config space accesses are
mapped to 0xfdff0000 and ext. config space accesses go to 0xfe100000.

> I hope this helps!

I don't know how this works out for what Bjorn had in mind, but at least
it'll allow the driver to be fixed for extended config space accesses.

Thierry

Patch

diff --git a/Documentation/devicetree/bindings/pci/tegra-pcie.txt b/Documentation/devicetree/bindings/pci/tegra-pcie.txt
new file mode 100644
index 0000000..10bbc40
--- /dev/null
+++ b/Documentation/devicetree/bindings/pci/tegra-pcie.txt
@@ -0,0 +1,23 @@ 
+NVIDIA Tegra PCIe controller
+
+Required properties:
+- compatible: "nvidia,tegra20-pcie"
+- reg: physical base address and length of the controller's registers
+- interrupts: the interrupt outputs of the controller
+
+Optional properties:
+- pex-clk-supply: supply voltage for internal reference clock
+- vdd-supply: power supply for controller (1.05V)
+
+Example:
+
+	pci@80000000 {
+		compatible = "nvidia,tegra20-pcie",
+		reg = <0x80000000 0x400000   /* PCIe AFI registers */
+		       0x80400000 0x010000>; /* PCI I/O space */
+		interrupts = < 0 98 0x04    /* controller interrupt */
+		               0 99 0x04 >; /* MSI interrupt */
+
+		pex-clk-supply = <&ldo0_reg>;
+		vdd-supply = <&pcie_reg>;
+	};
diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
index fa56519..b212658 100644
--- a/arch/arm/boot/dts/tegra20.dtsi
+++ b/arch/arm/boot/dts/tegra20.dtsi
@@ -199,6 +199,15 @@ 
 		#size-cells = <0>;
 	};
 
+	pci {
+		compatible = "nvidia,tegra20-pcie";
+		reg = <0x80000000 0x400000   /* PCIe AFI registers */
+		       0x80400000 0x010000>; /* PCI I/O space */
+		interrupts = < 0 98 0x04    /* controller interrupt */
+		               0 99 0x04 >; /* MSI interrupt */
+		status = "disable";
+	};
+
 	usb@c5000000 {
 		compatible = "nvidia,tegra20-ehci", "usb-ehci";
 		reg = <0xc5000000 0x4000>;
diff --git a/arch/arm/mach-tegra/pcie.c b/arch/arm/mach-tegra/pcie.c
index 3c8c953..f24923b 100644
--- a/arch/arm/mach-tegra/pcie.c
+++ b/arch/arm/mach-tegra/pcie.c
@@ -37,6 +37,8 @@ 
 #include <linux/delay.h>
 #include <linux/export.h>
 #include <linux/msi.h>
+#include <linux/of.h>
+#include <linux/regulator/consumer.h>
 
 #include <asm/sizes.h>
 #include <asm/mach/irq.h>
@@ -242,6 +244,9 @@  struct tegra_pcie_info {
 	struct clk		*pll_e;
 
 	struct tegra_pcie_msi	*msi;
+
+	struct regulator *pex_clk_supply;
+	struct regulator *vdd_supply;
 };
 
 static inline struct tegra_pcie_info *sys_to_pcie(struct pci_sys_data *sys)
@@ -1194,6 +1199,99 @@  static int tegra_pcie_disable_msi(struct platform_device *pdev)
 	return 0;
 }
 
+static int tegra_pcie_dt_init(struct platform_device *pdev)
+{
+	struct tegra_pcie_info *pcie = platform_get_drvdata(pdev);
+	int err;
+
+	if (!IS_ERR_OR_NULL(pcie->vdd_supply)) {
+		err = regulator_enable(pcie->vdd_supply);
+		if (err < 0) {
+			dev_err(&pdev->dev,
+				"failed to enable VDD regulator: %d\n", err);
+			return err;
+		}
+	}
+
+	if (!IS_ERR_OR_NULL(pcie->pex_clk_supply)) {
+		err = regulator_enable(pcie->pex_clk_supply);
+		if (err < 0) {
+			dev_err(&pdev->dev,
+				"failed to enable pex-clk regulator: %d\n",
+				err);
+			return err;
+		}
+	}
+
+	return 0;
+}
+
+static int tegra_pcie_dt_exit(struct platform_device *pdev)
+{
+	struct tegra_pcie_info *pcie = platform_get_drvdata(pdev);
+	int err;
+
+	if (!IS_ERR_OR_NULL(pcie->pex_clk_supply)) {
+		err = regulator_disable(pcie->pex_clk_supply);
+		if (err < 0) {
+			dev_err(&pdev->dev,
+				"failed to disable pex-clk regulator: %d\n",
+				err);
+			return err;
+		}
+	}
+
+	if (!IS_ERR_OR_NULL(pcie->vdd_supply)) {
+		err = regulator_disable(pcie->vdd_supply);
+		if (err < 0) {
+			dev_err(&pdev->dev,
+				"failed to disable VDD regulator: %d\n", err);
+			return err;
+		}
+	}
+
+	return 0;
+}
+
+static struct tegra_pcie_pdata *tegra_pcie_parse_dt(struct platform_device *pdev)
+{
+	struct tegra_pcie_info *pcie = platform_get_drvdata(pdev);
+	struct device_node *node = pdev->dev.of_node;
+	struct tegra_pcie_pdata *pdata;
+	unsigned int i;
+
+	pdata = devm_kzalloc(&pdev->dev, sizeof(*pdata), GFP_KERNEL);
+	if (!pdata)
+		return NULL;
+
+	pdata->init = tegra_pcie_dt_init;
+	pdata->exit = tegra_pcie_dt_exit;
+
+	if (of_find_property(node, "vdd-supply", NULL)) {
+		pcie->vdd_supply = regulator_get(&pdev->dev, "vdd");
+		if (IS_ERR_OR_NULL(pcie->vdd_supply))
+			return ERR_PTR(-EPROBE_DEFER);
+	}
+
+	if (of_find_property(node, "pex-clk-supply", NULL)) {
+		pcie->pex_clk_supply = regulator_get(&pdev->dev, "pex-clk");
+		if (IS_ERR_OR_NULL(pcie->pex_clk_supply))
+			return ERR_PTR(-EPROBE_DEFER);
+	}
+
+	for (i = 0; i < TEGRA_PCIE_MAX_PORTS; i++)
+		pdata->enable_ports[i] = true;
+
+	return pdata;
+}
+
+#ifdef CONFIG_OF
+static const struct of_device_id tegra_pcie_of_match[] = {
+	{ .compatible = "nvidia,tegra20-pcie", },
+	{ },
+};
+#endif
+
 static int __devinit tegra_pcie_probe(struct platform_device *pdev)
 {
 	struct tegra_pcie_pdata *pdata = pdev->dev.platform_data;
@@ -1202,9 +1300,6 @@  static int __devinit tegra_pcie_probe(struct platform_device *pdev)
 	struct hw_pci hw;
 	int err;
 
-	if (!pdata->enable_ports[0] && !pdata->enable_ports[1])
-		return -ENODEV;
-
 	pcie = devm_kzalloc(&pdev->dev, sizeof(*pcie), GFP_KERNEL);
 	if (!pcie)
 		return -ENOMEM;
@@ -1212,6 +1307,20 @@  static int __devinit tegra_pcie_probe(struct platform_device *pdev)
 	platform_set_drvdata(pdev, pcie);
 	pcie->dev = &pdev->dev;
 
+	if (IS_ENABLED(CONFIG_OF)) {
+		if (!pdata && pdev->dev.of_node) {
+			pdata = tegra_pcie_parse_dt(pdev);
+			if (IS_ERR(pdata))
+				return PTR_ERR(pdata);
+		}
+	}
+
+	if (!pdata)
+		return -ENODEV;
+
+	if (!pdata->enable_ports[0] && !pdata->enable_ports[1])
+		return -ENODEV;
+
 	pcibios_min_mem = 0;
 
 	err = tegra_pcie_get_resources(pdev);
@@ -1266,6 +1375,7 @@  static int __devinit tegra_pcie_probe(struct platform_device *pdev)
 
 static int __devexit tegra_pcie_remove(struct platform_device *pdev)
 {
+	struct tegra_pcie_info *pcie = platform_get_drvdata(pdev);
 	struct tegra_pcie_pdata *pdata = pdev->dev.platform_data;
 	int err;
 
@@ -1285,6 +1395,9 @@  static int __devexit tegra_pcie_remove(struct platform_device *pdev)
 			return err;
 	}
 
+	regulator_put(pcie->pex_clk_supply);
+	regulator_put(pcie->vdd_supply);
+
 	return 0;
 }
 
@@ -1292,6 +1405,7 @@  static struct platform_driver tegra_pcie_driver = {
 	.driver = {
 		.name = "tegra-pcie",
 		.owner = THIS_MODULE,
+		.of_match_table = of_match_ptr(tegra_pcie_of_match),
 	},
 	.probe = tegra_pcie_probe,
 	.remove = __devexit_p(tegra_pcie_remove),