[RFC,v0.001] PCI: Add support for tango PCIe controller

Submitted by Mason on March 17, 2017, 4:11 p.m.

Details

Message ID ab110ce2-8678-9b34-6bd1-6cfaaa7fea04@free.fr
State Superseded
Headers show

Commit Message

Mason March 17, 2017, 4:11 p.m.
This is version 0.001 pre-alpha.

I have several questions embedded as comments in the code.
Bjorn, can you tell me if something looks really wrong,
or if this driver can be accepted in this shape?

Regards.
---
 drivers/pci/host/pcie-tango.c | 125 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 125 insertions(+)
 create mode 100644 drivers/pci/host/pcie-tango.c

Comments

Mason March 20, 2017, 2:14 p.m.
On 17/03/2017 17:11, Mason wrote:

> diff --git a/drivers/pci/host/pcie-tango.c b/drivers/pci/host/pcie-tango.c
> new file mode 100644
> index 000000000000..9d7fed75fd1d
> --- /dev/null
> +++ b/drivers/pci/host/pcie-tango.c
> @@ -0,0 +1,125 @@
> +#include <linux/pci-ecam.h>
> +
> +#define VENDOR_SIGMA	0x1105
> +
> +/*
> + * How do I reach dev->driver_data in the config accessors?
> + */
> +static void __iomem *mux;

	struct pci_config_window *cfg = bus->sysdata;
	dev_get_drvdata(cfg->parent);


/*
 * struct to hold the mappings of a config space window. This
 * is expected to be used as sysdata for PCI controllers that
 * use ECAM.
 */
struct pci_config_window {
	struct resource			res;
	struct resource			busr;
	void				*priv;
	struct pci_ecam_ops		*ops;
	union {
		void __iomem		*win;	/* 64-bit single mapping */
		void __iomem		**winp; /* 32-bit per-bus mapping */
	};
	struct device			*parent;/* ECAM res was from this dev */
};
Mason March 21, 2017, 10:15 a.m.
[ Adding iommu ML ]

On 17/03/2017 17:11, Mason wrote:

> + * QUIRK #5
> + * Only transfers within the BAR are forwarded to the host.
> + * By default, the DMA framework expects that
> + * PCI address 0x8000_0000 -> CPU address 0x8000_0000
> + * which is where DRAM0 is mapped.

I have an additional question on this topic.

In a typical system, PCI bus masters are able to access all of RAM,
unless there is some kind of IOMMU, right?

I suppose one may consider the above limitation ("Only transfers
within the BAR are forwarded to the host") as some form of weird
IOMMU? (There is, in fact, some remapping logic in the controller
setup which I haven't discussed so far.)

Since this SoC is used for TV, the media cartels mandate some way
to limit where PCI bus masters can peek/poke in RAM.

Regards.
Robin Murphy March 21, 2017, 11:36 a.m.
On 21/03/17 10:15, Mason wrote:
> [ Adding iommu ML ]
> 
> On 17/03/2017 17:11, Mason wrote:
> 
>> + * QUIRK #5
>> + * Only transfers within the BAR are forwarded to the host.
>> + * By default, the DMA framework expects that
>> + * PCI address 0x8000_0000 -> CPU address 0x8000_0000
>> + * which is where DRAM0 is mapped.
> 
> I have an additional question on this topic.
> 
> In a typical system, PCI bus masters are able to access all of RAM,
> unless there is some kind of IOMMU, right?

Typically, yes (as always, exceptions exist, but it happens that we
still need some work to actually support those cases properly).

> I suppose one may consider the above limitation ("Only transfers
> within the BAR are forwarded to the host") as some form of weird
> IOMMU? (There is, in fact, some remapping logic in the controller
> setup which I haven't discussed so far.)

Not really. If it's the 8x128MB region thing mentioned elsewhere, that's
far too coarse a granularity to be much use with the existing IOMMU API
(this vaguely reminds me of a similar discussion about programmable
interconnects ages ago). Unless it's actually got some sort of GART-type
thing or better capable of page-granularity mappings within a
significantly-sized region, I'd put that idea to bed.

> Since this SoC is used for TV, the media cartels mandate some way
> to limit where PCI bus masters can peek/poke in RAM.

FWIW, since that sounds like more of a box-ticking exercise than a real
practical concern, I'd point out that content protection is more or less
the poster child for TrustZone, and your TZASC should help tick that box
regardless.

Robin.

> 
> Regards.
>
Liviu Dudau March 21, 2017, 12:31 p.m.
On Tue, Mar 21, 2017 at 11:15:16AM +0100, Mason wrote:
> [ Adding iommu ML ]
> 
> On 17/03/2017 17:11, Mason wrote:
> 
> > + * QUIRK #5
> > + * Only transfers within the BAR are forwarded to the host.
> > + * By default, the DMA framework expects that
> > + * PCI address 0x8000_0000 -> CPU address 0x8000_0000
> > + * which is where DRAM0 is mapped.
> 
> I have an additional question on this topic.
> 
> In a typical system, PCI bus masters are able to access all of RAM,
> unless there is some kind of IOMMU, right?

Depends where the ATS (Address translation service) lives and what gets
programmed there, but usually one can restrict the RAM (or system address
space, to be more precise) that can be accessed from the PCI bus masters
by only providing PCI(e)-to-system bus mappings that make sense. Remember
that on any PCI bus you can only see addresses that are a subset of your
parent bus addresses. If the root PCI bus only allows access to a predefined
range of addresses, then that is all you can access in the system.

> 
> I suppose one may consider the above limitation ("Only transfers
> within the BAR are forwarded to the host") as some form of weird
> IOMMU? (There is, in fact, some remapping logic in the controller
> setup which I haven't discussed so far.)

Not sure I understand the quirk. What BAR are you referring to when you say
"within the BAR"? The bus master's? Are you saying that only if you write
to a PCI bus address XXXXXXX that is in the range assigned by the parent
bus then you can escape your bus hierarchy and target the system RAM? That
sounds backwards to me.

> 
> Since this SoC is used for TV, the media cartels mandate some way
> to limit where PCI bus masters can peek/poke in RAM.

And that is usually done by either having a TrustZone controller on top of
your PCI host bridge or by putting an IOMMU.

Best regards,
Liviu

> 
> Regards.
Mason March 21, 2017, 12:43 p.m.
On 21/03/2017 12:36, Robin Murphy wrote:

> On 21/03/17 10:15, Mason wrote:
>
>> I suppose one may consider the above limitation ("Only transfers
>> within the BAR are forwarded to the host") as some form of weird
>> IOMMU? (There is, in fact, some remapping logic in the controller
>> setup which I haven't discussed so far.)
> 
> Not really. If it's the 8x128MB region thing mentioned elsewhere, that's
> far too coarse a granularity to be much use with the existing IOMMU API
> (this vaguely reminds me of a similar discussion about programmable
> interconnects ages ago). Unless it's actually got some sort of GART-type
> thing or better capable of page-granularity mappings within a
> significantly-sized region, I'd put that idea to bed.

I had a feeling my use-case was too quirky for a sane API :-)

>> Since this SoC is used for TV, the media cartels mandate some way
>> to limit where PCI bus masters can peek/poke in RAM.
> 
> FWIW, since that sounds like more of a box-ticking exercise than a real
> practical concern, I'd point out that content protection is more or less
> the poster child for TrustZone, and your TZASC should help tick that box
> regardless.

Interesting. In fact, our Linux runs as the non-secure OS,
and we do have a custom "secure" OS running in parallel.

My knowledge of TZ is limited to "call this SMC to offline
that CPU", but our local TZ expert is CCed :-)

TZASC = TrustZone Address Space Controller

http://infocenter.arm.com/help/topic/com.arm.doc.ddi0431c/CHDBBGIC.html

Is there a TZASC embedded in every ARM core?
We're using Cortex-A9 MPCore r3p0 + PL310 r3p2

Regards.
Thibaud Cornic March 21, 2017, 12:53 p.m.
On 21/03/2017 13:43, Mason wrote:
> On 21/03/2017 12:36, Robin Murphy wrote:
>
>> On 21/03/17 10:15, Mason wrote:
>>
>>> I suppose one may consider the above limitation ("Only transfers
>>> within the BAR are forwarded to the host") as some form of weird
>>> IOMMU? (There is, in fact, some remapping logic in the controller
>>> setup which I haven't discussed so far.)
>> Not really. If it's the 8x128MB region thing mentioned elsewhere, that's
>> far too coarse a granularity to be much use with the existing IOMMU API
>> (this vaguely reminds me of a similar discussion about programmable
>> interconnects ages ago). Unless it's actually got some sort of GART-type
>> thing or better capable of page-granularity mappings within a
>> significantly-sized region, I'd put that idea to bed.
> I had a feeling my use-case was too quirky for a sane API :-)
>
>>> Since this SoC is used for TV, the media cartels mandate some way
>>> to limit where PCI bus masters can peek/poke in RAM.
>> FWIW, since that sounds like more of a box-ticking exercise than a real
>> practical concern, I'd point out that content protection is more or less
>> the poster child for TrustZone, and your TZASC should help tick that box
>> regardless.
> Interesting. In fact, our Linux runs as the non-secure OS,
> and we do have a custom "secure" OS running in parallel.
>
> My knowledge of TZ is limited to "call this SMC to offline
> that CPU", but our local TZ expert is CCed :-)
>
> TZASC = TrustZone Address Space Controller
>
> http://infocenter.arm.com/help/topic/com.arm.doc.ddi0431c/CHDBBGIC.html
>
> Is there a TZASC embedded in every ARM core?
> We're using Cortex-A9 MPCore r3p0 + PL310 r3p2
We don't have the TZASC, but a proprietary solution, on which the PCIe
can't be distinguished from other I/Os (e.g. SATA, USB).
But the requirement of limiting device accesses to only certain regions
of the DRAM is only for PCIe (it's special in the sense the device can
initiate accesses without any action on our chip's side).
Since we already had those 8x128MB remaps (which we now understand seem
wrong, as the root complex is not supposed to expose a BAR, but rather
forward all accesses to the internal bus), we simply put a lock on them,
and our partners were satisfied.

Regards.
Mason March 21, 2017, 1:01 p.m.
On 21/03/2017 13:31, Liviu Dudau wrote:

> On Tue, Mar 21, 2017 at 11:15:16AM +0100, Mason wrote:
>>
>> On 17/03/2017 17:11, Mason wrote:
>>
>>> + * QUIRK #5
>>> + * Only transfers within the BAR are forwarded to the host.
>>> + * By default, the DMA framework expects that
>>> + * PCI address 0x8000_0000 -> CPU address 0x8000_0000
>>> + * which is where DRAM0 is mapped.
>>
>> I have an additional question on this topic.
>>
>> In a typical system, PCI bus masters are able to access all of RAM,
>> unless there is some kind of IOMMU, right?
> 
> Depends where the ATS (Address translation service) lives and what gets
> programmed there, but usually one can restrict the RAM (or system address
> space, to be more precise) that can be accessed from the PCI bus masters
> by only providing PCI(e)-to-system bus mappings that make sense. Remember
> that on any PCI bus you can only see addresses that are a subset of your
> parent bus addresses. If the root PCI bus only allows access to a predefined
> range of addresses, then that is all you can access in the system.

Hmmm, where do I configure the range of addresses which the root
PCI bus allows access to? In the DT?

>> I suppose one may consider the above limitation ("Only transfers
>> within the BAR are forwarded to the host") as some form of weird
>> IOMMU? (There is, in fact, some remapping logic in the controller
>> setup which I haven't discussed so far.)
> 
> Not sure I understand the quirk. What BAR are you referring to when you say
> "within the BAR"? The bus master's? Are you saying that only if you write
> to a PCI bus address XXXXXXX that is in the range assigned by the parent
> bus then you can escape your bus hierarchy and target the system RAM? That
> sounds backwards to me.

My PCIe controller is b/d/f 0/0/0.
It ignores all PCI addresses that do not fall within the range
defined in BAR0 of b/d/f 0/0/0.

BAR0 has a configurable width of 2^(23+i) bytes, i < 8
It is implicitly split in 8 regions, which can map to
anywhere in CPU space. However, my understanding is that,
by default, the DMA framework expects PCI address X to
map to CPU address X...
(My understanding of that part is still a bit hazy.)

>> Since this SoC is used for TV, the media cartels mandate some way
>> to limit where PCI bus masters can peek/poke in RAM.
> 
> And that is usually done by either having a TrustZone controller on top of
> your PCI host bridge or by putting an IOMMU.

It seems we don't have this HW, therefore some kind of custom
HW quirk was required.

Regards.
Liviu Dudau March 21, 2017, 2:23 p.m.
On Tue, Mar 21, 2017 at 02:01:57PM +0100, Mason wrote:
> On 21/03/2017 13:31, Liviu Dudau wrote:
> 
> > On Tue, Mar 21, 2017 at 11:15:16AM +0100, Mason wrote:
> >>
> >> On 17/03/2017 17:11, Mason wrote:
> >>
> >>> + * QUIRK #5
> >>> + * Only transfers within the BAR are forwarded to the host.
> >>> + * By default, the DMA framework expects that
> >>> + * PCI address 0x8000_0000 -> CPU address 0x8000_0000
> >>> + * which is where DRAM0 is mapped.
> >>
> >> I have an additional question on this topic.
> >>
> >> In a typical system, PCI bus masters are able to access all of RAM,
> >> unless there is some kind of IOMMU, right?
> > 
> > Depends where the ATS (Address translation service) lives and what gets
> > programmed there, but usually one can restrict the RAM (or system address
> > space, to be more precise) that can be accessed from the PCI bus masters
> > by only providing PCI(e)-to-system bus mappings that make sense. Remember
> > that on any PCI bus you can only see addresses that are a subset of your
> > parent bus addresses. If the root PCI bus only allows access to a predefined
> > range of addresses, then that is all you can access in the system.
> 
> Hmmm, where do I configure the range of addresses which the root
> PCI bus allows access to? In the DT?

No, that should be the firmware's job. Or the secure world OS.

> 
> >> I suppose one may consider the above limitation ("Only transfers
> >> within the BAR are forwarded to the host") as some form of weird
> >> IOMMU? (There is, in fact, some remapping logic in the controller
> >> setup which I haven't discussed so far.)
> > 
> > Not sure I understand the quirk. What BAR are you referring to when you say
> > "within the BAR"? The bus master's? Are you saying that only if you write
> > to a PCI bus address XXXXXXX that is in the range assigned by the parent
> > bus then you can escape your bus hierarchy and target the system RAM? That
> > sounds backwards to me.
> 
> My PCIe controller is b/d/f 0/0/0.
> It ignores all PCI addresses that do not fall within the range
> defined in BAR0 of b/d/f 0/0/0.
> 
> BAR0 has a configurable width of 2^(23+i) bytes, i < 8
> It is implicitly split in 8 regions, which can map to
> anywhere in CPU space. However, my understanding is that,
> by default, the DMA framework expects PCI address X to
> map to CPU address X...
> (My understanding of that part is still a bit hazy.)

It looks very much like your PCIe controller (host bridge) has the configuration
registers on the wrong side of the bridge (PCIe one vs the more normal host side).
But otherwise it does look very much like an ATS system, where you program how the
PCIe bus addresses map into the system (you call them CPU) addresses. Do you also
have a way of controlling the direction of the mapping? (in other words, those 8
regions are only for translating request coming out of the PCIe bus into system
addresses, or can you also set the direct mapping of system address to PCIe address?).

As for DMA framework expectations, I'm not sure how that plays a role, unless you
have some DMA engines on the PCIe bus.

Best regards,
Liviu

> 
> >> Since this SoC is used for TV, the media cartels mandate some way
> >> to limit where PCI bus masters can peek/poke in RAM.
> > 
> > And that is usually done by either having a TrustZone controller on top of
> > your PCI host bridge or by putting an IOMMU.
> 
> It seems we don't have this HW, therefore some kind of custom
> HW quirk was required.
> 
> Regards.
Mason March 21, 2017, 4:19 p.m.
On 21/03/2017 15:23, Liviu Dudau wrote:

> On Tue, Mar 21, 2017 at 02:01:57PM +0100, Mason wrote:
>
>> My PCIe controller is b/d/f 0/0/0.
>> It ignores all PCI addresses that do not fall within the range
>> defined in BAR0 of b/d/f 0/0/0.
>>
>> BAR0 has a configurable width of 2^(23+i) bytes, i < 8
>> It is implicitly split in 8 regions, which can map to
>> anywhere in CPU space. However, my understanding is that,
>> by default, the DMA framework expects PCI address X to
>> map to CPU address X...
>> (My understanding of that part is still a bit hazy.)
> 
> It looks very much like your PCIe controller (host bridge) has the configuration
> registers on the wrong side of the bridge (PCIe one vs the more normal host side).
> But otherwise it does look very much like an ATS system, where you program how the
> PCIe bus addresses map into the system (you call them CPU) addresses. Do you also
> have a way of controlling the direction of the mapping? (in other words, those 8
> regions are only for translating request coming out of the PCIe bus into system
> addresses, or can you also set the direct mapping of system address to PCIe address?).

There are two (asymmetric) features for these mappings.

For bus-to-system accesses, the scheme described above
of "bus addresses within BAR0 fall into 1 of 8 regions,
and 8 registers define 8 target system addresses".

For system-to-bus accesses, there is a single "offset"
register which gets added to the system address, to form
the bus address. I just use a 0 offset.

(This is made slightly more complex on later chips because
someone in marketing thought it would look good on the
data sheet to support 4 GB of RAM in a SoC using 32-bit
processors. There is an additional cpu-to-foo mapping.)

> As for DMA framework expectations, I'm not sure how that plays a role, unless you
> have some DMA engines on the PCIe bus.

I think the USB3 PCIe card I'm using is a bus master, so
it's able to push data to RAM without any help from the
ARM core.

Regards.

P.S. something in your message is confusing my MUA, likely
your Mail-Followup-To field. My MUA thinks it should not
send you my reply. Is that really what you want?

Patch hide | download patch | download mbox

diff --git a/drivers/pci/host/pcie-tango.c b/drivers/pci/host/pcie-tango.c
new file mode 100644
index 000000000000..9d7fed75fd1d
--- /dev/null
+++ b/drivers/pci/host/pcie-tango.c
@@ -0,0 +1,125 @@ 
+#include <linux/pci-ecam.h>
+
+#define VENDOR_SIGMA	0x1105
+
+/*
+ * How do I reach dev->driver_data in the config accessors?
+ */
+static void __iomem *mux;
+
+static int smp8759_config_read(struct pci_bus *bus,
+		unsigned int devfn, int where, int size, u32 *val)
+{
+	int ret;
+
+	/*
+	 * QUIRK #1
+	 * Reads in configuration space outside devfn 0 return garbage.
+	 */
+	if (devfn != 0) {
+		*val = 0xffffffff; /* ~0 means "nothing here" right? */
+		return PCIBIOS_SUCCESSFUL; /* Should we return error or success? */
+	}
+
+	/*
+	 * QUIRK #2
+	 * The root complex advertizes a fake BAR, which is used to filter
+	 * bus-to-cpu requests. Hide it from Linux.
+	 */
+	if (where == PCI_BASE_ADDRESS_0 && bus->number == 0) {
+		*val = 0; /* 0 or ~0 to hide the BAR from Linux? */
+		return PCIBIOS_SUCCESSFUL; /* Should we return error or success? */
+	}
+
+	/*
+	 * QUIRK #3
+	 * Unfortunately, config and mem spaces are muxed.
+	 * Linux does not support such a setting, since drivers are free
+	 * to access mem space directly, at any time.
+	 * Therefore, we can only PRAY that config and mem space accesses
+	 * NEVER occur concurrently.
+	 */
+	writel(1, mux);
+	ret = pci_generic_config_read(bus, devfn, where, size, val);
+	writel(0, mux);
+
+	return ret;
+}
+
+static int smp8759_config_write(struct pci_bus *bus,
+		unsigned int devfn, int where, int size, u32 val)
+{
+	int ret;
+
+	writel(1, mux);
+	ret = pci_generic_config_write(bus, devfn, where, size, val);
+	writel(0, mux);
+
+	return ret;
+}
+
+static struct pci_ecam_ops smp8759_ecam_ops = {
+	.bus_shift	= 20,
+	.pci_ops	= {
+		.map_bus	= pci_ecam_map_bus,
+		.read		= smp8759_config_read,
+		.write		= smp8759_config_write,
+	}
+};
+
+static const struct of_device_id tango_pcie_ids[] = {
+	{ .compatible = "sigma,smp8759-pcie" },
+	{ /* sentinel */ },
+};
+
+static int tango_pcie_probe(struct platform_device *pdev)
+{
+	void __iomem *base;
+	struct resource *res;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
+	base = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(base))
+		return PTR_ERR(base);
+
+	mux = base + 0x48;
+
+	return pci_host_common_probe(pdev, &smp8759_ecam_ops);
+}
+
+static struct platform_driver tango_pcie_driver = {
+	.probe = tango_pcie_probe,
+	.driver = {
+		.name = KBUILD_MODNAME,
+		.of_match_table = tango_pcie_ids,
+	},
+};
+
+/*
+ * This should probably be module_platform_driver ?
+ */
+builtin_platform_driver(tango_pcie_driver);
+
+/*
+ * QUIRK #4
+ * The root complex advertizes the wrong device class.
+ * Header Type 1 are handled by PCI-to-PCI bridges.
+ */
+static void tango_fixup_class(struct pci_dev *dev)
+{
+	dev->class = PCI_CLASS_BRIDGE_PCI << 8;
+}
+DECLARE_PCI_FIXUP_EARLY(VENDOR_SIGMA, PCI_ANY_ID, tango_fixup_class);
+
+/*
+ * QUIRK #5
+ * Only transfers within the BAR are forwarded to the host.
+ * By default, the DMA framework expects that
+ * PCI address 0x8000_0000 -> CPU address 0x8000_0000
+ * which is where DRAM0 is mapped.
+ */
+static void tango_fixup_bar(struct pci_dev *dev)
+{
+        pci_write_config_dword(dev, PCI_BASE_ADDRESS_0, 0x80000000);
+}
+DECLARE_PCI_FIXUP_FINAL(VENDOR_SIGMA, PCI_ANY_ID, tango_fixup_bar);