diff mbox

[2/7] dt-bindings: PCI: Describe ATS property for root complex nodes

Message ID 20170524180143.19855-3-jean-philippe.brucker@arm.com
State Changes Requested, archived
Headers show

Commit Message

Jean-Philippe Brucker May 24, 2017, 6:01 p.m. UTC
Address Translation Service (ATS) is an extension to PCIe allowing
endpoints to manage their own IOTLB, called Address Translation Cache
(ATC). Instead of having every memory transaction processed by the IOMMU,
the endpoint can first send an Address Translation Requests for an IOVA,
obtain the corresponding Physical Address from the IOMMU and store it in
its ATC. Subsequent transactions to this memory region can be performed on
the PA, in which case they are marked 'translated' and (partially) bypass
the IOMMU.

Since the extension uses fields that were previously reserved in the
PCIe Translation Layer Packet, it seems ill-advised to enabled it on a
system that doesn't fully support ATS.

To "old" root complexes that simply ignored the new AT field, an Address
Translation Request will look exactly like a Memory Read Request, so the
root bridge will forward a memory read to the IOMMU instead of a
translation request. If the access succeeds, the RC will send a Read
Completion, which looks like a Translation Completion, back to the
endpoint. As a result the endpoint might end up storing the content of
memory instead of a physical address in its ATC. In reality, it's more
likely that the size fields will be invalid and either end will detect the
error, but in any case, it is undesirable.

Add a way for firmware to tell the OS that ATS is supported by the PCI
root complex.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
---
 Documentation/devicetree/bindings/pci/pci-iommu.txt | 8 ++++++++
 1 file changed, 8 insertions(+)

Comments

Joerg Roedel May 30, 2017, 10:01 a.m. UTC | #1
On Wed, May 24, 2017 at 07:01:38PM +0100, Jean-Philippe Brucker wrote:
> +- ats-supported: if present, the root complex supports the Address
> +  Translation Service (ATS). It is able to interpret the AT field in PCIe
> +  Transaction Layer Packets, and forward Translation Completions or
> +  Invalidation Requests to endpoints.
> +
> +  Device drivers should not enable ATS in endpoints unless this property
> +  is present.

Device drivers should never enable ATS, this is the job of the IOMMU
driver. This should be clarified in the text.

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jean-Philippe Brucker May 30, 2017, 10:58 a.m. UTC | #2
On 30/05/17 11:01, Joerg Roedel wrote:
> On Wed, May 24, 2017 at 07:01:38PM +0100, Jean-Philippe Brucker wrote:
>> +- ats-supported: if present, the root complex supports the Address
>> +  Translation Service (ATS). It is able to interpret the AT field in PCIe
>> +  Transaction Layer Packets, and forward Translation Completions or
>> +  Invalidation Requests to endpoints.
>> +
>> +  Device drivers should not enable ATS in endpoints unless this property
>> +  is present.
> 
> Device drivers should never enable ATS, this is the job of the IOMMU
> driver. This should be clarified in the text.

Right, I will change this.

Thanks,
Jean

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rob Herring May 31, 2017, 5:17 p.m. UTC | #3
On Tue, May 30, 2017 at 11:58:50AM +0100, Jean-Philippe Brucker wrote:
> On 30/05/17 11:01, Joerg Roedel wrote:
> > On Wed, May 24, 2017 at 07:01:38PM +0100, Jean-Philippe Brucker wrote:
> >> +- ats-supported: if present, the root complex supports the Address
> >> +  Translation Service (ATS). It is able to interpret the AT field in PCIe
> >> +  Transaction Layer Packets, and forward Translation Completions or
> >> +  Invalidation Requests to endpoints.
> >> +
> >> +  Device drivers should not enable ATS in endpoints unless this property
> >> +  is present.
> > 
> > Device drivers should never enable ATS, this is the job of the IOMMU
> > driver. This should be clarified in the text.
> 
> Right, I will change this.

Driver details should not be in DT to begin with.

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rob Herring May 31, 2017, 5:23 p.m. UTC | #4
On Wed, May 24, 2017 at 07:01:38PM +0100, Jean-Philippe Brucker wrote:
> Address Translation Service (ATS) is an extension to PCIe allowing
> endpoints to manage their own IOTLB, called Address Translation Cache
> (ATC). Instead of having every memory transaction processed by the IOMMU,
> the endpoint can first send an Address Translation Requests for an IOVA,
> obtain the corresponding Physical Address from the IOMMU and store it in
> its ATC. Subsequent transactions to this memory region can be performed on
> the PA, in which case they are marked 'translated' and (partially) bypass
> the IOMMU.
> 
> Since the extension uses fields that were previously reserved in the
> PCIe Translation Layer Packet, it seems ill-advised to enabled it on a
> system that doesn't fully support ATS.
> 
> To "old" root complexes that simply ignored the new AT field, an Address
> Translation Request will look exactly like a Memory Read Request, so the
> root bridge will forward a memory read to the IOMMU instead of a
> translation request. If the access succeeds, the RC will send a Read
> Completion, which looks like a Translation Completion, back to the
> endpoint. As a result the endpoint might end up storing the content of
> memory instead of a physical address in its ATC. In reality, it's more
> likely that the size fields will be invalid and either end will detect the
> error, but in any case, it is undesirable.
> 
> Add a way for firmware to tell the OS that ATS is supported by the PCI
> root complex.

Can't firmware have already enabled ATS? Often for things like this, not 
present means "use firmware setting".

> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
> ---
>  Documentation/devicetree/bindings/pci/pci-iommu.txt | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/pci/pci-iommu.txt b/Documentation/devicetree/bindings/pci/pci-iommu.txt
> index 0def586fdcdf..f21a68ec471a 100644
> --- a/Documentation/devicetree/bindings/pci/pci-iommu.txt
> +++ b/Documentation/devicetree/bindings/pci/pci-iommu.txt
> @@ -44,6 +44,14 @@ Optional properties
>  - iommu-map-mask: A mask to be applied to each Requester ID prior to being
>    mapped to an IOMMU specifier per the iommu-map property.
>  
> +- ats-supported: if present, the root complex supports the Address
> +  Translation Service (ATS). It is able to interpret the AT field in PCIe
> +  Transaction Layer Packets, and forward Translation Completions or
> +  Invalidation Requests to endpoints.

Why can't this be based on the compatible strings?

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jean-Philippe Brucker June 1, 2017, 12:28 p.m. UTC | #5
On 31/05/17 18:23, Rob Herring wrote:
> On Wed, May 24, 2017 at 07:01:38PM +0100, Jean-Philippe Brucker wrote:
>> Address Translation Service (ATS) is an extension to PCIe allowing
>> endpoints to manage their own IOTLB, called Address Translation Cache
>> (ATC). Instead of having every memory transaction processed by the IOMMU,
>> the endpoint can first send an Address Translation Requests for an IOVA,
>> obtain the corresponding Physical Address from the IOMMU and store it in
>> its ATC. Subsequent transactions to this memory region can be performed on
>> the PA, in which case they are marked 'translated' and (partially) bypass
>> the IOMMU.
>>
>> Since the extension uses fields that were previously reserved in the
>> PCIe Translation Layer Packet, it seems ill-advised to enabled it on a
>> system that doesn't fully support ATS.
>>
>> To "old" root complexes that simply ignored the new AT field, an Address
>> Translation Request will look exactly like a Memory Read Request, so the
>> root bridge will forward a memory read to the IOMMU instead of a
>> translation request. If the access succeeds, the RC will send a Read
>> Completion, which looks like a Translation Completion, back to the
>> endpoint. As a result the endpoint might end up storing the content of
>> memory instead of a physical address in its ATC. In reality, it's more
>> likely that the size fields will be invalid and either end will detect the
>> error, but in any case, it is undesirable.
>>
>> Add a way for firmware to tell the OS that ATS is supported by the PCI
>> root complex.
> 
> Can't firmware have already enabled ATS? Often for things like this, not 
> present means "use firmware setting".

I don't think it's up to firmware to enable ATS in endpoints, because it
depends on IOMMU properties (e.g. configured page size). It must also be
enabled after the PASID capability, which the OS may or may not want to
enable.

While endpoints have ATS capability and config register, there is no
architected mechanism in root complexes as far as I know. So firmware may
have a mechanism outside the OS scope to toggle ATS in the root complex.
If there is a bug and firmware cannot enable ATS, then the OS must be made
aware of it, so that it doesn't enable ATS in endpoints, or else we might
end up with silent memory corruption as described above. (Lack of ATS may
slow the system down but shouldn't be fatal.)

If the SMMU supports ATS, then the root complex attached to it will most
likely supports ATS. The switch in this patch simply allows firmware to
confirm that.

>> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
>> ---
>>  Documentation/devicetree/bindings/pci/pci-iommu.txt | 8 ++++++++
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/Documentation/devicetree/bindings/pci/pci-iommu.txt b/Documentation/devicetree/bindings/pci/pci-iommu.txt
>> index 0def586fdcdf..f21a68ec471a 100644
>> --- a/Documentation/devicetree/bindings/pci/pci-iommu.txt
>> +++ b/Documentation/devicetree/bindings/pci/pci-iommu.txt
>> @@ -44,6 +44,14 @@ Optional properties
>>  - iommu-map-mask: A mask to be applied to each Requester ID prior to being
>>    mapped to an IOMMU specifier per the iommu-map property.
>>  
>> +- ats-supported: if present, the root complex supports the Address
>> +  Translation Service (ATS). It is able to interpret the AT field in PCIe
>> +  Transaction Layer Packets, and forward Translation Completions or
>> +  Invalidation Requests to endpoints.
> 
> Why can't this be based on the compatible strings?

Host controllers like the generic ECAM one should be able to advertise
ATS, if for instance the virtual IOMMU in Qemu offers a channel for ATS
invalidation. In that case we would have pci-host-{e,}cam-generic{-ats,}
compatible strings and double the number of compatible strings each time
we add a similar capability.

Thanks,
Jean
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rob Herring June 5, 2017, 5:20 p.m. UTC | #6
On Thu, Jun 01, 2017 at 01:28:01PM +0100, Jean-Philippe Brucker wrote:
> On 31/05/17 18:23, Rob Herring wrote:
> > On Wed, May 24, 2017 at 07:01:38PM +0100, Jean-Philippe Brucker wrote:
> >> Address Translation Service (ATS) is an extension to PCIe allowing
> >> endpoints to manage their own IOTLB, called Address Translation Cache
> >> (ATC). Instead of having every memory transaction processed by the IOMMU,
> >> the endpoint can first send an Address Translation Requests for an IOVA,
> >> obtain the corresponding Physical Address from the IOMMU and store it in
> >> its ATC. Subsequent transactions to this memory region can be performed on
> >> the PA, in which case they are marked 'translated' and (partially) bypass
> >> the IOMMU.
> >>
> >> Since the extension uses fields that were previously reserved in the
> >> PCIe Translation Layer Packet, it seems ill-advised to enabled it on a
> >> system that doesn't fully support ATS.
> >>
> >> To "old" root complexes that simply ignored the new AT field, an Address
> >> Translation Request will look exactly like a Memory Read Request, so the
> >> root bridge will forward a memory read to the IOMMU instead of a
> >> translation request. If the access succeeds, the RC will send a Read
> >> Completion, which looks like a Translation Completion, back to the
> >> endpoint. As a result the endpoint might end up storing the content of
> >> memory instead of a physical address in its ATC. In reality, it's more
> >> likely that the size fields will be invalid and either end will detect the
> >> error, but in any case, it is undesirable.
> >>
> >> Add a way for firmware to tell the OS that ATS is supported by the PCI
> >> root complex.
> > 
> > Can't firmware have already enabled ATS? Often for things like this, not 
> > present means "use firmware setting".
> 
> I don't think it's up to firmware to enable ATS in endpoints, because it
> depends on IOMMU properties (e.g. configured page size). It must also be
> enabled after the PASID capability, which the OS may or may not want to
> enable.
> 
> While endpoints have ATS capability and config register, there is no
> architected mechanism in root complexes as far as I know. So firmware may
> have a mechanism outside the OS scope to toggle ATS in the root complex.
> If there is a bug and firmware cannot enable ATS, then the OS must be made
> aware of it, so that it doesn't enable ATS in endpoints, or else we might
> end up with silent memory corruption as described above. (Lack of ATS may
> slow the system down but shouldn't be fatal.)
> 
> If the SMMU supports ATS, then the root complex attached to it will most
> likely supports ATS. The switch in this patch simply allows firmware to
> confirm that.
> 
> >> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
> >> ---
> >>  Documentation/devicetree/bindings/pci/pci-iommu.txt | 8 ++++++++
> >>  1 file changed, 8 insertions(+)
> >>
> >> diff --git a/Documentation/devicetree/bindings/pci/pci-iommu.txt b/Documentation/devicetree/bindings/pci/pci-iommu.txt
> >> index 0def586fdcdf..f21a68ec471a 100644
> >> --- a/Documentation/devicetree/bindings/pci/pci-iommu.txt
> >> +++ b/Documentation/devicetree/bindings/pci/pci-iommu.txt
> >> @@ -44,6 +44,14 @@ Optional properties
> >>  - iommu-map-mask: A mask to be applied to each Requester ID prior to being
> >>    mapped to an IOMMU specifier per the iommu-map property.
> >>  
> >> +- ats-supported: if present, the root complex supports the Address
> >> +  Translation Service (ATS). It is able to interpret the AT field in PCIe
> >> +  Transaction Layer Packets, and forward Translation Completions or
> >> +  Invalidation Requests to endpoints.
> > 
> > Why can't this be based on the compatible strings?
> 
> Host controllers like the generic ECAM one should be able to advertise
> ATS, if for instance the virtual IOMMU in Qemu offers a channel for ATS
> invalidation. In that case we would have pci-host-{e,}cam-generic{-ats,}
> compatible strings and double the number of compatible strings each time
> we add a similar capability.

It would not double the compatibles. A given SoC will either support ATS 
or not, right? A given compatible will imply whether ATS is supported or 
not.

pci-host-{e,}cam-generic is a special case. I'm okay with having a 
property for that I suppose. We should not require this property though 
and allow for it to be implied by the SoC specific compatible as well.

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jean-Philippe Brucker June 6, 2017, 11:11 a.m. UTC | #7
On 05/06/17 18:20, Rob Herring wrote:
> On Thu, Jun 01, 2017 at 01:28:01PM +0100, Jean-Philippe Brucker wrote:
>> On 31/05/17 18:23, Rob Herring wrote:
>>> On Wed, May 24, 2017 at 07:01:38PM +0100, Jean-Philippe Brucker wrote:
>>>> Address Translation Service (ATS) is an extension to PCIe allowing
>>>> endpoints to manage their own IOTLB, called Address Translation Cache
>>>> (ATC). Instead of having every memory transaction processed by the IOMMU,
>>>> the endpoint can first send an Address Translation Requests for an IOVA,
>>>> obtain the corresponding Physical Address from the IOMMU and store it in
>>>> its ATC. Subsequent transactions to this memory region can be performed on
>>>> the PA, in which case they are marked 'translated' and (partially) bypass
>>>> the IOMMU.
>>>>
>>>> Since the extension uses fields that were previously reserved in the
>>>> PCIe Translation Layer Packet, it seems ill-advised to enabled it on a
>>>> system that doesn't fully support ATS.
>>>>
>>>> To "old" root complexes that simply ignored the new AT field, an Address
>>>> Translation Request will look exactly like a Memory Read Request, so the
>>>> root bridge will forward a memory read to the IOMMU instead of a
>>>> translation request. If the access succeeds, the RC will send a Read
>>>> Completion, which looks like a Translation Completion, back to the
>>>> endpoint. As a result the endpoint might end up storing the content of
>>>> memory instead of a physical address in its ATC. In reality, it's more
>>>> likely that the size fields will be invalid and either end will detect the
>>>> error, but in any case, it is undesirable.
>>>>
>>>> Add a way for firmware to tell the OS that ATS is supported by the PCI
>>>> root complex.
>>>
>>> Can't firmware have already enabled ATS? Often for things like this, not 
>>> present means "use firmware setting".
>>
>> I don't think it's up to firmware to enable ATS in endpoints, because it
>> depends on IOMMU properties (e.g. configured page size). It must also be
>> enabled after the PASID capability, which the OS may or may not want to
>> enable.
>>
>> While endpoints have ATS capability and config register, there is no
>> architected mechanism in root complexes as far as I know. So firmware may
>> have a mechanism outside the OS scope to toggle ATS in the root complex.
>> If there is a bug and firmware cannot enable ATS, then the OS must be made
>> aware of it, so that it doesn't enable ATS in endpoints, or else we might
>> end up with silent memory corruption as described above. (Lack of ATS may
>> slow the system down but shouldn't be fatal.)
>>
>> If the SMMU supports ATS, then the root complex attached to it will most
>> likely supports ATS. The switch in this patch simply allows firmware to
>> confirm that.
>>
>>>> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
>>>> ---
>>>>  Documentation/devicetree/bindings/pci/pci-iommu.txt | 8 ++++++++
>>>>  1 file changed, 8 insertions(+)
>>>>
>>>> diff --git a/Documentation/devicetree/bindings/pci/pci-iommu.txt b/Documentation/devicetree/bindings/pci/pci-iommu.txt
>>>> index 0def586fdcdf..f21a68ec471a 100644
>>>> --- a/Documentation/devicetree/bindings/pci/pci-iommu.txt
>>>> +++ b/Documentation/devicetree/bindings/pci/pci-iommu.txt
>>>> @@ -44,6 +44,14 @@ Optional properties
>>>>  - iommu-map-mask: A mask to be applied to each Requester ID prior to being
>>>>    mapped to an IOMMU specifier per the iommu-map property.
>>>>  
>>>> +- ats-supported: if present, the root complex supports the Address
>>>> +  Translation Service (ATS). It is able to interpret the AT field in PCIe
>>>> +  Transaction Layer Packets, and forward Translation Completions or
>>>> +  Invalidation Requests to endpoints.
>>>
>>> Why can't this be based on the compatible strings?
>>
>> Host controllers like the generic ECAM one should be able to advertise
>> ATS, if for instance the virtual IOMMU in Qemu offers a channel for ATS
>> invalidation. In that case we would have pci-host-{e,}cam-generic{-ats,}
>> compatible strings and double the number of compatible strings each time
>> we add a similar capability.
> 
> It would not double the compatibles. A given SoC will either support ATS 
> or not, right? A given compatible will imply whether ATS is supported or 
> not.
> 
> pci-host-{e,}cam-generic is a special case. I'm okay with having a 
> property for that I suppose. We should not require this property though 
> and allow for it to be implied by the SoC specific compatible as well.

The property isn't useful for host-generic since a host like Qemu can
easily disable ATS in the IOMMU. It would only be here for consistency.
Changing IOMMU registers wouldn't be as simple for a firmware running on
real hardware.

The ats-supported property aimed to provide a mechanism identical to what
IORT provides in root complex nodes, since I have to add support for that
in Linux anyway. But it is not clear what their rationale is.

I couldn't see any more reason to add it, so I'll gladly drop the patch
and replace it with something based on compatible, if you think that the
case I described above (bug in hardware, firmware cannot enable ATS in
root complex) is superfluous.

Thanks,
Jean

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jean-Philippe Brucker June 20, 2017, 11:38 a.m. UTC | #8
On 06/06/2017 12:11 PM, Jean-Philippe Brucker wrote:
> On 05/06/17 18:20, Rob Herring wrote:
>> pci-host-{e,}cam-generic is a special case. I'm okay with having a 
>> property for that I suppose. We should not require this property though 
>> and allow for it to be implied by the SoC specific compatible as well.
> 
> The property isn't useful for host-generic since a host like Qemu can
> easily disable ATS in the IOMMU. It would only be here for consistency.
> Changing IOMMU registers wouldn't be as simple for a firmware running on
> real hardware.

Sorry I was mistaken here. AMD Seattle hardware, for instance, implements
a pure host-ecam-generic. So it's not exclusively used by Qemu. A quick
search in the kernel also brings up alpine and thunder2. So I guess I'll
move ats-supported documentation to host-generic-pci.txt.

Thanks,
Jean
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/Documentation/devicetree/bindings/pci/pci-iommu.txt b/Documentation/devicetree/bindings/pci/pci-iommu.txt
index 0def586fdcdf..f21a68ec471a 100644
--- a/Documentation/devicetree/bindings/pci/pci-iommu.txt
+++ b/Documentation/devicetree/bindings/pci/pci-iommu.txt
@@ -44,6 +44,14 @@  Optional properties
 - iommu-map-mask: A mask to be applied to each Requester ID prior to being
   mapped to an IOMMU specifier per the iommu-map property.
 
+- ats-supported: if present, the root complex supports the Address
+  Translation Service (ATS). It is able to interpret the AT field in PCIe
+  Transaction Layer Packets, and forward Translation Completions or
+  Invalidation Requests to endpoints.
+
+  Device drivers should not enable ATS in endpoints unless this property
+  is present.
+
 
 Example (1)
 ===========