mbox series

[RFC,0/4] Stop losing firmware-set DMA masks

Message ID cover.1531239284.git.robin.murphy@arm.com
Headers show
Series Stop losing firmware-set DMA masks | expand

Message

Robin Murphy July 10, 2018, 5:17 p.m. UTC
Whilst the common firmware code invoked by dma_configure() initialises
devices' DMA masks according to limitations described by the respective
properties ("dma-ranges" for OF and _DMA/IORT for ACPI), the nature of
the dma_set_mask() API leads to that information getting lost when
well-behaved drivers probe and set a 64-bit mask, since in general
there's no way to tell the difference between a firmware-described mask
(which should be respected) and whatever default may have come from the
bus code (which should be replaced outright). This can break DMA on
systems with certain IOMMU topologies (e.g. [1]) where the IOMMU driver
only knows its maximum supported address size, not how many of those
address bits might actually be wired up between any of its input
interfaces and the associated DMA master devices. Similarly, some PCIe
root complexes only have a 32-bit native interface on their host bridge,
which leads to the same DMA-address-truncation problem in systems with a
larger physical memory map and RAM above 4GB (e.g. [2]).

These patches attempt to deal with this in the simplest way possible by
generalising the specific quirk for 32-bit bridges into an arbitrary
mask which can then also be plumbed into the firmware code. In the
interest of being minimally invasive, I've only included a point fix
for the IOMMU issue as seen on arm64 - there may be further tweaks
needed in DMA ops to catch all possible incarnations of this problem,
but this initial RFC is mostly about the impact beyond the dma-mapping
subsystem itself.

Robin.


[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2018-May/580804.html
[2] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-December/474443.html

Robin Murphy (4):
  dma-mapping: Generalise dma_32bit_limit flag
  ACPI/IORT: Set bus DMA mask as appropriate
  of/device: Set bus DMA mask as appropriate
  iommu/dma: Respect bus DMA limit for IOVAs

 arch/x86/kernel/pci-dma.c | 2 +-
 drivers/acpi/arm64/iort.c | 1 +
 drivers/iommu/dma-iommu.c | 3 +++
 drivers/of/device.c       | 1 +
 include/linux/device.h    | 6 +++---
 kernel/dma/direct.c       | 2 +-
 6 files changed, 10 insertions(+), 5 deletions(-)

Comments

Rob Herring July 11, 2018, 2:40 p.m. UTC | #1
On Tue, Jul 10, 2018 at 12:43 PM Robin Murphy <robin.murphy@arm.com> wrote:
>
> Whilst the common firmware code invoked by dma_configure() initialises
> devices' DMA masks according to limitations described by the respective
> properties ("dma-ranges" for OF and _DMA/IORT for ACPI), the nature of
> the dma_set_mask() API leads to that information getting lost when
> well-behaved drivers probe and set a 64-bit mask, since in general
> there's no way to tell the difference between a firmware-described mask
> (which should be respected) and whatever default may have come from the
> bus code (which should be replaced outright). This can break DMA on
> systems with certain IOMMU topologies (e.g. [1]) where the IOMMU driver
> only knows its maximum supported address size, not how many of those
> address bits might actually be wired up between any of its input
> interfaces and the associated DMA master devices. Similarly, some PCIe
> root complexes only have a 32-bit native interface on their host bridge,
> which leads to the same DMA-address-truncation problem in systems with a
> larger physical memory map and RAM above 4GB (e.g. [2]).
>
> These patches attempt to deal with this in the simplest way possible by
> generalising the specific quirk for 32-bit bridges into an arbitrary
> mask which can then also be plumbed into the firmware code. In the
> interest of being minimally invasive, I've only included a point fix
> for the IOMMU issue as seen on arm64 - there may be further tweaks
> needed in DMA ops to catch all possible incarnations of this problem,
> but this initial RFC is mostly about the impact beyond the dma-mapping
> subsystem itself.

Couldn't you set and use the device's parent's dma_mask instead. At
least for DT, we should always have a parent device representing the
bus. That would avoid further bloating of struct device.

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Robin Murphy July 11, 2018, 4:03 p.m. UTC | #2
On 11/07/18 15:40, Rob Herring wrote:
> On Tue, Jul 10, 2018 at 12:43 PM Robin Murphy <robin.murphy@arm.com> wrote:
>>
>> Whilst the common firmware code invoked by dma_configure() initialises
>> devices' DMA masks according to limitations described by the respective
>> properties ("dma-ranges" for OF and _DMA/IORT for ACPI), the nature of
>> the dma_set_mask() API leads to that information getting lost when
>> well-behaved drivers probe and set a 64-bit mask, since in general
>> there's no way to tell the difference between a firmware-described mask
>> (which should be respected) and whatever default may have come from the
>> bus code (which should be replaced outright). This can break DMA on
>> systems with certain IOMMU topologies (e.g. [1]) where the IOMMU driver
>> only knows its maximum supported address size, not how many of those
>> address bits might actually be wired up between any of its input
>> interfaces and the associated DMA master devices. Similarly, some PCIe
>> root complexes only have a 32-bit native interface on their host bridge,
>> which leads to the same DMA-address-truncation problem in systems with a
>> larger physical memory map and RAM above 4GB (e.g. [2]).
>>
>> These patches attempt to deal with this in the simplest way possible by
>> generalising the specific quirk for 32-bit bridges into an arbitrary
>> mask which can then also be plumbed into the firmware code. In the
>> interest of being minimally invasive, I've only included a point fix
>> for the IOMMU issue as seen on arm64 - there may be further tweaks
>> needed in DMA ops to catch all possible incarnations of this problem,
>> but this initial RFC is mostly about the impact beyond the dma-mapping
>> subsystem itself.
> 
> Couldn't you set and use the device's parent's dma_mask instead. At
> least for DT, we should always have a parent device representing the
> bus. That would avoid further bloating of struct device.

But then if the parent device did have a non-trivial driver which calls 
dma_set_mask(), we'd be back at square 1 :/

More realistically, I don't think that's viable for ACPI, at least with 
IORT, since the memory address size limit belongs to the endpoint 
itself, thus two devices with the same nominal parent in the Linux 
device model could still have different limits (where in DT you'd have 
to have to insert intermediate simple-bus nodes to model the same 
topology with dma-ranges). Plus either way it seems somewhat fragile for 
PCI where the host bridge may be some distance up the hierarchy.

Robin.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Robin Murphy July 11, 2018, 4:56 p.m. UTC | #3
On 10/07/18 19:04, Christoph Hellwig wrote:
> On Tue, Jul 10, 2018 at 06:17:16PM +0100, Robin Murphy wrote:
>> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
>> index 8be8106270c2..95e185347e34 100644
>> --- a/kernel/dma/direct.c
>> +++ b/kernel/dma/direct.c
>> @@ -183,7 +183,7 @@ int dma_direct_supported(struct device *dev, u64 mask)
>>   	 * Various PCI/PCIe bridges have broken support for > 32bit DMA even
>>   	 * if the device itself might support it.
>>   	 */
>> -	if (dev->dma_32bit_limit && mask > DMA_BIT_MASK(32))
>> +	if (dev->bus_dma_mask && mask > dev->bus_dma_mask)
>>   		return 0;
> 
> The comment above this check needs an updated (or just be removed).

Right, I'll give it a tweak. I could also do with actually getting the 
field name correct in via_no_dac_cb()...

> Also we still have a few architectures not using dma-direct. I guess
> most were doing fine without such limits anyway, but at least arm
> will probably need an equivalent check.

Indeed, once we've found an approach that everyone's happy with we can 
have a more thorough audit of exactly where else it needs to be applied. 
FWIW I'm not aware of any 32-bit Arm systems affected by this*, but if 
they do exist then at least there's no risk of regression since they've 
always been busted.

Robin.


* Not counting the somewhat-similar StrongArm DMA controller bug where 
one bit in the *middle* of the mask is unusable. Let's keep that 
confined to the Arm dmabounce code and never speak of it...
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Robin Murphy July 11, 2018, 6:03 p.m. UTC | #4
On 10/07/18 19:04, Christoph Hellwig wrote:
> On Tue, Jul 10, 2018 at 06:17:17PM +0100, Robin Murphy wrote:
>> When an explicit DMA limit is described by firmware, we need to remember
>> it regardless of how drivers might subsequently update their devices'
>> masks. The new bus_dma_mask field does that.
> 
> Shouldn't we also stop presetting the dma mask after this?

I guess initialising the device masks here only really has any effect if 
drivers fail to set their own, so if we're getting stricter about that 
then it would make sense to stop; I'll add a couple of patches on top to 
clean that up.

Robin.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig July 12, 2018, 7:20 a.m. UTC | #5
On Wed, Jul 11, 2018 at 05:56:40PM +0100, Robin Murphy wrote:
> Indeed, once we've found an approach that everyone's happy with we can have 
> a more thorough audit of exactly where else it needs to be applied. FWIW 
> I'm not aware of any 32-bit Arm systems affected by this*, but if they do 
> exist then at least there's no risk of regression since they've always been 
> busted.

Ok, great.  I just assumed arm might be affected due to the fact that
it currently parses dma-ranges DT properties.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html