mbox series

[RFC,00/17] Solve iommu probe races around iommu_fwspec

Message ID 0-v1-5f734af130a3+34f-iommu_fwspec_jgg@nvidia.com
Headers show
Series Solve iommu probe races around iommu_fwspec | expand

Message

Jason Gunthorpe Nov. 3, 2023, 4:44 p.m. UTC
This is a more complete solution that the first attempt here:
https://lore.kernel.org/r/1698825902-10685-1-git-send-email-quic_zhenhuah@quicinc.com

I haven't been able to test this on any HW that touches these paths, so if
some people with HW can help get it in shape it can become non-RFC.


The iommu subsystem uses dev->iommu to store bits of information about the
attached iommu driver. This has been co-opted by the ACPI/OF code to also
be a place to pass around the iommu_fwspec before a driver is probed.

Since both are using the same pointers without any locking it triggers
races if there is concurrent driver loading:

     CPU0                                     CPU1
of_iommu_configure()                iommu_device_register()
 ..                                   bus_iommu_probe()
  iommu_fwspec_of_xlate()              __iommu_probe_device()
                                        iommu_init_device()
   dev_iommu_get()
                                          .. ops->probe fails, no fwspec ..
                                          dev_iommu_free()
   dev->iommu->fwspec    *crash*

My first attempt get correct locking here was to use the device_lock to
protect the entire *_iommu_configure() and iommu_probe() paths. This
allowed safe use of dev->iommu within those paths. Unfortuately enough
drivers abuse the of_iommu_configure() flow without proper locking and
this approach failed.

This approach removes touches of dev->iommu from the *_iommu_configure()
code. The few remaining required touches are moved into iommu.c and
protected with the existing iommu_probe_device_lock.

To do this we change *_iommu_configure() to hold the iommu_fwspec on the
stack while it is being built. Once it is fully formed the core code will
install it into the dev->iommu when it calls probe.

This also removes all the touches of iommu_ops from
the *_iommu_configure() paths and makes that mechanism private to the
iommu core.

A few more lockdep assertions are added to discourage future mis-use.

This is on github: https://github.com/jgunthorpe/linux/commits/iommu_fwspec

Jason Gunthorpe (17):
  iommu: Remove struct iommu_ops *iommu from arch_setup_dma_ops()
  of: Do not return struct iommu_ops from of_iommu_configure()
  of: Use -ENODEV consistently in of_iommu_configure()
  acpi: Do not return struct iommu_ops from acpi_iommu_configure_id()
  iommu: Make iommu_fwspec->ids a distinct allocation
  iommu: Add iommu_fwspec_alloc/dealloc()
  iommu: Add iommu_probe_device_fwspec()
  of: Do not use dev->iommu within of_iommu_configure()
  iommu: Add iommu_fwspec_append_ids()
  acpi: Do not use dev->iommu within acpi_iommu_configure()
  iommu: Hold iommu_probe_device_lock while calling ops->of_xlate
  iommu: Make iommu_ops_from_fwnode() static
  iommu: Remove dev_iommu_fwspec_set()
  iommu: Remove pointless iommu_fwspec_free()
  iommu: Add ops->of_xlate_fwspec()
  iommu: Mark dev_iommu_get() with lockdep
  iommu: Mark dev_iommu_priv_set() with a lockdep

 arch/arc/mm/dma.c                           |   2 +-
 arch/arm/mm/dma-mapping-nommu.c             |   2 +-
 arch/arm/mm/dma-mapping.c                   |  10 +-
 arch/arm64/mm/dma-mapping.c                 |   4 +-
 arch/mips/mm/dma-noncoherent.c              |   2 +-
 arch/riscv/mm/dma-noncoherent.c             |   2 +-
 drivers/acpi/arm64/iort.c                   |  39 ++--
 drivers/acpi/scan.c                         | 104 +++++----
 drivers/acpi/viot.c                         |  44 ++--
 drivers/hv/hv_common.c                      |   2 +-
 drivers/iommu/amd/iommu.c                   |   2 -
 drivers/iommu/apple-dart.c                  |   1 -
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c |   9 +-
 drivers/iommu/arm/arm-smmu/arm-smmu.c       |  23 +-
 drivers/iommu/intel/iommu.c                 |   2 -
 drivers/iommu/iommu.c                       | 227 +++++++++++++++-----
 drivers/iommu/of_iommu.c                    | 129 +++++------
 drivers/iommu/omap-iommu.c                  |   1 -
 drivers/iommu/tegra-smmu.c                  |   1 -
 drivers/iommu/virtio-iommu.c                |   8 +-
 drivers/of/device.c                         |  24 ++-
 include/acpi/acpi_bus.h                     |   8 +-
 include/linux/acpi_iort.h                   |   3 +-
 include/linux/acpi_viot.h                   |   5 +-
 include/linux/dma-map-ops.h                 |   4 +-
 include/linux/iommu.h                       |  46 ++--
 include/linux/of_iommu.h                    |  13 +-
 27 files changed, 417 insertions(+), 300 deletions(-)


base-commit: ab41f1aafb43c2555b358147b14b4d7b8105b452

Comments

André Draszik Nov. 8, 2023, 6:34 p.m. UTC | #1
Hi Jason,

On Fri, 2023-11-03 at 13:44 -0300, Jason Gunthorpe wrote:
> This is a more complete solution that the first attempt here:
> https://lore.kernel.org/r/1698825902-10685-1-git-send-email-quic_zhenhuah@quicinc.com
> 
> I haven't been able to test this on any HW that touches these paths, so if
> some people with HW can help get it in shape it can become non-RFC.

Thank you for this series.

Please note that we're also observing this issue on 6.1.
I think this series is a good candidate for a back port (slightly complicated by
the fact that various refactors have happened since).

For me, it's working fine so far on master, and I've also done my own back port
to 6.1 and am currently testing both. An official back port once finalised
could be useful, though :-)


Cheers,
Andre'
Jason Gunthorpe Nov. 8, 2023, 7:22 p.m. UTC | #2
On Wed, Nov 08, 2023 at 06:34:58PM +0000, André Draszik wrote:

> For me, it's working fine so far on master, and I've also done my own back port
> to 6.1 and am currently testing both. An official back port once finalised
> could be useful, though :-)

Great, I'll post a non-RFC version next week (LPC permitting)

BTW, kbuild 0-day caught your note in the other email and a bunch of
other wonky stuff I've fixed on the github version.

Thanks,
Jason
Zhenhua Huang Nov. 14, 2023, 4:56 a.m. UTC | #3
Thanks Jason.

On 2023/11/4 0:44, Jason Gunthorpe wrote:
> This is a more complete solution that the first attempt here:
> https://lore.kernel.org/r/1698825902-10685-1-git-send-email-quic_zhenhuah@quicinc.com
> 
> I haven't been able to test this on any HW that touches these paths, so if
> some people with HW can help get it in shape it can become non-RFC.

Thank you for addressing it quickly with a thorough way. I have 
backported it to Android common kernel 6.1 and tested basic sanity well.
I will share these to OEMs and see if they can reproduce further, thanks.

Thanks,
Zhenhua