diff mbox

[V8,07/11] iommu: of: Handle IOMMU lookup failure with deferred probing or error

Message ID 26defadf-6380-4af4-6323-b51198376bc1@codeaurora.org
State Not Applicable
Headers show

Commit Message

Sricharan Ramabadhran May 3, 2017, 10:24 a.m. UTC
Hi Robin,

On 5/3/2017 3:24 PM, Robin Murphy wrote:
> Hi Geert,
> 
> On 02/05/17 19:35, Geert Uytterhoeven wrote:
>> Hi Sricharan,
>>
>> On Fri, Feb 3, 2017 at 4:48 PM, Sricharan R <sricharan@codeaurora.org> wrote:
>>> From: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
>>>
>>> Failures to look up an IOMMU when parsing the DT iommus property need to
>>> be handled separately from the .of_xlate() failures to support deferred
>>> probing.
>>>
>>> The lack of a registered IOMMU can be caused by the lack of a driver for
>>> the IOMMU, the IOMMU device probe not having been performed yet, having
>>> been deferred, or having failed.
>>>
>>> The first case occurs when the device tree describes the bus master and
>>> IOMMU topology correctly but no device driver exists for the IOMMU yet
>>> or the device driver has not been compiled in. Return NULL, the caller
>>> will configure the device without an IOMMU.
>>>
>>> The second and third cases are handled by deferring the probe of the bus
>>> master device which will eventually get reprobed after the IOMMU.
>>>
>>> The last case is currently handled by deferring the probe of the bus
>>> master device as well. A mechanism to either configure the bus master
>>> device without an IOMMU or to fail the bus master device probe depending
>>> on whether the IOMMU is optional or mandatory would be a good
>>> enhancement.
>>>
>>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
>>> Signed-off-by: Laurent Pichart <laurent.pinchart+renesas@ideasonboard.com>
>>> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
>>
>> This patch broke Renesas R-Car Gen3 platforms in renesas-drivers.
>> As the IOMMU nodes in DT are not yet enabled, all devices having iommus
>> properties in DT now fail to probe.
> 
> How exactly do they fail to probe? Per d7b0558230e4, if there are no ops
> registered then they should merely defer until we reach the point of
> giving up and ignoring the IOMMU. Is it just that you have no other
> late-probing drivers or post-init module loads to kick the deferred
> queue after that point? I did try to find a way to explicitly kick it
> from a suitably late initcall, but there didn't seem to be any obvious
> public interface - anyone have any suggestions?
> 
> I think that's more of a general problem with the probe deferral
> mechanism itself (I've seen the same thing happen with some of the
> CoreSight stuff on Juno due to the number of inter-component
> dependencies) rather than any specific fault of this series.
> 

I was thinking of an additional check like below to avoid the
situation ?

From 499b6e662f60f23740b8880882b0a16f16434501 Mon Sep 17 00:00:00 2001
From: Sricharan R <sricharan@codeaurora.org>
Date: Wed, 3 May 2017 13:16:59 +0530
Subject: [PATCH] iommu: of: Fix check for returning EPROBE_DEFER

While returning EPROBE_DEFER for iommu masters
take in to account of iommu nodes that could be
marked in DT as 'status=disabled', in which case
simply return NULL and let the master's probe
continue rather than deferring.

Signed-off-by: Sricharan R <sricharan@codeaurora.org>
---
 drivers/iommu/of_iommu.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Sricharan Ramabadhran May 3, 2017, 11:13 a.m. UTC | #1
Hi,

On 5/3/2017 3:54 PM, Sricharan R wrote:
> Hi Robin,
> 
> On 5/3/2017 3:24 PM, Robin Murphy wrote:
>> Hi Geert,
>>
>> On 02/05/17 19:35, Geert Uytterhoeven wrote:
>>> Hi Sricharan,
>>>
>>> On Fri, Feb 3, 2017 at 4:48 PM, Sricharan R <sricharan@codeaurora.org> wrote:
>>>> From: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
>>>>
>>>> Failures to look up an IOMMU when parsing the DT iommus property need to
>>>> be handled separately from the .of_xlate() failures to support deferred
>>>> probing.
>>>>
>>>> The lack of a registered IOMMU can be caused by the lack of a driver for
>>>> the IOMMU, the IOMMU device probe not having been performed yet, having
>>>> been deferred, or having failed.
>>>>
>>>> The first case occurs when the device tree describes the bus master and
>>>> IOMMU topology correctly but no device driver exists for the IOMMU yet
>>>> or the device driver has not been compiled in. Return NULL, the caller
>>>> will configure the device without an IOMMU.
>>>>
>>>> The second and third cases are handled by deferring the probe of the bus
>>>> master device which will eventually get reprobed after the IOMMU.
>>>>
>>>> The last case is currently handled by deferring the probe of the bus
>>>> master device as well. A mechanism to either configure the bus master
>>>> device without an IOMMU or to fail the bus master device probe depending
>>>> on whether the IOMMU is optional or mandatory would be a good
>>>> enhancement.
>>>>
>>>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
>>>> Signed-off-by: Laurent Pichart <laurent.pinchart+renesas@ideasonboard.com>
>>>> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
>>>
>>> This patch broke Renesas R-Car Gen3 platforms in renesas-drivers.
>>> As the IOMMU nodes in DT are not yet enabled, all devices having iommus
>>> properties in DT now fail to probe.
>>
>> How exactly do they fail to probe? Per d7b0558230e4, if there are no ops
>> registered then they should merely defer until we reach the point of
>> giving up and ignoring the IOMMU. Is it just that you have no other
>> late-probing drivers or post-init module loads to kick the deferred
>> queue after that point? I did try to find a way to explicitly kick it
>> from a suitably late initcall, but there didn't seem to be any obvious
>> public interface - anyone have any suggestions?
>>
>> I think that's more of a general problem with the probe deferral
>> mechanism itself (I've seen the same thing happen with some of the
>> CoreSight stuff on Juno due to the number of inter-component
>> dependencies) rather than any specific fault of this series.
>>
> 
> I was thinking of an additional check like below to avoid the
> situation ?
> 
> From 499b6e662f60f23740b8880882b0a16f16434501 Mon Sep 17 00:00:00 2001
> From: Sricharan R <sricharan@codeaurora.org>
> Date: Wed, 3 May 2017 13:16:59 +0530
> Subject: [PATCH] iommu: of: Fix check for returning EPROBE_DEFER
> 
> While returning EPROBE_DEFER for iommu masters
> take in to account of iommu nodes that could be
> marked in DT as 'status=disabled', in which case
> simply return NULL and let the master's probe
> continue rather than deferring.
> 
> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
> ---
>  drivers/iommu/of_iommu.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> index 9f44ee8..e6e9bec 100644
> --- a/drivers/iommu/of_iommu.c
> +++ b/drivers/iommu/of_iommu.c
> @@ -118,6 +118,7 @@ static bool of_iommu_driver_present(struct device_node *np)
> 
>         ops = iommu_ops_from_fwnode(fwnode);
>         if ((ops && !ops->of_xlate) ||
> +           !of_device_is_available(iommu_spec->np) ||
>             (!ops && !of_iommu_driver_present(iommu_spec->np)))
>                 return NULL;
> 

While same as the other 'status=disabled' patch [1], better not to
defer the probe itself in the case ?

[1] https://patchwork.kernel.org/patch/9681211/

Regards,
 Sricharan
Will Deacon May 15, 2017, 2:22 p.m. UTC | #2
Hi Sricharan,

On Wed, May 03, 2017 at 03:54:59PM +0530, Sricharan R wrote:
> On 5/3/2017 3:24 PM, Robin Murphy wrote:
> > On 02/05/17 19:35, Geert Uytterhoeven wrote:
> >> On Fri, Feb 3, 2017 at 4:48 PM, Sricharan R <sricharan@codeaurora.org> wrote:
> >>> From: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
> >>>
> >>> Failures to look up an IOMMU when parsing the DT iommus property need to
> >>> be handled separately from the .of_xlate() failures to support deferred
> >>> probing.
> >>>
> >>> The lack of a registered IOMMU can be caused by the lack of a driver for
> >>> the IOMMU, the IOMMU device probe not having been performed yet, having
> >>> been deferred, or having failed.
> >>>
> >>> The first case occurs when the device tree describes the bus master and
> >>> IOMMU topology correctly but no device driver exists for the IOMMU yet
> >>> or the device driver has not been compiled in. Return NULL, the caller
> >>> will configure the device without an IOMMU.
> >>>
> >>> The second and third cases are handled by deferring the probe of the bus
> >>> master device which will eventually get reprobed after the IOMMU.
> >>>
> >>> The last case is currently handled by deferring the probe of the bus
> >>> master device as well. A mechanism to either configure the bus master
> >>> device without an IOMMU or to fail the bus master device probe depending
> >>> on whether the IOMMU is optional or mandatory would be a good
> >>> enhancement.
> >>>
> >>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> >>> Signed-off-by: Laurent Pichart <laurent.pinchart+renesas@ideasonboard.com>
> >>> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
> >>
> >> This patch broke Renesas R-Car Gen3 platforms in renesas-drivers.
> >> As the IOMMU nodes in DT are not yet enabled, all devices having iommus
> >> properties in DT now fail to probe.
> > 
> > How exactly do they fail to probe? Per d7b0558230e4, if there are no ops
> > registered then they should merely defer until we reach the point of
> > giving up and ignoring the IOMMU. Is it just that you have no other
> > late-probing drivers or post-init module loads to kick the deferred
> > queue after that point? I did try to find a way to explicitly kick it
> > from a suitably late initcall, but there didn't seem to be any obvious
> > public interface - anyone have any suggestions?
> > 
> > I think that's more of a general problem with the probe deferral
> > mechanism itself (I've seen the same thing happen with some of the
> > CoreSight stuff on Juno due to the number of inter-component
> > dependencies) rather than any specific fault of this series.
> > 
> 
> I was thinking of an additional check like below to avoid the
> situation ?
> 
> From 499b6e662f60f23740b8880882b0a16f16434501 Mon Sep 17 00:00:00 2001
> From: Sricharan R <sricharan@codeaurora.org>
> Date: Wed, 3 May 2017 13:16:59 +0530
> Subject: [PATCH] iommu: of: Fix check for returning EPROBE_DEFER
> 
> While returning EPROBE_DEFER for iommu masters
> take in to account of iommu nodes that could be
> marked in DT as 'status=disabled', in which case
> simply return NULL and let the master's probe
> continue rather than deferring.
> 
> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
> ---
>  drivers/iommu/of_iommu.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> index 9f44ee8..e6e9bec 100644
> --- a/drivers/iommu/of_iommu.c
> +++ b/drivers/iommu/of_iommu.c
> @@ -118,6 +118,7 @@ static bool of_iommu_driver_present(struct device_node *np)
> 
>         ops = iommu_ops_from_fwnode(fwnode);
>         if ((ops && !ops->of_xlate) ||
> +           !of_device_is_available(iommu_spec->np) ||
>             (!ops && !of_iommu_driver_present(iommu_spec->np)))
>                 return NULL;

Without this patch, v4.12-rc1 hangs on my Juno waiting to mount the root
filesystem. The problem is that the USB controller is behind an SMMU which
is marked as 'status = "disabled"' in the devicetree. Whilst there was a
separate thread with Ard about exactly what this means in terms of the DMA
ops used by upstream devices, your patch above fixes the regression and
I think should go in regardless. The DMA ops issue will likely require
an additional DT binding anyway, to advertise the behaviour of the
IOMMU when it is disabled.

Tested-by: Will Deacon <will.deacon@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>

Could you resend it as a proper patch, please?

Will
Laurent Pinchart May 15, 2017, 8:37 p.m. UTC | #3
Hi Sricharan,

On Wednesday 03 May 2017 15:54:59 Sricharan R wrote:
> On 5/3/2017 3:24 PM, Robin Murphy wrote:
> > On 02/05/17 19:35, Geert Uytterhoeven wrote:
> >> On Fri, Feb 3, 2017 at 4:48 PM, Sricharan R wrote:
> >>> From: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
> >>> 
> >>> Failures to look up an IOMMU when parsing the DT iommus property need to
> >>> be handled separately from the .of_xlate() failures to support deferred
> >>> probing.
> >>> 
> >>> The lack of a registered IOMMU can be caused by the lack of a driver for
> >>> the IOMMU, the IOMMU device probe not having been performed yet, having
> >>> been deferred, or having failed.
> >>> 
> >>> The first case occurs when the device tree describes the bus master and
> >>> IOMMU topology correctly but no device driver exists for the IOMMU yet
> >>> or the device driver has not been compiled in. Return NULL, the caller
> >>> will configure the device without an IOMMU.
> >>> 
> >>> The second and third cases are handled by deferring the probe of the bus
> >>> master device which will eventually get reprobed after the IOMMU.
> >>> 
> >>> The last case is currently handled by deferring the probe of the bus
> >>> master device as well. A mechanism to either configure the bus master
> >>> device without an IOMMU or to fail the bus master device probe depending
> >>> on whether the IOMMU is optional or mandatory would be a good
> >>> enhancement.
> >>> 
> >>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> >>> Signed-off-by: Laurent Pichart
> >>> <laurent.pinchart+renesas@ideasonboard.com>
> >>> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
> >> 
> >> This patch broke Renesas R-Car Gen3 platforms in renesas-drivers.
> >> As the IOMMU nodes in DT are not yet enabled, all devices having iommus
> >> properties in DT now fail to probe.
> > 
> > How exactly do they fail to probe? Per d7b0558230e4, if there are no ops
> > registered then they should merely defer until we reach the point of
> > giving up and ignoring the IOMMU. Is it just that you have no other
> > late-probing drivers or post-init module loads to kick the deferred
> > queue after that point? I did try to find a way to explicitly kick it
> > from a suitably late initcall, but there didn't seem to be any obvious
> > public interface - anyone have any suggestions?
> > 
> > I think that's more of a general problem with the probe deferral
> > mechanism itself (I've seen the same thing happen with some of the
> > CoreSight stuff on Juno due to the number of inter-component
> > dependencies) rather than any specific fault of this series.
> 
> I was thinking of an additional check like below to avoid the
> situation ?
> 
> From 499b6e662f60f23740b8880882b0a16f16434501 Mon Sep 17 00:00:00 2001
> From: Sricharan R <sricharan@codeaurora.org>
> Date: Wed, 3 May 2017 13:16:59 +0530
> Subject: [PATCH] iommu: of: Fix check for returning EPROBE_DEFER
> 
> While returning EPROBE_DEFER for iommu masters
> take in to account of iommu nodes that could be
> marked in DT as 'status=disabled', in which case
> simply return NULL and let the master's probe
> continue rather than deferring.
> 
> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
> ---
>  drivers/iommu/of_iommu.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> index 9f44ee8..e6e9bec 100644
> --- a/drivers/iommu/of_iommu.c
> +++ b/drivers/iommu/of_iommu.c
> @@ -118,6 +118,7 @@ static bool of_iommu_driver_present(struct device_node
> *np)
> 
>         ops = iommu_ops_from_fwnode(fwnode);
>         if ((ops && !ops->of_xlate) ||
> +           !of_device_is_available(iommu_spec->np) ||
>             (!ops && !of_iommu_driver_present(iommu_spec->np)))
>                 return NULL;

This looks good to me, but won't be enough. The ipmmu-vmsa driver in v4.12-rc1 
doesn't call iommu_device_register() and thus won't be found by 
iommu_ops_from_fwnode(). Furthermore, it doesn't IOMMU_OF_DECLARE(), and thus 
will always be considered as absent.

I agree that the ipmmu-vmsa driver needs to be fixed, but it would have been 
nice to check existing IOMMU drivers before merging this patch series...

> >> This can be fixed by either:
> >>   - Disabling CONFIG_IPMMU_VMSA, or
> >>   - Reverting commit 7b07cbefb68d486f (but keeping "int ret = 0;").
> >> 
> >> Note that this was a bit hard to investigate, as R-Car Gen3 support
> >> wasn't upstreamed yet, so bisection pointed to a merge commit.
Laurent Pinchart May 15, 2017, 9:34 p.m. UTC | #4
Hi Sricharan,

On Monday 15 May 2017 23:37:16 Laurent Pinchart wrote:
> On Wednesday 03 May 2017 15:54:59 Sricharan R wrote:
> > On 5/3/2017 3:24 PM, Robin Murphy wrote:
> >> On 02/05/17 19:35, Geert Uytterhoeven wrote:
> >>> On Fri, Feb 3, 2017 at 4:48 PM, Sricharan R wrote:
> >>>> From: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
> >>>> 
> >>>> Failures to look up an IOMMU when parsing the DT iommus property need
> >>>> to be handled separately from the .of_xlate() failures to support
> >>>> deferred probing.
> >>>> 
> >>>> The lack of a registered IOMMU can be caused by the lack of a driver
> >>>> for the IOMMU, the IOMMU device probe not having been performed yet,
> >>>> having been deferred, or having failed.
> >>>> 
> >>>> The first case occurs when the device tree describes the bus master
> >>>> and IOMMU topology correctly but no device driver exists for the IOMMU
> >>>> yet or the device driver has not been compiled in. Return NULL, the
> >>>> caller will configure the device without an IOMMU.
> >>>> 
> >>>> The second and third cases are handled by deferring the probe of the
> >>>> bus master device which will eventually get reprobed after the IOMMU.
> >>>> 
> >>>> The last case is currently handled by deferring the probe of the bus
> >>>> master device as well. A mechanism to either configure the bus master
> >>>> device without an IOMMU or to fail the bus master device probe
> >>>> depending on whether the IOMMU is optional or mandatory would be a good
> >>>> enhancement.
> >>>> 
> >>>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> >>>> Signed-off-by: Laurent Pichart
> >>>> <laurent.pinchart+renesas@ideasonboard.com>
> >>>> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
> >>> 
> >>> This patch broke Renesas R-Car Gen3 platforms in renesas-drivers.
> >>> As the IOMMU nodes in DT are not yet enabled, all devices having iommus
> >>> properties in DT now fail to probe.
> >> 
> >> How exactly do they fail to probe? Per d7b0558230e4, if there are no ops
> >> registered then they should merely defer until we reach the point of
> >> giving up and ignoring the IOMMU. Is it just that you have no other
> >> late-probing drivers or post-init module loads to kick the deferred
> >> queue after that point? I did try to find a way to explicitly kick it
> >> from a suitably late initcall, but there didn't seem to be any obvious
> >> public interface - anyone have any suggestions?
> >> 
> >> I think that's more of a general problem with the probe deferral
> >> mechanism itself (I've seen the same thing happen with some of the
> >> CoreSight stuff on Juno due to the number of inter-component
> >> dependencies) rather than any specific fault of this series.
> > 
> > I was thinking of an additional check like below to avoid the
> > situation ?
> > 
> > From 499b6e662f60f23740b8880882b0a16f16434501 Mon Sep 17 00:00:00 2001
> > From: Sricharan R <sricharan@codeaurora.org>
> > Date: Wed, 3 May 2017 13:16:59 +0530
> > Subject: [PATCH] iommu: of: Fix check for returning EPROBE_DEFER
> > 
> > While returning EPROBE_DEFER for iommu masters
> > take in to account of iommu nodes that could be
> > marked in DT as 'status=disabled', in which case
> > simply return NULL and let the master's probe
> > continue rather than deferring.
> > 
> > Signed-off-by: Sricharan R <sricharan@codeaurora.org>
> > ---
> > 
> >  drivers/iommu/of_iommu.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> > index 9f44ee8..e6e9bec 100644
> > --- a/drivers/iommu/of_iommu.c
> > +++ b/drivers/iommu/of_iommu.c
> > @@ -118,6 +118,7 @@ static bool of_iommu_driver_present(struct device_node
> > *np)
> > 
> >         ops = iommu_ops_from_fwnode(fwnode);
> >         if ((ops && !ops->of_xlate) ||
> > +           !of_device_is_available(iommu_spec->np) ||
> >             (!ops && !of_iommu_driver_present(iommu_spec->np)))
> >                 return NULL;
> 
> This looks good to me, but won't be enough. The ipmmu-vmsa driver in
> v4.12-rc1 doesn't call iommu_device_register() and thus won't be found by
> iommu_ops_from_fwnode(). Furthermore, it doesn't IOMMU_OF_DECLARE(), and
> thus will always be considered as absent.
> 
> I agree that the ipmmu-vmsa driver needs to be fixed, but it would have been
> nice to check existing IOMMU drivers before merging this patch series...

Please pardon the question, but has this patch series been tested on ARM32 ?

When the device is probed the arch_setup_dma_ops() function is called. It sets 
the device's dma_ops and the mapping (in __arm_iommu_attach_device()). If 
probe is deferred, arch_teardown_dma_ops() is called which in turn calls 
arch_teardown_dma_ops(). This removes the mapping but doesn't touch the 
dma_ops. The next time the device is probed, arch_setup_dma_ops() bails out 
immediately as the dma_ops are already set, leaving us with a device bound to 
IOMMU operations but with no mapping. This oopses later as soon as the kernel 
tries to map memory for the device through the IOMMU.

I might be missing something obvious, but I don't see how this can work.

> >>> This can be fixed by either:
> >>>   - Disabling CONFIG_IPMMU_VMSA, or
> >>>   - Reverting commit 7b07cbefb68d486f (but keeping "int ret = 0;").
> >>> 
> >>> Note that this was a bit hard to investigate, as R-Car Gen3 support
> >>> wasn't upstreamed yet, so bisection pointed to a merge commit.
Sricharan Ramabadhran May 16, 2017, 2:23 a.m. UTC | #5
Hi Laurent,

On 2017-05-16 03:04, Laurent Pinchart wrote:
> Hi Sricharan,
> 
> On Monday 15 May 2017 23:37:16 Laurent Pinchart wrote:
>> On Wednesday 03 May 2017 15:54:59 Sricharan R wrote:
>> > On 5/3/2017 3:24 PM, Robin Murphy wrote:
>> >> On 02/05/17 19:35, Geert Uytterhoeven wrote:
>> >>> On Fri, Feb 3, 2017 at 4:48 PM, Sricharan R wrote:
>> >>>> From: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
>> >>>>
>> >>>> Failures to look up an IOMMU when parsing the DT iommus property need
>> >>>> to be handled separately from the .of_xlate() failures to support
>> >>>> deferred probing.
>> >>>>
>> >>>> The lack of a registered IOMMU can be caused by the lack of a driver
>> >>>> for the IOMMU, the IOMMU device probe not having been performed yet,
>> >>>> having been deferred, or having failed.
>> >>>>
>> >>>> The first case occurs when the device tree describes the bus master
>> >>>> and IOMMU topology correctly but no device driver exists for the IOMMU
>> >>>> yet or the device driver has not been compiled in. Return NULL, the
>> >>>> caller will configure the device without an IOMMU.
>> >>>>
>> >>>> The second and third cases are handled by deferring the probe of the
>> >>>> bus master device which will eventually get reprobed after the IOMMU.
>> >>>>
>> >>>> The last case is currently handled by deferring the probe of the bus
>> >>>> master device as well. A mechanism to either configure the bus master
>> >>>> device without an IOMMU or to fail the bus master device probe
>> >>>> depending on whether the IOMMU is optional or mandatory would be a good
>> >>>> enhancement.
>> >>>>
>> >>>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
>> >>>> Signed-off-by: Laurent Pichart
>> >>>> <laurent.pinchart+renesas@ideasonboard.com>
>> >>>> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
>> >>>
>> >>> This patch broke Renesas R-Car Gen3 platforms in renesas-drivers.
>> >>> As the IOMMU nodes in DT are not yet enabled, all devices having iommus
>> >>> properties in DT now fail to probe.
>> >>
>> >> How exactly do they fail to probe? Per d7b0558230e4, if there are no ops
>> >> registered then they should merely defer until we reach the point of
>> >> giving up and ignoring the IOMMU. Is it just that you have no other
>> >> late-probing drivers or post-init module loads to kick the deferred
>> >> queue after that point? I did try to find a way to explicitly kick it
>> >> from a suitably late initcall, but there didn't seem to be any obvious
>> >> public interface - anyone have any suggestions?
>> >>
>> >> I think that's more of a general problem with the probe deferral
>> >> mechanism itself (I've seen the same thing happen with some of the
>> >> CoreSight stuff on Juno due to the number of inter-component
>> >> dependencies) rather than any specific fault of this series.
>> >
>> > I was thinking of an additional check like below to avoid the
>> > situation ?
>> >
>> > From 499b6e662f60f23740b8880882b0a16f16434501 Mon Sep 17 00:00:00 2001
>> > From: Sricharan R <sricharan@codeaurora.org>
>> > Date: Wed, 3 May 2017 13:16:59 +0530
>> > Subject: [PATCH] iommu: of: Fix check for returning EPROBE_DEFER
>> >
>> > While returning EPROBE_DEFER for iommu masters
>> > take in to account of iommu nodes that could be
>> > marked in DT as 'status=disabled', in which case
>> > simply return NULL and let the master's probe
>> > continue rather than deferring.
>> >
>> > Signed-off-by: Sricharan R <sricharan@codeaurora.org>
>> > ---
>> >
>> >  drivers/iommu/of_iommu.c | 1 +
>> >  1 file changed, 1 insertion(+)
>> >
>> > diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>> > index 9f44ee8..e6e9bec 100644
>> > --- a/drivers/iommu/of_iommu.c
>> > +++ b/drivers/iommu/of_iommu.c
>> > @@ -118,6 +118,7 @@ static bool of_iommu_driver_present(struct device_node
>> > *np)
>> >
>> >         ops = iommu_ops_from_fwnode(fwnode);
>> >         if ((ops && !ops->of_xlate) ||
>> > +           !of_device_is_available(iommu_spec->np) ||
>> >             (!ops && !of_iommu_driver_present(iommu_spec->np)))
>> >                 return NULL;
>> 
>> This looks good to me, but won't be enough. The ipmmu-vmsa driver in
>> v4.12-rc1 doesn't call iommu_device_register() and thus won't be found 
>> by
>> iommu_ops_from_fwnode(). Furthermore, it doesn't IOMMU_OF_DECLARE(), 
>> and
>> thus will always be considered as absent.
>> 
>> I agree that the ipmmu-vmsa driver needs to be fixed, but it would 
>> have been
>> nice to check existing IOMMU drivers before merging this patch 
>> series...
> 
> Please pardon the question, but has this patch series been tested on 
> ARM32 ?
> 
> When the device is probed the arch_setup_dma_ops() function is called. 
> It sets
> the device's dma_ops and the mapping (in __arm_iommu_attach_device()). 
> If
> probe is deferred, arch_teardown_dma_ops() is called which in turn 
> calls
> arch_teardown_dma_ops(). This removes the mapping but doesn't touch the
> dma_ops. The next time the device is probed, arch_setup_dma_ops() bails 
> out
> immediately as the dma_ops are already set, leaving us with a device 
> bound to
> IOMMU operations but with no mapping. This oopses later as soon as the 
> kernel
> tries to map memory for the device through the IOMMU.

Resetting the dma_ops for arm32 was added in this patch [1], which I 
missed
to send in the original series, but now have added to Russell's patch 
tracking
system.

[1] https://patchwork.kernel.org/patch/9434105/

Regards,
  Sricharan

> 
> I might be missing something obvious, but I don't see how this can 
> work.
> 
>> >>> This can be fixed by either:
>> >>>   - Disabling CONFIG_IPMMU_VMSA, or
>> >>>   - Reverting commit 7b07cbefb68d486f (but keeping "int ret = 0;").
>> >>>
>> >>> Note that this was a bit hard to investigate, as R-Car Gen3 support
>> >>> wasn't upstreamed yet, so bisection pointed to a merge commit.
Sricharan Ramabadhran May 16, 2017, 2:26 a.m. UTC | #6
Hi Will,

On 2017-05-15 19:52, Will Deacon wrote:
> Hi Sricharan,
> 
> On Wed, May 03, 2017 at 03:54:59PM +0530, Sricharan R wrote:
>> On 5/3/2017 3:24 PM, Robin Murphy wrote:
>> > On 02/05/17 19:35, Geert Uytterhoeven wrote:
>> >> On Fri, Feb 3, 2017 at 4:48 PM, Sricharan R <sricharan@codeaurora.org> wrote:
>> >>> From: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
>> >>>
>> >>> Failures to look up an IOMMU when parsing the DT iommus property need to
>> >>> be handled separately from the .of_xlate() failures to support deferred
>> >>> probing.
>> >>>
>> >>> The lack of a registered IOMMU can be caused by the lack of a driver for
>> >>> the IOMMU, the IOMMU device probe not having been performed yet, having
>> >>> been deferred, or having failed.
>> >>>
>> >>> The first case occurs when the device tree describes the bus master and
>> >>> IOMMU topology correctly but no device driver exists for the IOMMU yet
>> >>> or the device driver has not been compiled in. Return NULL, the caller
>> >>> will configure the device without an IOMMU.
>> >>>
>> >>> The second and third cases are handled by deferring the probe of the bus
>> >>> master device which will eventually get reprobed after the IOMMU.
>> >>>
>> >>> The last case is currently handled by deferring the probe of the bus
>> >>> master device as well. A mechanism to either configure the bus master
>> >>> device without an IOMMU or to fail the bus master device probe depending
>> >>> on whether the IOMMU is optional or mandatory would be a good
>> >>> enhancement.
>> >>>
>> >>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
>> >>> Signed-off-by: Laurent Pichart <laurent.pinchart+renesas@ideasonboard.com>
>> >>> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
>> >>
>> >> This patch broke Renesas R-Car Gen3 platforms in renesas-drivers.
>> >> As the IOMMU nodes in DT are not yet enabled, all devices having iommus
>> >> properties in DT now fail to probe.
>> >
>> > How exactly do they fail to probe? Per d7b0558230e4, if there are no ops
>> > registered then they should merely defer until we reach the point of
>> > giving up and ignoring the IOMMU. Is it just that you have no other
>> > late-probing drivers or post-init module loads to kick the deferred
>> > queue after that point? I did try to find a way to explicitly kick it
>> > from a suitably late initcall, but there didn't seem to be any obvious
>> > public interface - anyone have any suggestions?
>> >
>> > I think that's more of a general problem with the probe deferral
>> > mechanism itself (I've seen the same thing happen with some of the
>> > CoreSight stuff on Juno due to the number of inter-component
>> > dependencies) rather than any specific fault of this series.
>> >
>> 
>> I was thinking of an additional check like below to avoid the
>> situation ?
>> 
>> From 499b6e662f60f23740b8880882b0a16f16434501 Mon Sep 17 00:00:00 2001
>> From: Sricharan R <sricharan@codeaurora.org>
>> Date: Wed, 3 May 2017 13:16:59 +0530
>> Subject: [PATCH] iommu: of: Fix check for returning EPROBE_DEFER
>> 
>> While returning EPROBE_DEFER for iommu masters
>> take in to account of iommu nodes that could be
>> marked in DT as 'status=disabled', in which case
>> simply return NULL and let the master's probe
>> continue rather than deferring.
>> 
>> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
>> ---
>>  drivers/iommu/of_iommu.c | 1 +
>>  1 file changed, 1 insertion(+)
>> 
>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>> index 9f44ee8..e6e9bec 100644
>> --- a/drivers/iommu/of_iommu.c
>> +++ b/drivers/iommu/of_iommu.c
>> @@ -118,6 +118,7 @@ static bool of_iommu_driver_present(struct 
>> device_node *np)
>> 
>>         ops = iommu_ops_from_fwnode(fwnode);
>>         if ((ops && !ops->of_xlate) ||
>> +           !of_device_is_available(iommu_spec->np) ||
>>             (!ops && !of_iommu_driver_present(iommu_spec->np)))
>>                 return NULL;
> 
> Without this patch, v4.12-rc1 hangs on my Juno waiting to mount the 
> root
> filesystem. The problem is that the USB controller is behind an SMMU 
> which
> is marked as 'status = "disabled"' in the devicetree. Whilst there was 
> a
> separate thread with Ard about exactly what this means in terms of the 
> DMA
> ops used by upstream devices, your patch above fixes the regression and
> I think should go in regardless. The DMA ops issue will likely require
> an additional DT binding anyway, to advertise the behaviour of the
> IOMMU when it is disabled.
> 
> Tested-by: Will Deacon <will.deacon@arm.com>
> Acked-by: Will Deacon <will.deacon@arm.com>
> 
> Could you resend it as a proper patch, please?

Sure, will send this as a separate patch.

Regards,
  Sricharan
Laurent Pinchart May 16, 2017, 7:17 a.m. UTC | #7
Hi Sricharan,

On Tuesday 16 May 2017 07:53:57 sricharan@codeaurora.org wrote:
> On 2017-05-16 03:04, Laurent Pinchart wrote:
> > On Monday 15 May 2017 23:37:16 Laurent Pinchart wrote:
> >> On Wednesday 03 May 2017 15:54:59 Sricharan R wrote:
> >>> On 5/3/2017 3:24 PM, Robin Murphy wrote:
> >>>> On 02/05/17 19:35, Geert Uytterhoeven wrote:
> >>>>> On Fri, Feb 3, 2017 at 4:48 PM, Sricharan R wrote:
> >>>>>> From: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
> >>>>>> 
> >>>>>> Failures to look up an IOMMU when parsing the DT iommus property
> >>>>>> need to be handled separately from the .of_xlate() failures to
> >>>>>> support deferred probing.
> >>>>>> 
> >>>>>> The lack of a registered IOMMU can be caused by the lack of a driver
> >>>>>> for the IOMMU, the IOMMU device probe not having been performed yet,
> >>>>>> having been deferred, or having failed.
> >>>>>> 
> >>>>>> The first case occurs when the device tree describes the bus master
> >>>>>> and IOMMU topology correctly but no device driver exists for the
> >>>>>> IOMMU yet or the device driver has not been compiled in. Return NULL,
> >>>>>> the caller will configure the device without an IOMMU.
> >>>>>> 
> >>>>>> The second and third cases are handled by deferring the probe of the
> >>>>>> bus master device which will eventually get reprobed after the
> >>>>>> IOMMU.
> >>>>>> 
> >>>>>> The last case is currently handled by deferring the probe of the bus
> >>>>>> master device as well. A mechanism to either configure the bus
> >>>>>> master device without an IOMMU or to fail the bus master device probe
> >>>>>> depending on whether the IOMMU is optional or mandatory would be a
> >>>>>> good enhancement.
> >>>>>> 
> >>>>>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> >>>>>> Signed-off-by: Laurent Pichart
> >>>>>> <laurent.pinchart+renesas@ideasonboard.com>
> >>>>>> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
> >>>>> 
> >>>>> This patch broke Renesas R-Car Gen3 platforms in renesas-drivers.
> >>>>> As the IOMMU nodes in DT are not yet enabled, all devices having
> >>>>> iommus properties in DT now fail to probe.
> >>>> 
> >>>> How exactly do they fail to probe? Per d7b0558230e4, if there are no
> >>>> ops registered then they should merely defer until we reach the point
> >>>> of giving up and ignoring the IOMMU. Is it just that you have no other
> >>>> late-probing drivers or post-init module loads to kick the deferred
> >>>> queue after that point? I did try to find a way to explicitly kick it
> >>>> from a suitably late initcall, but there didn't seem to be any obvious
> >>>> public interface - anyone have any suggestions?
> >>>> 
> >>>> I think that's more of a general problem with the probe deferral
> >>>> mechanism itself (I've seen the same thing happen with some of the
> >>>> CoreSight stuff on Juno due to the number of inter-component
> >>>> dependencies) rather than any specific fault of this series.
> >>> 
> >>> I was thinking of an additional check like below to avoid the
> >>> situation ?
> >>> 
> >>> From 499b6e662f60f23740b8880882b0a16f16434501 Mon Sep 17 00:00:00 2001
> >>> From: Sricharan R <sricharan@codeaurora.org>
> >>> Date: Wed, 3 May 2017 13:16:59 +0530
> >>> Subject: [PATCH] iommu: of: Fix check for returning EPROBE_DEFER
> >>> 
> >>> While returning EPROBE_DEFER for iommu masters
> >>> take in to account of iommu nodes that could be
> >>> marked in DT as 'status=disabled', in which case
> >>> simply return NULL and let the master's probe
> >>> continue rather than deferring.
> >>> 
> >>> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
> >>> ---
> >>> 
> >>>  drivers/iommu/of_iommu.c | 1 +
> >>>  1 file changed, 1 insertion(+)
> >>> 
> >>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> >>> index 9f44ee8..e6e9bec 100644
> >>> --- a/drivers/iommu/of_iommu.c
> >>> +++ b/drivers/iommu/of_iommu.c
> >>> @@ -118,6 +118,7 @@ static bool of_iommu_driver_present(struct
> >>> device_node *np)
> >>> 
> >>>         ops = iommu_ops_from_fwnode(fwnode);
> >>>         if ((ops && !ops->of_xlate) ||
> >>> +           !of_device_is_available(iommu_spec->np) ||
> >>>             (!ops && !of_iommu_driver_present(iommu_spec->np)))
> >>>                 return NULL;
> >> 
> >> This looks good to me, but won't be enough. The ipmmu-vmsa driver in
> >> v4.12-rc1 doesn't call iommu_device_register() and thus won't be found
> >> by iommu_ops_from_fwnode(). Furthermore, it doesn't IOMMU_OF_DECLARE(),
> >> and thus will always be considered as absent.
> >> 
> >> I agree that the ipmmu-vmsa driver needs to be fixed, but it would
> >> have been nice to check existing IOMMU drivers before merging this patch
> >> series...
> > 
> > Please pardon the question, but has this patch series been tested on
> > ARM32 ?
> > 
> > When the device is probed the arch_setup_dma_ops() function is called.
> > It sets the device's dma_ops and the mapping (in
> > __arm_iommu_attach_device()). If probe is deferred,
> > arch_teardown_dma_ops() is called which in turn calls
> > arch_teardown_dma_ops(). This removes the mapping but doesn't touch the
> > dma_ops. The next time the device is probed, arch_setup_dma_ops() bails
> > out immediately as the dma_ops are already set, leaving us with a device
> > bound to IOMMU operations but with no mapping. This oopses later as soon
> > as the kernel tries to map memory for the device through the IOMMU.
> 
> Resetting the dma_ops for arm32 was added in this patch [1], which I
> missed to send in the original series, but now have added to Russell's patch
> tracking system.

Thank you. I fear that won't be enough though.

> [1] https://patchwork.kernel.org/patch/9434105/

Quoting the patch:

> arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops()
> ,dma_ops should be cleared in the teardown path. Otherwise
> this causes problem when the probe of device is retried after
> being deferred. The device's iommu structures are cleared
> after EPROBEDEFER error, but on the next try dma_ops will still
> be set to old value, which is not right.
>
> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> ---
>   arch/arm/mm/dma-mapping.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> index ab4f745..a40f03e 100644
> --- a/arch/arm/mm/dma-mapping.c
> +++ b/arch/arm/mm/dma-mapping.c
> @@ -2358,6 +2358,7 @@ static void arm_teardown_iommu_dma_ops(struct device 
*dev)
>   	__arm_iommu_detach_device(dev);
>   	arm_iommu_release_mapping(mapping);
> +	set_dma_ops(dev, NULL);
>   }
>   #else

The subject mentions arch_teardown_dma_ops(), which I think is correct, but 
the patch adds the set_dma_ops() call to arm_teardown_iommu_dma_ops().

However, the situation is perhaps more complex. Note the check at the 
beginning of arch_setup_dma_ops():

	/*
	 * Don't override the dma_ops if they have already been set. Ideally
	 * this should be the only location where dma_ops are set, remove this
	 * check when all other callers of set_dma_ops will have disappeared.
	 */
	if (dev->dma_ops)
		return;

If you set the dma_ops to NULL in arm_teardown_iommu_dma_ops() or 
arch_teardown_dma_ops(), the next call to arch_setup_dma_ops() will override 
them. To be safe you should only set them to NULL if they have been set by 
arch_setup_dma_ops(). More than that, arch_teardown_dma_ops() should probably 
not call arm_teardown_iommu_dma_ops() at all if the dma_ops were set by 
arm_iommu_attach_device() and not arch_teardown_dma_ops(). One option would be 
to add a field to struct dev_archdata to store that information. To avoid 
growing the structure, which is embedded in every struct device, you could 
possibly turn the dma_coherent bool into a bitfield.

@@ -19,7 +19,8 @@ struct dev_archdata {
 #ifdef CONFIG_XEN
 	const struct dma_map_ops *dev_dma_ops;
 #endif
-	bool dma_coherent;
+	bool dma_coherent:1;
+	bool dma_ops_setup:1;
 };
 
 struct omap_device;

I haven't checked, however, whether the dma_coherent field would need to be 
accessed atomically, so this might be a bad idea.

Last but not least, a fix must be merged in v4.12, and the sooner the better.

> > I might be missing something obvious, but I don't see how this can
> > work.
> > 
> >>>>> This can be fixed by either:
> >>>>>   - Disabling CONFIG_IPMMU_VMSA, or
> >>>>>   - Reverting commit 7b07cbefb68d486f (but keeping "int ret = 0;").
> >>>>> 
> >>>>> Note that this was a bit hard to investigate, as R-Car Gen3 support
> >>>>> wasn't upstreamed yet, so bisection pointed to a merge commit.
Sakari Ailus May 16, 2017, 9:47 a.m. UTC | #8
Hi Laurent,

On Tue, May 16, 2017 at 10:17:08AM +0300, Laurent Pinchart wrote:
> Hi Sricharan,
> 
> On Tuesday 16 May 2017 07:53:57 sricharan@codeaurora.org wrote:
> > On 2017-05-16 03:04, Laurent Pinchart wrote:
> > > On Monday 15 May 2017 23:37:16 Laurent Pinchart wrote:
> > >> On Wednesday 03 May 2017 15:54:59 Sricharan R wrote:
> > >>> On 5/3/2017 3:24 PM, Robin Murphy wrote:
> > >>>> On 02/05/17 19:35, Geert Uytterhoeven wrote:
> > >>>>> On Fri, Feb 3, 2017 at 4:48 PM, Sricharan R wrote:
> > >>>>>> From: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
> > >>>>>> 
> > >>>>>> Failures to look up an IOMMU when parsing the DT iommus property
> > >>>>>> need to be handled separately from the .of_xlate() failures to
> > >>>>>> support deferred probing.
> > >>>>>> 
> > >>>>>> The lack of a registered IOMMU can be caused by the lack of a driver
> > >>>>>> for the IOMMU, the IOMMU device probe not having been performed yet,
> > >>>>>> having been deferred, or having failed.
> > >>>>>> 
> > >>>>>> The first case occurs when the device tree describes the bus master
> > >>>>>> and IOMMU topology correctly but no device driver exists for the
> > >>>>>> IOMMU yet or the device driver has not been compiled in. Return NULL,
> > >>>>>> the caller will configure the device without an IOMMU.
> > >>>>>> 
> > >>>>>> The second and third cases are handled by deferring the probe of the
> > >>>>>> bus master device which will eventually get reprobed after the
> > >>>>>> IOMMU.
> > >>>>>> 
> > >>>>>> The last case is currently handled by deferring the probe of the bus
> > >>>>>> master device as well. A mechanism to either configure the bus
> > >>>>>> master device without an IOMMU or to fail the bus master device probe
> > >>>>>> depending on whether the IOMMU is optional or mandatory would be a
> > >>>>>> good enhancement.
> > >>>>>> 
> > >>>>>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> > >>>>>> Signed-off-by: Laurent Pichart
> > >>>>>> <laurent.pinchart+renesas@ideasonboard.com>
> > >>>>>> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
> > >>>>> 
> > >>>>> This patch broke Renesas R-Car Gen3 platforms in renesas-drivers.
> > >>>>> As the IOMMU nodes in DT are not yet enabled, all devices having
> > >>>>> iommus properties in DT now fail to probe.
> > >>>> 
> > >>>> How exactly do they fail to probe? Per d7b0558230e4, if there are no
> > >>>> ops registered then they should merely defer until we reach the point
> > >>>> of giving up and ignoring the IOMMU. Is it just that you have no other
> > >>>> late-probing drivers or post-init module loads to kick the deferred
> > >>>> queue after that point? I did try to find a way to explicitly kick it
> > >>>> from a suitably late initcall, but there didn't seem to be any obvious
> > >>>> public interface - anyone have any suggestions?
> > >>>> 
> > >>>> I think that's more of a general problem with the probe deferral
> > >>>> mechanism itself (I've seen the same thing happen with some of the
> > >>>> CoreSight stuff on Juno due to the number of inter-component
> > >>>> dependencies) rather than any specific fault of this series.
> > >>> 
> > >>> I was thinking of an additional check like below to avoid the
> > >>> situation ?
> > >>> 
> > >>> From 499b6e662f60f23740b8880882b0a16f16434501 Mon Sep 17 00:00:00 2001
> > >>> From: Sricharan R <sricharan@codeaurora.org>
> > >>> Date: Wed, 3 May 2017 13:16:59 +0530
> > >>> Subject: [PATCH] iommu: of: Fix check for returning EPROBE_DEFER
> > >>> 
> > >>> While returning EPROBE_DEFER for iommu masters
> > >>> take in to account of iommu nodes that could be
> > >>> marked in DT as 'status=disabled', in which case
> > >>> simply return NULL and let the master's probe
> > >>> continue rather than deferring.
> > >>> 
> > >>> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
> > >>> ---
> > >>> 
> > >>>  drivers/iommu/of_iommu.c | 1 +
> > >>>  1 file changed, 1 insertion(+)
> > >>> 
> > >>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> > >>> index 9f44ee8..e6e9bec 100644
> > >>> --- a/drivers/iommu/of_iommu.c
> > >>> +++ b/drivers/iommu/of_iommu.c
> > >>> @@ -118,6 +118,7 @@ static bool of_iommu_driver_present(struct
> > >>> device_node *np)
> > >>> 
> > >>>         ops = iommu_ops_from_fwnode(fwnode);
> > >>>         if ((ops && !ops->of_xlate) ||
> > >>> +           !of_device_is_available(iommu_spec->np) ||
> > >>>             (!ops && !of_iommu_driver_present(iommu_spec->np)))
> > >>>                 return NULL;
> > >> 
> > >> This looks good to me, but won't be enough. The ipmmu-vmsa driver in
> > >> v4.12-rc1 doesn't call iommu_device_register() and thus won't be found
> > >> by iommu_ops_from_fwnode(). Furthermore, it doesn't IOMMU_OF_DECLARE(),
> > >> and thus will always be considered as absent.
> > >> 
> > >> I agree that the ipmmu-vmsa driver needs to be fixed, but it would
> > >> have been nice to check existing IOMMU drivers before merging this patch
> > >> series...
> > > 
> > > Please pardon the question, but has this patch series been tested on
> > > ARM32 ?
> > > 
> > > When the device is probed the arch_setup_dma_ops() function is called.
> > > It sets the device's dma_ops and the mapping (in
> > > __arm_iommu_attach_device()). If probe is deferred,
> > > arch_teardown_dma_ops() is called which in turn calls
> > > arch_teardown_dma_ops(). This removes the mapping but doesn't touch the
> > > dma_ops. The next time the device is probed, arch_setup_dma_ops() bails
> > > out immediately as the dma_ops are already set, leaving us with a device
> > > bound to IOMMU operations but with no mapping. This oopses later as soon
> > > as the kernel tries to map memory for the device through the IOMMU.
> > 
> > Resetting the dma_ops for arm32 was added in this patch [1], which I
> > missed to send in the original series, but now have added to Russell's patch
> > tracking system.
> 
> Thank you. I fear that won't be enough though.
> 
> > [1] https://patchwork.kernel.org/patch/9434105/
> 
> Quoting the patch:
> 
> > arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops()
> > ,dma_ops should be cleared in the teardown path. Otherwise
> > this causes problem when the probe of device is retried after
> > being deferred. The device's iommu structures are cleared
> > after EPROBEDEFER error, but on the next try dma_ops will still
> > be set to old value, which is not right.
> >
> > Signed-off-by: Sricharan R <sricharan@codeaurora.org>
> > Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> > ---
> >   arch/arm/mm/dma-mapping.c | 1 +
> >   1 file changed, 1 insertion(+)
> >
> > diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> > index ab4f745..a40f03e 100644
> > --- a/arch/arm/mm/dma-mapping.c
> > +++ b/arch/arm/mm/dma-mapping.c
> > @@ -2358,6 +2358,7 @@ static void arm_teardown_iommu_dma_ops(struct device 
> *dev)
> >   	__arm_iommu_detach_device(dev);
> >   	arm_iommu_release_mapping(mapping);
> > +	set_dma_ops(dev, NULL);
> >   }
> >   #else
> 
> The subject mentions arch_teardown_dma_ops(), which I think is correct, but 
> the patch adds the set_dma_ops() call to arm_teardown_iommu_dma_ops().
> 
> However, the situation is perhaps more complex. Note the check at the 
> beginning of arch_setup_dma_ops():
> 
> 	/*
> 	 * Don't override the dma_ops if they have already been set. Ideally
> 	 * this should be the only location where dma_ops are set, remove this
> 	 * check when all other callers of set_dma_ops will have disappeared.
> 	 */
> 	if (dev->dma_ops)
> 		return;
> 
> If you set the dma_ops to NULL in arm_teardown_iommu_dma_ops() or 
> arch_teardown_dma_ops(), the next call to arch_setup_dma_ops() will override 
> them. To be safe you should only set them to NULL if they have been set by 
> arch_setup_dma_ops(). More than that, arch_teardown_dma_ops() should probably 
> not call arm_teardown_iommu_dma_ops() at all if the dma_ops were set by 
> arm_iommu_attach_device() and not arch_teardown_dma_ops(). One option would be 
> to add a field to struct dev_archdata to store that information. To avoid 
> growing the structure, which is embedded in every struct device, you could 
> possibly turn the dma_coherent bool into a bitfield.
> 
> @@ -19,7 +19,8 @@ struct dev_archdata {
>  #ifdef CONFIG_XEN
>  	const struct dma_map_ops *dev_dma_ops;
>  #endif
> -	bool dma_coherent;
> +	bool dma_coherent:1;
> +	bool dma_ops_setup:1;
>  };
>  
>  struct omap_device;
> 
> I haven't checked, however, whether the dma_coherent field would need to be 
> accessed atomically, so this might be a bad idea.

A bool bit field? :-)

I think I'd just use bool for both. I wouldn't expect dma_coherent change
once it has been set before device driver probe though.

If you like a bit field, then I'd propose making it unsigned int.
Sricharan Ramabadhran May 16, 2017, 1:40 p.m. UTC | #9
Hi Laurent,

On 2017-05-16 12:47, Laurent Pinchart wrote:
> Hi Sricharan,
> 
> On Tuesday 16 May 2017 07:53:57 sricharan@codeaurora.org wrote:
>> On 2017-05-16 03:04, Laurent Pinchart wrote:
>> > On Monday 15 May 2017 23:37:16 Laurent Pinchart wrote:
>> >> On Wednesday 03 May 2017 15:54:59 Sricharan R wrote:
>> >>> On 5/3/2017 3:24 PM, Robin Murphy wrote:
>> >>>> On 02/05/17 19:35, Geert Uytterhoeven wrote:
>> >>>>> On Fri, Feb 3, 2017 at 4:48 PM, Sricharan R wrote:
>> >>>>>> From: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
>> >>>>>>
>> >>>>>> Failures to look up an IOMMU when parsing the DT iommus property
>> >>>>>> need to be handled separately from the .of_xlate() failures to
>> >>>>>> support deferred probing.
>> >>>>>>
>> >>>>>> The lack of a registered IOMMU can be caused by the lack of a driver
>> >>>>>> for the IOMMU, the IOMMU device probe not having been performed yet,
>> >>>>>> having been deferred, or having failed.
>> >>>>>>
>> >>>>>> The first case occurs when the device tree describes the bus master
>> >>>>>> and IOMMU topology correctly but no device driver exists for the
>> >>>>>> IOMMU yet or the device driver has not been compiled in. Return NULL,
>> >>>>>> the caller will configure the device without an IOMMU.
>> >>>>>>
>> >>>>>> The second and third cases are handled by deferring the probe of the
>> >>>>>> bus master device which will eventually get reprobed after the
>> >>>>>> IOMMU.
>> >>>>>>
>> >>>>>> The last case is currently handled by deferring the probe of the bus
>> >>>>>> master device as well. A mechanism to either configure the bus
>> >>>>>> master device without an IOMMU or to fail the bus master device probe
>> >>>>>> depending on whether the IOMMU is optional or mandatory would be a
>> >>>>>> good enhancement.
>> >>>>>>
>> >>>>>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
>> >>>>>> Signed-off-by: Laurent Pichart
>> >>>>>> <laurent.pinchart+renesas@ideasonboard.com>
>> >>>>>> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
>> >>>>>
>> >>>>> This patch broke Renesas R-Car Gen3 platforms in renesas-drivers.
>> >>>>> As the IOMMU nodes in DT are not yet enabled, all devices having
>> >>>>> iommus properties in DT now fail to probe.
>> >>>>
>> >>>> How exactly do they fail to probe? Per d7b0558230e4, if there are no
>> >>>> ops registered then they should merely defer until we reach the point
>> >>>> of giving up and ignoring the IOMMU. Is it just that you have no other
>> >>>> late-probing drivers or post-init module loads to kick the deferred
>> >>>> queue after that point? I did try to find a way to explicitly kick it
>> >>>> from a suitably late initcall, but there didn't seem to be any obvious
>> >>>> public interface - anyone have any suggestions?
>> >>>>
>> >>>> I think that's more of a general problem with the probe deferral
>> >>>> mechanism itself (I've seen the same thing happen with some of the
>> >>>> CoreSight stuff on Juno due to the number of inter-component
>> >>>> dependencies) rather than any specific fault of this series.
>> >>>
>> >>> I was thinking of an additional check like below to avoid the
>> >>> situation ?
>> >>>
>> >>> From 499b6e662f60f23740b8880882b0a16f16434501 Mon Sep 17 00:00:00 2001
>> >>> From: Sricharan R <sricharan@codeaurora.org>
>> >>> Date: Wed, 3 May 2017 13:16:59 +0530
>> >>> Subject: [PATCH] iommu: of: Fix check for returning EPROBE_DEFER
>> >>>
>> >>> While returning EPROBE_DEFER for iommu masters
>> >>> take in to account of iommu nodes that could be
>> >>> marked in DT as 'status=disabled', in which case
>> >>> simply return NULL and let the master's probe
>> >>> continue rather than deferring.
>> >>>
>> >>> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
>> >>> ---
>> >>>
>> >>>  drivers/iommu/of_iommu.c | 1 +
>> >>>  1 file changed, 1 insertion(+)
>> >>>
>> >>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>> >>> index 9f44ee8..e6e9bec 100644
>> >>> --- a/drivers/iommu/of_iommu.c
>> >>> +++ b/drivers/iommu/of_iommu.c
>> >>> @@ -118,6 +118,7 @@ static bool of_iommu_driver_present(struct
>> >>> device_node *np)
>> >>>
>> >>>         ops = iommu_ops_from_fwnode(fwnode);
>> >>>         if ((ops && !ops->of_xlate) ||
>> >>> +           !of_device_is_available(iommu_spec->np) ||
>> >>>             (!ops && !of_iommu_driver_present(iommu_spec->np)))
>> >>>                 return NULL;
>> >>
>> >> This looks good to me, but won't be enough. The ipmmu-vmsa driver in
>> >> v4.12-rc1 doesn't call iommu_device_register() and thus won't be found
>> >> by iommu_ops_from_fwnode(). Furthermore, it doesn't IOMMU_OF_DECLARE(),
>> >> and thus will always be considered as absent.
>> >>
>> >> I agree that the ipmmu-vmsa driver needs to be fixed, but it would
>> >> have been nice to check existing IOMMU drivers before merging this patch
>> >> series...
>> >
>> > Please pardon the question, but has this patch series been tested on
>> > ARM32 ?
>> >
>> > When the device is probed the arch_setup_dma_ops() function is called.
>> > It sets the device's dma_ops and the mapping (in
>> > __arm_iommu_attach_device()). If probe is deferred,
>> > arch_teardown_dma_ops() is called which in turn calls
>> > arch_teardown_dma_ops(). This removes the mapping but doesn't touch the
>> > dma_ops. The next time the device is probed, arch_setup_dma_ops() bails
>> > out immediately as the dma_ops are already set, leaving us with a device
>> > bound to IOMMU operations but with no mapping. This oopses later as soon
>> > as the kernel tries to map memory for the device through the IOMMU.
>> 
>> Resetting the dma_ops for arm32 was added in this patch [1], which I
>> missed to send in the original series, but now have added to Russell's 
>> patch
>> tracking system.
> 
> Thank you. I fear that won't be enough though.
> 
>> [1] https://patchwork.kernel.org/patch/9434105/
> 
> Quoting the patch:
> 
>> arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops()
>> ,dma_ops should be cleared in the teardown path. Otherwise
>> this causes problem when the probe of device is retried after
>> being deferred. The device's iommu structures are cleared
>> after EPROBEDEFER error, but on the next try dma_ops will still
>> be set to old value, which is not right.
>> 
>> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
>> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
>> ---
>>   arch/arm/mm/dma-mapping.c | 1 +
>>   1 file changed, 1 insertion(+)
>> 
>> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
>> index ab4f745..a40f03e 100644
>> --- a/arch/arm/mm/dma-mapping.c
>> +++ b/arch/arm/mm/dma-mapping.c
>> @@ -2358,6 +2358,7 @@ static void arm_teardown_iommu_dma_ops(struct 
>> device
> *dev)
>>   	__arm_iommu_detach_device(dev);
>>   	arm_iommu_release_mapping(mapping);
>> +	set_dma_ops(dev, NULL);
>>   }
>>   #else
> 
> The subject mentions arch_teardown_dma_ops(), which I think is correct, 
> but
> the patch adds the set_dma_ops() call to arm_teardown_iommu_dma_ops().
> 
> However, the situation is perhaps more complex. Note the check at the
> beginning of arch_setup_dma_ops():
> 
> 	/*
> 	 * Don't override the dma_ops if they have already been set. Ideally
> 	 * this should be the only location where dma_ops are set, remove this
> 	 * check when all other callers of set_dma_ops will have disappeared.
> 	 */
> 	if (dev->dma_ops)
> 		return;
> 
> If you set the dma_ops to NULL in arm_teardown_iommu_dma_ops() or
> arch_teardown_dma_ops(), the next call to arch_setup_dma_ops() will 
> override
> them. To be safe you should only set them to NULL if they have been set 
> by
> arch_setup_dma_ops(). More than that, arch_teardown_dma_ops() should 
> probably
> not call arm_teardown_iommu_dma_ops() at all if the dma_ops were set by
> arm_iommu_attach_device() and not arch_teardown_dma_ops(). One option 
> would be
> to add a field to struct dev_archdata to store that information. To 
> avoid
> growing the structure, which is embedded in every struct device, you 
> could
> possibly turn the dma_coherent bool into a bitfield.
> 
> @@ -19,7 +19,8 @@ struct dev_archdata {
>  #ifdef CONFIG_XEN
>  	const struct dma_map_ops *dev_dma_ops;
>  #endif
> -	bool dma_coherent;
> +	bool dma_coherent:1;
> +	bool dma_ops_setup:1;
>  };
> 
>  struct omap_device;
> 
> I haven't checked, however, whether the dma_coherent field would need 
> to be
> accessed atomically, so this might be a bad idea.
> 
> Last but not least, a fix must be merged in v4.12, and the sooner the 
> better.
> 

ho, yet another combination. This seems to be a problem with 
exynos_iommu,
ipmmu-vmsa, mtk_iommu_v1 which calls the arm_iommu_attach_device with 
its
own custom mapping. They are calling arm_iommu_attach_device from the
add_device callback and that is not always replayed when the reprobe 
happens
and these archs are storing the old mapping data in private structures 
which
might not be cleared in the teardown path. I will post the fix that you 
have
suggested.

Regards,
  Sricharan


>> > I might be missing something obvious, but I don't see how this can
>> > work.
>> >
>> >>>>> This can be fixed by either:
>> >>>>>   - Disabling CONFIG_IPMMU_VMSA, or
>> >>>>>   - Reverting commit 7b07cbefb68d486f (but keeping "int ret = 0;").
>> >>>>>
>> >>>>> Note that this was a bit hard to investigate, as R-Car Gen3 support
>> >>>>> wasn't upstreamed yet, so bisection pointed to a merge commit.
Robin Murphy May 16, 2017, 2:04 p.m. UTC | #10
Hi Laurent,

On 16/05/17 08:17, Laurent Pinchart wrote:
> Hi Sricharan,
> 
> On Tuesday 16 May 2017 07:53:57 sricharan@codeaurora.org wrote:
>> On 2017-05-16 03:04, Laurent Pinchart wrote:
>>> On Monday 15 May 2017 23:37:16 Laurent Pinchart wrote:
>>>> On Wednesday 03 May 2017 15:54:59 Sricharan R wrote:
>>>>> On 5/3/2017 3:24 PM, Robin Murphy wrote:
>>>>>> On 02/05/17 19:35, Geert Uytterhoeven wrote:
>>>>>>> On Fri, Feb 3, 2017 at 4:48 PM, Sricharan R wrote:
>>>>>>>> From: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
>>>>>>>>
>>>>>>>> Failures to look up an IOMMU when parsing the DT iommus property
>>>>>>>> need to be handled separately from the .of_xlate() failures to
>>>>>>>> support deferred probing.
>>>>>>>>
>>>>>>>> The lack of a registered IOMMU can be caused by the lack of a driver
>>>>>>>> for the IOMMU, the IOMMU device probe not having been performed yet,
>>>>>>>> having been deferred, or having failed.
>>>>>>>>
>>>>>>>> The first case occurs when the device tree describes the bus master
>>>>>>>> and IOMMU topology correctly but no device driver exists for the
>>>>>>>> IOMMU yet or the device driver has not been compiled in. Return NULL,
>>>>>>>> the caller will configure the device without an IOMMU.
>>>>>>>>
>>>>>>>> The second and third cases are handled by deferring the probe of the
>>>>>>>> bus master device which will eventually get reprobed after the
>>>>>>>> IOMMU.
>>>>>>>>
>>>>>>>> The last case is currently handled by deferring the probe of the bus
>>>>>>>> master device as well. A mechanism to either configure the bus
>>>>>>>> master device without an IOMMU or to fail the bus master device probe
>>>>>>>> depending on whether the IOMMU is optional or mandatory would be a
>>>>>>>> good enhancement.
>>>>>>>>
>>>>>>>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
>>>>>>>> Signed-off-by: Laurent Pichart
>>>>>>>> <laurent.pinchart+renesas@ideasonboard.com>
>>>>>>>> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
>>>>>>>
>>>>>>> This patch broke Renesas R-Car Gen3 platforms in renesas-drivers.
>>>>>>> As the IOMMU nodes in DT are not yet enabled, all devices having
>>>>>>> iommus properties in DT now fail to probe.
>>>>>>
>>>>>> How exactly do they fail to probe? Per d7b0558230e4, if there are no
>>>>>> ops registered then they should merely defer until we reach the point
>>>>>> of giving up and ignoring the IOMMU. Is it just that you have no other
>>>>>> late-probing drivers or post-init module loads to kick the deferred
>>>>>> queue after that point? I did try to find a way to explicitly kick it
>>>>>> from a suitably late initcall, but there didn't seem to be any obvious
>>>>>> public interface - anyone have any suggestions?
>>>>>>
>>>>>> I think that's more of a general problem with the probe deferral
>>>>>> mechanism itself (I've seen the same thing happen with some of the
>>>>>> CoreSight stuff on Juno due to the number of inter-component
>>>>>> dependencies) rather than any specific fault of this series.
>>>>>
>>>>> I was thinking of an additional check like below to avoid the
>>>>> situation ?
>>>>>
>>>>> From 499b6e662f60f23740b8880882b0a16f16434501 Mon Sep 17 00:00:00 2001
>>>>> From: Sricharan R <sricharan@codeaurora.org>
>>>>> Date: Wed, 3 May 2017 13:16:59 +0530
>>>>> Subject: [PATCH] iommu: of: Fix check for returning EPROBE_DEFER
>>>>>
>>>>> While returning EPROBE_DEFER for iommu masters
>>>>> take in to account of iommu nodes that could be
>>>>> marked in DT as 'status=disabled', in which case
>>>>> simply return NULL and let the master's probe
>>>>> continue rather than deferring.
>>>>>
>>>>> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
>>>>> ---
>>>>>
>>>>>  drivers/iommu/of_iommu.c | 1 +
>>>>>  1 file changed, 1 insertion(+)
>>>>>
>>>>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>>>>> index 9f44ee8..e6e9bec 100644
>>>>> --- a/drivers/iommu/of_iommu.c
>>>>> +++ b/drivers/iommu/of_iommu.c
>>>>> @@ -118,6 +118,7 @@ static bool of_iommu_driver_present(struct
>>>>> device_node *np)
>>>>>
>>>>>         ops = iommu_ops_from_fwnode(fwnode);
>>>>>         if ((ops && !ops->of_xlate) ||
>>>>> +           !of_device_is_available(iommu_spec->np) ||
>>>>>             (!ops && !of_iommu_driver_present(iommu_spec->np)))
>>>>>                 return NULL;
>>>>
>>>> This looks good to me, but won't be enough. The ipmmu-vmsa driver in
>>>> v4.12-rc1 doesn't call iommu_device_register() and thus won't be found
>>>> by iommu_ops_from_fwnode(). Furthermore, it doesn't IOMMU_OF_DECLARE(),
>>>> and thus will always be considered as absent.
>>>>
>>>> I agree that the ipmmu-vmsa driver needs to be fixed, but it would
>>>> have been nice to check existing IOMMU drivers before merging this patch
>>>> series...
>>>
>>> Please pardon the question, but has this patch series been tested on
>>> ARM32 ?
>>>
>>> When the device is probed the arch_setup_dma_ops() function is called.
>>> It sets the device's dma_ops and the mapping (in
>>> __arm_iommu_attach_device()). If probe is deferred,
>>> arch_teardown_dma_ops() is called which in turn calls
>>> arch_teardown_dma_ops(). This removes the mapping but doesn't touch the
>>> dma_ops. The next time the device is probed, arch_setup_dma_ops() bails
>>> out immediately as the dma_ops are already set, leaving us with a device
>>> bound to IOMMU operations but with no mapping. This oopses later as soon
>>> as the kernel tries to map memory for the device through the IOMMU.
>>
>> Resetting the dma_ops for arm32 was added in this patch [1], which I
>> missed to send in the original series, but now have added to Russell's patch
>> tracking system.
> 
> Thank you. I fear that won't be enough though.
> 
>> [1] https://patchwork.kernel.org/patch/9434105/
> 
> Quoting the patch:
> 
>> arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops()
>> ,dma_ops should be cleared in the teardown path. Otherwise
>> this causes problem when the probe of device is retried after
>> being deferred. The device's iommu structures are cleared
>> after EPROBEDEFER error, but on the next try dma_ops will still
>> be set to old value, which is not right.
>>
>> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
>> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
>> ---
>>   arch/arm/mm/dma-mapping.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
>> index ab4f745..a40f03e 100644
>> --- a/arch/arm/mm/dma-mapping.c
>> +++ b/arch/arm/mm/dma-mapping.c
>> @@ -2358,6 +2358,7 @@ static void arm_teardown_iommu_dma_ops(struct device 
> *dev)
>>   	__arm_iommu_detach_device(dev);
>>   	arm_iommu_release_mapping(mapping);
>> +	set_dma_ops(dev, NULL);
>>   }
>>   #else
> 
> The subject mentions arch_teardown_dma_ops(), which I think is correct, but 
> the patch adds the set_dma_ops() call to arm_teardown_iommu_dma_ops().
> 
> However, the situation is perhaps more complex. Note the check at the 
> beginning of arch_setup_dma_ops():
> 
> 	/*
> 	 * Don't override the dma_ops if they have already been set. Ideally
> 	 * this should be the only location where dma_ops are set, remove this
> 	 * check when all other callers of set_dma_ops will have disappeared.
> 	 */
> 	if (dev->dma_ops)
> 		return;
> 
> If you set the dma_ops to NULL in arm_teardown_iommu_dma_ops() or 
> arch_teardown_dma_ops(), the next call to arch_setup_dma_ops() will override 
> them. To be safe you should only set them to NULL if they have been set by 
> arch_setup_dma_ops(). More than that, arch_teardown_dma_ops() should probably 
> not call arm_teardown_iommu_dma_ops() at all if the dma_ops were set by 
> arm_iommu_attach_device() and not arch_teardown_dma_ops().

Under what circumstances is that an issue? We'll only be tearing down
the DMA ops when unbinding the driver, and I think it would be erroneous
to expect the device to retain much state after that. Everything else
would be set up from scratch again if it get reprobed later, so why not
the DMA ops?

Robin.

> One option would be 
> to add a field to struct dev_archdata to store that information. To avoid 
> growing the structure, which is embedded in every struct device, you could 
> possibly turn the dma_coherent bool into a bitfield.
> 
> @@ -19,7 +19,8 @@ struct dev_archdata {
>  #ifdef CONFIG_XEN
>  	const struct dma_map_ops *dev_dma_ops;
>  #endif
> -	bool dma_coherent;
> +	bool dma_coherent:1;
> +	bool dma_ops_setup:1;
>  };
>  
>  struct omap_device;
> 
> I haven't checked, however, whether the dma_coherent field would need to be 
> accessed atomically, so this might be a bad idea.
> 
> Last but not least, a fix must be merged in v4.12, and the sooner the better.
> 
>>> I might be missing something obvious, but I don't see how this can
>>> work.
>>>
>>>>>>> This can be fixed by either:
>>>>>>>   - Disabling CONFIG_IPMMU_VMSA, or
>>>>>>>   - Reverting commit 7b07cbefb68d486f (but keeping "int ret = 0;").
>>>>>>>
>>>>>>> Note that this was a bit hard to investigate, as R-Car Gen3 support
>>>>>>> wasn't upstreamed yet, so bisection pointed to a merge commit.
>
Laurent Pinchart May 16, 2017, 2:06 p.m. UTC | #11
Hi Sricharan,

On Tuesday 16 May 2017 19:10:03 sricharan@codeaurora.org wrote:
> On 2017-05-16 12:47, Laurent Pinchart wrote:
> > On Tuesday 16 May 2017 07:53:57 sricharan@codeaurora.org wrote:
> >> On 2017-05-16 03:04, Laurent Pinchart wrote:
> >>> On Monday 15 May 2017 23:37:16 Laurent Pinchart wrote:
> >>>> On Wednesday 03 May 2017 15:54:59 Sricharan R wrote:
> >>>>> On 5/3/2017 3:24 PM, Robin Murphy wrote:
> >>>>>> On 02/05/17 19:35, Geert Uytterhoeven wrote:
> >>>>>>> On Fri, Feb 3, 2017 at 4:48 PM, Sricharan R wrote:
> >>>>>>>> From: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
> >>>>>>>> 
> >>>>>>>> Failures to look up an IOMMU when parsing the DT iommus property
> >>>>>>>> need to be handled separately from the .of_xlate() failures to
> >>>>>>>> support deferred probing.
> >>>>>>>> 
> >>>>>>>> The lack of a registered IOMMU can be caused by the lack of a
> >>>>>>>> driver for the IOMMU, the IOMMU device probe not having been
> >>>>>>>> performed yet, having been deferred, or having failed.
> >>>>>>>> 
> >>>>>>>> The first case occurs when the device tree describes the bus
> >>>>>>>> master and IOMMU topology correctly but no device driver exists for
> >>>>>>>> the IOMMU yet or the device driver has not been compiled in. Return
> >>>>>>>> NULL, the caller will configure the device without an IOMMU.
> >>>>>>>> 
> >>>>>>>> The second and third cases are handled by deferring the probe of
> >>>>>>>> the bus master device which will eventually get reprobed after the
> >>>>>>>> IOMMU.
> >>>>>>>> 
> >>>>>>>> The last case is currently handled by deferring the probe of the
> >>>>>>>> bus master device as well. A mechanism to either configure the bus
> >>>>>>>> master device without an IOMMU or to fail the bus master device
> >>>>>>>> probe depending on whether the IOMMU is optional or mandatory would
> >>>>>>>> be a good enhancement.
> >>>>>>>> 
> >>>>>>>> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
> >>>>>>>> Signed-off-by: Laurent Pichart
> >>>>>>>> <laurent.pinchart+renesas@ideasonboard.com>
> >>>>>>>> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
> >>>>>>> 
> >>>>>>> This patch broke Renesas R-Car Gen3 platforms in renesas-drivers.
> >>>>>>> As the IOMMU nodes in DT are not yet enabled, all devices having
> >>>>>>> iommus properties in DT now fail to probe.
> >>>>>> 
> >>>>>> How exactly do they fail to probe? Per d7b0558230e4, if there are no
> >>>>>> ops registered then they should merely defer until we reach the
> >>>>>> point of giving up and ignoring the IOMMU. Is it just that you have
> >>>>>> no other late-probing drivers or post-init module loads to kick the
> >>>>>> deferred queue after that point? I did try to find a way to
> >>>>>> explicitly kick it from a suitably late initcall, but there didn't
> >>>>>> seem to be any obvious public interface - anyone have any
> >>>>>> suggestions?
> >>>>>> 
> >>>>>> I think that's more of a general problem with the probe deferral
> >>>>>> mechanism itself (I've seen the same thing happen with some of the
> >>>>>> CoreSight stuff on Juno due to the number of inter-component
> >>>>>> dependencies) rather than any specific fault of this series.
> >>>>> 
> >>>>> I was thinking of an additional check like below to avoid the
> >>>>> situation ?
> >>>>> 
> >>>>> From 499b6e662f60f23740b8880882b0a16f16434501 Mon Sep 17 00:00:00
> >>>>> 2001
> >>>>> From: Sricharan R <sricharan@codeaurora.org>
> >>>>> Date: Wed, 3 May 2017 13:16:59 +0530
> >>>>> Subject: [PATCH] iommu: of: Fix check for returning EPROBE_DEFER
> >>>>> 
> >>>>> While returning EPROBE_DEFER for iommu masters
> >>>>> take in to account of iommu nodes that could be
> >>>>> marked in DT as 'status=disabled', in which case
> >>>>> simply return NULL and let the master's probe
> >>>>> continue rather than deferring.
> >>>>> 
> >>>>> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
> >>>>> ---
> >>>>> 
> >>>>>  drivers/iommu/of_iommu.c | 1 +
> >>>>>  1 file changed, 1 insertion(+)
> >>>>> 
> >>>>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> >>>>> index 9f44ee8..e6e9bec 100644
> >>>>> --- a/drivers/iommu/of_iommu.c
> >>>>> +++ b/drivers/iommu/of_iommu.c
> >>>>> @@ -118,6 +118,7 @@ static bool of_iommu_driver_present(struct
> >>>>> device_node *np)
> >>>>> 
> >>>>>         ops = iommu_ops_from_fwnode(fwnode);
> >>>>>         if ((ops && !ops->of_xlate) ||
> >>>>> +           !of_device_is_available(iommu_spec->np) ||
> >>>>>             (!ops && !of_iommu_driver_present(iommu_spec->np)))
> >>>>>                 return NULL;
> >>>> 
> >>>> This looks good to me, but won't be enough. The ipmmu-vmsa driver in
> >>>> v4.12-rc1 doesn't call iommu_device_register() and thus won't be found
> >>>> by iommu_ops_from_fwnode(). Furthermore, it doesn't
> >>>> IOMMU_OF_DECLARE(),
> >>>> and thus will always be considered as absent.
> >>>> 
> >>>> I agree that the ipmmu-vmsa driver needs to be fixed, but it would
> >>>> have been nice to check existing IOMMU drivers before merging this
> >>>> patch series...
> >>> 
> >>> Please pardon the question, but has this patch series been tested on
> >>> ARM32 ?
> >>> 
> >>> When the device is probed the arch_setup_dma_ops() function is called.
> >>> It sets the device's dma_ops and the mapping (in
> >>> __arm_iommu_attach_device()). If probe is deferred,
> >>> arch_teardown_dma_ops() is called which in turn calls
> >>> arch_teardown_dma_ops(). This removes the mapping but doesn't touch the
> >>> dma_ops. The next time the device is probed, arch_setup_dma_ops() bails
> >>> out immediately as the dma_ops are already set, leaving us with a
> >>> device bound to IOMMU operations but with no mapping. This oopses later
> >>> as soon as the kernel tries to map memory for the device through the
> >>> IOMMU.
> >> 
> >> Resetting the dma_ops for arm32 was added in this patch [1], which I
> >> missed to send in the original series, but now have added to Russell's
> >> patch tracking system.
> > 
> > Thank you. I fear that won't be enough though.
> > 
> >> [1] https://patchwork.kernel.org/patch/9434105/
> > 
> > Quoting the patch:
> >
> >> arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops()
> >> ,dma_ops should be cleared in the teardown path. Otherwise
> >> this causes problem when the probe of device is retried after
> >> being deferred. The device's iommu structures are cleared
> >> after EPROBEDEFER error, but on the next try dma_ops will still
> >> be set to old value, which is not right.
> >> 
> >> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
> >> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> >> ---
> >> 
> >>   arch/arm/mm/dma-mapping.c | 1 +
> >>   1 file changed, 1 insertion(+)
> >> 
> >> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> >> index ab4f745..a40f03e 100644
> >> --- a/arch/arm/mm/dma-mapping.c
> >> +++ b/arch/arm/mm/dma-mapping.c
> >> @@ -2358,6 +2358,7 @@ static void arm_teardown_iommu_dma_ops(struct
> >> device *dev)
> > 
> >>   	__arm_iommu_detach_device(dev);
> >>   	arm_iommu_release_mapping(mapping);
> >> +	set_dma_ops(dev, NULL);
> >>   }
> >>   #else
> > 
> > The subject mentions arch_teardown_dma_ops(), which I think is correct,
> > but the patch adds the set_dma_ops() call to arm_teardown_iommu_dma_ops().
> > 
> > However, the situation is perhaps more complex. Note the check at the
> > 
> > beginning of arch_setup_dma_ops():
> > 	/*
> > 	 * Don't override the dma_ops if they have already been set. Ideally
> > 	 * this should be the only location where dma_ops are set, remove this
> > 	 * check when all other callers of set_dma_ops will have disappeared.
> > 	 */
> > 	if (dev->dma_ops)
> > 		return;
> > 
> > If you set the dma_ops to NULL in arm_teardown_iommu_dma_ops() or
> > arch_teardown_dma_ops(), the next call to arch_setup_dma_ops() will
> > override them. To be safe you should only set them to NULL if they have
> > been set by arch_setup_dma_ops(). More than that, arch_teardown_dma_ops()
> > should probably not call arm_teardown_iommu_dma_ops() at all if the
> > dma_ops were set by arm_iommu_attach_device() and not
> > arch_teardown_dma_ops(). One option would be to add a field to struct
> > dev_archdata to store that information. To avoid growing the structure,
> > which is embedded in every struct device, you could possibly turn the
> > dma_coherent bool into a bitfield.
> > 
> > @@ -19,7 +19,8 @@ struct dev_archdata {
> >  #ifdef CONFIG_XEN
> >  	const struct dma_map_ops *dev_dma_ops;
> >  #endif
> > -	bool dma_coherent;
> > +	bool dma_coherent:1;
> > +	bool dma_ops_setup:1;
> >  };
> >  
> >  struct omap_device;
> > 
> > I haven't checked, however, whether the dma_coherent field would need
> > to be accessed atomically, so this might be a bad idea.
> > 
> > Last but not least, a fix must be merged in v4.12, and the sooner the
> > better.
> 
> ho, yet another combination. This seems to be a problem with exynos_iommu,
> ipmmu-vmsa, mtk_iommu_v1 which calls the arm_iommu_attach_device with its
> own custom mapping. They are calling arm_iommu_attach_device from the
> add_device callback and that is not always replayed when the reprobe happens
> and these archs are storing the old mapping data in private structures which
> might not be cleared in the teardown path.

Yes, I know, it's messy :-/ There's a handful of non-IOMMU drivers calling 
arm_iommu_attach_device() directly too. All these should be fixed, but in the 
meantime, let's try not to break them.

> I will post the fix that you have suggested.

Thank you. You might want to use an unsigned int bitfield instead of a bool 
bitfield as Sakari suggested. It would be nice to check the code setting the 
dma_coherent field to make sure there will be no race with code setting the 
new dma_ops_setup field (which might not be the best name, feel free to rename 
it).

I have successfully test the patch, let me know if there's anything else I can 
do to help.

> >>> I might be missing something obvious, but I don't see how this can
> >>> work.
> >>> 
> >>>>>>> This can be fixed by either:
> >>>>>>>   - Disabling CONFIG_IPMMU_VMSA, or
> >>>>>>>   - Reverting commit 7b07cbefb68d486f (but keeping "int ret = 0;").
> >>>>>>> 
> >>>>>>> Note that this was a bit hard to investigate, as R-Car Gen3 support
> >>>>>>> wasn't upstreamed yet, so bisection pointed to a merge commit.
Laurent Pinchart May 16, 2017, 2:10 p.m. UTC | #12
Hi Robin,

On Tuesday 16 May 2017 15:04:55 Robin Murphy wrote:
> On 16/05/17 08:17, Laurent Pinchart wrote:
> > On Tuesday 16 May 2017 07:53:57 sricharan@codeaurora.org wrote:

[snip]

> >> arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops()
> >> ,dma_ops should be cleared in the teardown path. Otherwise
> >> this causes problem when the probe of device is retried after
> >> being deferred. The device's iommu structures are cleared
> >> after EPROBEDEFER error, but on the next try dma_ops will still
> >> be set to old value, which is not right.
> >> 
> >> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
> >> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> >> ---
> >> 
> >>   arch/arm/mm/dma-mapping.c | 1 +
> >>   1 file changed, 1 insertion(+)
> >> 
> >> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> >> index ab4f745..a40f03e 100644
> >> --- a/arch/arm/mm/dma-mapping.c
> >> +++ b/arch/arm/mm/dma-mapping.c
> >> @@ -2358,6 +2358,7 @@ static void arm_teardown_iommu_dma_ops(struct
> >> device *dev)
> > 
> >>   	__arm_iommu_detach_device(dev);
> >>   	arm_iommu_release_mapping(mapping);
> >> +	set_dma_ops(dev, NULL);
> >>   }
> >>   #else
> > 
> > The subject mentions arch_teardown_dma_ops(), which I think is correct,
> > but the patch adds the set_dma_ops() call to arm_teardown_iommu_dma_ops().
> > 
> > However, the situation is perhaps more complex. Note the check at the
> > beginning of arch_setup_dma_ops():
> > 	/*
> > 	 * Don't override the dma_ops if they have already been set. Ideally
> > 	 * this should be the only location where dma_ops are set, remove this
> > 	 * check when all other callers of set_dma_ops will have disappeared.
> > 	 */
> > 	if (dev->dma_ops)
> > 		return;
> > 
> > If you set the dma_ops to NULL in arm_teardown_iommu_dma_ops() or
> > arch_teardown_dma_ops(), the next call to arch_setup_dma_ops() will
> > override them. To be safe you should only set them to NULL if they have
> > been set by arch_setup_dma_ops(). More than that, arch_teardown_dma_ops()
> > should probably not call arm_teardown_iommu_dma_ops() at all if the
> > dma_ops were set by arm_iommu_attach_device() and not
> > arch_teardown_dma_ops().
> 
> Under what circumstances is that an issue? We'll only be tearing down
> the DMA ops when unbinding the driver,

Or when deferring probe.

> and I think it would be erroneous to expect the device to retain much state
> after that. Everything else would be set up from scratch again if it get
> reprobed later, so why not the DMA ops?

Because the DMA ops might have been set elsewhere than arch_setup_dma_ops(). 
If you look at the patch that added the above warning, its commit message 
states

commit 26b37b946a5c2658dbc37dd5d6df40aaa9685d70 (iommu-joerg/arm/core)
Author: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Date:   Fri May 15 02:00:02 2015 +0300

    arm: dma-mapping: Don't override dma_ops in arch_setup_dma_ops()
    
    The arch_setup_dma_ops() function is in charge of setting dma_ops with a
    call to set_dma_ops(). set_dma_ops() is also called from
    
    - highbank and mvebu bus notifiers
    - dmabounce (to be replaced with swiotlb)
    - arm_iommu_attach_device
    
    (arm_iommu_attach_device is itself called from IOMMU and bus master
    device drivers)
    
    To allow the arch_setup_dma_ops() call to be moved from device add time
    to device probe time we must ensure that dma_ops already setup by any of
    the above callers will not be overriden.
    
    Aftering replacing dmabounce with swiotlb, converting IOMMU drivers to
    of_xlate and taking care of highbank and mvebu, the workaround should be
    removed.

I'm concerned about potentially breaking these if we unconditionally remove 
the DMA ops and mapping.
Sricharan Ramabadhran May 16, 2017, 2:29 p.m. UTC | #13
Hi,

On 2017-05-16 19:40, Laurent Pinchart wrote:
> Hi Robin,
> 
> On Tuesday 16 May 2017 15:04:55 Robin Murphy wrote:
>> On 16/05/17 08:17, Laurent Pinchart wrote:
>> > On Tuesday 16 May 2017 07:53:57 sricharan@codeaurora.org wrote:
> 
> [snip]
> 
>> >> arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops()
>> >> ,dma_ops should be cleared in the teardown path. Otherwise
>> >> this causes problem when the probe of device is retried after
>> >> being deferred. The device's iommu structures are cleared
>> >> after EPROBEDEFER error, but on the next try dma_ops will still
>> >> be set to old value, which is not right.
>> >>
>> >> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
>> >> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
>> >> ---
>> >>
>> >>   arch/arm/mm/dma-mapping.c | 1 +
>> >>   1 file changed, 1 insertion(+)
>> >>
>> >> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
>> >> index ab4f745..a40f03e 100644
>> >> --- a/arch/arm/mm/dma-mapping.c
>> >> +++ b/arch/arm/mm/dma-mapping.c
>> >> @@ -2358,6 +2358,7 @@ static void arm_teardown_iommu_dma_ops(struct
>> >> device *dev)
>> >
>> >>   	__arm_iommu_detach_device(dev);
>> >>   	arm_iommu_release_mapping(mapping);
>> >> +	set_dma_ops(dev, NULL);
>> >>   }
>> >>   #else
>> >
>> > The subject mentions arch_teardown_dma_ops(), which I think is correct,
>> > but the patch adds the set_dma_ops() call to arm_teardown_iommu_dma_ops().
>> >
>> > However, the situation is perhaps more complex. Note the check at the
>> > beginning of arch_setup_dma_ops():
>> > 	/*
>> > 	 * Don't override the dma_ops if they have already been set. Ideally
>> > 	 * this should be the only location where dma_ops are set, remove this
>> > 	 * check when all other callers of set_dma_ops will have disappeared.
>> > 	 */
>> > 	if (dev->dma_ops)
>> > 		return;
>> >
>> > If you set the dma_ops to NULL in arm_teardown_iommu_dma_ops() or
>> > arch_teardown_dma_ops(), the next call to arch_setup_dma_ops() will
>> > override them. To be safe you should only set them to NULL if they have
>> > been set by arch_setup_dma_ops(). More than that, arch_teardown_dma_ops()
>> > should probably not call arm_teardown_iommu_dma_ops() at all if the
>> > dma_ops were set by arm_iommu_attach_device() and not
>> > arch_teardown_dma_ops().
>> 
>> Under what circumstances is that an issue? We'll only be tearing down
>> the DMA ops when unbinding the driver,
> 
> Or when deferring probe.
> 
>> and I think it would be erroneous to expect the device to retain much 
>> state
>> after that. Everything else would be set up from scratch again if it 
>> get
>> reprobed later, so why not the DMA ops?
> 
> Because the DMA ops might have been set elsewhere than 
> arch_setup_dma_ops().
> If you look at the patch that added the above warning, its commit 
> message
> states
> 
> commit 26b37b946a5c2658dbc37dd5d6df40aaa9685d70 (iommu-joerg/arm/core)
> Author: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
> Date:   Fri May 15 02:00:02 2015 +0300
> 
>     arm: dma-mapping: Don't override dma_ops in arch_setup_dma_ops()
> 
>     The arch_setup_dma_ops() function is in charge of setting dma_ops 
> with a
>     call to set_dma_ops(). set_dma_ops() is also called from
> 
>     - highbank and mvebu bus notifiers
>     - dmabounce (to be replaced with swiotlb)
>     - arm_iommu_attach_device
> 
>     (arm_iommu_attach_device is itself called from IOMMU and bus master
>     device drivers)
> 
>     To allow the arch_setup_dma_ops() call to be moved from device add 
> time
>     to device probe time we must ensure that dma_ops already setup by 
> any of
>     the above callers will not be overriden.
> 
>     Aftering replacing dmabounce with swiotlb, converting IOMMU drivers 
> to
>     of_xlate and taking care of highbank and mvebu, the workaround 
> should be
>     removed.
> 
> I'm concerned about potentially breaking these if we unconditionally 
> remove
> the DMA ops and mapping.

arch_teardown_dma_ops does nothing if there is
no mapping (not behind iommu), dma_ops without iommu is ok.
But when the arm_iommu_create_mapping/arm_iommu_attach_device
was called previously in the iommu driver, after we teardown,
that path in the iommu driver which called those functions is not
replayed.

Regards,
Laurent Pinchart May 16, 2017, 2:46 p.m. UTC | #14
Hi Sricharan,

On Tuesday 16 May 2017 19:59:01 sricharan@codeaurora.org wrote:
> On 2017-05-16 19:40, Laurent Pinchart wrote:
> > On Tuesday 16 May 2017 15:04:55 Robin Murphy wrote:
> >> On 16/05/17 08:17, Laurent Pinchart wrote:
> >> > On Tuesday 16 May 2017 07:53:57 sricharan@codeaurora.org wrote:
> > [snip]
> > 
> >>>> arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops(),
> >>>> dma_ops should be cleared in the teardown path. Otherwise
> >>>> this causes problem when the probe of device is retried after
> >>>> being deferred. The device's iommu structures are cleared
> >>>> after EPROBEDEFER error, but on the next try dma_ops will still
> >>>> be set to old value, which is not right.
> >>>> 
> >>>> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
> >>>> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
> >>>> ---
> >>>> 
> >>>>   arch/arm/mm/dma-mapping.c | 1 +
> >>>>   1 file changed, 1 insertion(+)
> >>>> 
> >>>> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
> >>>> index ab4f745..a40f03e 100644
> >>>> --- a/arch/arm/mm/dma-mapping.c
> >>>> +++ b/arch/arm/mm/dma-mapping.c
> >>>> @@ -2358,6 +2358,7 @@ static void arm_teardown_iommu_dma_ops(struct
> >>>> device *dev)
> >>>>
> >>>>   	__arm_iommu_detach_device(dev);
> >>>>   	arm_iommu_release_mapping(mapping);
> >>>> +	set_dma_ops(dev, NULL);
> >>>>   }
> >>>>   #else
> >>> 
> >>> The subject mentions arch_teardown_dma_ops(), which I think is correct,
> >>> but the patch adds the set_dma_ops() call to
> >>> arm_teardown_iommu_dma_ops().
> >>> 
> >>> However, the situation is perhaps more complex. Note the check at the
> >>> beginning of arch_setup_dma_ops():
> >>> 
> >>> 	/*
> >>> 	 * Don't override the dma_ops if they have already been set. Ideally
> >>> 	 * this should be the only location where dma_ops are set, remove this
> >>> 	 * check when all other callers of set_dma_ops will have disappeared.
> >>> 	 */
> >>> 	if (dev->dma_ops)
> >>> 		return;
> >>> 
> >>> If you set the dma_ops to NULL in arm_teardown_iommu_dma_ops() or
> >>> arch_teardown_dma_ops(), the next call to arch_setup_dma_ops() will
> >>> override them. To be safe you should only set them to NULL if they have
> >>> been set by arch_setup_dma_ops(). More than that,
> >>> arch_teardown_dma_ops()
> >>> should probably not call arm_teardown_iommu_dma_ops() at all if the
> >>> dma_ops were set by arm_iommu_attach_device() and not
> >>> arch_teardown_dma_ops().
> >> 
> >> Under what circumstances is that an issue? We'll only be tearing down
> >> the DMA ops when unbinding the driver,
> > 
> > Or when deferring probe.
> > 
> >> and I think it would be erroneous to expect the device to retain much
> >> state after that. Everything else would be set up from scratch again if
> >> it get reprobed later, so why not the DMA ops?
> > 
> > Because the DMA ops might have been set elsewhere than
> > arch_setup_dma_ops(). If you look at the patch that added the above
> > warning, its commit message states
> > 
> > commit 26b37b946a5c2658dbc37dd5d6df40aaa9685d70 (iommu-joerg/arm/core)
> > Author: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
> > Date:   Fri May 15 02:00:02 2015 +0300
> > 
> >     arm: dma-mapping: Don't override dma_ops in arch_setup_dma_ops()
> >     
> >     The arch_setup_dma_ops() function is in charge of setting dma_ops
> >     with a call to set_dma_ops(). set_dma_ops() is also called from
> >     
> >     - highbank and mvebu bus notifiers
> >     - dmabounce (to be replaced with swiotlb)
> >     - arm_iommu_attach_device
> >     
> >     (arm_iommu_attach_device is itself called from IOMMU and bus master
> >     device drivers)
> >     
> >     To allow the arch_setup_dma_ops() call to be moved from device add
> >     time to device probe time we must ensure that dma_ops already setup by
> >     any of the above callers will not be overriden.
> >     
> >     Aftering replacing dmabounce with swiotlb, converting IOMMU drivers
> >     to of_xlate and taking care of highbank and mvebu, the workaround
> >     should be removed.
> > 
> > I'm concerned about potentially breaking these if we unconditionally
> > remove the DMA ops and mapping.
> 
> arch_teardown_dma_ops does nothing if there is no mapping (not behind
> iommu), dma_ops without iommu is ok. But when the
> arm_iommu_create_mapping/arm_iommu_attach_device was called previously in
> the iommu driver, after we teardown, that path in the iommu driver which
> called those functions is not replayed.

I've had a look at the code in more details, and I'm not sure how we've 
reached the current situation (as I haven't followed the multiple versions of 
this patch series due to lack of time), but it's a very big mess.

arch_setup_dma_ops() is currently not the only way to create a mapping and 
attach it to a device. Not only that, but with the current ARM32 dma-mapping 
implementation, it can't be.

arch_setup_dma_ops() will create a separate mapping for every device. That's 
certainly fine when every device has its own IOMMU instance (or at least its 
own TLB in a shared IOMMU), but that's far from being true in all cases. 
Renesas R-Car Gen2 hardware, for instance, share IOMMUs and TLBs between 
multiple devices. That's why the ipmmu-vmsa driver creates the mapping 
manually with arm_iommu_create_mapping() and attaches it to the device with 
arm_iommu_attach_device(). We need to keep supporting this mechanism until the 
ARM32 dma-mapping API is brought in line with the ARM64 implementation that 
lets the IOMMU core manage the IOVA address space.

We're unfortunately far from that, so, for the time being, we need to keep 
supporting arm_iommu_attach_device(). How that can work with deferred probing, 
I will let you figure it out, but we need to at least fix the breakage for 
v4.12.
Robin Murphy May 16, 2017, 2:52 p.m. UTC | #15
On 16/05/17 15:10, Laurent Pinchart wrote:
> Hi Robin,
> 
> On Tuesday 16 May 2017 15:04:55 Robin Murphy wrote:
>> On 16/05/17 08:17, Laurent Pinchart wrote:
>>> On Tuesday 16 May 2017 07:53:57 sricharan@codeaurora.org wrote:
> 
> [snip]
> 
>>>> arch_teardown_dma_ops() being the inverse of arch_setup_dma_ops()
>>>> ,dma_ops should be cleared in the teardown path. Otherwise
>>>> this causes problem when the probe of device is retried after
>>>> being deferred. The device's iommu structures are cleared
>>>> after EPROBEDEFER error, but on the next try dma_ops will still
>>>> be set to old value, which is not right.
>>>>
>>>> Signed-off-by: Sricharan R <sricharan@codeaurora.org>
>>>> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
>>>> ---
>>>>
>>>>   arch/arm/mm/dma-mapping.c | 1 +
>>>>   1 file changed, 1 insertion(+)
>>>>
>>>> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
>>>> index ab4f745..a40f03e 100644
>>>> --- a/arch/arm/mm/dma-mapping.c
>>>> +++ b/arch/arm/mm/dma-mapping.c
>>>> @@ -2358,6 +2358,7 @@ static void arm_teardown_iommu_dma_ops(struct
>>>> device *dev)
>>>
>>>>   	__arm_iommu_detach_device(dev);
>>>>   	arm_iommu_release_mapping(mapping);
>>>> +	set_dma_ops(dev, NULL);
>>>>   }
>>>>   #else
>>>
>>> The subject mentions arch_teardown_dma_ops(), which I think is correct,
>>> but the patch adds the set_dma_ops() call to arm_teardown_iommu_dma_ops().
>>>
>>> However, the situation is perhaps more complex. Note the check at the
>>> beginning of arch_setup_dma_ops():
>>> 	/*
>>> 	 * Don't override the dma_ops if they have already been set. Ideally
>>> 	 * this should be the only location where dma_ops are set, remove this
>>> 	 * check when all other callers of set_dma_ops will have disappeared.
>>> 	 */
>>> 	if (dev->dma_ops)
>>> 		return;
>>>
>>> If you set the dma_ops to NULL in arm_teardown_iommu_dma_ops() or
>>> arch_teardown_dma_ops(), the next call to arch_setup_dma_ops() will
>>> override them. To be safe you should only set them to NULL if they have
>>> been set by arch_setup_dma_ops(). More than that, arch_teardown_dma_ops()
>>> should probably not call arm_teardown_iommu_dma_ops() at all if the
>>> dma_ops were set by arm_iommu_attach_device() and not
>>> arch_teardown_dma_ops().
>>
>> Under what circumstances is that an issue? We'll only be tearing down
>> the DMA ops when unbinding the driver,
> 
> Or when deferring probe.
> 
>> and I think it would be erroneous to expect the device to retain much state
>> after that. Everything else would be set up from scratch again if it get
>> reprobed later, so why not the DMA ops?
> 
> Because the DMA ops might have been set elsewhere than arch_setup_dma_ops(). 
> If you look at the patch that added the above warning, its commit message 
> states
> 
> commit 26b37b946a5c2658dbc37dd5d6df40aaa9685d70 (iommu-joerg/arm/core)
> Author: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
> Date:   Fri May 15 02:00:02 2015 +0300
> 
>     arm: dma-mapping: Don't override dma_ops in arch_setup_dma_ops()
>     
>     The arch_setup_dma_ops() function is in charge of setting dma_ops with a
>     call to set_dma_ops(). set_dma_ops() is also called from
>     
>     - highbank and mvebu bus notifiers
>     - dmabounce (to be replaced with swiotlb)
>     - arm_iommu_attach_device
>     
>     (arm_iommu_attach_device is itself called from IOMMU and bus master
>     device drivers)
>     
>     To allow the arch_setup_dma_ops() call to be moved from device add time
>     to device probe time we must ensure that dma_ops already setup by any of
>     the above callers will not be overriden.
>     
>     Aftering replacing dmabounce with swiotlb, converting IOMMU drivers to
>     of_xlate and taking care of highbank and mvebu, the workaround should be
>     removed.
> 
> I'm concerned about potentially breaking these if we unconditionally remove 
> the DMA ops and mapping.

Ah, sorry, I see now - it was taking a long time to page the 32-bit code
back in, and I'd forgotten the specifics of the mess of competing
"default domain" notions. Indeed, it's not the device's driver expecting
any state to be preserved as I got stuck on, it's the IOMMU driver,
which does "know better" to an extent, expecting its changes to the
struct device to stick for the lifetime of that structure.

I agree there shouldn't be a disparity - arch_setup_dma_ops() only does
things given certain circumstances, so arch_teardown_dma_ops() should
only undo them under the same. With probe-deferral in place I'll be
reviving my work to convert this path over to IOMMU API default domains,
which will make some of these issues go away again, but in the meantime
I also agree that the most expedient fix is indeed to add a flag to say
whether the dma ops were automatically set or not (this is implicitly
true on arm64, which was partly what was tripping me up).

Robin.
diff mbox

Patch

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index 9f44ee8..e6e9bec 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -118,6 +118,7 @@  static bool of_iommu_driver_present(struct device_node *np)

        ops = iommu_ops_from_fwnode(fwnode);
        if ((ops && !ops->of_xlate) ||
+           !of_device_is_available(iommu_spec->np) ||
            (!ops && !of_iommu_driver_present(iommu_spec->np)))
                return NULL;