diff mbox series

[v2] pinctrl: qcom: Assign irq_disable/eoi conditionally

Message ID 20200309125207.571840-1-linus.walleij@linaro.org
State New
Headers show
Series [v2] pinctrl: qcom: Assign irq_disable/eoi conditionally | expand

Commit Message

Linus Walleij March 9, 2020, 12:52 p.m. UTC
The hierarchical parts of MSM pinctrl/GPIO is only
used when the device tree has a "wakeup-parent" as
a phandle, but the .irq_disable and .irq_eoi are anyway
assigned leading to semantic problems on elder
Qualcomm chipsets.

When the drivers/mfd/qcom-pm8xxx.c driver calls
chained_irq_exit() that call will in turn call chip->irq_eoi()
which is set to irq_chip_eoi_parent() by default on a
hierachical IRQ chip, and the parent is pinctrl-msm.c
so that will in turn unconditionally call
irq_chip_eoi_parent() again, but its parent is invalid
so we get the following crash:

 Unnable to handle kernel NULL pointer dereference at
 virtual address 00000010
 pgd = (ptrval)
 [00000010] *pgd=00000000
 Internal error: Oops: 5 [#1] PREEMPT SMP ARM
 (...)
 PC is at irq_chip_eoi_parent+0x4/0x10
 LR is at pm8xxx_irq_handler+0x1b4/0x2d8

If we solve this crash by avoiding to call up to
irq_chip_eoi_parent(), the machine will hang and get
reset by the watchdog, because of semantic issues,
probably inside irq_chip.

As a solution, just assign the .irq_disable and .irq_eoi
condtionally if we are actually using a wakeup parent.

Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Lina Iyer <ilina@codeaurora.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Stephen Boyd <swboyd@chromium.org>
Cc: stable@vger.kernel.org
Fixes: e35a6ae0eb3a ("pinctrl/msm: Setup GPIO chip in hierarchy")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
---
ChangeLog v1->v2:
- Noticed that the previous solution doesn't actually work,
  the machine hangs and reboots intead (even if it got rid of
  the most obvious crash). Make a more thorough solution that
  completely avoids using these callbacks if we don't have
  a parent.
- v1 was called "Guard irq_eoi()"
---
 drivers/pinctrl/qcom/pinctrl-msm.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

Comments

Marc Zyngier March 9, 2020, 2:54 p.m. UTC | #1
Hi Linus,

On 2020-03-09 12:52, Linus Walleij wrote:
> The hierarchical parts of MSM pinctrl/GPIO is only
> used when the device tree has a "wakeup-parent" as
> a phandle, but the .irq_disable and .irq_eoi are anyway
> assigned leading to semantic problems on elder
> Qualcomm chipsets.
> 
> When the drivers/mfd/qcom-pm8xxx.c driver calls
> chained_irq_exit() that call will in turn call chip->irq_eoi()
> which is set to irq_chip_eoi_parent() by default on a
> hierachical IRQ chip, and the parent is pinctrl-msm.c
> so that will in turn unconditionally call
> irq_chip_eoi_parent() again, but its parent is invalid
> so we get the following crash:
> 
>  Unnable to handle kernel NULL pointer dereference at
>  virtual address 00000010
>  pgd = (ptrval)
>  [00000010] *pgd=00000000
>  Internal error: Oops: 5 [#1] PREEMPT SMP ARM
>  (...)
>  PC is at irq_chip_eoi_parent+0x4/0x10
>  LR is at pm8xxx_irq_handler+0x1b4/0x2d8
> 
> If we solve this crash by avoiding to call up to
> irq_chip_eoi_parent(), the machine will hang and get
> reset by the watchdog, because of semantic issues,
> probably inside irq_chip.
> 
> As a solution, just assign the .irq_disable and .irq_eoi
> condtionally if we are actually using a wakeup parent.
> 
> Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
> Cc: Lina Iyer <ilina@codeaurora.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Stephen Boyd <swboyd@chromium.org>
> Cc: stable@vger.kernel.org
> Fixes: e35a6ae0eb3a ("pinctrl/msm: Setup GPIO chip in hierarchy")
> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
> ---
> ChangeLog v1->v2:
> - Noticed that the previous solution doesn't actually work,
>   the machine hangs and reboots intead (even if it got rid of
>   the most obvious crash). Make a more thorough solution that
>   completely avoids using these callbacks if we don't have
>   a parent.

What is the problem with disable exactly?

> - v1 was called "Guard irq_eoi()"
> ---
>  drivers/pinctrl/qcom/pinctrl-msm.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/pinctrl/qcom/pinctrl-msm.c
> b/drivers/pinctrl/qcom/pinctrl-msm.c
> index 9a8daa256a32..fe3c53ae25f4 100644
> --- a/drivers/pinctrl/qcom/pinctrl-msm.c
> +++ b/drivers/pinctrl/qcom/pinctrl-msm.c
> @@ -1100,11 +1100,9 @@ static int msm_gpio_init(struct msm_pinctrl 
> *pctrl)
> 
>  	pctrl->irq_chip.name = "msmgpio";
>  	pctrl->irq_chip.irq_enable = msm_gpio_irq_enable;
> -	pctrl->irq_chip.irq_disable = msm_gpio_irq_disable;

I find it really odd to have the enable callback, but not the disable.
What is the rational for that? Can we drop the enable as well for old
platforms and only use mask/unmask instead?

>  	pctrl->irq_chip.irq_mask = msm_gpio_irq_mask;
>  	pctrl->irq_chip.irq_unmask = msm_gpio_irq_unmask;
>  	pctrl->irq_chip.irq_ack = msm_gpio_irq_ack;
> -	pctrl->irq_chip.irq_eoi = irq_chip_eoi_parent;
>  	pctrl->irq_chip.irq_set_type = msm_gpio_irq_set_type;
>  	pctrl->irq_chip.irq_set_wake = msm_gpio_irq_set_wake;
>  	pctrl->irq_chip.irq_request_resources = msm_gpio_irq_reqres;
> @@ -1118,7 +1116,8 @@ static int msm_gpio_init(struct msm_pinctrl 
> *pctrl)
>  		if (!chip->irq.parent_domain)
>  			return -EPROBE_DEFER;
>  		chip->irq.child_to_parent_hwirq = msm_gpio_wakeirq;
> -
> +		pctrl->irq_chip.irq_disable = msm_gpio_irq_disable;
> +		pctrl->irq_chip.irq_eoi = irq_chip_eoi_parent;
>  		/*
>  		 * Let's skip handling the GPIOs, if the parent irqchip
>  		 * is handling the direct connect IRQ of the GPIO.

Thanks,

         M.
Linus Walleij March 9, 2020, 3:03 p.m. UTC | #2
On Mon, Mar 9, 2020 at 3:54 PM Marc Zyngier <maz@kernel.org> wrote:

> On 2020-03-09 12:52, Linus Walleij wrote:

> > ChangeLog v1->v2:
> > - Noticed that the previous solution doesn't actually work,
> >   the machine hangs and reboots intead (even if it got rid of
> >   the most obvious crash). Make a more thorough solution that
> >   completely avoids using these callbacks if we don't have
> >   a parent.
>
> What is the problem with disable exactly?

There is no problem with .irq_disable, the system still works
if I keep that. But since the original patch added these two
callbacks for hierarchical I just moved them both to be
conditional.

The .irq_eoi callback is the culprit.

> >       pctrl->irq_chip.name = "msmgpio";
> >       pctrl->irq_chip.irq_enable = msm_gpio_irq_enable;
> > -     pctrl->irq_chip.irq_disable = msm_gpio_irq_disable;
>
> I find it really odd to have the enable callback, but not the disable.
> What is the rational for that? Can we drop the enable as well for old
> platforms and only use mask/unmask instead?

Hm I'm just working with the regression, and before the
patch I'm fixing the driver actually had just the .irq_enable
callback, so I'm restoring that state.

Would you prefer a patch where I just move the assignment
of the .irq_eoi callback to be conditional?

I have no idea *why* .irq_eoi() locks up the system, I suspect
one of those irqchip internal semantics that are sometimes
not entirely clear.

Yours,
Linus Walleij
Marc Zyngier March 9, 2020, 3:15 p.m. UTC | #3
On 2020-03-09 15:03, Linus Walleij wrote:
> On Mon, Mar 9, 2020 at 3:54 PM Marc Zyngier <maz@kernel.org> wrote:
> 
>> On 2020-03-09 12:52, Linus Walleij wrote:
> 
>> > ChangeLog v1->v2:
>> > - Noticed that the previous solution doesn't actually work,
>> >   the machine hangs and reboots intead (even if it got rid of
>> >   the most obvious crash). Make a more thorough solution that
>> >   completely avoids using these callbacks if we don't have
>> >   a parent.
>> 
>> What is the problem with disable exactly?
> 
> There is no problem with .irq_disable, the system still works
> if I keep that. But since the original patch added these two
> callbacks for hierarchical I just moved them both to be
> conditional.
> 
> The .irq_eoi callback is the culprit.
> 
>> >       pctrl->irq_chip.name = "msmgpio";
>> >       pctrl->irq_chip.irq_enable = msm_gpio_irq_enable;
>> > -     pctrl->irq_chip.irq_disable = msm_gpio_irq_disable;
>> 
>> I find it really odd to have the enable callback, but not the disable.
>> What is the rational for that? Can we drop the enable as well for old
>> platforms and only use mask/unmask instead?
> 
> Hm I'm just working with the regression, and before the
> patch I'm fixing the driver actually had just the .irq_enable
> callback, so I'm restoring that state.
> 
> Would you prefer a patch where I just move the assignment
> of the .irq_eoi callback to be conditional?

I'd rather we have the minimal change that makes your system runnable.
If making irq_eoi depend on some QC magic, fine by me. Having an 
unbalanced
enable/disable setup looks pretty fragile.

> I have no idea *why* .irq_eoi() locks up the system, I suspect
> one of those irqchip internal semantics that are sometimes
> not entirely clear.

I don't think anyone knows what they are outside of the usual QC 
circles.

         M.
diff mbox series

Patch

diff --git a/drivers/pinctrl/qcom/pinctrl-msm.c b/drivers/pinctrl/qcom/pinctrl-msm.c
index 9a8daa256a32..fe3c53ae25f4 100644
--- a/drivers/pinctrl/qcom/pinctrl-msm.c
+++ b/drivers/pinctrl/qcom/pinctrl-msm.c
@@ -1100,11 +1100,9 @@  static int msm_gpio_init(struct msm_pinctrl *pctrl)
 
 	pctrl->irq_chip.name = "msmgpio";
 	pctrl->irq_chip.irq_enable = msm_gpio_irq_enable;
-	pctrl->irq_chip.irq_disable = msm_gpio_irq_disable;
 	pctrl->irq_chip.irq_mask = msm_gpio_irq_mask;
 	pctrl->irq_chip.irq_unmask = msm_gpio_irq_unmask;
 	pctrl->irq_chip.irq_ack = msm_gpio_irq_ack;
-	pctrl->irq_chip.irq_eoi = irq_chip_eoi_parent;
 	pctrl->irq_chip.irq_set_type = msm_gpio_irq_set_type;
 	pctrl->irq_chip.irq_set_wake = msm_gpio_irq_set_wake;
 	pctrl->irq_chip.irq_request_resources = msm_gpio_irq_reqres;
@@ -1118,7 +1116,8 @@  static int msm_gpio_init(struct msm_pinctrl *pctrl)
 		if (!chip->irq.parent_domain)
 			return -EPROBE_DEFER;
 		chip->irq.child_to_parent_hwirq = msm_gpio_wakeirq;
-
+		pctrl->irq_chip.irq_disable = msm_gpio_irq_disable;
+		pctrl->irq_chip.irq_eoi = irq_chip_eoi_parent;
 		/*
 		 * Let's skip handling the GPIOs, if the parent irqchip
 		 * is handling the direct connect IRQ of the GPIO.