pinctrl: mcp23s08: Allocate irq_chip dynamic

Message ID 20190111162516.28197-1-poeschel@lemonage.de
State New
Headers show
Series
  • pinctrl: mcp23s08: Allocate irq_chip dynamic
Related show

Commit Message

Lars Poeschel Jan. 11, 2019, 4:25 p.m.
Keeping the irq_chip definition static shares it with multiple instances
of the mcp23s08 gpiochip in the system. This is bad and now we get this
warning from gpiolib core:

"detected irqchip that is shared with multiple gpiochips: please fix the
driver."

Hence, move the irq_chip definition from being driver static into the
struct mcp23s08. So a unique irq_chip is used for each gpiochip
instance.

Signed-off-by: Lars Poeschel <poeschel@lemonage.de>
---
 drivers/pinctrl/pinctrl-mcp23s08.c | 28 +++++++++++++++++-----------
 1 file changed, 17 insertions(+), 11 deletions(-)

Comments

Linus Walleij Jan. 21, 2019, 1:19 p.m. | #1
On Fri, Jan 11, 2019 at 4:16 PM Lars Poeschel <poeschel@lemonage.de> wrote:

> Keeping the irq_chip definition static shares it with multiple instances
> of the mcp23s08 gpiochip in the system. This is bad and now we get this
> warning from gpiolib core:
>
> "detected irqchip that is shared with multiple gpiochips: please fix the
> driver."
>
> Hence, move the irq_chip definition from being driver static into the
> struct mcp23s08. So a unique irq_chip is used for each gpiochip
> instance.
>
> Signed-off-by: Lars Poeschel <poeschel@lemonage.de>

Patch applied.

Yours,
Linus Walleij
Jan Kundrát March 5, 2019, 3:19 p.m. | #2
On pondělí 21. ledna 2019 14:19:49 CET, Linus Walleij wrote:
> On Fri, Jan 11, 2019 at 4:16 PM Lars Poeschel <poeschel@lemonage.de> wrote:
>
>> Keeping the irq_chip definition static shares it with multiple instances
>> of the mcp23s08 gpiochip in the system. This is bad and now we get this
>> warning from gpiolib core:
>> 
>> "detected irqchip that is shared with multiple gpiochips: please fix the
>> driver."
>> 
>> Hence, move the irq_chip definition from being driver static into the
>> struct mcp23s08. So a unique irq_chip is used for each gpiochip
>> instance.
>> 
>> Signed-off-by: Lars Poeschel <poeschel@lemonage.de>
>
> Patch applied.

Tested-by: Jan Kundrát <jan.kundrat@cesnet.cz>
Fixes: 171948ea33e14 (gpiolib: check if irqchip already has the irq hook 
replacements)

This commit should probably go to 5.0-stable and 4.20-stable as well 
because commit 171948ea33e14 first appeared in 4.20. I have been using 4.19 
previsouly, which works, but 5.0 oopses on boot for me with two MCP23S17 
chips (sorry, I just don't know how to get more than 10 lines of stack 
trace in this oops):

[    1.011926] gpio gpiochip3: (mcp23s17.1): detected irqchip that is 
shared with multiple gpiochips: please fix the driver.
[    1.022936] GPIO line 486 (I2C XOR ready) hogged as input
[    1.028550] Unable to handle kernel NULL pointer dereference at virtual 
address 00000054
[    1.036675] pgd = (ptrval)
[    1.039388] [00000054] *pgd=00000000
[    1.042980] Internal error: Oops: 5 [#1] SMP ARM
[    1.047613] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.0.0 #1
[    1.053460] Hardware name: Marvell Armada 380/385 (Device Tree)
[    1.059401] PC is at __dev_printk+0x10/0x84
[    1.063595] LR is at _dev_info+0x44/0x68
[    1.067528] pc : [<c0435b50>]    lr : [<c0436764>]    psr: 20000013
[    1.073811] sp : ef4afbb4  ip : ef4afbbc  fp : eed0a040
[    1.079048] r10: 00000000  r9 : c0b32834  r8 : 00000000
[    1.084286] r7 : c016f8cc  r6 : 00000000  r5 : ef9f7ba0  r4 : c0b07bc8
[    1.090829] r3 : ef4afbd8  r2 : ef4afbbc  r1 : 00000008  r0 : c08cf6fc
[    1.097374] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment 
none
[    1.104528] Control: 10c5387d  Table: 0000404a  DAC: 00000051
[    1.110289] Process swapper/0 (pid: 1, stack limit = 0x(ptrval))
[    1.116311] Stack: (0xef4afbb4 to 0xef4b0000)
[    1.120679] fba0:                                              c0436764 
ef4afbd8 c08f7134
[    1.128881] fbc0: ef4afbb8 36415da1 c0bbc194 eed0a08c c03bce58 c08f7134 
ef714c40 00000000
[    1.137082] fbe0: eed0a08c 00000002 eecf9840 eed0a064 c0b07bc8 eed0a400 
eed0a08c 00000002
[    1.145283] fc00: eecf9840 c03b9484 00000000 00000001 00000000 00000000 
00004c4c 36415da1
[    1.153485] fc20: 00000000 00000003 eed0a400 00000002 00000000 c09019b8 
ef4afc5c c03b97ac
[    1.161686] fc40: 00000001 ffffffff 00000010 00000001 0000010c c0b07bc8 
00000002 00000006
[    1.169887] fc60: ffffffff 36415da1 00000000 eed0a400 00000000 c0b32768 
00000000 00000000
[    1.178088] fc80: c0b32778 00000000 00000000 c04d0218 c0bb8590 eed0a400 
c0bb859c c04391e8
[    1.186290] fca0: eed0a400 c0b32778 ef4afd1c c0439848 00000001 00000000 
00000000 c043965c
[    1.194491] fcc0: eed0a400 c0b32778 c0b32768 00000000 c0b07bc8 ef4afd1c 
c0439848 00000001
[    1.202692] fce0: 00000000 00000000 00000000 c0437ca4 00000000 ef54cc6c 
ef5c8938 36415da1
[    1.210894] fd00: eed0a400 eed0a400 c0b07bc8 eed0a434 eee79c00 c04394c4 
c093b958 eed0a400
[    1.219095] fd20: 00000001 36415da1 eed0a408 eed0a400 c0b43fd4 eee79c00 
eed0a400 c0437e60
[    1.227296] fd40: eed0a408 c0b07bc8 00000000 c0436244 00000006 c04d56f8 
00000000 eed0a400
[    1.235497] fd60: 000001be 36415da1 eed0a400 eee79c00 00000000 ef575410 
00000001 ef9f7bec
[    1.243699] fd80: c0b07bc8 c04d2620 eee79c00 ef9f7ba0 eed0a400 00000000 
00000001 c04d2e4c
[    1.251900] fda0: 00000000 c05cff48 c0901258 c0913c9c c0913ca8 c0913dd4 
00989680 36415da1
[    1.260102] fdc0: eee79f48 eee79c00 ef575400 eee79f48 ef575410 00000002 
c0b07bc8 f0997680
[    1.268303] fde0: c08d00d8 c04d618c 00000000 c02ab604 c08f7a40 c09141b8 
eee79c00 c0b07bc8
[    1.276504] fe00: 00000003 36415da1 c08fed94 ef575410 00000000 c0b449bc 
00000000 00000000
[    1.284706] fe20: c0b449bc 00000000 c0a5185c c043a8c4 c0bb8590 ef575410 
c0bb859c 00000000
[    1.292907] fe40: 00000000 c04391e8 ef575410 c0b449bc ef575444 c0439768 
00000000 c0a5183c
[    1.301108] fe60: c0a004a8 c043965c ef9f7144 c0439768 00000000 ef575410 
c0b449bc ef575444
[    1.309310] fe80: c0439768 00000000 c0a5183c c0a004a8 c0a5185c c0439844 
ef579a34 c0b07bc8
[    1.317511] fea0: c0b449bc c043771c 00000000 ef484d58 ef579a34 36415da1 
c0b3b638 c0b449bc
[    1.325712] fec0: ef737c80 c0b3b638 00000000 c0438114 c0914250 c0439fa8 
c0b449bc c0b449bc
[    1.333914] fee0: c0b07bc8 c0a2f38c ffffe000 c0439f78 c0b62260 c0b07bc8 
c0a2f38c c0a0101c
[    1.342115] ff00: efffcdb7 c013dd34 c0941a58 c0921800 00000000 00000006 
00000006 c08d5294
[    1.350316] ff20: 00000000 c0b07bc8 c08e2294 c08d5308 c0b64e00 efffcda4 
efffcdb2 36415da1
[    1.358518] ff40: c0a51840 c0b62260 c0a60a80 36415da1 c0b62260 c0a60c44 
00000007 c0b64e00
[    1.366719] ff60: c0b64e00 c0a0139c 00000006 00000006 00000000 c0a004a8 
000000b8 00000000
[    1.374920] ff80: 00000000 00000000 c071f84c 00000000 00000000 00000000 
00000000 00000000
[    1.383121] ffa0: 00000000 c071f854 00000000 c01010e8 00000000 00000000 
00000000 00000000
[    1.391322] ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 
00000000 00000000
[    1.399523] ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 
00000000 00000000
[    1.407729] [<c0435b50>] (__dev_printk) from [<ef4afbb8>] (0xef4afbb8)
[    1.414276] Code: e3510000 0a00001a e52de004 e1a0c002 (e591304c) 
[    1.420405] ---[ end trace 1243f20a0e53cc41 ]---
[    1.425040] Kernel panic - not syncing: Fatal exception
[    1.430282] CPU1: stopping
[    1.433000] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G      D           
5.0.0 #1
[    1.440242] Hardware name: Marvell Armada 380/385 (Device Tree)
[    1.446188] [<c01103ac>] (unwind_backtrace) from [<c010c1e4>] 
(show_stack+0x10/0x14)
[    1.453958] [<c010c1e4>] (show_stack) from [<c0708af0>] 
(dump_stack+0x88/0x9c)
[    1.461205] [<c0708af0>] (dump_stack) from [<c010f0b8>] 
(handle_IPI+0x348/0x380)
[    1.468626] [<c010f0b8>] (handle_IPI) from [<c03af568>] 
(gic_handle_irq+0x8c/0x90)
[    1.476220] [<c03af568>] (gic_handle_irq) from [<c0101a0c>] 
(__irq_svc+0x6c/0x90)
[    1.483723] Exception stack(0xef4cdf60 to 0xef4cdfa8)
[    1.488789] df60: 00000000 00002418 ef9e4de0 c0118940 ffffe000 c0b07bf0 
c0b07c30 00000002
[    1.496991] df80: 00000000 c0b07bc8 c0a69a00 c0b61f0c 00000000 ef4cdfb0 
c0108c94 c0108c98
[    1.505191] dfa0: 60000013 ffffffff
[    1.508693] [<c0101a0c>] (__irq_svc) from [<c0108c98>] 
(arch_cpu_idle+0x38/0x3c)
[    1.516113] [<c0108c98>] (arch_cpu_idle) from [<c014ccd4>] 
(do_idle+0x1c4/0x1fc)
[    1.523533] [<c014ccd4>] (do_idle) from [<c014cfa8>] 
(cpu_startup_entry+0x18/0x20)
[    1.531124] [<c014cfa8>] (cpu_startup_entry) from [<0010252c>] 
(0x10252c)
[    1.537933] Rebooting in 10 seconds..

If I cherry-pick the following two commits to my 5.0, my board boots once 
again:

16f4372fd7a5 pinctrl: mcp23s08: use struct_size() in devm_kzalloc()
19ab5ca9b77d pinctrl: mcp23s08: Allocate irq_chip dynamic

While I am not getting a complete stack trace, it seems likely that the 
culprit is indeed 171948ea33e14. It does not add a mere warning, it also 
changes behavior so that a chip's irq_enable and irq_disable are not 
properly initialized. There's no error code in this function, so the rest 
of the code cannot reasonably catch this new behavior. My recommendation 
would be to actually get rid of that early return, but if the point is to 
*really* get people to notice, well, I noticed :).

As on "why am I hitting this while nobody else did", my board has a shared 
IRQ line which is active during bootup, perhaps that might have something 
to do with this -- I don't know.

Thanks for this fixup, Lars.

Cheers,
Jan
Linus Walleij March 8, 2019, 9:15 a.m. | #3
On Tue, Mar 5, 2019 at 4:19 PM Jan Kundrát <jan.kundrat@cesnet.cz> wrote:

> This commit should probably go to 5.0-stable and 4.20-stable as well

I doubt that this commit is fixing a regression, but who knows...
> If I cherry-pick the following two commits to my 5.0, my board boots once
> again:
>
> 16f4372fd7a5 pinctrl: mcp23s08: use struct_size() in devm_kzalloc()
> 19ab5ca9b77d pinctrl: mcp23s08: Allocate irq_chip dynamic

Hm weird.

> As on "why am I hitting this while nobody else did", my board has a shared
> IRQ line which is active during bootup, perhaps that might have something
> to do with this -- I don't know.

Sounds like the code in mcp23s08 .probe() should write to all
registers clearing and disabling interrupts before proceeding to
request interrupts or register the irqchip, else it will fire immediately
and cause oopses.

Yours,
Linus Walleij
Jan Kundrát March 8, 2019, 12:30 p.m. | #4
On pátek 8. března 2019 10:15:50 CET, Linus Walleij wrote:
> On Tue, Mar 5, 2019 at 4:19 PM Jan Kundrát <jan.kundrat@cesnet.cz> wrote:
>
>> This commit should probably go to 5.0-stable and 4.20-stable as well
>
> I doubt that this commit is fixing a regression, but who knows...
>> If I cherry-pick the following two commits to my 5.0, my board boots once
>> again:
>> 
>> 16f4372fd7a5 pinctrl: mcp23s08: use struct_size() in devm_kzalloc()
>> 19ab5ca9b77d pinctrl: mcp23s08: Allocate irq_chip dynamic
>
> Hm weird.

My working theory is that this is because I have two physical chips on the 
same SPI CS. If I am correct, it works like this between 4.20 and current 
gpio-next:

- GPIO expander #1 is initialized, including the irqchip
- GPIO expander #2 is initialized, and the irqchip initialization fails
- an interrupt on a shared IRQ line which is common to GPIO expanders #1 
and #2 and one other unrelated device starts to be processed
- GPIO expander #1 says "not mine interrupt"
- GPIO expander #2 attempts to dereference a null pointer becase since 
171948ea33e14, the gpiochip->irq.irq_enable/disable are not initialized

>> As on "why am I hitting this while nobody else did", my board has a shared
>> IRQ line which is active during bootup, perhaps that might have something
>> to do with this -- I don't know.
>
> Sounds like the code in mcp23s08 .probe() should write to all
> registers clearing and disabling interrupts before proceeding to
> request interrupts or register the irqchip, else it will fire immediately
> and cause oopses.

I do not think that this would help me. My IRQ line is shared with another 
chip, so it can be physically asserted even if this particular GPIO 
expander doesn't generate an interrupt.

I suspect that this configuration (multiple MCP23S17 on one SPI CS, and an 
IRQ line shared with something else) is quite rare...

With kind regards,
Jan

Patch

diff --git a/drivers/pinctrl/pinctrl-mcp23s08.c b/drivers/pinctrl/pinctrl-mcp23s08.c
index b03481ef99a1..3b1785e56174 100644
--- a/drivers/pinctrl/pinctrl-mcp23s08.c
+++ b/drivers/pinctrl/pinctrl-mcp23s08.c
@@ -68,6 +68,7 @@  struct mcp23s08 {
 	struct mutex		lock;
 
 	struct gpio_chip	chip;
+	struct irq_chip		irq_chip;
 
 	struct regmap		*regmap;
 	struct device		*dev;
@@ -607,15 +608,6 @@  static void mcp23s08_irq_bus_unlock(struct irq_data *data)
 	mutex_unlock(&mcp->lock);
 }
 
-static struct irq_chip mcp23s08_irq_chip = {
-	.name = "gpio-mcp23xxx",
-	.irq_mask = mcp23s08_irq_mask,
-	.irq_unmask = mcp23s08_irq_unmask,
-	.irq_set_type = mcp23s08_irq_set_type,
-	.irq_bus_lock = mcp23s08_irq_bus_lock,
-	.irq_bus_sync_unlock = mcp23s08_irq_bus_unlock,
-};
-
 static int mcp23s08_irq_setup(struct mcp23s08 *mcp)
 {
 	struct gpio_chip *chip = &mcp->chip;
@@ -645,7 +637,7 @@  static int mcp23s08_irqchip_setup(struct mcp23s08 *mcp)
 	int err;
 
 	err =  gpiochip_irqchip_add_nested(chip,
-					   &mcp23s08_irq_chip,
+					   &mcp->irq_chip,
 					   0,
 					   handle_simple_irq,
 					   IRQ_TYPE_NONE);
@@ -656,7 +648,7 @@  static int mcp23s08_irqchip_setup(struct mcp23s08 *mcp)
 	}
 
 	gpiochip_set_nested_irqchip(chip,
-				    &mcp23s08_irq_chip,
+				    &mcp->irq_chip,
 				    mcp->irq);
 
 	return 0;
@@ -1042,6 +1034,13 @@  static int mcp230xx_probe(struct i2c_client *client,
 		return -ENOMEM;
 
 	mcp->irq = client->irq;
+	mcp->irq_chip.name = dev_name(&client->dev);
+	mcp->irq_chip.irq_mask = mcp23s08_irq_mask;
+	mcp->irq_chip.irq_unmask = mcp23s08_irq_unmask;
+	mcp->irq_chip.irq_set_type = mcp23s08_irq_set_type;
+	mcp->irq_chip.irq_bus_lock = mcp23s08_irq_bus_lock;
+	mcp->irq_chip.irq_bus_sync_unlock = mcp23s08_irq_bus_unlock;
+
 	status = mcp23s08_probe_one(mcp, &client->dev, client, client->addr,
 				    id->driver_data, pdata->base, 0);
 	if (status)
@@ -1152,6 +1151,13 @@  static int mcp23s08_probe(struct spi_device *spi)
 		chips--;
 		data->mcp[addr] = &data->chip[chips];
 		data->mcp[addr]->irq = spi->irq;
+		data->mcp[addr]->irq_chip.name = dev_name(&spi->dev);
+		data->mcp[addr]->irq_chip.irq_mask = mcp23s08_irq_mask;
+		data->mcp[addr]->irq_chip.irq_unmask = mcp23s08_irq_unmask;
+		data->mcp[addr]->irq_chip.irq_set_type = mcp23s08_irq_set_type;
+		data->mcp[addr]->irq_chip.irq_bus_lock = mcp23s08_irq_bus_lock;
+		data->mcp[addr]->irq_chip.irq_bus_sync_unlock =
+			mcp23s08_irq_bus_unlock;
 		status = mcp23s08_probe_one(data->mcp[addr], &spi->dev, spi,
 					    0x40 | (addr << 1), type,
 					    pdata->base, addr);