Message ID | 1370630768-4077-2-git-send-email-grygorii.strashko@ti.com |
---|---|
State | Rejected |
Headers | show |
Grygorii Strashko <grygorii.strashko@ti.com> writes: > From: Kevin Hilman <khilman@deeprootsystems.com> > > Currently, runtime PM is used to keep the device enabled only during > active transfers and for a configurable runtime PM autosuspend timout > after an xfer. > > In addition to idling the device, driver's ->runtime_suspend() method > currently disables device interrupts when idle. However, on some SoCs > (notably OMAP4+), the I2C hardware may shared with other coprocessors. > This means that the MPU will still recieve interrupts if a coprocessor > is using the I2C device. To avoid this, also disable interrupts at > the MPU INTC when idling the device in ->runtime_suspend() (and > re-enable them in ->runtime_resume().) This part based on an original > patch from Shubhrajyoti Datta. NOTE: for proper sharing the I2C with > a coprocessor, this driver still needs hwspinlock support added. > > More over, this patch fixes the kernel boot failure which happens when > CONFIG_SENSORS_LM75=y: > > [ 2.251220] WARNING: at drivers/bus/omap_l3_noc.c:113 l3_interrupt_handler+0x140/0x184() > [ 2.257781] L3 custom error: MASTER:MPU TARGET:L4 PER2 > [ 2.264373] Modules linked in: > [ 2.268463] CPU: 0 PID: 2 Comm: kthreadd Not tainted 3.10.0-rc4-00034-g042dd60-dirty #64 > [ 2.270385] [<c001a80c>] (unwind_backtrace+0x0/0xf0) from [<c0017238>] (show_stack+0x10/0x14) > [ 2.286102] [<c0017238>] (show_stack+0x10/0x14) from [<c0040fd0>] (warn_slowpath_common+0x4c/0x68) > [ 2.295593] [<c0040fd0>] (warn_slowpath_common+0x4c/0x68) from [<c0041080>] (warn_slowpath_fmt+0x30/0x40) > [ 2.299987] [<c0041080>] (warn_slowpath_fmt+0x30/0x40) from [<c02c5ed0>] (l3_interrupt_handler+0x140/0x184) > [ 2.315582] [<c02c5ed0>] (l3_interrupt_handler+0x140/0x184) from [<c009ffb8>] (handle_irq_event_percpu+0x58/0x258) > [ 2.323364] [<c009ffb8>] (handle_irq_event_percpu+0x58/0x258) from [<c00a01f4>] (handle_irq_event+0x3c/0x5c) > [ 2.327819] [<c00a01f4>] (handle_irq_event+0x3c/0x5c) from [<c00a2f7c>] (handle_fasteoi_irq+0xbc/0x164) > [ 2.337829] [<c00a2f7c>] (handle_fasteoi_irq+0xbc/0x164) from [<c009f978>] (generic_handle_irq+0x20/0x30) > [ 2.357513] [<c009f978>] (generic_handle_irq+0x20/0x30) from [<c0014168>] (handle_IRQ+0x4c/0xac) > [ 2.366821] [<c0014168>] (handle_IRQ+0x4c/0xac) from [<c00085b4>] (gic_handle_irq+0x28/0x5c) > [ 2.375762] [<c00085b4>] (gic_handle_irq+0x28/0x5c) from [<c04f08a4>] (__irq_svc+0x44/0x5c) > [ 2.379821] Exception stack(0xdb085ec0 to 0xdb085f08) > [ 2.389953] 5ec0: 00000001 00000001 00000000 db07ea00 40000113 db2317a8 db084000 000002f1 > [ 2.389953] 5ee0: db07ea00 00000000 00000000 00000000 c04fd990 db085f08 c009170c c04f03e8 > [ 2.405609] 5f00: 20000113 ffffffff > [ 2.408355] [<c04f08a4>] (__irq_svc+0x44/0x5c) from [<c04f03e8>] (_raw_spin_unlock_irqrestore+0x34/0x44) > [ 2.418304] [<c04f03e8>] (_raw_spin_unlock_irqrestore+0x34/0x44) from [<c00403c0>] (do_fork+0xa4/0x2d4) > [ 2.427795] [<c00403c0>] (do_fork+0xa4/0x2d4) from [<c0040620>] (kernel_thread+0x30/0x38) > [ 2.437805] [<c0040620>] (kernel_thread+0x30/0x38) from [<c0063888>] (kthreadd+0xd4/0x13c) > [ 2.448364] [<c0063888>] (kthreadd+0xd4/0x13c) from [<c0013308>] (ret_from_fork+0x14/0x2c) > [ 2.448364] ---[ end trace da8cf92c433d1616 ]--- > > The root casue of crash is race between omap_i2c_runtime_suspend() > and omap_i2c_isr_thread() If there's a race here, then it's not the same problem which is described above, unless the CPU2 IRQ is happening because of a shared user of I2C, in which case it doesn't need any additional explanation. > > CPU1: CPU2: > |-omap_i2c_xfer | > |- pm_runtime_put_autosuspend() | > |-timeout |-omap_i2c_isr() > |-return IRQ_WAKE_THREAD; > |-omap_i2c_runtime_suspend() | > |-omap_i2c_isr_thread() > |- oops is here - I2C module is in idle state If this is happening, I don't think it's related to $SUBJECT patch (but is probably hidden by it.) Instead, what probably needs to happen in this is case is that omap_i2c_isr() should be doing a "mark last busy" to reset the autosuspend timeout. And, that should be done in a separate patch. > Cc: Nishanth Menon <nm@ti.com> > Signed-off-by: Kevin Hilman <khilman@linaro.org> Both the From: and Signed-off should be from my TI address since the work was done while I was working for TI. Also, if you change the original patch/changelog, you should add a line here like: [grygorii.strashko@ti.com]: updated changlog > Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Kevin -- To unsubscribe from this list: send the line "unsubscribe linux-i2c" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/07/2013 11:51 PM, Kevin Hilman wrote: > Grygorii Strashko <grygorii.strashko@ti.com> writes: > >> From: Kevin Hilman <khilman@deeprootsystems.com> >> >> Currently, runtime PM is used to keep the device enabled only during >> active transfers and for a configurable runtime PM autosuspend timout >> after an xfer. >> >> In addition to idling the device, driver's ->runtime_suspend() method >> currently disables device interrupts when idle. However, on some SoCs >> (notably OMAP4+), the I2C hardware may shared with other coprocessors. >> This means that the MPU will still recieve interrupts if a coprocessor >> is using the I2C device. To avoid this, also disable interrupts at >> the MPU INTC when idling the device in ->runtime_suspend() (and >> re-enable them in ->runtime_resume().) This part based on an original >> patch from Shubhrajyoti Datta. NOTE: for proper sharing the I2C with >> a coprocessor, this driver still needs hwspinlock support added. >> >> More over, this patch fixes the kernel boot failure which happens when >> CONFIG_SENSORS_LM75=y: >> >> [ 2.251220] WARNING: at drivers/bus/omap_l3_noc.c:113 l3_interrupt_handler+0x140/0x184() >> [ 2.257781] L3 custom error: MASTER:MPU TARGET:L4 PER2 >> [ 2.264373] Modules linked in: >> [ 2.268463] CPU: 0 PID: 2 Comm: kthreadd Not tainted 3.10.0-rc4-00034-g042dd60-dirty #64 >> [ 2.270385] [<c001a80c>] (unwind_backtrace+0x0/0xf0) from [<c0017238>] (show_stack+0x10/0x14) >> [ 2.286102] [<c0017238>] (show_stack+0x10/0x14) from [<c0040fd0>] (warn_slowpath_common+0x4c/0x68) >> [ 2.295593] [<c0040fd0>] (warn_slowpath_common+0x4c/0x68) from [<c0041080>] (warn_slowpath_fmt+0x30/0x40) >> [ 2.299987] [<c0041080>] (warn_slowpath_fmt+0x30/0x40) from [<c02c5ed0>] (l3_interrupt_handler+0x140/0x184) >> [ 2.315582] [<c02c5ed0>] (l3_interrupt_handler+0x140/0x184) from [<c009ffb8>] (handle_irq_event_percpu+0x58/0x258) >> [ 2.323364] [<c009ffb8>] (handle_irq_event_percpu+0x58/0x258) from [<c00a01f4>] (handle_irq_event+0x3c/0x5c) >> [ 2.327819] [<c00a01f4>] (handle_irq_event+0x3c/0x5c) from [<c00a2f7c>] (handle_fasteoi_irq+0xbc/0x164) >> [ 2.337829] [<c00a2f7c>] (handle_fasteoi_irq+0xbc/0x164) from [<c009f978>] (generic_handle_irq+0x20/0x30) >> [ 2.357513] [<c009f978>] (generic_handle_irq+0x20/0x30) from [<c0014168>] (handle_IRQ+0x4c/0xac) >> [ 2.366821] [<c0014168>] (handle_IRQ+0x4c/0xac) from [<c00085b4>] (gic_handle_irq+0x28/0x5c) >> [ 2.375762] [<c00085b4>] (gic_handle_irq+0x28/0x5c) from [<c04f08a4>] (__irq_svc+0x44/0x5c) >> [ 2.379821] Exception stack(0xdb085ec0 to 0xdb085f08) >> [ 2.389953] 5ec0: 00000001 00000001 00000000 db07ea00 40000113 db2317a8 db084000 000002f1 >> [ 2.389953] 5ee0: db07ea00 00000000 00000000 00000000 c04fd990 db085f08 c009170c c04f03e8 >> [ 2.405609] 5f00: 20000113 ffffffff >> [ 2.408355] [<c04f08a4>] (__irq_svc+0x44/0x5c) from [<c04f03e8>] (_raw_spin_unlock_irqrestore+0x34/0x44) >> [ 2.418304] [<c04f03e8>] (_raw_spin_unlock_irqrestore+0x34/0x44) from [<c00403c0>] (do_fork+0xa4/0x2d4) >> [ 2.427795] [<c00403c0>] (do_fork+0xa4/0x2d4) from [<c0040620>] (kernel_thread+0x30/0x38) >> [ 2.437805] [<c0040620>] (kernel_thread+0x30/0x38) from [<c0063888>] (kthreadd+0xd4/0x13c) >> [ 2.448364] [<c0063888>] (kthreadd+0xd4/0x13c) from [<c0013308>] (ret_from_fork+0x14/0x2c) >> [ 2.448364] ---[ end trace da8cf92c433d1616 ]--- >> >> The root casue of crash is race between omap_i2c_runtime_suspend() >> and omap_i2c_isr_thread() > If there's a race here, then it's not the same problem which is > described above, unless the CPU2 IRQ is happening because of a shared > user of I2C, in which case it doesn't need any additional explanation. no shared users here >> CPU1: CPU2: >> |-omap_i2c_xfer | >> |- pm_runtime_put_autosuspend() | >> |-timeout |-omap_i2c_isr() >> |-return IRQ_WAKE_THREAD; >> |-omap_i2c_runtime_suspend() | >> |-omap_i2c_isr_thread() >> |- oops is here - I2C module is in idle state > If this is happening, I don't think it's related to $SUBJECT patch (but > is probably hidden by it.) I can remove "fix spurious IRQs: " from $SUBJECT. What do you think? > > Instead, what probably needs to happen in this is case is that > omap_i2c_isr() should be doing a "mark last busy" to reset the > autosuspend timeout. And, that should be done in a separate patch. Yes - from one side. From other side, this patch prevents such situation to happen. disable_irq(_dev->irq); - waits for any pending IRQ handlers for this interrupt to complete before returning. If we are in .runtime_idle() callback - it means I2C transaction has been finished (and It doesn't matter successfully or not) and we don't want to receive IRQs any more. As I mentioned in cover-latter this happens because PM runtime auto-suspend isn't working properly during the boot: [ 23.190002] omap_i2c 48350000.i2c: ==== i2c_add_numbered_adapter [ 23.204681] omap_i2c 48350000.i2c: bus 3 rev0.11 at 400 kHz [ 23.211669] omap_i2c 48350000.i2c: ====== rpm_suspend elapsed 0 last_busy 4294939616 [ 23.219879] omap_i2c 48350000.i2c: ====== rpm_suspend expires 4294939716 last_busy 4294939616 autosuspend_delay 1000 [ 23.219879] omap_i2c 48350000.i2c: ====== rpm_suspend expires 4294939700 [ 23.237274] omap_i2c 48350000.i2c: ====== rpm_suspend mod_timer expires 4294939700 --- the timer scheduled to 84 jiffes [ 23.246185] omap_i2c 48350000.i2c: ==== pm_runtime_put_autosuspend [ 23.253448] omap-dma-engine 4a056000.dma-controller: allocating channel for 62 [ 23.261138] omap-dma-engine 4a056000.dma-controller: allocating channel for 61 [ 23.269500] omap_i2c 48070000.i2c: ==== omap_i2c_runtime_resume [ 23.275817] omap_i2c 48350000.i2c: ==== omap_i2c_runtime_suspend --- and expired after ~40 ms So, there are no guaranty that this race will not happen even if I'll do "mark last busy" in HW IRQ handler (, because in high CPU load use case the threaded IRQ can be delayed. I2C driver should not receive IRQs after finishing the transfer. >> Cc: Nishanth Menon <nm@ti.com> >> Signed-off-by: Kevin Hilman <khilman@linaro.org> > Both the From: and Signed-off should be from my TI address since the > work was done while I was working for TI. > > Also, if you change the original patch/changelog, you should add a line > here like: > > [grygorii.strashko@ti.com]: updated changlog ok. >> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> > Kevin Actually, I can confirm, that this patch really fixes "spurious IRQs" issues when M3 Ducati is enabled :). Sorry, for delayed reply - I've had problems with my e-mail. - grygorii -- To unsubscribe from this list: send the line "unsubscribe linux-i2c" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi, On Wed, Jun 19, 2013 at 09:35:38PM +0300, Grygorii Strashko wrote: > On 06/07/2013 11:51 PM, Kevin Hilman wrote: > >Grygorii Strashko <grygorii.strashko@ti.com> writes: > > > >>From: Kevin Hilman <khilman@deeprootsystems.com> > >> > >>Currently, runtime PM is used to keep the device enabled only during > >>active transfers and for a configurable runtime PM autosuspend timout > >>after an xfer. > >> > >>In addition to idling the device, driver's ->runtime_suspend() method > >>currently disables device interrupts when idle. However, on some SoCs > >>(notably OMAP4+), the I2C hardware may shared with other coprocessors. > >>This means that the MPU will still recieve interrupts if a coprocessor > >>is using the I2C device. To avoid this, also disable interrupts at > >>the MPU INTC when idling the device in ->runtime_suspend() (and > >>re-enable them in ->runtime_resume().) This part based on an original > >>patch from Shubhrajyoti Datta. NOTE: for proper sharing the I2C with > >>a coprocessor, this driver still needs hwspinlock support added. > >> > >>More over, this patch fixes the kernel boot failure which happens when > >>CONFIG_SENSORS_LM75=y: > >> > >>[ 2.251220] WARNING: at drivers/bus/omap_l3_noc.c:113 l3_interrupt_handler+0x140/0x184() > >>[ 2.257781] L3 custom error: MASTER:MPU TARGET:L4 PER2 > >>[ 2.264373] Modules linked in: > >>[ 2.268463] CPU: 0 PID: 2 Comm: kthreadd Not tainted 3.10.0-rc4-00034-g042dd60-dirty #64 > >>[ 2.270385] [<c001a80c>] (unwind_backtrace+0x0/0xf0) from [<c0017238>] (show_stack+0x10/0x14) > >>[ 2.286102] [<c0017238>] (show_stack+0x10/0x14) from [<c0040fd0>] (warn_slowpath_common+0x4c/0x68) > >>[ 2.295593] [<c0040fd0>] (warn_slowpath_common+0x4c/0x68) from [<c0041080>] (warn_slowpath_fmt+0x30/0x40) > >>[ 2.299987] [<c0041080>] (warn_slowpath_fmt+0x30/0x40) from [<c02c5ed0>] (l3_interrupt_handler+0x140/0x184) > >>[ 2.315582] [<c02c5ed0>] (l3_interrupt_handler+0x140/0x184) from [<c009ffb8>] (handle_irq_event_percpu+0x58/0x258) > >>[ 2.323364] [<c009ffb8>] (handle_irq_event_percpu+0x58/0x258) from [<c00a01f4>] (handle_irq_event+0x3c/0x5c) > >>[ 2.327819] [<c00a01f4>] (handle_irq_event+0x3c/0x5c) from [<c00a2f7c>] (handle_fasteoi_irq+0xbc/0x164) > >>[ 2.337829] [<c00a2f7c>] (handle_fasteoi_irq+0xbc/0x164) from [<c009f978>] (generic_handle_irq+0x20/0x30) > >>[ 2.357513] [<c009f978>] (generic_handle_irq+0x20/0x30) from [<c0014168>] (handle_IRQ+0x4c/0xac) > >>[ 2.366821] [<c0014168>] (handle_IRQ+0x4c/0xac) from [<c00085b4>] (gic_handle_irq+0x28/0x5c) > >>[ 2.375762] [<c00085b4>] (gic_handle_irq+0x28/0x5c) from [<c04f08a4>] (__irq_svc+0x44/0x5c) > >>[ 2.379821] Exception stack(0xdb085ec0 to 0xdb085f08) > >>[ 2.389953] 5ec0: 00000001 00000001 00000000 db07ea00 40000113 db2317a8 db084000 000002f1 > >>[ 2.389953] 5ee0: db07ea00 00000000 00000000 00000000 c04fd990 db085f08 c009170c c04f03e8 > >>[ 2.405609] 5f00: 20000113 ffffffff > >>[ 2.408355] [<c04f08a4>] (__irq_svc+0x44/0x5c) from [<c04f03e8>] (_raw_spin_unlock_irqrestore+0x34/0x44) > >>[ 2.418304] [<c04f03e8>] (_raw_spin_unlock_irqrestore+0x34/0x44) from [<c00403c0>] (do_fork+0xa4/0x2d4) > >>[ 2.427795] [<c00403c0>] (do_fork+0xa4/0x2d4) from [<c0040620>] (kernel_thread+0x30/0x38) > >>[ 2.437805] [<c0040620>] (kernel_thread+0x30/0x38) from [<c0063888>] (kthreadd+0xd4/0x13c) > >>[ 2.448364] [<c0063888>] (kthreadd+0xd4/0x13c) from [<c0013308>] (ret_from_fork+0x14/0x2c) > >>[ 2.448364] ---[ end trace da8cf92c433d1616 ]--- > >> > >>The root casue of crash is race between omap_i2c_runtime_suspend() > >> and omap_i2c_isr_thread() > >If there's a race here, then it's not the same problem which is > >described above, unless the CPU2 IRQ is happening because of a shared > >user of I2C, in which case it doesn't need any additional explanation. > no shared users here > >>CPU1: CPU2: > >>|-omap_i2c_xfer | > >> |- pm_runtime_put_autosuspend() | > >> |-timeout |-omap_i2c_isr() > >> |-return IRQ_WAKE_THREAD; > >> |-omap_i2c_runtime_suspend() | > >> |-omap_i2c_isr_thread() > >> |- oops is here - I2C module is in idle state > >If this is happening, I don't think it's related to $SUBJECT patch (but > >is probably hidden by it.) > I can remove "fix spurious IRQs: " from $SUBJECT. What do you think? > > > >Instead, what probably needs to happen in this is case is that > >omap_i2c_isr() should be doing a "mark last busy" to reset the > >autosuspend timeout. And, that should be done in a separate patch. this doesn't make sense... mark last busy is done after the I2C message is actually complete, so is pm_runtime_put_autosuspend() which is done following mark_last_busy. autosuspend timer shouldn't be firing since put won't be called until we're dead sure I2C message (all of them, in fact) are finalized. > Yes - from one side. From other side, this patch prevents such > situation to happen. > disable_irq(_dev->irq); - waits for any pending IRQ handlers for this > interrupt > to complete before returning. IRQ subsystem makes sure to disable our IRQ line while our handler is running and, because of IRQF_ONESHOT, IRQ will only be re-enabled after thread has finished. > If we are in .runtime_idle() callback - it means I2C transaction has > been finished we don't even use runtime_idle() and your statement is wrong, btw. If you have 200 pm_runtime_get() followed by 200 pm_runtime_put() (put is called only after 200 gets, no put-get ping-pong), your ->runtime_resume() gets called once, your ->runtime_suspend() gets called once but your ->runtime_idle() will get called 200 times. assuming I2C transaction has finished is wrong here. What if we decide to add pm_runtime_get() on hardirq handler and pm_runtime_put() to the end of the thread handler ? If you mask IRQs on ->runtime_idle(), you might miss some events. > As I mentioned in cover-latter this happens because PM runtime > auto-suspend isn't > working properly during the boot: > > [ 23.190002] omap_i2c 48350000.i2c: ==== i2c_add_numbered_adapter > [ 23.204681] omap_i2c 48350000.i2c: bus 3 rev0.11 at 400 kHz > [ 23.211669] omap_i2c 48350000.i2c: ====== rpm_suspend elapsed 0 last_busy 4294939616 > [ 23.219879] omap_i2c 48350000.i2c: ====== rpm_suspend expires 4294939716 last_busy 4294939616 autosuspend_delay 1000 > [ 23.219879] omap_i2c 48350000.i2c: ====== rpm_suspend expires 4294939700 > [ 23.237274] omap_i2c 48350000.i2c: ====== rpm_suspend mod_timer expires 4294939700 > --- the timer scheduled to 84 jiffes > [ 23.246185] omap_i2c 48350000.i2c: ==== pm_runtime_put_autosuspend > [ 23.253448] omap-dma-engine 4a056000.dma-controller: allocating channel for 62 > [ 23.261138] omap-dma-engine 4a056000.dma-controller: allocating channel for 61 > [ 23.269500] omap_i2c 48070000.i2c: ==== omap_i2c_runtime_resume > [ 23.275817] omap_i2c 48350000.i2c: ==== omap_i2c_runtime_suspend > --- and expired after ~40 ms > > So, there are no guaranty that this race will not happen even if I'll > do "mark last busy" > in HW IRQ handler (, because in high CPU load use case the threaded > IRQ can be delayed. then the bug is on pm_runtime time keeping, no ? Not really on the I2C driver.
Felipe Balbi <balbi@ti.com> writes: [...] > If you have 200 pm_runtime_get() followed by 200 pm_runtime_put() (put > is called only after 200 gets, no put-get ping-pong), your > ->runtime_resume() gets called once, your ->runtime_suspend() gets > called once but your ->runtime_idle() will get called 200 times. No. The driver's ->runtime_idle() will only be called when the usecount goes to zero. Kevin -- To unsubscribe from this list: send the line "unsubscribe linux-i2c" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi, On Wed, Jun 19, 2013 at 01:01:28PM -0700, Kevin Hilman wrote: > Felipe Balbi <balbi@ti.com> writes: > > [...] > > > If you have 200 pm_runtime_get() followed by 200 pm_runtime_put() (put > > is called only after 200 gets, no put-get ping-pong), your > > ->runtime_resume() gets called once, your ->runtime_suspend() gets > > called once but your ->runtime_idle() will get called 200 times. > > No. The driver's ->runtime_idle() will only be called when the usecount > goes to zero. Indeed, just re-read the code.
diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c index 64c26f9..97844ff 100644 --- a/drivers/i2c/busses/i2c-omap.c +++ b/drivers/i2c/busses/i2c-omap.c @@ -1290,6 +1290,8 @@ static int omap_i2c_runtime_suspend(struct device *dev) struct platform_device *pdev = to_platform_device(dev); struct omap_i2c_dev *_dev = platform_get_drvdata(pdev); + disable_irq(_dev->irq); + _dev->iestate = omap_i2c_read_reg(_dev, OMAP_I2C_IE_REG); if (_dev->scheme == OMAP_I2C_SCHEME_0) @@ -1320,6 +1322,8 @@ static int omap_i2c_runtime_resume(struct device *dev) __omap_i2c_init(_dev); + enable_irq(_dev->irq); + return 0; } #endif /* CONFIG_PM_RUNTIME */