Message ID | 20150119191524.GC10570@htj.dyndns.org |
---|---|
State | Not Applicable |
Delegated to: | David Miller |
Headers | show |
On 01/19/2015 01:15 PM, Tejun Heo wrote: > Hello, > > I massaged it a bit and applied to libata/for-3.19-fixes. > > Thanks. > ---- 8< ---- > From ce7514526742c0898b837d4395f515b79dfb5a12 Mon Sep 17 00:00:00 2001 > From: David Jeffery <djeffery@redhat.com> > Date: Mon, 19 Jan 2015 13:03:25 -0600 > Subject: [PATCH] libata: prevent HSM state change race between ISR and PIO > > It is possible for ata_sff_flush_pio_task() to set ap->hsm_task_state to > HSM_ST_IDLE in between the time __ata_sff_port_intr() checks for HSM_ST_IDLE > and before it calls ata_sff_hsm_move() causing ata_sff_hsm_move() to BUG(). > > This problem is hard to reproduce making this patch hard to verify, but this > fix will prevent the race. > > I have not been able to reproduce the problem, but here is a crash dump from > a 2.6.32 kernel. > > On examining the ata port's state, its hsm_task_state field has a value of HSM_ST_IDLE: > > crash> struct ata_port.hsm_task_state ffff881c1121c000 > hsm_task_state = 0 > > Normally, this should not be possible as ata_sff_hsm_move() was called from ata_sff_host_intr(), > which checks hsm_task_state and won't call ata_sff_hsm_move() if it has a HSM_ST_IDLE value. > > PID: 11053 TASK: ffff8816e846cae0 CPU: 0 COMMAND: "sshd" > #0 [ffff88008ba03960] machine_kexec at ffffffff81038f3b > #1 [ffff88008ba039c0] crash_kexec at ffffffff810c5d92 > #2 [ffff88008ba03a90] oops_end at ffffffff8152b510 > #3 [ffff88008ba03ac0] die at ffffffff81010e0b > #4 [ffff88008ba03af0] do_trap at ffffffff8152ad74 > #5 [ffff88008ba03b50] do_invalid_op at ffffffff8100cf95 > #6 [ffff88008ba03bf0] invalid_op at ffffffff8100bf9b > [exception RIP: ata_sff_hsm_move+317] > RIP: ffffffff813a77ad RSP: ffff88008ba03ca0 RFLAGS: 00010097 > RAX: 0000000000000000 RBX: ffff881c1121dc60 RCX: 0000000000000000 > RDX: ffff881c1121dd10 RSI: ffff881c1121dc60 RDI: ffff881c1121c000 > RBP: ffff88008ba03d00 R8: 0000000000000000 R9: 000000000000002e > R10: 000000000001003f R11: 000000000000009b R12: ffff881c1121c000 > R13: 0000000000000000 R14: 0000000000000050 R15: ffff881c1121dd78 > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > #7 [ffff88008ba03d08] ata_sff_host_intr at ffffffff813a7fbd > #8 [ffff88008ba03d38] ata_sff_interrupt at ffffffff813a821e > #9 [ffff88008ba03d78] handle_IRQ_event at ffffffff810e6ec0 > --- <IRQ stack> --- > [exception RIP: pipe_poll+48] > RIP: ffffffff81192780 RSP: ffff880f26d459b8 RFLAGS: 00000246 > RAX: 0000000000000000 RBX: ffff880f26d459c8 RCX: 0000000000000000 > RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff881a0539fa80 > RBP: ffffffff8100bb8e R8: ffff8803b23324a0 R9: 0000000000000000 > R10: ffff880f26d45dd0 R11: 0000000000000008 R12: ffffffff8109b646 > R13: ffff880f26d45948 R14: 0000000000000246 R15: 0000000000000246 > ORIG_RAX: ffffffffffffff10 CS: 0010 SS: 0018 > RIP: 00007f26017435c3 RSP: 00007fffe020c420 RFLAGS: 00000206 > RAX: 0000000000000017 RBX: ffffffff8100b072 RCX: 00007fffe020c45c > RDX: 00007f2604a3f120 RSI: 00007f2604a3f140 RDI: 000000000000000d > RBP: 0000000000000000 R8: 00007fffe020e570 R9: 0101010101010101 > R10: 0000000000000000 R11: 0000000000000246 R12: 00007fffe020e5f0 > R13: 00007fffe020e5f4 R14: 00007f26045f373c R15: 00007fffe020e5e0 > ORIG_RAX: 0000000000000017 CS: 0033 SS: 002b > > Somewhere between the ata_sff_hsm_move() check and the ata_sff_host_intr() check, the value changed. > On examining the other cpus to see what else was running, another cpu was running the error handler > routines: > > PID: 326 TASK: ffff881c11014aa0 CPU: 1 COMMAND: "scsi_eh_1" > #0 [ffff88008ba27e90] crash_nmi_callback at ffffffff8102fee6 > #1 [ffff88008ba27ea0] notifier_call_chain at ffffffff8152d515 > #2 [ffff88008ba27ee0] atomic_notifier_call_chain at ffffffff8152d57a > #3 [ffff88008ba27ef0] notify_die at ffffffff810a154e > #4 [ffff88008ba27f20] do_nmi at ffffffff8152b1db > #5 [ffff88008ba27f50] nmi at ffffffff8152aaa0 > [exception RIP: _spin_lock_irqsave+47] > RIP: ffffffff8152a1ff RSP: ffff881c11a73aa0 RFLAGS: 00000006 > RAX: 0000000000000001 RBX: ffff881c1121deb8 RCX: 0000000000000000 > RDX: 0000000000000246 RSI: 0000000000000020 RDI: ffff881c122612d8 > RBP: ffff881c11a73aa0 R8: ffff881c17083800 R9: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000000 R12: ffff881c1121c000 > R13: 000000000000001f R14: ffff881c1121dd50 R15: ffff881c1121dc60 > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000 > --- <NMI exception stack> --- > #6 [ffff881c11a73aa0] _spin_lock_irqsave at ffffffff8152a1ff > #7 [ffff881c11a73aa8] ata_exec_internal_sg at ffffffff81396fb5 > #8 [ffff881c11a73b58] ata_exec_internal at ffffffff81397109 > #9 [ffff881c11a73bd8] atapi_eh_request_sense at ffffffff813a34eb > > Before it tried to acquire a spinlock, ata_exec_internal_sg() called ata_sff_flush_pio_task(). > This function will set ap->hsm_task_state to HSM_ST_IDLE, and has no locking around setting this > value. ata_sff_flush_pio_task() can then race with the interrupt handler and potentially set > HSM_ST_IDLE at a fatal moment, which will trigger a kernel BUG. > > v2: Fixup comment in ata_sff_flush_pio_task() > > tj: Further updated comment. Use ap->lock instead of shost lock and > use the [un]lock_irq variant instead of the irqsave/restore one. > > Signed-off-by: David Milburn <dmilburn@redhat.com> > Signed-off-by: Tejun Heo <tj@kernel.org> > Cc: stable@vger.kernel.org > --- > drivers/ata/libata-sff.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/drivers/ata/libata-sff.c b/drivers/ata/libata-sff.c > index db90aa3..2e86e3b 100644 > --- a/drivers/ata/libata-sff.c > +++ b/drivers/ata/libata-sff.c > @@ -1333,7 +1333,19 @@ void ata_sff_flush_pio_task(struct ata_port *ap) > DPRINTK("ENTER\n"); > > cancel_delayed_work_sync(&ap->sff_pio_task); > + > + /* > + * We wanna reset the HSM state to IDLE. If we do so without > + * grabbing the port lock, critical sections protected by it which > + * expect the HSM state to stay stable may get surprised. For > + * example, we may set IDLE in between the time > + * __ata_sff_port_intr() checks for HSM_ST_IDLE and before it calls > + * ata_sff_hsm_move() causing ata_sff_hsm_move() to BUG(). > + */ > + spin_lock_irq(ap->lock); > ap->hsm_task_state = HSM_ST_IDLE; > + spin_unlock_irq(ap->lock); > + Hi Tejun, Wouldn't we still have to use host lock instead of port lock to sync with ISR? Thanks, David > ap->sff_pio_task_link = NULL; > > if (ata_msg_ctl(ap)) > -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jan 19, 2015 at 01:37:19PM -0600, David Milburn wrote: > Wouldn't we still have to use host lock instead of port lock to sync > with ISR? They're the same lock for SFF controllers. If I'm not mistaken, ahci is currently the only one which splits the host lock when MMSI is enabled. Thanks.
diff --git a/drivers/ata/libata-sff.c b/drivers/ata/libata-sff.c index db90aa3..2e86e3b 100644 --- a/drivers/ata/libata-sff.c +++ b/drivers/ata/libata-sff.c @@ -1333,7 +1333,19 @@ void ata_sff_flush_pio_task(struct ata_port *ap) DPRINTK("ENTER\n"); cancel_delayed_work_sync(&ap->sff_pio_task); + + /* + * We wanna reset the HSM state to IDLE. If we do so without + * grabbing the port lock, critical sections protected by it which + * expect the HSM state to stay stable may get surprised. For + * example, we may set IDLE in between the time + * __ata_sff_port_intr() checks for HSM_ST_IDLE and before it calls + * ata_sff_hsm_move() causing ata_sff_hsm_move() to BUG(). + */ + spin_lock_irq(ap->lock); ap->hsm_task_state = HSM_ST_IDLE; + spin_unlock_irq(ap->lock); + ap->sff_pio_task_link = NULL; if (ata_msg_ctl(ap))