Message ID | 4B550F9F.80503@kernel.org |
---|---|
State | Not Applicable |
Delegated to: | David Miller |
Headers | show |
On 01/18/2010 08:49 PM, Tejun Heo wrote: > Traditional IDE interface sucks in that it doesn't have a reliable IRQ > pending bit, so if the controller raises IRQ while the driver is > expecting it not to, the IRQ won't be cleared and eventually the IRQ > line will be killed by interrupt subsystem. Some controllers have > non-standard mechanism to indicate IRQ pending so that this condition > can be detected and worked around. > > This patch adds an optional operation ->sff_irq_check() which will be > called for each port from the ata_sff_interrupt() if an unexpected > interrupt is received. If the operation returns %true, > ->sff_check_status() and ->sff_irq_clear() will be cleared for the > port. Note that this doesn't mark the interrupt as handled so it > won't prevent IRQ subsystem from killing the IRQ if this mechanism > fails to clear the spurious IRQ. > > This patch also implements ->sff_irq_check() for ata_piix. Note that > this adds slight overhead to shared IRQ operation as IRQs which are > destined for other controllers will trigger extra register accesses to > check whether IDE interrupt is pending but this solves rare screaming > IRQ cases and for some curious reason also helps weird BIOS related > glitch on Samsung n130 as reported in bko#14314. > > http://bugzilla.kernel.org/show_bug.cgi?id=14314 > > * piix_base_ops dropped as suggested by Sergei. > > * Spurious IRQ detection doesn't kick in anymore if polling qc is in > progress. This provides less protection but some controllers have > possible data corruption issues if the wrong register is accessed > while a command is in progress. > > Signed-off-by: Tejun Heo<tj@kernel.org> > Reported-by: Johannes Stezenbach<js@sig21.net> > Reported-by: Hans Werner<hwerner4@gmx.de> > Cc: Alan Cox<alan@lxorguk.ukuu.org.uk> > Cc: Sergei Shtylyov<sshtylyov@ru.mvista.com> > --- > Jeff, these two are for #upstream but generated on top of > #upstream-fixes because the piix sata 32bit bmdma patch is currently > only in #upstream-fixes. Please pull -fixes into #upstream before > applying these two. Thanks. > > drivers/ata/ata_piix.c | 20 +++++++++++++++----- > drivers/ata/libata-sff.c | 35 ++++++++++++++++++++++++++++++++--- > include/linux/libata.h | 1 + > 3 files changed, 48 insertions(+), 8 deletions(-) applied, with comment: Overall, as long as the drive is in Bus-Idle mode, it should be safe to go ahead and read Status, for pretty much every controller and drive. I would make exception only for the new SATA FIS-based controllers, where we know that hitting Status is likely both pointless and wasteful, as well as being superfluous because the newer FIS-based controllers all have irq status registers. Additionally, I think we should have a "fast-timeout" and "slow-timeout", whereby we check Status after a short period (5 seconds?) to make sure we did not lose an interrupt. If Status is !BSY, then we can proceed with handling qc success/failure immediately. Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, On 01/21/2010 04:33 AM, Jeff Garzik wrote: > Overall, as long as the drive is in Bus-Idle mode, it should be safe to > go ahead and read Status, for pretty much every controller and drive. Hmmm... I was a bit worried about the case Alan mentioned several times where access to AltStatus while data transfer is going on can lead to silent data corruption. > I would make exception only for the new SATA FIS-based controllers, > where we know that hitting Status is likely both pointless and wasteful, > as well as being superfluous because the newer FIS-based controllers all > have irq status registers. FIS-based ones need their own interrupt handlers anyway so, fortunately, things like irq_check callback isn't necessary to begin with. :-) > Additionally, I think we should have a "fast-timeout" and > "slow-timeout", whereby we check Status after a short period (5 > seconds?) to make sure we did not lose an interrupt. If Status is !BSY, > then we can proceed with handling qc success/failure immediately. Does this happen often? What I find more common is just plain timeouts, so I think it would improve our exception latency if we apply different timeouts for each trial. ie. For the first RW try, set the timeout to 7 secs. For the second, 15 and then to 30. This wouldn't harm the correctness while allowing libata to react much faster to transient failures. Another thing is I can think of which can improve our robustness is dynamic irqpoll support such that when screaming IRQ happens, IRQ subsystem not only shuts down the IRQ line but also begins selectively irqpolling it. Thanks.
> Hmmm... I was a bit worried about the case Alan mentioned several > times where access to AltStatus while data transfer is going on can > lead to silent data corruption. PIIX4 Erratum #15 440MX Erratum #13 PIIX4E Erratum #16 http://www.intel.com/design/chipsets/specupdt/29063508.pdf -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/20/2010 06:45 PM, Tejun Heo wrote: > Hello, > > On 01/21/2010 04:33 AM, Jeff Garzik wrote: >> Overall, as long as the drive is in Bus-Idle mode, it should be safe to >> go ahead and read Status, for pretty much every controller and drive. > > Hmmm... I was a bit worried about the case Alan mentioned several > times where access to AltStatus while data transfer is going on can > lead to silent data corruption. If a drive is in Bus-Idle, as I mentioned, then there is no active data transfer. >> I would make exception only for the new SATA FIS-based controllers, >> where we know that hitting Status is likely both pointless and wasteful, >> as well as being superfluous because the newer FIS-based controllers all >> have irq status registers. > > FIS-based ones need their own interrupt handlers anyway so, > fortunately, things like irq_check callback isn't necessary to begin > with. :-) Yep. >> Additionally, I think we should have a "fast-timeout" and >> "slow-timeout", whereby we check Status after a short period (5 >> seconds?) to make sure we did not lose an interrupt. If Status is !BSY, >> then we can proceed with handling qc success/failure immediately. > > Does this happen often? What I find more common is just plain > timeouts, so I think it would improve our exception latency if we > apply different timeouts for each trial. ie. For the first RW try, > set the timeout to 7 secs. For the second, 15 and then to 30. This > wouldn't harm the correctness while allowing libata to react much > faster to transient failures. Lost interrupts do not happen often, but they do happen. Google finds plenty of examples. > Another thing is I can think of which can improve our robustness is > dynamic irqpoll support such that when screaming IRQ happens, IRQ > subsystem not only shuts down the IRQ line but also begins selectively > irqpolling it. Does this ever happen when data transfer is active? AFAIK this happens during probe or reset or set-xfer or bus-idle or some other auxiliary moment in time. Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/22/2010 01:52 AM, Jeff Garzik wrote: >> Hmmm... I was a bit worried about the case Alan mentioned several >> times where access to AltStatus while data transfer is going on can >> lead to silent data corruption. > > If a drive is in Bus-Idle, as I mentioned, then there is no active data > transfer. Oh, you meant reading status instead of altstatus? >> Does this happen often? What I find more common is just plain >> timeouts, so I think it would improve our exception latency if we >> apply different timeouts for each trial. ie. For the first RW try, >> set the timeout to 7 secs. For the second, 15 and then to 30. This >> wouldn't harm the correctness while allowing libata to react much >> faster to transient failures. > > Lost interrupts do not happen often, but they do happen. Google finds > plenty of examples. Yeap, but genuine timeouts seem to happen more commonly and shortening initial timeout would also work for non-SFF controllers, so I think it would be better to do that instead. >> Another thing is I can think of which can improve our robustness is >> dynamic irqpoll support such that when screaming IRQ happens, IRQ >> subsystem not only shuts down the IRQ line but also begins selectively >> irqpolling it. > > Does this ever happen when data transfer is active? AFAIK this happens > during probe or reset or set-xfer or bus-idle or some other auxiliary > moment in time. It usually does but there are other components too. A USB host in my x61s often causes IRQ storm after STR cycle. There also was a strange I2C device which shared IRQ line with ATA controller and killed the IRQ line when the system status changed. It's just that with shareable IRQs, there's no reason the kernel should be this vulnereable to these not-so-uncommon failure modes. Thanks.
Index: ata/include/linux/libata.h =================================================================== --- ata.orig/include/linux/libata.h +++ ata/include/linux/libata.h @@ -857,6 +857,7 @@ struct ata_port_operations { unsigned int (*sff_data_xfer)(struct ata_device *dev, unsigned char *buf, unsigned int buflen, int rw); u8 (*sff_irq_on)(struct ata_port *); + bool (*sff_irq_check)(struct ata_port *); void (*sff_irq_clear)(struct ata_port *); void (*bmdma_setup)(struct ata_queued_cmd *qc); Index: ata/drivers/ata/ata_piix.c =================================================================== --- ata.orig/drivers/ata/ata_piix.c +++ ata/drivers/ata/ata_piix.c @@ -173,6 +173,7 @@ static int piix_sidpr_scr_read(struct at unsigned int reg, u32 *val); static int piix_sidpr_scr_write(struct ata_link *link, unsigned int reg, u32 val); +static bool piix_irq_check(struct ata_port *ap); #ifdef CONFIG_PM static int piix_pci_device_suspend(struct pci_dev *pdev, pm_message_t mesg); static int piix_pci_device_resume(struct pci_dev *pdev); @@ -309,8 +310,13 @@ static struct scsi_host_template piix_sh ATA_BMDMA_SHT(DRV_NAME), }; -static struct ata_port_operations piix_pata_ops = { +static struct ata_port_operations piix_sata_ops = { .inherits = &ata_bmdma32_port_ops, + .sff_irq_check = piix_irq_check, +}; + +static struct ata_port_operations piix_pata_ops = { + .inherits = &piix_sata_ops, .cable_detect = ata_cable_40wire, .set_piomode = piix_set_piomode, .set_dmamode = piix_set_dmamode, @@ -328,10 +334,6 @@ static struct ata_port_operations ich_pa .set_dmamode = ich_set_dmamode, }; -static struct ata_port_operations piix_sata_ops = { - .inherits = &ata_bmdma32_port_ops, -}; - static struct ata_port_operations piix_sidpr_sata_ops = { .inherits = &piix_sata_ops, .hardreset = sata_std_hardreset, @@ -962,6 +964,14 @@ static int piix_sidpr_scr_write(struct a return 0; } +static bool piix_irq_check(struct ata_port *ap) +{ + if (unlikely(!ap->ioaddr.bmdma_addr)) + return false; + + return ap->ops->bmdma_status(ap) & ATA_DMA_INTR; +} + #ifdef CONFIG_PM static int piix_broken_suspend(void) { Index: ata/drivers/ata/libata-sff.c =================================================================== --- ata.orig/drivers/ata/libata-sff.c +++ ata/drivers/ata/libata-sff.c @@ -1760,7 +1760,7 @@ irqreturn_t ata_sff_interrupt(int irq, v { struct ata_host *host = dev_instance; unsigned int i; - unsigned int handled = 0; + unsigned int handled = 0, polling = 0; unsigned long flags; /* TODO: make _irqsave conditional on x86 PCI IDE legacy mode */ @@ -1774,8 +1774,37 @@ irqreturn_t ata_sff_interrupt(int irq, v continue; qc = ata_qc_from_tag(ap, ap->link.active_tag); - if (qc && !(qc->tf.flags & ATA_TFLAG_POLLING)) - handled |= ata_sff_host_intr(ap, qc); + if (qc) { + if (!(qc->tf.flags & ATA_TFLAG_POLLING)) + handled |= ata_sff_host_intr(ap, qc); + else + polling |= 1 << i; + } + } + + /* + * If no port was expecting IRQ but the controller is actually + * asserting IRQ line, nobody cared will ensue. Check IRQ + * pending status if available and clear spurious IRQ. + */ + if (!handled) { + for (i = 0; i < host->n_ports; i++) { + struct ata_port *ap = host->ports[i]; + + if (polling & (1 << i)) + continue; + + if (!ap->ops->sff_irq_check || + !ap->ops->sff_irq_check(ap)) + continue; + + if (printk_ratelimit()) + ata_port_printk(ap, KERN_INFO, + "clearing spurious IRQ\n"); + + ap->ops->sff_check_status(ap); + ap->ops->sff_irq_clear(ap); + } } spin_unlock_irqrestore(&host->lock, flags);
Traditional IDE interface sucks in that it doesn't have a reliable IRQ pending bit, so if the controller raises IRQ while the driver is expecting it not to, the IRQ won't be cleared and eventually the IRQ line will be killed by interrupt subsystem. Some controllers have non-standard mechanism to indicate IRQ pending so that this condition can be detected and worked around. This patch adds an optional operation ->sff_irq_check() which will be called for each port from the ata_sff_interrupt() if an unexpected interrupt is received. If the operation returns %true, ->sff_check_status() and ->sff_irq_clear() will be cleared for the port. Note that this doesn't mark the interrupt as handled so it won't prevent IRQ subsystem from killing the IRQ if this mechanism fails to clear the spurious IRQ. This patch also implements ->sff_irq_check() for ata_piix. Note that this adds slight overhead to shared IRQ operation as IRQs which are destined for other controllers will trigger extra register accesses to check whether IDE interrupt is pending but this solves rare screaming IRQ cases and for some curious reason also helps weird BIOS related glitch on Samsung n130 as reported in bko#14314. http://bugzilla.kernel.org/show_bug.cgi?id=14314 * piix_base_ops dropped as suggested by Sergei. * Spurious IRQ detection doesn't kick in anymore if polling qc is in progress. This provides less protection but some controllers have possible data corruption issues if the wrong register is accessed while a command is in progress. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Johannes Stezenbach <js@sig21.net> Reported-by: Hans Werner <hwerner4@gmx.de> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> --- Jeff, these two are for #upstream but generated on top of #upstream-fixes because the piix sata 32bit bmdma patch is currently only in #upstream-fixes. Please pull -fixes into #upstream before applying these two. Thanks. drivers/ata/ata_piix.c | 20 +++++++++++++++----- drivers/ata/libata-sff.c | 35 ++++++++++++++++++++++++++++++++--- include/linux/libata.h | 1 + 3 files changed, 48 insertions(+), 8 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html