Message ID | 1276443098-20653-12-git-send-email-tj@kernel.org |
---|---|
State | Not Applicable |
Delegated to: | David Miller |
Headers | show |
On 06/13/2010 05:31 PM, Tejun Heo wrote: > Thanks to its age, ATA is very susceptible to IRQ delivery problems in > both directions - lost and spurious interrupts. In traditional PATA, > the IRQ line is ultimately out of the controller and driver's control. > Even relatively new SATA isn't free from these issues. Many > controllers still emulate the traditional IDE interface which doesn't > have reliable way to indicate interrupt pending state and there also > is an issue regarding the interpretation of nIEN on both sides of the > cable. > > Most of these problems can be worked around by using the new IRQ > expecting mechanism without adding noticeable overhead. In ATA, > almost all operations are initiated by the host and the controller > signals progress or completion using IRQ. IRQ expecting can easily be > added in libata core and applied to all libata drivers. > > Signed-off-by: Tejun Heo <tj@kernel.org> Jeff, can you please ack this change if this looks okay to you? Also, would it okay to route this through irq tree? Thanks.
On 06/13/2010 11:31 AM, Tejun Heo wrote: > Thanks to its age, ATA is very susceptible to IRQ delivery problems in > both directions - lost and spurious interrupts. In traditional PATA, > the IRQ line is ultimately out of the controller and driver's control. > Even relatively new SATA isn't free from these issues. Many > controllers still emulate the traditional IDE interface which doesn't > have reliable way to indicate interrupt pending state and there also > is an issue regarding the interpretation of nIEN on both sides of the > cable. > > Most of these problems can be worked around by using the new IRQ > expecting mechanism without adding noticeable overhead. In ATA, > almost all operations are initiated by the host and the controller > signals progress or completion using IRQ. IRQ expecting can easily be > added in libata core and applied to all libata drivers. > > Signed-off-by: Tejun Heo<tj@kernel.org> > --- > drivers/ata/libata-core.c | 15 +++++++++++++-- > drivers/ata/libata-eh.c | 4 +++- > drivers/ata/libata-sff.c | 37 +++++++++++++++++++------------------ > include/linux/libata.h | 2 ++ > 4 files changed, 37 insertions(+), 21 deletions(-) > > diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c > index ddf8e48..9a0aaa0 100644 > --- a/drivers/ata/libata-core.c > +++ b/drivers/ata/libata-core.c > @@ -4972,6 +4972,8 @@ void ata_qc_complete(struct ata_queued_cmd *qc) > { > struct ata_port *ap = qc->ap; > > + unexpect_irq(ap->irq_expect, false); > + > /* XXX: New EH and old EH use different mechanisms to > * synchronize EH with regular execution path. > * Unconditional use of unexpect_irq() here seems incorrect for some cases, such as sata_mv's use, where ata_qc_complete() is called multiple times rather than a singleton ata_qc_complete_multiple() call. Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, Jeff. On 06/25/2010 02:22 AM, Jeff Garzik wrote: >> @@ -4972,6 +4972,8 @@ void ata_qc_complete(struct ata_queued_cmd *qc) >> { >> struct ata_port *ap = qc->ap; >> >> + unexpect_irq(ap->irq_expect, false); >> + >> /* XXX: New EH and old EH use different mechanisms to >> * synchronize EH with regular execution path. >> * > > Unconditional use of unexpect_irq() here seems incorrect for some cases, > such as sata_mv's use, where ata_qc_complete() is called multiple times > rather than a singleton ata_qc_complete_multiple() call. Indeed, sata_mv is calling ata_qc_complete() directly multiple times. I still think calling unexpect_irq() from ata_qc_complete() is correct as ata_qc_complete() is always a good indicator of completion events. What's missing is a way for sata_mv to indicate that it has more events to expect for, which under the current implementation only sata_mv interrupt handler can determine. I'll see if I can convert it to use ata_qc_complete_multiple() instead. Thanks.
On 06/25/2010 03:44 AM, Tejun Heo wrote: > Hello, Jeff. > > On 06/25/2010 02:22 AM, Jeff Garzik wrote: >>> @@ -4972,6 +4972,8 @@ void ata_qc_complete(struct ata_queued_cmd *qc) >>> { >>> struct ata_port *ap = qc->ap; >>> >>> + unexpect_irq(ap->irq_expect, false); >>> + >>> /* XXX: New EH and old EH use different mechanisms to >>> * synchronize EH with regular execution path. >>> * >> >> Unconditional use of unexpect_irq() here seems incorrect for some cases, >> such as sata_mv's use, where ata_qc_complete() is called multiple times >> rather than a singleton ata_qc_complete_multiple() call. > > Indeed, sata_mv is calling ata_qc_complete() directly multiple times. > I still think calling unexpect_irq() from ata_qc_complete() is correct > as ata_qc_complete() is always a good indicator of completion events. My basic point is that you are implicitly changing the entire ata_qc_complete() API, and associated underlying assumptions. The existing assumption, since libata day #0, is that ata_qc_complete() works entirely within the scope of a single qc -- thus enabling multiple calls for a single controller interrupt. Your change greatly widens the scope to an entire port. This isn't just an issue with sata_mv, that was just the easy example I remember off the top of my head. sata_fsl and sata_nv also make the same assumption. And it's a reasonable assumption, IMO. I think an unexpect_irq() call is more appropriate outside ata_qc_complete(). Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, On 06/25/2010 11:48 AM, Jeff Garzik wrote: > My basic point is that you are implicitly changing the entire > ata_qc_complete() API, and associated underlying assumptions. > > The existing assumption, since libata day #0, is that ata_qc_complete() > works entirely within the scope of a single qc -- thus enabling multiple > calls for a single controller interrupt. Your change greatly widens the > scope to an entire port. Yeah, I'm changing that and it actually reduces code. > This isn't just an issue with sata_mv, that was just the easy example I > remember off the top of my head. sata_fsl and sata_nv also make the > same assumption. And it's a reasonable assumption, IMO. Yeah, already updating all of them. > I think an unexpect_irq() call is more appropriate outside > ata_qc_complete(). The choices we have here are.... 1. Update completion API so that libata core layer has enough information to decide expect/unexpect events. 2. Add expect/unexpect calls to individual drivers. I think #1 is much better now and in the long run. The code actually looks better too. Thanks.
On 06/25/2010 03:44 AM, Tejun Heo wrote: > I still think calling unexpect_irq() from ata_qc_complete() is correct > as ata_qc_complete() is always a good indicator of completion events. No, it's not. ata_qc_complete() is an indicator that _one_ completion event occurred, not _all_ completion events for that port. Converting drivers to use ata_qc_complete_multiple() completely misses the point: ata_qc_complete_multiple() is doing exactly the same thing as those drivers: calling ata_qc_complete() in a loop. ata_qc_complete() is -- as its name implies -- completing a single qc. Your patch introduces an unconditional controller-wide unexpect_irq() call. It's a layering violation. You can trivially trace through ata_qc_complete_multiple() after patch #11 is applied, and see the result... for $N completion bits passed to ata_qc_complete_multiple(), you call unexpect_irq() expect_irq() in rapid succession $N times, once for each ata_qc_complete() call in the loop of ata_qc_complete_multiple(). For something that is not needed for modern SATA controllers. The preferred solution would be something that only touches legacy controllers, namely: *) create ata_port_complete(), with the implementation ata_qc_complete() unexpect_irq() *) then call ata_port_complete() in the legacy code that needs unexpect_irq() We don't want to burden modern SATA drivers with the overhead of dealing with silly PATA/SATA1 legacy irq nastiness, particularly the ugliness of calling unexpect_irq() + expect_irq() for a number of NCQ commands, in a tight loop! Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/25/2010 11:45 PM, Jeff Garzik wrote: > ata_qc_complete_multiple(), you call > unexpect_irq() > expect_irq() Correction, unexpect_irq() x N expect_irq() for N NCQ commands. Controller-wide irq twiddling in the function that completes a single ATA command is painful. Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, Jeff. On 06/26/2010 05:45 AM, Jeff Garzik wrote: > No, it's not. ata_qc_complete() is an indicator that _one_ completion > event occurred, not _all_ completion events for that port. Well, it can indicte the start of cluster of completions, which is the necessary information anyway. From the second call on, it's a simple flag test and return. I doubt it will affect anything even w/ high performance SSDs but please read on. > Converting drivers to use ata_qc_complete_multiple() completely misses > the point: ata_qc_complete_multiple() is doing exactly the same thing > as those drivers: calling ata_qc_complete() in a loop. The point of ata_qc_complete_multiple() is giving libata core layer better visibility into how commands are being executed, which in turn allows (continued below) > ata_qc_complete() is -- as its name implies -- completing a single qc. > Your patch introduces an unconditional controller-wide unexpect_irq() > call. It's a layering violation. > > You can trivially trace through ata_qc_complete_multiple() after patch > #11 is applied, and see the result... for $N completion bits passed to > ata_qc_complete_multiple(), you call > unexpect_irq() > expect_irq() > in rapid succession $N times, once for each ata_qc_complete() call in > the loop of ata_qc_complete_multiple(). For something that is not > needed for modern SATA controllers. ata_qc_complete_multiple() call [un]expect_irq() only once by introducing an internal completion function w/o irq expect handling, say ata_qc_complete_raw() and making both ata_qc_complete() and ata_qc_complete_multiple() simple wrapper around it w/ irq expect handling. > The preferred solution would be something that only touches legacy > controllers, namely: > > *) create ata_port_complete(), with the implementation > > ata_qc_complete() > unexpect_irq() > > *) then call ata_port_complete() in the legacy code that needs > unexpect_irq() > > We don't want to burden modern SATA drivers with the overhead of > dealing with silly PATA/SATA1 legacy irq nastiness, particularly the > ugliness of calling I think we're much better off applying it to all the drivers. IRQ expecting is very cheap and scalable and there definitely are plenty of IRQ delivery problems with modern controllers although their patterns tend to be different from legacy ones. Plus, it will also be useful for power state predictions. > unexpect_irq() + expect_irq() > > for a number of NCQ commands, in a tight loop! As I wrote above, I don't think N*unexpect_irq() really matters but with ata_qc_complete_multiple() conversion, there will only be single pair of expect/unexpect for each cluster of completions anyway. Thanks.
On 06/26/2010 04:31 AM, Tejun Heo wrote: > Well, it can indicte the start of cluster of completions, which is the > necessary information anyway. From the second call on, it's a simple > flag test and return. I doubt it will affect anything even w/ high > performance SSDs but please read on. Yes, and your patch calls unexpect_irq() at the _start_ of a cluster of completions. That is nonsensical, because it reflects the /opposite/ of the present ATA bus state, when multiple commands are in flight. > ata_qc_complete_multiple() call [un]expect_irq() only once by > introducing an internal completion function w/o irq expect handling, > say ata_qc_complete_raw() and making both ata_qc_complete() and > ata_qc_complete_multiple() simple wrapper around it w/ irq expect > handling. Yes, this fixes problem, but it is better to create a wrapper path for the legacy PATA/SATA1 that uses irq-expecting, and a fast path for modern controllers that do not use it. > On 06/26/2010 05:45 AM, Jeff Garzik wrote: >> We don't want to burden modern SATA drivers with the overhead of >> dealing with silly PATA/SATA1 legacy irq nastiness, particularly the >> ugliness of calling > > I think we're much better off applying it to all the drivers. IRQ > expecting is very cheap and scalable and there definitely are plenty > of IRQ delivery problems with modern controllers although their > patterns tend to be different from legacy ones. Plus, it will also be > useful for power state predictions. Modern SATA/SAS controllers, and their drivers, already have well defined methods of acknowledging interrupts, even unexpected ones, in ways that do not need this core manipulation. This is over-engineering, punishing all modern chipsets moving forward regardless of their design, by unconditionally requiring this behavior of all libata drivers. Just like the rest of libata's layered driver architecture, it should be straightforward to apply this only to SFF/BMDMA chipsets, then tackle odd cases as needs arise. Modern controllers acknowledge interrupts sanely, and always "expect" an interrupt when you include interrupt events like hotplug, even if the ATA bus itself is idle. There is no need to burden the millions of ahci users with irq-expecting, for example. With regards to power state predictions, it is only useful if you are accurately reflecting the ATA bus state (idle or not) at all times. As mentioned above, this patch clearly creates a condition where unexpect_irq() is called when commands remain in flight, and libata is expecting further command completions. IOW, patch #11 says "we are not expecting irq" when we are. At least a halfway sane approach would be to track bus-idle status, and trigger useful code when that status changes (idle->active or active->idle). Perhaps LED, power state, and irq-expecting could all use such a triggering mechanism. Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, Jeff. On 06/26/2010 11:16 AM, Jeff Garzik wrote: > On 06/26/2010 04:31 AM, Tejun Heo wrote: >> Well, it can indicte the start of cluster of completions, which is the >> necessary information anyway. From the second call on, it's a simple >> flag test and return. I doubt it will affect anything even w/ high >> performance SSDs but please read on. > > Yes, and your patch calls unexpect_irq() at the _start_ of a cluster of > completions. That is nonsensical, because it reflects the /opposite/ of > the present ATA bus state, when multiple commands are in flight. That's actually what we wanna know. I'll talk about it below. >> ata_qc_complete_multiple() call [un]expect_irq() only once by >> introducing an internal completion function w/o irq expect handling, >> say ata_qc_complete_raw() and making both ata_qc_complete() and >> ata_qc_complete_multiple() simple wrapper around it w/ irq expect >> handling. > > Yes, this fixes problem, but it is better to create a wrapper path for > the legacy PATA/SATA1 that uses irq-expecting, and a fast path for > modern controllers that do not use it. > >> On 06/26/2010 05:45 AM, Jeff Garzik wrote: >>> We don't want to burden modern SATA drivers with the overhead of >>> dealing with silly PATA/SATA1 legacy irq nastiness, particularly the >>> ugliness of calling >> >> I think we're much better off applying it to all the drivers. IRQ >> expecting is very cheap and scalable and there definitely are plenty >> of IRQ delivery problems with modern controllers although their >> patterns tend to be different from legacy ones. Plus, it will also be >> useful for power state predictions. > > Modern SATA/SAS controllers, and their drivers, already have well > defined methods of acknowledging interrupts, even unexpected ones, in > ways that do not need this core manipulation. This is over-engineering, > punishing all modern chipsets moving forward regardless of their design, > by unconditionally requiring this behavior of all libata drivers. Unacked irqs are primarily handled by spurious IRQ handling. IRQ expecting is more about lost interrupts and we have enough lost interrupt cases even on new controllers w/ native interface, both transient and non-transient. One of the goals of this whole IRQ exception handling was to make it dumb easy for drivers to use which also included makes things cheap enough so that they can be called from hot paths. Both expect and unexpect_irq() are very cheap once the IRQ delivery is verified. If the processor is taking an interrupt in the first place, this amount of logic shouldn't matter at all. There really isn't punishment to avoid and IMHO not doing it for native controllers is an over optimization. It gains almost nothing while losing meaningful protection. > Just like the rest of libata's layered driver architecture, it should be > straightforward to apply this only to SFF/BMDMA chipsets, then tackle > odd cases as needs arise. > > Modern controllers acknowledge interrupts sanely, and always "expect" an > interrupt when you include interrupt events like hotplug, even if the > ATA bus itself is idle. There is no need to burden the millions of ahci > users with irq-expecting, for example. I'm not saying applying it to only SFF/BMDMA is difficult, just that it's better to apply it to all drivers in this case. IRQ expecting is to protect against misdelivered / lost IRQs and we do have them for ahci, sil24 or whatever too. It would of course be silly to pay significant performance overhead for such protection but as I stated above, it's _really_ cheap. If the driver is taking an interrupt and accessing harddware and even if compared only against the general complexity of generic IRQ and libata code, the cost of IRQ [un]expect is negligible and designed precisely that way to allow use cases like this. > With regards to power state predictions, it is only useful if you are > accurately reflecting the ATA bus state (idle or not) at all times. As > mentioned above, this patch clearly creates a condition where > unexpect_irq() is called when commands remain in flight, and libata is > expecting further command completions. > > IOW, patch #11 says "we are not expecting irq" when we are. > > At least a halfway sane approach would be to track bus-idle status, and > trigger useful code when that status changes (idle->active or > active->idle). Perhaps LED, power state, and irq-expecting could all > use such a triggering mechanism. Continuing from the response to the first paragraph. The IRQ expecting code isn't interested in the bus state, it's interested only in the IRQ events and that's what it's expecting. The same applies to power state prediction too, so please consider the following NCQ command execution sequence. 1. issue tags 0, 1, 2, 3 2. IRQ triggers, tags 0, 2 complete 3. IRQ triggers, tags 1, 3 completes For IRQ expecting, both 1-2 and 2-3 are segments to expect for and for power state transition too, as it's IRQ itself which forces the cpu to come out of sleep state. The reason why I said unexpect in ata_qc_complete() is okay is that it can still delimit each segment as long as we have proper irq_expect() call at the beginning of each segment (all other unexpect calls are ignored). But, that's kind of moot point as we can easily do single pair. Thanks.
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index ddf8e48..9a0aaa0 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -4972,6 +4972,8 @@ void ata_qc_complete(struct ata_queued_cmd *qc) { struct ata_port *ap = qc->ap; + unexpect_irq(ap->irq_expect, false); + /* XXX: New EH and old EH use different mechanisms to * synchronize EH with regular execution path. * @@ -5087,6 +5089,9 @@ int ata_qc_complete_multiple(struct ata_port *ap, u32 qc_active) done_mask &= ~(1 << tag); } + if (ap->qc_active) + expect_irq(ap->irq_expect); + return nr_done; } @@ -5153,6 +5158,7 @@ void ata_qc_issue(struct ata_queued_cmd *qc) qc->err_mask |= ap->ops->qc_issue(qc); if (unlikely(qc->err_mask)) goto err; + expect_irq(ap->irq_expect); return; sg_err: @@ -6185,8 +6191,13 @@ int ata_host_activate(struct ata_host *host, int irq, if (rc) return rc; - for (i = 0; i < host->n_ports; i++) - ata_port_desc(host->ports[i], "irq %d", irq); + for (i = 0; i < host->n_ports; i++) { + struct ata_port *ap = host->ports[i]; + + if (!ata_port_is_dummy(ap)) + ap->irq_expect = init_irq_expect(irq, host); + ata_port_desc(ap, "irq %d%s", irq, ap->irq_expect ? "+" : ""); + } rc = ata_host_register(host, sht); /* if failed, just free the IRQ and leave ports alone */ diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c index f77a673..f1ae3ec 100644 --- a/drivers/ata/libata-eh.c +++ b/drivers/ata/libata-eh.c @@ -619,8 +619,10 @@ void ata_scsi_error(struct Scsi_Host *host) * handler doesn't diddle with those qcs. This must * be done atomically w.r.t. setting QCFLAG_FAILED. */ - if (nr_timedout) + if (nr_timedout) { + unexpect_irq(ap->irq_expect, true); __ata_port_freeze(ap); + } spin_unlock_irqrestore(ap->lock, flags); diff --git a/drivers/ata/libata-sff.c b/drivers/ata/libata-sff.c index efa4a18..cc96f36 100644 --- a/drivers/ata/libata-sff.c +++ b/drivers/ata/libata-sff.c @@ -2388,7 +2388,8 @@ int ata_pci_sff_activate_host(struct ata_host *host, struct device *dev = host->dev; struct pci_dev *pdev = to_pci_dev(dev); const char *drv_name = dev_driver_string(host->dev); - int legacy_mode = 0, rc; + struct ata_port *ap[2] = { host->ports[0], host->ports[1] }; + int legacy_mode = 0, i, rc; rc = ata_host_start(host); if (rc) @@ -2422,29 +2423,29 @@ int ata_pci_sff_activate_host(struct ata_host *host, if (rc) goto out; - ata_port_desc(host->ports[0], "irq %d", pdev->irq); - ata_port_desc(host->ports[1], "irq %d", pdev->irq); + for (i = 0; i < 2; i++) { + if (!ata_port_is_dummy(ap[i])) + ap[i]->irq_expect = + init_irq_expect(pdev->irq, host); + ata_port_desc(ap[i], "irq %d%s", + pdev->irq, ap[i]->irq_expect ? "+" : ""); + } } else if (legacy_mode) { - if (!ata_port_is_dummy(host->ports[0])) { - rc = devm_request_irq(dev, ATA_PRIMARY_IRQ(pdev), - irq_handler, IRQF_SHARED, - drv_name, host); - if (rc) - goto out; + unsigned int irqs[2] = { ATA_PRIMARY_IRQ(pdev), + ATA_SECONDARY_IRQ(pdev) }; - ata_port_desc(host->ports[0], "irq %d", - ATA_PRIMARY_IRQ(pdev)); - } + for (i = 0; i < 2; i++) { + if (ata_port_is_dummy(ap[i])) + continue; - if (!ata_port_is_dummy(host->ports[1])) { - rc = devm_request_irq(dev, ATA_SECONDARY_IRQ(pdev), - irq_handler, IRQF_SHARED, - drv_name, host); + rc = devm_request_irq(dev, irqs[i], irq_handler, + IRQF_SHARED, drv_name, host); if (rc) goto out; - ata_port_desc(host->ports[1], "irq %d", - ATA_SECONDARY_IRQ(pdev)); + ap[i]->irq_expect = init_irq_expect(irqs[i], host); + ata_port_desc(ap[i], "irq %d%s", + irqs[i], ap[i]->irq_expect ? "+" : ""); } } diff --git a/include/linux/libata.h b/include/linux/libata.h index b85f3ff..3f5f159 100644 --- a/include/linux/libata.h +++ b/include/linux/libata.h @@ -751,6 +751,8 @@ struct ata_port { struct ata_host *host; struct device *dev; + struct irq_expect *irq_expect; /* for irq expecting */ + struct delayed_work hotplug_task; struct work_struct scsi_rescan_task;
Thanks to its age, ATA is very susceptible to IRQ delivery problems in both directions - lost and spurious interrupts. In traditional PATA, the IRQ line is ultimately out of the controller and driver's control. Even relatively new SATA isn't free from these issues. Many controllers still emulate the traditional IDE interface which doesn't have reliable way to indicate interrupt pending state and there also is an issue regarding the interpretation of nIEN on both sides of the cable. Most of these problems can be worked around by using the new IRQ expecting mechanism without adding noticeable overhead. In ATA, almost all operations are initiated by the host and the controller signals progress or completion using IRQ. IRQ expecting can easily be added in libata core and applied to all libata drivers. Signed-off-by: Tejun Heo <tj@kernel.org> --- drivers/ata/libata-core.c | 15 +++++++++++++-- drivers/ata/libata-eh.c | 4 +++- drivers/ata/libata-sff.c | 37 +++++++++++++++++++------------------ include/linux/libata.h | 2 ++ 4 files changed, 37 insertions(+), 21 deletions(-)