Message ID | 20101117062958.GA2894@havoc.gtf.org |
---|---|
State | Not Applicable |
Delegated to: | David Miller |
Headers | show |
On Tue, Nov 16, 2010 at 10:29 PM, Jeff Garzik <jeff@garzik.org> wrote: > > + spin_lock(shost->host_lock); > + scsi_cmd_get_serial(shost, cmd); > spin_unlock(shost->host_lock); This is just sad. How important is that serial number? So important that we need to do a spinlock over it here? And it _must_ be per-shost? Because if you made it per-ap, you could easily just move that logic down to after you get the ap lock. Just add a "unsigned int serial_number" into the ata_port struct, and do the same (trivial) logic for getting a non-zero serial number there: n = ap->serial_number; if (!++n) n = 1 ap->serial_number = n; cmd->serial_number = n; or something. And then you'd _really_ not need to touch that totally pointless host lock at all. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11/17/2010 01:44 AM, Linus Torvalds wrote: > On Tue, Nov 16, 2010 at 10:29 PM, Jeff Garzik<jeff@garzik.org> wrote: >> >> + spin_lock(shost->host_lock); >> + scsi_cmd_get_serial(shost, cmd); >> spin_unlock(shost->host_lock); > > This is just sad. > > How important is that serial number? So important that we need to do a > spinlock over it here? And it _must_ be per-shost? Quite unimportant. Quoting James from http://marc.info/?l=linux-kernel&m=128949079704323&w=2 > There are only a few drivers left that actually make use of a serial > number. Of those, the only modern ones are qla4, lpfc, mpt2sas and > megaraid. > > So the next logical step seems to be eliminate the overloading of the > serial number zero value, which removes the last mid-layer use (dpt_i2o > seems to abuse this unnecessarily as well), then the serial number code > can be pushed down into the queuecommand routines of only those drivers > that actually use it. None of the modern ones seems to have a > legitimate use, so I think their uses can mostly be eliminated. Thus, > we might be able to get away with a simple queuecommand push down and > never bother with atomics for this (since it's unlikely the legacy users > would convert away from a lock wrapping their queuecommand routines). Looking solely at the SCSI code (ie. ignoring LLD code), it seems like the magic number zero for serial_number is signaling a boolean condition "are we an EH command?" EH tests this at the very beginning of the abort/reset/explode error handling sequence, presumably to avoid recursive EH invocations (scsi_try_to_abort_cmd). So maybe an EH expert (Tejun?) can correct me here, but I think we may be able to completely the lock/get-serial/unlock sequence from libata, as long as scsi_init_cmd_errh() reliably sets an "I am an EH command" flag. Would be nice if true... Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11/17/2010 03:08 AM, Jeff Garzik wrote: > So maybe an EH expert (Tejun?) can correct me here, but I think we may > be able to completely the lock/get-serial/unlock sequence from libata, ^^^ er, completely remove -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, Jeff, Linus. On 11/17/2010 09:08 AM, Jeff Garzik wrote: > Looking solely at the SCSI code (ie. ignoring LLD code), it seems > like the magic number zero for serial_number is signaling a boolean > condition "are we an EH command?" > > EH tests this at the very beginning of the abort/reset/explode error > handling sequence, presumably to avoid recursive EH invocations > (scsi_try_to_abort_cmd). > > So maybe an EH expert (Tejun?) can correct me here, but I think we > may be able to completely the lock/get-serial/unlock sequence from > libata, as long as scsi_init_cmd_errh() reliably sets an "I am an EH > command" flag. > > Would be nice if true... Yeah, it's actually nice (for once). libata doesn't use or care about scmd->serial_number at all. The SCSI EH path you mentioned above is not applicable as libata implements its eh_strategy_handler and SCSI only calls scsi_try_to_abort_cmd() for the default EH handler, scsi_unjam_host(). We'll need to test a bit to make sure everything is okay but I'm fairly certain removing it won't break anything fundamental. If something breaks at all, it would be some silly easy-to-fix thing. Thanks.
On 11/17/2010 05:01 AM, Tejun Heo wrote: > Hello, Jeff, Linus. > > On 11/17/2010 09:08 AM, Jeff Garzik wrote: >> Looking solely at the SCSI code (ie. ignoring LLD code), it seems >> like the magic number zero for serial_number is signaling a boolean >> condition "are we an EH command?" >> >> EH tests this at the very beginning of the abort/reset/explode error >> handling sequence, presumably to avoid recursive EH invocations >> (scsi_try_to_abort_cmd). >> >> So maybe an EH expert (Tejun?) can correct me here, but I think we >> may be able to completely the lock/get-serial/unlock sequence from >> libata, as long as scsi_init_cmd_errh() reliably sets an "I am an EH >> command" flag. >> >> Would be nice if true... > > Yeah, it's actually nice (for once). libata doesn't use or care about > scmd->serial_number at all. The SCSI EH path you mentioned above is > not applicable as libata implements its eh_strategy_handler and SCSI > only calls scsi_try_to_abort_cmd() for the default EH handler, > scsi_unjam_host(). > > We'll need to test a bit to make sure everything is okay but I'm > fairly certain removing it won't break anything fundamental. If > something breaks at all, it would be some silly easy-to-fix thing. It would be surprising if there is breakage, because serial_number is only tested in two places in the generic kernel: scsi_cmd_get_serial() -- where it simply avoids the zero value -- and scsi_try_to_abort_cmd(). Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2010-11-17 at 03:08 -0500, Jeff Garzik wrote: > On 11/17/2010 01:44 AM, Linus Torvalds wrote: > > On Tue, Nov 16, 2010 at 10:29 PM, Jeff Garzik<jeff@garzik.org> wrote: > >> > >> + spin_lock(shost->host_lock); > >> + scsi_cmd_get_serial(shost, cmd); > >> spin_unlock(shost->host_lock); > > > > This is just sad. > > > > How important is that serial number? So important that we need to do a > > spinlock over it here? And it _must_ be per-shost? > > Quite unimportant. Quoting James from > http://marc.info/?l=linux-kernel&m=128949079704323&w=2 > > There are only a few drivers left that actually make use of a serial > > number. Of those, the only modern ones are qla4, lpfc, mpt2sas and > > megaraid. > > > > So the next logical step seems to be eliminate the overloading of the > > serial number zero value, which removes the last mid-layer use (dpt_i2o > > seems to abuse this unnecessarily as well), then the serial number code > > can be pushed down into the queuecommand routines of only those drivers > > that actually use it. None of the modern ones seems to have a > > legitimate use, so I think their uses can mostly be eliminated. Thus, > > we might be able to get away with a simple queuecommand push down and > > never bother with atomics for this (since it's unlikely the legacy users > > would convert away from a lock wrapping their queuecommand routines). > > > Looking solely at the SCSI code (ie. ignoring LLD code), it seems like > the magic number zero for serial_number is signaling a boolean condition > "are we an EH command?" > > EH tests this at the very beginning of the abort/reset/explode error > handling sequence, presumably to avoid recursive EH invocations > (scsi_try_to_abort_cmd). > > So maybe an EH expert (Tejun?) can correct me here, but I think we may > be able to completely the lock/get-serial/unlock sequence from libata, > as long as scsi_init_cmd_errh() reliably sets an "I am an EH command" flag. > > Would be nice if true... Right ... I couldn't persuade anyone else to do it, so I'll probably end up coding it. It looks like the serial number zero check might be bogus. If so I'll remove it, then push the serial number acquisition down only into the locked queuecommand of only those drivers that actually use it (which were listed in the quoted email, and which might have a big impetus to remove it if the use is trivial). Then we can begin unwinding the locking. James -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> Right ... I couldn't persuade anyone else to do it, so I'll probably end > up coding it. It looks like the serial number zero check might be > bogus. If so I'll remove it, then push the serial number acquisition > down only into the locked queuecommand of only those drivers that > actually use it (which were listed in the quoted email, and which might > have a big impetus to remove it if the use is trivial). Then we can > begin unwinding the locking. In dc395x, eata_pio, in2000, lpfc, megaraid_legacy, megaraid_mbox, megaraid_sas, mesh, ncr53c8xx, qla1280, qla4xxx, wd33c93 and i2o_scsi use it in printks only, and all this can be safely removed. Except for EH that only leaves dpt_i2o, eata, mpt2sas, u14-34f, mptscsi as non-trivial users. If you sort out the EH side I'll volunteer to kill the printks and add private serial numbers to the five drivers mentioned above. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2010-11-17 at 10:47 -0500, Christoph Hellwig wrote: > > Right ... I couldn't persuade anyone else to do it, so I'll probably end > > up coding it. It looks like the serial number zero check might be > > bogus. If so I'll remove it, then push the serial number acquisition > > down only into the locked queuecommand of only those drivers that > > actually use it (which were listed in the quoted email, and which might > > have a big impetus to remove it if the use is trivial). Then we can > > begin unwinding the locking. > > In dc395x, eata_pio, in2000, lpfc, megaraid_legacy, megaraid_mbox, > megaraid_sas, mesh, ncr53c8xx, qla1280, qla4xxx, wd33c93 and i2o_scsi > use it in printks only, and all this can be safely removed. > > Except for EH that only leaves dpt_i2o, eata, mpt2sas, u14-34f, mptscsi > as non-trivial users. If you sort out the EH side I'll volunteer to > kill the printks and add private serial numbers to the five drivers > mentioned above. You're on (you were cc'd on the patch) ... I was planning just to move the scsi_cmnd_get_serial() call into the locked queuecommand routine, but a more permanent solution might be better. James -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index 19835d3..d72340e 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -3166,8 +3166,8 @@ static inline int __ata_scsi_queuecmd(struct scsi_cmnd *scmd, /** * ata_scsi_queuecmd - Issue SCSI cdb to libata-managed device + * @shost: SCSI host destination for given command * @cmd: SCSI command to be sent - * @done: Completion function, called when command is complete * * In some cases, this function translates SCSI commands into * ATA taskfiles, and queues the taskfiles to be sent to @@ -3183,36 +3183,41 @@ static inline int __ata_scsi_queuecmd(struct scsi_cmnd *scmd, * Return value from __ata_scsi_queuecmd() if @cmd can be queued, * 0 otherwise. */ -static int ata_scsi_queuecmd_lck(struct scsi_cmnd *cmd, void (*done)(struct scsi_cmnd *)) +int ata_scsi_queuecmd(struct Scsi_Host *shost, struct scsi_cmnd *cmd) { struct ata_port *ap; struct ata_device *dev; struct scsi_device *scsidev = cmd->device; - struct Scsi_Host *shost = scsidev->host; int rc = 0; + unsigned long irq_flags; - ap = ata_shost_to_port(shost); + local_irq_save(irq_flags); + spin_lock(shost->host_lock); + scsi_cmd_get_serial(shost, cmd); spin_unlock(shost->host_lock); + + ap = ata_shost_to_port(shost); + spin_lock(ap->lock); ata_scsi_dump_cdb(ap, cmd); dev = ata_scsi_find_dev(ap, scsidev); if (likely(dev)) - rc = __ata_scsi_queuecmd(cmd, done, dev); + rc = __ata_scsi_queuecmd(cmd, cmd->scsi_done, dev); else { cmd->result = (DID_BAD_TARGET << 16); - done(cmd); + cmd->scsi_done(cmd); } spin_unlock(ap->lock); - spin_lock(shost->host_lock); + + local_irq_restore(irq_flags); + return rc; } -DEF_SCSI_QCMD(ata_scsi_queuecmd) - /** * ata_scsi_simulate - simulate SCSI command on ATA device * @dev: the target device
This is in #upstream-fixes... Each libata command is now two lock-ops less expensive... for free! Should be nice for multi-core SSD users etc. (not a pull request, Linus, just an FYI) commit 02ab5c49284c91901cad783921e816297c92ebf8 Author: Jeff Garzik <jeff@garzik.org> Date: Wed Nov 17 01:19:11 2010 -0500 [libata] kill unlock+relock cycle in SCSI queuecommand hook Now that SCSI's ->queuecommand hook is unlocked, ATA no longer needs to unlock the SCSI host_lock prior to work, and re-lock the host_lock once our work is complete. Signed-off-by: Jeff Garzik <jgarzik@redhat.com> drivers/ata/libata-scsi.c | 23 ++++++++++++++--------- 1 file changed, 14 insertions(+), 9 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html