Message ID | 20140108005607.GB29556@linux.intel.com |
---|---|
State | Not Applicable |
Delegated to: | David Miller |
Headers | show |
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 1/7/2014 7:56 PM, Todd E Brandt wrote: > On resume, the ATA port driver currently waits until the AHCI > controller finishes executing the port wakeup command. This patch > changes the ata_port_resume callback to issue the wakeup and then > return immediately, thus allowing the next device in the pm queue > to resume. Any commands issued to the AHCI hardware during the > wakeup will be queued up and executed once the port is physically > online. Thus no information is lost. > > This patch only changes the behavior of the resume callback, not > restore, thaw, or runtime-resume. This is because thaw and restore > are used after a suspend-to-disk, which means that an image needs > to be read from swap and reloaded into RAM. The swap disk will > always need to be fully restored/thawed in order for resume to > continue. > > The return value from ata_resume_async is now an indicator of > whether the resume was initiated, rather than if it was completed. > I'm letting the ata driver assume control over its own error > reporting in this case (which it does already). If you look at the > ata_port resume code you'll see that the ata_port_resume callback > returns the status of the ahci_port_resume callback, which is > always 0. So I don't see any harm in ignoring it. > > If somebody requests suspend while ATA port resume is still > running, the request is queued until the resume has completed. I've > tested that case very heavily. Basically if you do two suspends in > a row (within seconds of each other) you lose any performance > benefit, but that's a use case that should happen only very rarerly > (and wouldn't be expected to be of any power benefit since too many > sequential suspends actually takes more power than just letting the > machine stay in run mode). I think my patch for this "libata: resume in the background" was a LOT simpler. It just used the existing behavior of becoming async when the async argument was !NULL, and fixed the one caller that didn't actually care about the result to pass in a pointer anyhow. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJSztZ+AAoJEI5FoCIzSKrwQ6QH/jA1kIDsDC5LBQDGtDjZ/cvB hLddAkDmqswfMXDmtD+u5lQuc+a0GbjINvrMYtq6bilDuPO39ccR//jUL69PULDo qcn3RMjzSIIqyfCiKj6iAIZyuTTR9SLgA9F9V2b0RBPn1FFD84y+qEitzrqb+/z6 pBjVYbbjfpj+Sfr00dbk7a6BTzB2FHTSEwA/Kv5gM2MJKGPrjmML8awl3gSG+lGT mOYV8yR+kycrnkp8Rcr+7hNb6LfJlKWBh33UA4TKxrE5WZg0nX+kU83BoCpRgvCm nfYynvakJ1E0Wz2fiYQq1NeNy1ZUZ7iyDdsl70KslaAhPFBQUH/tsoXL1syHSRo= =rXVR -----END PGP SIGNATURE----- -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday, January 09, 2014 9:04 AM, Phillip Susi wrote: > I think my patch for this "libata: resume in the background" was a LOT > simpler. It just used the existing behavior of becoming async when > the async argument was !NULL, and fixed the one caller that didn't > actually care about the result to pass in a pointer anyhow. Yes yours is simpler, but it also opens a potential memory issue by passing a static int as the return location for the error value. I think it's just safer to tell the callback to attempt no return value at all, and for that you need to expand it into two arguments, one for selection, the other for the output address.-- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On 01/10/2014 06:11 PM, Brandt, Todd E wrote: > Yes yours is simpler, but it also opens a potential memory issue > by passing a static int as the return location for the error value. > I think it's just safer to tell the callback to attempt no return > value at all, and for that you need to expand it into two > arguments, one for selection, the other for the output address. What sort of memory issue? Also isn't there a system NULL page somewhere that could be used? -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCgAGBQJS0Ki0AAoJEI5FoCIzSKrwzAUH/3PXkCbb1zW6ObzBVOS8I4vM SjeE9unHqTlFbFL8M/sTWV+LJTKnIpEBEfV7QXFENDKPi+o58Aq1PanvW0Pu/R5+ /hOR08ry7HMRqK+Q/J+CynjoMQO3Qfw0elIVQUkSsayqdeA2QeLMDbWLxQhh9j85 o5AD7VsW1X8M7l1yzCUvl5A68M/ZR9dapW3RlErRmTdECZE5qLTN2RKezfboM5lo i4y8IWadF7tou4vbqoGQd/JVgmgc4SRlal7HAqbVWuwmumslMGGBNpZTcgY7ffvT PLHnx8txmeK2ELVg/aRm34j2TdI8mAYzIQ2pJ7hKwRlBc5W8ho4XHEy9o1jDaLg= =MCaR -----END PGP SIGNATURE----- -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jan 10, 2014 at 6:13 PM, Phillip Susi <psusi@ubuntu.com> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA512 > > On 01/10/2014 06:11 PM, Brandt, Todd E wrote: >> Yes yours is simpler, but it also opens a potential memory issue >> by passing a static int as the return location for the error value. >> I think it's just safer to tell the callback to attempt no return >> value at all, and for that you need to expand it into two >> arguments, one for selection, the other for the output address. > > What sort of memory issue? Also isn't there a system NULL page > somewhere that could be used? > I think the static variable is ok. We can be sure that all eh threads are torn down before libata.ko is unloaded. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, On Tue, Jan 07, 2014 at 04:56:07PM -0800, Todd E Brandt wrote: > On resume, the ATA port driver currently waits until the AHCI controller > finishes executing the port wakeup command. This patch changes the Is there anything ahci specific about this? There shouldn't be. > This patch only changes the behavior of the resume callback, not restore, > thaw, or runtime-resume. This is because thaw and restore are used after a > suspend-to-disk, which means that an image needs to be read from swap and > reloaded into RAM. The swap disk will always need to be fully restored/thawed > in order for resume to continue. If the system has multiple devices, wouldn't that still reduce the latency? Do we really need to deviate behavior among different resume/unfreeze paths? > The return value from ata_resume_async is now an indicator of whether > the resume was initiated, rather than if it was completed. I'm letting the ata > driver assume control over its own error reporting in this case (which it does > already). If you look at the ata_port resume code you'll see that the > ata_port_resume callback returns the status of the ahci_port_resume callback, > which is always 0. So I don't see any harm in ignoring it. I've been always kinda doubtful about the usefulness of resume return code. It's not like there's anything resume path can do differently upon being notified that resume of a device failed. Just reporting success and deal with error conditions as usual should work fine, right? Well, in fact, I don't think even the return code of suspend is useful. Failing suspend is most likely wrong. The only action PM core can take is aborting suspend and AFAICS none of the libata drivers has suspend failure conditions whose recoverability is made worse by just proceeding with suspend. The hardware is recycled after all anyway. The device is inoperational anyway, so aborting suspend doens't buy us anything other than failing suspend, which is almost always wrong. So, I'd actually prefer just removing all returns from suspend/resume operations, but yeah this might be out of scope for this series. > +static int ata_port_resume_async(struct device *dev) > +{ > + int rc; > + > + rc = ata_port_resume_common(dev, PMSG_RESUME, true); > + if (!rc) { > + pm_runtime_disable(dev); > + pm_runtime_set_active(dev); > + pm_runtime_enable(dev); > + } > + > + return rc; > } So, can't just everything become async? Are there cases where we *need* synchronous PM behaviors? Thanks.
On Sat, Jan 11, 2014 at 02:13:15PM -0500, Tejun Heo wrote: > Hello, > > On Tue, Jan 07, 2014 at 04:56:07PM -0800, Todd E Brandt wrote: > > On resume, the ATA port driver currently waits until the AHCI controller > > finishes executing the port wakeup command. This patch changes the > > Is there anything ahci specific about this? There shouldn't be. > > > This patch only changes the behavior of the resume callback, not restore, > > thaw, or runtime-resume. This is because thaw and restore are used after a > > suspend-to-disk, which means that an image needs to be read from swap and > > reloaded into RAM. The swap disk will always need to be fully restored/thawed > > in order for resume to continue. > > If the system has multiple devices, wouldn't that still reduce the > latency? Do we really need to deviate behavior among different > resume/unfreeze paths? I see your point, why have two paths if one will do. The only thing that worries me is that the PM resume from hibernate function doesn't have an error handler. What happens when it tries to read the image from swap and the disk is still spinning up? The scsi layer has an error handler so it just keeps retrying every few seconds, but the PM core reads directly from the swap disk's block device. > > > The return value from ata_resume_async is now an indicator of whether > > the resume was initiated, rather than if it was completed. I'm letting the ata > > driver assume control over its own error reporting in this case (which it does > > already). If you look at the ata_port resume code you'll see that the > > ata_port_resume callback returns the status of the ahci_port_resume callback, > > which is always 0. So I don't see any harm in ignoring it. > > I've been always kinda doubtful about the usefulness of resume return > code. It's not like there's anything resume path can do differently > upon being notified that resume of a device failed. Just reporting > success and deal with error conditions as usual should work fine, > right? > > Well, in fact, I don't think even the return code of suspend is > useful. Failing suspend is most likely wrong. The only action PM > core can take is aborting suspend and AFAICS none of the libata > drivers has suspend failure conditions whose recoverability is made > worse by just proceeding with suspend. The hardware is recycled after > all anyway. The device is inoperational anyway, so aborting suspend > doens't buy us anything other than failing suspend, which is almost > always wrong. So, I'd actually prefer just removing all returns from > suspend/resume operations, but yeah this might be out of scope for > this series. I think the only potential value of the return is to tell the PM core that a suspend or resume can't be initiated at all. But since that case never actually comes up in the ata code, it would appear to be essentially useless. > > > +static int ata_port_resume_async(struct device *dev) > > +{ > > + int rc; > > + > > + rc = ata_port_resume_common(dev, PMSG_RESUME, true); > > + if (!rc) { > > + pm_runtime_disable(dev); > > + pm_runtime_set_active(dev); > > + pm_runtime_enable(dev); > > + } > > + > > + return rc; > > } > > So, can't just everything become async? Are there cases where we > *need* synchronous PM behaviors? I think suspend still needs to be synchronous, because the PM core needs to be sure that the disks are actually spun down before it can attempt to shut the system down. I'm adding Raphael to the thread. Raphael, is this correct? > > Thanks. > > -- > tejun -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jan 10, 2014 at 07:13:11PM -0800, Dan Williams wrote: > On Fri, Jan 10, 2014 at 6:13 PM, Phillip Susi <psusi@ubuntu.com> wrote: > > -----BEGIN PGP SIGNED MESSAGE----- > > Hash: SHA512 > > > > On 01/10/2014 06:11 PM, Brandt, Todd E wrote: > >> Yes yours is simpler, but it also opens a potential memory issue > >> by passing a static int as the return location for the error value. > >> I think it's just safer to tell the callback to attempt no return > >> value at all, and for that you need to expand it into two > >> arguments, one for selection, the other for the output address. > > > > What sort of memory issue? Also isn't there a system NULL page > > somewhere that could be used? > > > > I think the static variable is ok. We can be sure that all eh threads > are torn down before libata.ko is unloaded. Actually there's one other reason. In the ata_port_request_pm function it checks to see if there's a previous resume operation pending, and if so it calls ata_port_wait_eh in order to wait for it to complete before issuing the new suspend. If you just use the (int*)async parameter it will return immediately and defer to the caller to try again, like is does with SAS. But in our case we *don't* try again, so it would result in the resume being skipped. There needs to be a new case where the caller wants the call to be asynchronous, and it wants ata_port_request_pm to do its own waiting, but doesn't care about the return value. Thus the additional parameter. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, On Mon, Jan 13, 2014 at 11:55:44AM -0800, Todd E Brandt wrote: > I see your point, why have two paths if one will do. The only thing that > worries me is that the PM resume from hibernate function doesn't have > an error handler. What happens when it tries to read the image from swap > and the disk is still spinning up? The scsi layer has an error handler so The request gets blocked on EH. > it just keeps retrying every few seconds, but the PM core reads directly > from the swap disk's block device. Why would that matter? Resume is handled by EH. While EH is in progress, all commands are blocked. Am I missing something? > > So, can't just everything become async? Are there cases where we > > *need* synchronous PM behaviors? > > I think suspend still needs to be synchronous, because the PM core needs > to be sure that the disks are actually spun down before it can attempt > to shut the system down. I'm adding Raphael to the thread. Raphael, is > this correct? Yeah, we definitely should wait for suspend to complete before entering suspend state. I was referring to the resume path. Thanks.
On Mon, Jan 13, 2014 at 12:06 PM, Todd E Brandt <todd.e.brandt@linux.intel.com> wrote: > On Fri, Jan 10, 2014 at 07:13:11PM -0800, Dan Williams wrote: >> On Fri, Jan 10, 2014 at 6:13 PM, Phillip Susi <psusi@ubuntu.com> wrote: >> > -----BEGIN PGP SIGNED MESSAGE----- >> > Hash: SHA512 >> > >> > On 01/10/2014 06:11 PM, Brandt, Todd E wrote: >> >> Yes yours is simpler, but it also opens a potential memory issue >> >> by passing a static int as the return location for the error value. >> >> I think it's just safer to tell the callback to attempt no return >> >> value at all, and for that you need to expand it into two >> >> arguments, one for selection, the other for the output address. >> > >> > What sort of memory issue? Also isn't there a system NULL page >> > somewhere that could be used? >> > >> >> I think the static variable is ok. We can be sure that all eh threads >> are torn down before libata.ko is unloaded. > > Actually there's one other reason. In the ata_port_request_pm function it > checks to see if there's a previous resume operation pending, and if so > it calls ata_port_wait_eh in order to wait for it to complete before > issuing the new suspend. If you just use the (int*)async parameter it > will return immediately and defer to the caller to try again, like is does > with SAS. But in our case we *don't* try again, so it would result in the > resume being skipped. There needs to be a new case where the caller wants > the call to be asynchronous, and it wants ata_port_request_pm to do its > own waiting, but doesn't care about the return value. Thus the additional > parameter. I think that is specifically for the libata case of a suspend request arriving while an async resume is still in flight. Given libata suspends are synchronous I do not think we have the reverse problem of resume requests arriving while a suspend is in flight. However, it might be worth a WARN_ON_ONCE() to document that assumption. In the libsas case suspends are asynchronous, but they are flushed by libsas before any resumes are processed, so there should not be conflicts. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jan 13, 2014 at 03:30:07PM -0500, Tejun Heo wrote: > Hello, > > On Mon, Jan 13, 2014 at 11:55:44AM -0800, Todd E Brandt wrote: > > I see your point, why have two paths if one will do. The only thing that > > worries me is that the PM resume from hibernate function doesn't have > > an error handler. What happens when it tries to read the image from swap > > and the disk is still spinning up? The scsi layer has an error handler so > > The request gets blocked on EH. > > > it just keeps retrying every few seconds, but the PM core reads directly > > from the swap disk's block device. > > Why would that matter? Resume is handled by EH. While EH is in > progress, all commands are blocked. Am I missing something? No, I am. You're right, there shouldn't be an issue. I'm just sketchy about new changes that I haven't tested. Lemme code up the change and do some stress testing to be sure there are no issues. > > > > So, can't just everything become async? Are there cases where we > > > *need* synchronous PM behaviors? > > > > I think suspend still needs to be synchronous, because the PM core needs > > to be sure that the disks are actually spun down before it can attempt > > to shut the system down. I'm adding Raphael to the thread. Raphael, is > > this correct? > > Yeah, we definitely should wait for suspend to complete before > entering suspend state. I was referring to the resume path. Ahh, sorry, yea I think async should work for the entire resume pathway. Would you be willing to accept this ata patch separately from the scsi one? It wouldn't provide any performance benefit on its own, but would pave the way for performance benefit by removing the ATA resume lag time (Which would then put the ball in the scsi layer's court to finish the optimization). I've tested both patches separately and both work without issue. Also, this patch appears to be a pre-requisite to both mine and Phillip Susi's solution, so it would really hope move things along regardless of which direction is ultimately chosen in the SCSI layer. > > Thanks. > > -- > tejun -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jan 13, 2014 at 12:37:01PM -0800, Dan Williams wrote: > On Mon, Jan 13, 2014 at 12:06 PM, Todd E Brandt > <todd.e.brandt@linux.intel.com> wrote: > > On Fri, Jan 10, 2014 at 07:13:11PM -0800, Dan Williams wrote: > >> On Fri, Jan 10, 2014 at 6:13 PM, Phillip Susi <psusi@ubuntu.com> wrote: > >> > -----BEGIN PGP SIGNED MESSAGE----- > >> > Hash: SHA512 > >> > > >> > On 01/10/2014 06:11 PM, Brandt, Todd E wrote: > >> >> Yes yours is simpler, but it also opens a potential memory issue > >> >> by passing a static int as the return location for the error value. > >> >> I think it's just safer to tell the callback to attempt no return > >> >> value at all, and for that you need to expand it into two > >> >> arguments, one for selection, the other for the output address. > >> > > >> > What sort of memory issue? Also isn't there a system NULL page > >> > somewhere that could be used? > >> > > >> > >> I think the static variable is ok. We can be sure that all eh threads > >> are torn down before libata.ko is unloaded. > > > > Actually there's one other reason. In the ata_port_request_pm function it > > checks to see if there's a previous resume operation pending, and if so > > it calls ata_port_wait_eh in order to wait for it to complete before > > issuing the new suspend. If you just use the (int*)async parameter it > > will return immediately and defer to the caller to try again, like is does > > with SAS. But in our case we *don't* try again, so it would result in the > > resume being skipped. There needs to be a new case where the caller wants > > the call to be asynchronous, and it wants ata_port_request_pm to do its > > own waiting, but doesn't care about the return value. Thus the additional > > parameter. > > I think that is specifically for the libata case of a suspend request > arriving while an async resume is still in flight. Given libata Accoring to the comments it's for a previous resume, not a previous suspend. /* Previous resume operation might still be in * progress. Wait for PM_PENDING to clear. */ if (ap->pflags & ATA_PFLAG_PM_PENDING) { if (async) { *async = -EAGAIN; return 0; } ata_port_wait_eh(ap); WARN_ON(ap->pflags & ATA_PFLAG_PM_PENDING); } I'm going to assume that it was added for a reason, so I've structured my patch in such a way that it doesn't alter the existing logic. Removing that particular check would be a completely different discussion and is out of the scope of this patch. > suspends are synchronous I do not think we have the reverse problem of > resume requests arriving while a suspend is in flight. However, it > might be worth a WARN_ON_ONCE() to document that assumption. > > In the libsas case suspends are asynchronous, but they are flushed by > libsas before any resumes are processed, so there should not be > conflicts. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jan 13, 2014 at 3:51 PM, Todd E Brandt <todd.e.brandt@linux.intel.com> wrote: > On Mon, Jan 13, 2014 at 12:37:01PM -0800, Dan Williams wrote: >> On Mon, Jan 13, 2014 at 12:06 PM, Todd E Brandt >> > Actually there's one other reason. In the ata_port_request_pm function it >> > checks to see if there's a previous resume operation pending, and if so >> > it calls ata_port_wait_eh in order to wait for it to complete before >> > issuing the new suspend. If you just use the (int*)async parameter it >> > will return immediately and defer to the caller to try again, like is does >> > with SAS. But in our case we *don't* try again, so it would result in the >> > resume being skipped. There needs to be a new case where the caller wants >> > the call to be asynchronous, and it wants ata_port_request_pm to do its >> > own waiting, but doesn't care about the return value. Thus the additional >> > parameter. >> >> I think that is specifically for the libata case of a suspend request >> arriving while an async resume is still in flight. Given libata > > Accoring to the comments it's for a previous resume, not a previous suspend. > > /* Previous resume operation might still be in > * progress. Wait for PM_PENDING to clear. > */ > if (ap->pflags & ATA_PFLAG_PM_PENDING) { > if (async) { > *async = -EAGAIN; > return 0; > } > ata_port_wait_eh(ap); > WARN_ON(ap->pflags & ATA_PFLAG_PM_PENDING); > } Right, but that should only happen when ata_port_request_pm() is called for 'suspend' and 'resume' is in flight. I think it's a core device-power-management bug if ->resume() is called twice without an intervening ->suspend(). > I'm going to assume that it was added for a reason, so I've structured my > patch in such a way that it doesn't alter the existing logic. Removing that > particular check would be a completely different discussion and is out of > the scope of this patch. You're right that logic should stay, I don't think removing it was ever proposed? -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, Todd. On Mon, Jan 13, 2014 at 03:30:26PM -0800, Todd E Brandt wrote: > Ahh, sorry, yea I think async should work for the entire resume pathway. Would > you be willing to accept this ata patch separately from the scsi one? It > wouldn't provide any performance benefit on its own, but would pave the way > for performance benefit by removing the ATA resume lag time (Which would then > put the ball in the scsi layer's court to finish the optimization). I've > tested both patches separately and both work without issue. Also, this patch > appears to be a pre-requisite to both mine and Phillip Susi's solution, so > it would really hope move things along regardless of which direction is > ultimately chosen in the SCSI layer. Let's do it together. It's not like this involves huge number of patches which would be helped by staging things. Thanks.
This patch reduces S3 resume time from 10+ seconds to less than a second on systems with SATA drives. It does this by making ata port and scsi disk resume non-blocking. This is another resend of the patch I send out in early December. I've addressed all the feedback I received from list members, so all I really need is a yay or nay from Tejun Huo and James Bottomley (since it touches both subsystems). The notes below include the performance results and patch description. [Computer One Test - 10.5X speedup on S3 resume] Platform: Ubuntu Raring Ringtail (13.04) kernel 3.11.0-rc7 Cpu: Intel(R) Core(TM) i7-3960X CPU @ 3.30GHz Ahci: Intel C600/X79 series chipset 6-Port SATA AHCI (r5) Disk config (using all 6 ahci ports): 240 GB SSD, 3 TB HD, 500 GB HD, DVD-ROM (with cd inserted), 2 TB HD, 1 TB HD Resume time: [unpatched 11656ms] [patched 1110ms] [Computer Two Test - 12X speedup on S3 resume] Platform: Ubuntu Raring Ringtail (13.04) kernel 3.11.0-rc7 Cpu: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz Ahci: Intel 7 Series/C210 Series Chipset Family 6-port SATA [AHCI mode] (r4) Disk config (using 1 ahci port): 320 GB Hard Disk Resume time: [unpatched 5416 ms] [patched 448 ms] [Computer Three Test - 7.8X speedup on S3 resume] Platform: Ubuntu Raring Ringtail (13.04) kernel 3.11.0-rc7 Cpu: Intel(R) Core(TM) i7-4770S CPU @ 3.10GHz Ahci: Intel Lynx Point 6-port SATA Controller 1 [AHCI mode] (r2) Disk config (using 2 ahci ports): DVD-ROM (empty), 500 GB Hard Disk Resume time: [unpatched 5385ms] [patched 688ms] The essential issue behind hard disks' lengthy resume time is the ata port driver blocking until the ATA port hardware is finished coming online. So the kernel isn't really doing anything during all those seconds that the disks are resuming, it's just blocking until the hardware says it's ready to accept commands. Applying this patch set allows SATA disks to resume asynchronously without holding up system resume, thus allowing the UI to come online much more quickly. There may be a short period after resume where the disks are still spinning up in the background, but it will only momentarily affect applications using that disk. The patch set has two parts which apply to ata_port_resume and sd_resume respectively. Both are required to achieve any real performance benefit, but they will still function independantly without a performance hit. ata_port_resume patch (1/2): sd_resume patch (2/2): v3: Jan 14, 2014 [in response to Tejun Huo] - respun the ata patch to merge the entire code path into async resume i.e. resume, restore, and thaw. - re-tested the changes in hibernate to verify there are no issues - respun the sd patch to apply clean over 3.13 rc8, no other changes v2: [in response to Oliver Neukum] - Added scsi cmd error reporting through the scsi_sense_hdr [in response to James Bottomley] - unified the sd_resume code path by just making sd_resume async. Thus all three resume callbacks: resume, restore, and runtime-resume will use it, but there is a performance benefit for resume only. [in response to Bartlomiej Zolnierkiewicz] - unified the __ata_port_resume_common code path to include an async parameter. This way there's no need to re-implement ata_port_request_pm. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Please start a new thread when you're posting a new version of the whole series && don't use the same patch title for the patches. They do different things. They need different names. Thanks.
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 75b9367..4819b93 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -5312,7 +5312,7 @@ bool ata_link_offline(struct ata_link *link) #ifdef CONFIG_PM static int ata_port_request_pm(struct ata_port *ap, pm_message_t mesg, unsigned int action, unsigned int ehi_flags, - int *async) - int *async) + bool async, int *async_result) { struct ata_link *link; unsigned long flags; @@ -5322,8 +5322,8 @@ static int ata_port_request_pm(struct ata_port *ap, pm_message_t mesg, * progress. Wait for PM_PENDING to clear. */ if (ap->pflags & ATA_PFLAG_PM_PENDING) { - if (async) { - *async = -EAGAIN; + if (async && async_result) { + *async_result = -EAGAIN; return 0; } ata_port_wait_eh(ap); @@ -5335,7 +5335,7 @@ static int ata_port_request_pm(struct ata_port *ap, pm_message_t mesg, ap->pm_mesg = mesg; if (async) - ap->pm_result = async; + ap->pm_result = async_result; else ap->pm_result = &rc; @@ -5358,7 +5358,8 @@ static int ata_port_request_pm(struct ata_port *ap, pm_message_t mesg, return rc; } -static int __ata_port_suspend_common(struct ata_port *ap, pm_message_t mesg, int *async) +static int __ata_port_suspend_common(struct ata_port *ap, pm_message_t mesg, + bool async, int *async_result) { /* * On some hardware, device fails to respond after spun down @@ -5370,14 +5371,14 @@ static int __ata_port_suspend_common(struct ata_port *ap, pm_message_t mesg, int */ unsigned int ehi_flags = ATA_EHI_QUIET | ATA_EHI_NO_AUTOPSY | ATA_EHI_NO_RECOVERY; - return ata_port_request_pm(ap, mesg, 0, ehi_flags, async); + return ata_port_request_pm(ap, mesg, 0, ehi_flags, async, async_result); } static int ata_port_suspend_common(struct device *dev, pm_message_t mesg) { struct ata_port *ap = to_ata_port(dev); - return __ata_port_suspend_common(ap, mesg, NULL); + return __ata_port_suspend_common(ap, mesg, false, NULL); } static int ata_port_suspend(struct device *dev) @@ -5402,27 +5403,42 @@ static int ata_port_poweroff(struct device *dev) } static int __ata_port_resume_common(struct ata_port *ap, pm_message_t mesg, - int *async) + bool async, int *async_result) { int rc; rc = ata_port_request_pm(ap, mesg, ATA_EH_RESET, - ATA_EHI_NO_AUTOPSY | ATA_EHI_QUIET, async); + ATA_EHI_NO_AUTOPSY | ATA_EHI_QUIET, async, async_result); return rc; } -static int ata_port_resume_common(struct device *dev, pm_message_t mesg) +static int ata_port_resume_common(struct device *dev, pm_message_t mesg, + bool async) { struct ata_port *ap = to_ata_port(dev); - return __ata_port_resume_common(ap, mesg, NULL); + return __ata_port_resume_common(ap, mesg, async, NULL); +} + +static int ata_port_resume_async(struct device *dev) +{ + int rc; + + rc = ata_port_resume_common(dev, PMSG_RESUME, true); + if (!rc) { + pm_runtime_disable(dev); + pm_runtime_set_active(dev); + pm_runtime_enable(dev); + } + + return rc; } static int ata_port_resume(struct device *dev) { int rc; - rc = ata_port_resume_common(dev, PMSG_RESUME); + rc = ata_port_resume_common(dev, PMSG_RESUME, false); if (!rc) { pm_runtime_disable(dev); pm_runtime_set_active(dev); @@ -5463,12 +5479,12 @@ static int ata_port_runtime_suspend(struct device *dev) static int ata_port_runtime_resume(struct device *dev) { - return ata_port_resume_common(dev, PMSG_AUTO_RESUME); + return ata_port_resume_common(dev, PMSG_AUTO_RESUME, false); } static const struct dev_pm_ops ata_port_pm_ops = { .suspend = ata_port_suspend, - .resume = ata_port_resume, + .resume = ata_port_resume_async, .freeze = ata_port_do_freeze, .thaw = ata_port_resume, .poweroff = ata_port_poweroff, @@ -5486,13 +5502,13 @@ static const struct dev_pm_ops ata_port_pm_ops = { */ int ata_sas_port_async_suspend(struct ata_port *ap, int *async) { - return __ata_port_suspend_common(ap, PMSG_SUSPEND, async); + return __ata_port_suspend_common(ap, PMSG_SUSPEND, true, async); } EXPORT_SYMBOL_GPL(ata_sas_port_async_suspend); int ata_sas_port_async_resume(struct ata_port *ap, int *async) { - return __ata_port_resume_common(ap, PMSG_RESUME, async); + return __ata_port_resume_common(ap, PMSG_RESUME, true, async); } EXPORT_SYMBOL_GPL(ata_sas_port_async_resume); -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in