diff mbox

[PATCH/RESEND,v2,1/2] Hard disk S3 resume time optimization

Message ID 20140108005607.GB29556@linux.intel.com
State Not Applicable
Delegated to: David Miller
Headers show

Commit Message

Todd Brandt Jan. 8, 2014, 12:56 a.m. UTC
On resume, the ATA port driver currently waits until the AHCI controller
finishes executing the port wakeup command. This patch changes the
ata_port_resume callback to issue the wakeup and then return immediately,
thus allowing the next device in the pm queue to resume. Any commands
issued to the AHCI hardware during the wakeup will be queued up and
executed once the port is physically online. Thus no information is lost.

This patch only changes the behavior of the resume callback, not restore,
thaw, or runtime-resume. This is because thaw and restore are used after a
suspend-to-disk, which means that an image needs to be read from swap and
reloaded into RAM. The swap disk will always need to be fully restored/thawed
in order for resume to continue.

The return value from ata_resume_async is now an indicator of whether 
the resume was initiated, rather than if it was completed. I'm letting the ata
driver assume control over its own error reporting in this case (which it does
already). If you look at the ata_port resume code you'll see that the
ata_port_resume callback returns the status of the ahci_port_resume callback,
which is always 0. So I don't see any harm in ignoring it.

If somebody requests suspend while ATA port resume is still running, the
request is queued until the resume has completed. I've tested
that case very heavily. Basically if you do two suspends in a row (within
seconds of each other) you lose any performance benefit, but that's a use
case that should happen only very rarerly (and wouldn't be expected to
be of any power benefit since too many sequential suspends actually
takes more power than just letting the machine stay in run mode).

Signed-off-by: Todd Brandt <todd.e.brandt@intel.com>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>

 drivers/ata/libata-core.c | 48 ++++++++++++++++++++++++++++++++----------------
 1 file changed, 32 insertions(+), 16 deletions(-)

the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Phillip Susi Jan. 9, 2014, 5:03 p.m. UTC | #1
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 1/7/2014 7:56 PM, Todd E Brandt wrote:
> On resume, the ATA port driver currently waits until the AHCI
> controller finishes executing the port wakeup command. This patch
> changes the ata_port_resume callback to issue the wakeup and then
> return immediately, thus allowing the next device in the pm queue
> to resume. Any commands issued to the AHCI hardware during the
> wakeup will be queued up and executed once the port is physically
> online. Thus no information is lost.
> 
> This patch only changes the behavior of the resume callback, not
> restore, thaw, or runtime-resume. This is because thaw and restore
> are used after a suspend-to-disk, which means that an image needs
> to be read from swap and reloaded into RAM. The swap disk will
> always need to be fully restored/thawed in order for resume to
> continue.
> 
> The return value from ata_resume_async is now an indicator of
> whether the resume was initiated, rather than if it was completed.
> I'm letting the ata driver assume control over its own error
> reporting in this case (which it does already). If you look at the
> ata_port resume code you'll see that the ata_port_resume callback
> returns the status of the ahci_port_resume callback, which is
> always 0. So I don't see any harm in ignoring it.
> 
> If somebody requests suspend while ATA port resume is still
> running, the request is queued until the resume has completed. I've
> tested that case very heavily. Basically if you do two suspends in
> a row (within seconds of each other) you lose any performance
> benefit, but that's a use case that should happen only very rarerly
> (and wouldn't be expected to be of any power benefit since too many
> sequential suspends actually takes more power than just letting the
> machine stay in run mode).

I think my patch for this "libata: resume in the background" was a LOT
simpler.  It just used the existing behavior of becoming async when
the async argument was !NULL, and fixed the one caller that didn't
actually care about the result to pass in a pointer anyhow.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSztZ+AAoJEI5FoCIzSKrwQ6QH/jA1kIDsDC5LBQDGtDjZ/cvB
hLddAkDmqswfMXDmtD+u5lQuc+a0GbjINvrMYtq6bilDuPO39ccR//jUL69PULDo
qcn3RMjzSIIqyfCiKj6iAIZyuTTR9SLgA9F9V2b0RBPn1FFD84y+qEitzrqb+/z6
pBjVYbbjfpj+Sfr00dbk7a6BTzB2FHTSEwA/Kv5gM2MJKGPrjmML8awl3gSG+lGT
mOYV8yR+kycrnkp8Rcr+7hNb6LfJlKWBh33UA4TKxrE5WZg0nX+kU83BoCpRgvCm
nfYynvakJ1E0Wz2fiYQq1NeNy1ZUZ7iyDdsl70KslaAhPFBQUH/tsoXL1syHSRo=
=rXVR
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Brandt, Todd E Jan. 10, 2014, 11:11 p.m. UTC | #2
On Thursday, January 09, 2014 9:04 AM, Phillip Susi wrote:
> I think my patch for this "libata: resume in the background" was a LOT
> simpler.  It just used the existing behavior of becoming async when
> the async argument was !NULL, and fixed the one caller that didn't
> actually care about the result to pass in a pointer anyhow.

Yes yours is simpler, but it also opens a potential memory issue by passing a static int as the return location for the error value. I think it's just safer to tell the callback to attempt no return value at all, and for that you need to expand it into two arguments, one for selection, the other for the output address.--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Phillip Susi Jan. 11, 2014, 2:13 a.m. UTC | #3
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On 01/10/2014 06:11 PM, Brandt, Todd E wrote:
> Yes yours is simpler, but it also opens a potential memory issue
> by passing a static int as the return location for the error value.
> I think it's just safer to tell the callback to attempt no return
> value at all, and for that you need to expand it into two
> arguments, one for selection, the other for the output address.

What sort of memory issue?  Also isn't there a system NULL page
somewhere that could be used?

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBCgAGBQJS0Ki0AAoJEI5FoCIzSKrwzAUH/3PXkCbb1zW6ObzBVOS8I4vM
SjeE9unHqTlFbFL8M/sTWV+LJTKnIpEBEfV7QXFENDKPi+o58Aq1PanvW0Pu/R5+
/hOR08ry7HMRqK+Q/J+CynjoMQO3Qfw0elIVQUkSsayqdeA2QeLMDbWLxQhh9j85
o5AD7VsW1X8M7l1yzCUvl5A68M/ZR9dapW3RlErRmTdECZE5qLTN2RKezfboM5lo
i4y8IWadF7tou4vbqoGQd/JVgmgc4SRlal7HAqbVWuwmumslMGGBNpZTcgY7ffvT
PLHnx8txmeK2ELVg/aRm34j2TdI8mAYzIQ2pJ7hKwRlBc5W8ho4XHEy9o1jDaLg=
=MCaR
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dan Williams Jan. 11, 2014, 3:13 a.m. UTC | #4
On Fri, Jan 10, 2014 at 6:13 PM, Phillip Susi <psusi@ubuntu.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
>
> On 01/10/2014 06:11 PM, Brandt, Todd E wrote:
>> Yes yours is simpler, but it also opens a potential memory issue
>> by passing a static int as the return location for the error value.
>> I think it's just safer to tell the callback to attempt no return
>> value at all, and for that you need to expand it into two
>> arguments, one for selection, the other for the output address.
>
> What sort of memory issue?  Also isn't there a system NULL page
> somewhere that could be used?
>

I think the static variable is ok.  We can be sure that all eh threads
are torn down before libata.ko is unloaded.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo Jan. 11, 2014, 7:13 p.m. UTC | #5
Hello,

On Tue, Jan 07, 2014 at 04:56:07PM -0800, Todd E Brandt wrote:
> On resume, the ATA port driver currently waits until the AHCI controller
> finishes executing the port wakeup command. This patch changes the

Is there anything ahci specific about this?  There shouldn't be.

> This patch only changes the behavior of the resume callback, not restore,
> thaw, or runtime-resume. This is because thaw and restore are used after a
> suspend-to-disk, which means that an image needs to be read from swap and
> reloaded into RAM. The swap disk will always need to be fully restored/thawed
> in order for resume to continue.

If the system has multiple devices, wouldn't that still reduce the
latency?  Do we really need to deviate behavior among different
resume/unfreeze paths?

> The return value from ata_resume_async is now an indicator of whether 
> the resume was initiated, rather than if it was completed. I'm letting the ata
> driver assume control over its own error reporting in this case (which it does
> already). If you look at the ata_port resume code you'll see that the
> ata_port_resume callback returns the status of the ahci_port_resume callback,
> which is always 0. So I don't see any harm in ignoring it.

I've been always kinda doubtful about the usefulness of resume return
code.  It's not like there's anything resume path can do differently
upon being notified that resume of a device failed.  Just reporting
success and deal with error conditions as usual should work fine,
right?

Well, in fact, I don't think even the return code of suspend is
useful.  Failing suspend is most likely wrong.  The only action PM
core can take is aborting suspend and AFAICS none of the libata
drivers has suspend failure conditions whose recoverability is made
worse by just proceeding with suspend.  The hardware is recycled after
all anyway.  The device is inoperational anyway, so aborting suspend
doens't buy us anything other than failing suspend, which is almost
always wrong.  So, I'd actually prefer just removing all returns from
suspend/resume operations, but yeah this might be out of scope for
this series.

> +static int ata_port_resume_async(struct device *dev)
> +{
> +       int rc;
> +
> +       rc = ata_port_resume_common(dev, PMSG_RESUME, true);
> +       if (!rc) {
> +               pm_runtime_disable(dev);
> +               pm_runtime_set_active(dev);
> +               pm_runtime_enable(dev);
> +       }
> +
> +       return rc;
>  }

So, can't just everything become async?  Are there cases where we
*need* synchronous PM behaviors?

Thanks.
Todd Brandt Jan. 13, 2014, 7:55 p.m. UTC | #6
On Sat, Jan 11, 2014 at 02:13:15PM -0500, Tejun Heo wrote:
> Hello,
> 
> On Tue, Jan 07, 2014 at 04:56:07PM -0800, Todd E Brandt wrote:
> > On resume, the ATA port driver currently waits until the AHCI controller
> > finishes executing the port wakeup command. This patch changes the
> 
> Is there anything ahci specific about this?  There shouldn't be.
> 
> > This patch only changes the behavior of the resume callback, not restore,
> > thaw, or runtime-resume. This is because thaw and restore are used after a
> > suspend-to-disk, which means that an image needs to be read from swap and
> > reloaded into RAM. The swap disk will always need to be fully restored/thawed
> > in order for resume to continue.
> 
> If the system has multiple devices, wouldn't that still reduce the
> latency?  Do we really need to deviate behavior among different
> resume/unfreeze paths?

I see your point, why have two paths if one will do. The only thing that
worries me is that the PM resume from hibernate function doesn't have
an error handler. What happens when it tries to read the image from swap
and the disk is still spinning up? The scsi layer has an error handler so
it just keeps retrying every few seconds, but the PM core reads directly
from the swap disk's block device.

> 
> > The return value from ata_resume_async is now an indicator of whether 
> > the resume was initiated, rather than if it was completed. I'm letting the ata
> > driver assume control over its own error reporting in this case (which it does
> > already). If you look at the ata_port resume code you'll see that the
> > ata_port_resume callback returns the status of the ahci_port_resume callback,
> > which is always 0. So I don't see any harm in ignoring it.
> 
> I've been always kinda doubtful about the usefulness of resume return
> code.  It's not like there's anything resume path can do differently
> upon being notified that resume of a device failed.  Just reporting
> success and deal with error conditions as usual should work fine,
> right?
> 
> Well, in fact, I don't think even the return code of suspend is
> useful.  Failing suspend is most likely wrong.  The only action PM
> core can take is aborting suspend and AFAICS none of the libata
> drivers has suspend failure conditions whose recoverability is made
> worse by just proceeding with suspend.  The hardware is recycled after
> all anyway.  The device is inoperational anyway, so aborting suspend
> doens't buy us anything other than failing suspend, which is almost
> always wrong.  So, I'd actually prefer just removing all returns from
> suspend/resume operations, but yeah this might be out of scope for
> this series.

I think the only potential value of the return is to tell the PM core that a
suspend or resume can't be initiated at all. But since that case never
actually comes up in the ata code, it would appear to be essentially useless.

> 
> > +static int ata_port_resume_async(struct device *dev)
> > +{
> > +       int rc;
> > +
> > +       rc = ata_port_resume_common(dev, PMSG_RESUME, true);
> > +       if (!rc) {
> > +               pm_runtime_disable(dev);
> > +               pm_runtime_set_active(dev);
> > +               pm_runtime_enable(dev);
> > +       }
> > +
> > +       return rc;
> >  }
> 
> So, can't just everything become async?  Are there cases where we
> *need* synchronous PM behaviors?

I think suspend still needs to be synchronous, because the PM core needs
to be sure that the disks are actually spun down before it can attempt
to shut the system down. I'm adding Raphael to the thread. Raphael, is 
this correct?

> 
> Thanks.
> 
> -- 
> tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Todd Brandt Jan. 13, 2014, 8:06 p.m. UTC | #7
On Fri, Jan 10, 2014 at 07:13:11PM -0800, Dan Williams wrote:
> On Fri, Jan 10, 2014 at 6:13 PM, Phillip Susi <psusi@ubuntu.com> wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA512
> >
> > On 01/10/2014 06:11 PM, Brandt, Todd E wrote:
> >> Yes yours is simpler, but it also opens a potential memory issue
> >> by passing a static int as the return location for the error value.
> >> I think it's just safer to tell the callback to attempt no return
> >> value at all, and for that you need to expand it into two
> >> arguments, one for selection, the other for the output address.
> >
> > What sort of memory issue?  Also isn't there a system NULL page
> > somewhere that could be used?
> >
> 
> I think the static variable is ok.  We can be sure that all eh threads
> are torn down before libata.ko is unloaded.

Actually there's one other reason. In the ata_port_request_pm function it
checks to see if there's a previous resume operation pending, and if so
it calls ata_port_wait_eh in order to wait for it to complete before 
issuing the new suspend. If you just use the (int*)async parameter it
will return immediately and defer to the caller to try again, like is does
with SAS. But in our case we *don't* try again, so it would result in the
resume being skipped. There needs to be a new case where the caller wants 
the call to be asynchronous, and it wants ata_port_request_pm to do its 
own waiting, but doesn't care about the return value. Thus the additional
parameter.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo Jan. 13, 2014, 8:30 p.m. UTC | #8
Hello,

On Mon, Jan 13, 2014 at 11:55:44AM -0800, Todd E Brandt wrote:
> I see your point, why have two paths if one will do. The only thing that
> worries me is that the PM resume from hibernate function doesn't have
> an error handler. What happens when it tries to read the image from swap
> and the disk is still spinning up? The scsi layer has an error handler so

The request gets blocked on EH.

> it just keeps retrying every few seconds, but the PM core reads directly
> from the swap disk's block device.

Why would that matter?  Resume is handled by EH.  While EH is in
progress, all commands are blocked.  Am I missing something?

> > So, can't just everything become async?  Are there cases where we
> > *need* synchronous PM behaviors?
> 
> I think suspend still needs to be synchronous, because the PM core needs
> to be sure that the disks are actually spun down before it can attempt
> to shut the system down. I'm adding Raphael to the thread. Raphael, is 
> this correct?

Yeah, we definitely should wait for suspend to complete before
entering suspend state.  I was referring to the resume path.

Thanks.
Dan Williams Jan. 13, 2014, 8:37 p.m. UTC | #9
On Mon, Jan 13, 2014 at 12:06 PM, Todd E Brandt
<todd.e.brandt@linux.intel.com> wrote:
> On Fri, Jan 10, 2014 at 07:13:11PM -0800, Dan Williams wrote:
>> On Fri, Jan 10, 2014 at 6:13 PM, Phillip Susi <psusi@ubuntu.com> wrote:
>> > -----BEGIN PGP SIGNED MESSAGE-----
>> > Hash: SHA512
>> >
>> > On 01/10/2014 06:11 PM, Brandt, Todd E wrote:
>> >> Yes yours is simpler, but it also opens a potential memory issue
>> >> by passing a static int as the return location for the error value.
>> >> I think it's just safer to tell the callback to attempt no return
>> >> value at all, and for that you need to expand it into two
>> >> arguments, one for selection, the other for the output address.
>> >
>> > What sort of memory issue?  Also isn't there a system NULL page
>> > somewhere that could be used?
>> >
>>
>> I think the static variable is ok.  We can be sure that all eh threads
>> are torn down before libata.ko is unloaded.
>
> Actually there's one other reason. In the ata_port_request_pm function it
> checks to see if there's a previous resume operation pending, and if so
> it calls ata_port_wait_eh in order to wait for it to complete before
> issuing the new suspend. If you just use the (int*)async parameter it
> will return immediately and defer to the caller to try again, like is does
> with SAS. But in our case we *don't* try again, so it would result in the
> resume being skipped. There needs to be a new case where the caller wants
> the call to be asynchronous, and it wants ata_port_request_pm to do its
> own waiting, but doesn't care about the return value. Thus the additional
> parameter.

I think that is specifically for the libata case of a suspend request
arriving while an async resume is still in flight.  Given libata
suspends are synchronous I do not think we have the reverse problem of
resume requests arriving while a suspend is in flight.  However, it
might be worth a WARN_ON_ONCE() to document that assumption.

In the libsas case suspends are asynchronous, but they are flushed by
libsas before any resumes are processed, so there should not be
conflicts.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Todd Brandt Jan. 13, 2014, 11:30 p.m. UTC | #10
On Mon, Jan 13, 2014 at 03:30:07PM -0500, Tejun Heo wrote:
> Hello,
> 
> On Mon, Jan 13, 2014 at 11:55:44AM -0800, Todd E Brandt wrote:
> > I see your point, why have two paths if one will do. The only thing that
> > worries me is that the PM resume from hibernate function doesn't have
> > an error handler. What happens when it tries to read the image from swap
> > and the disk is still spinning up? The scsi layer has an error handler so
> 
> The request gets blocked on EH.
> 
> > it just keeps retrying every few seconds, but the PM core reads directly
> > from the swap disk's block device.
> 
> Why would that matter?  Resume is handled by EH.  While EH is in
> progress, all commands are blocked.  Am I missing something?

No, I am. You're right, there shouldn't be an issue. I'm just sketchy about
new changes that I haven't tested. Lemme code up the change and do some
stress testing to be sure there are no issues.

> 
> > > So, can't just everything become async?  Are there cases where we
> > > *need* synchronous PM behaviors?
> > 
> > I think suspend still needs to be synchronous, because the PM core needs
> > to be sure that the disks are actually spun down before it can attempt
> > to shut the system down. I'm adding Raphael to the thread. Raphael, is 
> > this correct?
> 
> Yeah, we definitely should wait for suspend to complete before
> entering suspend state.  I was referring to the resume path.

Ahh, sorry, yea I think async should work for the entire resume pathway. Would
you be willing to accept this ata patch separately from the scsi one? It
wouldn't provide any performance benefit on its own, but would pave the way
for performance benefit by removing the ATA resume lag time (Which would then
put the ball in the scsi layer's court to finish the optimization). I've 
tested both patches separately and both work without issue. Also, this patch
appears to be a pre-requisite to both mine and Phillip Susi's solution, so 
it would really hope move things along regardless of which direction is
ultimately chosen in the SCSI layer.

> 
> Thanks.
> 
> -- 
> tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Todd Brandt Jan. 13, 2014, 11:51 p.m. UTC | #11
On Mon, Jan 13, 2014 at 12:37:01PM -0800, Dan Williams wrote:
> On Mon, Jan 13, 2014 at 12:06 PM, Todd E Brandt
> <todd.e.brandt@linux.intel.com> wrote:
> > On Fri, Jan 10, 2014 at 07:13:11PM -0800, Dan Williams wrote:
> >> On Fri, Jan 10, 2014 at 6:13 PM, Phillip Susi <psusi@ubuntu.com> wrote:
> >> > -----BEGIN PGP SIGNED MESSAGE-----
> >> > Hash: SHA512
> >> >
> >> > On 01/10/2014 06:11 PM, Brandt, Todd E wrote:
> >> >> Yes yours is simpler, but it also opens a potential memory issue
> >> >> by passing a static int as the return location for the error value.
> >> >> I think it's just safer to tell the callback to attempt no return
> >> >> value at all, and for that you need to expand it into two
> >> >> arguments, one for selection, the other for the output address.
> >> >
> >> > What sort of memory issue?  Also isn't there a system NULL page
> >> > somewhere that could be used?
> >> >
> >>
> >> I think the static variable is ok.  We can be sure that all eh threads
> >> are torn down before libata.ko is unloaded.
> >
> > Actually there's one other reason. In the ata_port_request_pm function it
> > checks to see if there's a previous resume operation pending, and if so
> > it calls ata_port_wait_eh in order to wait for it to complete before
> > issuing the new suspend. If you just use the (int*)async parameter it
> > will return immediately and defer to the caller to try again, like is does
> > with SAS. But in our case we *don't* try again, so it would result in the
> > resume being skipped. There needs to be a new case where the caller wants
> > the call to be asynchronous, and it wants ata_port_request_pm to do its
> > own waiting, but doesn't care about the return value. Thus the additional
> > parameter.
> 
> I think that is specifically for the libata case of a suspend request
> arriving while an async resume is still in flight.  Given libata

Accoring to the comments it's for a previous resume, not a previous suspend.

        /* Previous resume operation might still be in
         * progress.  Wait for PM_PENDING to clear.
         */
        if (ap->pflags & ATA_PFLAG_PM_PENDING) {
                if (async) {
                        *async = -EAGAIN;
                        return 0;
                }
                ata_port_wait_eh(ap);
                WARN_ON(ap->pflags & ATA_PFLAG_PM_PENDING);
        }

I'm going to assume that it was added for a reason, so I've structured my
patch in such a way that it doesn't alter the existing logic. Removing that
particular check would be a completely different discussion and is out of
the scope of this patch.

> suspends are synchronous I do not think we have the reverse problem of
> resume requests arriving while a suspend is in flight.  However, it
> might be worth a WARN_ON_ONCE() to document that assumption.
> 
> In the libsas case suspends are asynchronous, but they are flushed by
> libsas before any resumes are processed, so there should not be
> conflicts.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dan Williams Jan. 14, 2014, 12:05 a.m. UTC | #12
On Mon, Jan 13, 2014 at 3:51 PM, Todd E Brandt
<todd.e.brandt@linux.intel.com> wrote:
> On Mon, Jan 13, 2014 at 12:37:01PM -0800, Dan Williams wrote:
>> On Mon, Jan 13, 2014 at 12:06 PM, Todd E Brandt
>> > Actually there's one other reason. In the ata_port_request_pm function it
>> > checks to see if there's a previous resume operation pending, and if so
>> > it calls ata_port_wait_eh in order to wait for it to complete before
>> > issuing the new suspend. If you just use the (int*)async parameter it
>> > will return immediately and defer to the caller to try again, like is does
>> > with SAS. But in our case we *don't* try again, so it would result in the
>> > resume being skipped. There needs to be a new case where the caller wants
>> > the call to be asynchronous, and it wants ata_port_request_pm to do its
>> > own waiting, but doesn't care about the return value. Thus the additional
>> > parameter.
>>
>> I think that is specifically for the libata case of a suspend request
>> arriving while an async resume is still in flight.  Given libata
>
> Accoring to the comments it's for a previous resume, not a previous suspend.
>
>         /* Previous resume operation might still be in
>          * progress.  Wait for PM_PENDING to clear.
>          */
>         if (ap->pflags & ATA_PFLAG_PM_PENDING) {
>                 if (async) {
>                         *async = -EAGAIN;
>                         return 0;
>                 }
>                 ata_port_wait_eh(ap);
>                 WARN_ON(ap->pflags & ATA_PFLAG_PM_PENDING);
>         }

Right, but that should only happen when ata_port_request_pm() is
called for 'suspend' and 'resume' is in flight.  I think it's a core
device-power-management bug if ->resume() is called twice without an
intervening ->suspend().

> I'm going to assume that it was added for a reason, so I've structured my
> patch in such a way that it doesn't alter the existing logic. Removing that
> particular check would be a completely different discussion and is out of
> the scope of this patch.

You're right that logic should stay, I don't think removing it was
ever proposed?
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo Jan. 14, 2014, 2:31 p.m. UTC | #13
Hello, Todd.

On Mon, Jan 13, 2014 at 03:30:26PM -0800, Todd E Brandt wrote:
> Ahh, sorry, yea I think async should work for the entire resume pathway. Would
> you be willing to accept this ata patch separately from the scsi one? It
> wouldn't provide any performance benefit on its own, but would pave the way
> for performance benefit by removing the ATA resume lag time (Which would then
> put the ball in the scsi layer's court to finish the optimization). I've 
> tested both patches separately and both work without issue. Also, this patch
> appears to be a pre-requisite to both mine and Phillip Susi's solution, so 
> it would really hope move things along regardless of which direction is
> ultimately chosen in the SCSI layer.

Let's do it together.  It's not like this involves huge number of
patches which would be helped by staging things.

Thanks.
Todd Brandt Jan. 15, 2014, 12:31 a.m. UTC | #14
This patch reduces S3 resume time from 10+ seconds to less than a second
on systems with SATA drives. It does this by making ata port and scsi disk
resume non-blocking.

This is another resend of the patch I send out in early December. I've
addressed all the feedback I received from list members, so all I really
need is a yay or nay from Tejun Huo and James Bottomley (since it touches
both subsystems).

The notes below include the performance results and patch description.

[Computer One Test - 10.5X speedup on S3 resume]
Platform: Ubuntu Raring Ringtail (13.04) kernel 3.11.0-rc7
Cpu: Intel(R) Core(TM) i7-3960X CPU @ 3.30GHz
Ahci: Intel C600/X79 series chipset 6-Port SATA AHCI (r5)
Disk config (using all 6 ahci ports): 
240 GB SSD, 3 TB HD, 500 GB HD, DVD-ROM (with cd inserted), 2 TB HD, 1 TB HD
Resume time: [unpatched 11656ms] [patched 1110ms]

[Computer Two Test - 12X speedup on S3 resume]
Platform: Ubuntu Raring Ringtail (13.04) kernel 3.11.0-rc7
Cpu: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
Ahci: Intel 7 Series/C210 Series Chipset Family 6-port SATA [AHCI mode] (r4)
Disk config (using 1 ahci port): 320 GB Hard Disk
Resume time: [unpatched 5416 ms] [patched 448 ms]

[Computer Three Test - 7.8X speedup on S3 resume]
Platform: Ubuntu Raring Ringtail (13.04) kernel 3.11.0-rc7
Cpu: Intel(R) Core(TM) i7-4770S CPU @ 3.10GHz
Ahci: Intel Lynx Point 6-port SATA Controller 1 [AHCI mode] (r2)
Disk config (using 2 ahci ports): DVD-ROM (empty), 500 GB Hard Disk
Resume time: [unpatched 5385ms] [patched 688ms]

The essential issue behind hard disks' lengthy resume time is the ata port
driver blocking until the ATA port hardware is finished coming online. So
the kernel isn't really doing anything during all those seconds that the
disks are resuming, it's just blocking until the hardware says it's ready
to accept commands. Applying this patch set allows SATA disks to resume
asynchronously without holding up system resume, thus allowing the UI to
come online much more quickly. There may be a short period after resume
where the disks are still spinning up in the background, but it will only
momentarily affect applications using that disk.

The patch set has two parts which apply to ata_port_resume and sd_resume
respectively. Both are required to achieve any real performance benefit,
but they will still function independantly without a performance hit.

ata_port_resume patch (1/2):
sd_resume patch (2/2):

v3: Jan 14, 2014
 [in response to Tejun Huo]
 - respun the ata patch to merge the entire code path into async resume
   i.e. resume, restore, and thaw.
 - re-tested the changes in hibernate to verify there are no issues
 - respun the sd patch to apply clean over 3.13 rc8, no other changes
v2:
 [in response to Oliver Neukum]
 - Added scsi cmd error reporting through the scsi_sense_hdr
 [in response to James Bottomley]
 - unified the sd_resume code path by just making sd_resume async. Thus all
   three resume callbacks: resume, restore, and runtime-resume will use it,
   but there is a performance benefit for resume only.
 [in response to Bartlomiej Zolnierkiewicz]
 - unified the __ata_port_resume_common code path to include an async
   parameter. This way there's no need to re-implement ata_port_request_pm.

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tejun Heo Jan. 15, 2014, 12:55 p.m. UTC | #15
Please start a new thread when you're posting a new version of the
whole series && don't use the same patch title for the patches.  They
do different things.  They need different names.

Thanks.
diff mbox

Patch

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 75b9367..4819b93 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -5312,7 +5312,7 @@  bool ata_link_offline(struct ata_link *link)
 #ifdef CONFIG_PM
 static int ata_port_request_pm(struct ata_port *ap, pm_message_t mesg,
                               unsigned int action, unsigned int ehi_flags,
-                              int *async)
-                              int *async)
+                              bool async, int *async_result)
 {
        struct ata_link *link;
        unsigned long flags;
@@ -5322,8 +5322,8 @@  static int ata_port_request_pm(struct ata_port *ap, pm_message_t mesg,
         * progress.  Wait for PM_PENDING to clear.
         */
        if (ap->pflags & ATA_PFLAG_PM_PENDING) {
-               if (async) {
-                       *async = -EAGAIN;
+               if (async && async_result) {
+                       *async_result = -EAGAIN;
                        return 0;
                }
                ata_port_wait_eh(ap);
@@ -5335,7 +5335,7 @@  static int ata_port_request_pm(struct ata_port *ap, pm_message_t mesg,

        ap->pm_mesg = mesg;
        if (async)
-               ap->pm_result = async;
+               ap->pm_result = async_result;
        else
                ap->pm_result = &rc;

@@ -5358,7 +5358,8 @@  static int ata_port_request_pm(struct ata_port *ap, pm_message_t mesg,
        return rc;
 }

-static int __ata_port_suspend_common(struct ata_port *ap, pm_message_t mesg, int *async)
+static int __ata_port_suspend_common(struct ata_port *ap, pm_message_t mesg,
+                                    bool async, int *async_result)
 {
        /*
         * On some hardware, device fails to respond after spun down
@@ -5370,14 +5371,14 @@  static int __ata_port_suspend_common(struct ata_port *ap, pm_message_t mesg, int
         */
        unsigned int ehi_flags = ATA_EHI_QUIET | ATA_EHI_NO_AUTOPSY |
                                 ATA_EHI_NO_RECOVERY;
-       return ata_port_request_pm(ap, mesg, 0, ehi_flags, async);
+       return ata_port_request_pm(ap, mesg, 0, ehi_flags, async, async_result);
 }

 static int ata_port_suspend_common(struct device *dev, pm_message_t mesg)
 {
        struct ata_port *ap = to_ata_port(dev);

-       return __ata_port_suspend_common(ap, mesg, NULL);
+       return __ata_port_suspend_common(ap, mesg, false, NULL);
 }

 static int ata_port_suspend(struct device *dev)
@@ -5402,27 +5403,42 @@  static int ata_port_poweroff(struct device *dev)
 }

 static int __ata_port_resume_common(struct ata_port *ap, pm_message_t mesg,
-                                   int *async)
+                                   bool async, int *async_result)
 {
        int rc;

        rc = ata_port_request_pm(ap, mesg, ATA_EH_RESET,
-               ATA_EHI_NO_AUTOPSY | ATA_EHI_QUIET, async);
+               ATA_EHI_NO_AUTOPSY | ATA_EHI_QUIET, async, async_result);
        return rc;
 }

-static int ata_port_resume_common(struct device *dev, pm_message_t mesg)
+static int ata_port_resume_common(struct device *dev, pm_message_t mesg,
+                                 bool async)
 {
        struct ata_port *ap = to_ata_port(dev);

-       return __ata_port_resume_common(ap, mesg, NULL);
+       return __ata_port_resume_common(ap, mesg, async, NULL);
+}
+
+static int ata_port_resume_async(struct device *dev)
+{
+       int rc;
+
+       rc = ata_port_resume_common(dev, PMSG_RESUME, true);
+       if (!rc) {
+               pm_runtime_disable(dev);
+               pm_runtime_set_active(dev);
+               pm_runtime_enable(dev);
+       }
+
+       return rc;
 }

 static int ata_port_resume(struct device *dev)
 {
        int rc;

-       rc = ata_port_resume_common(dev, PMSG_RESUME);
+       rc = ata_port_resume_common(dev, PMSG_RESUME, false);
        if (!rc) {
                pm_runtime_disable(dev);
                pm_runtime_set_active(dev);
@@ -5463,12 +5479,12 @@  static int ata_port_runtime_suspend(struct device *dev)

 static int ata_port_runtime_resume(struct device *dev)
 {
-       return ata_port_resume_common(dev, PMSG_AUTO_RESUME);
+       return ata_port_resume_common(dev, PMSG_AUTO_RESUME, false);
 }

 static const struct dev_pm_ops ata_port_pm_ops = {
        .suspend = ata_port_suspend,
-       .resume = ata_port_resume,
+       .resume = ata_port_resume_async,
        .freeze = ata_port_do_freeze,
        .thaw = ata_port_resume,
        .poweroff = ata_port_poweroff,
@@ -5486,13 +5502,13 @@  static const struct dev_pm_ops ata_port_pm_ops = {
  */
 int ata_sas_port_async_suspend(struct ata_port *ap, int *async)
 {
-       return __ata_port_suspend_common(ap, PMSG_SUSPEND, async);
+       return __ata_port_suspend_common(ap, PMSG_SUSPEND, true, async);
 }
 EXPORT_SYMBOL_GPL(ata_sas_port_async_suspend);

 int ata_sas_port_async_resume(struct ata_port *ap, int *async)
 {
-       return __ata_port_resume_common(ap, PMSG_RESUME, async);
+       return __ata_port_resume_common(ap, PMSG_RESUME, true, async);
 }
 EXPORT_SYMBOL_GPL(ata_sas_port_async_resume);
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in