REQ_PM vs REQ_TYPE_PM_RESUME

Message ID	Pine.LNX.4.44L0.1401071057130.1296-100000@iolanthe.rowland.org
State	Not Applicable
Delegated to:	David Miller
Headers	show Return-Path: <linux-ide-owner@vger.kernel.org> Date: Tue, 7 Jan 2014 11:08:22 -0500 (EST) From: Alan Stern <stern@rowland.harvard.edu> To: Phillip Susi <psusi@ubuntu.com> cc: Aaron Lu <aaron.lu@intel.com>, Sujit Reddy Thumma <sthumma@codeaurora.org>, <todd.e.brandt@linux.intel.com>, <tj@kernel.org>, <JBottomley@parallels.com>, <linux-ide@vger.kernel.org>, <linux-scsi@vger.kernel.org>, Linux-pm mailing list <linux-pm@vger.kernel.org>, "Rafael J. Wysocki" <rjw@rjwysocki.net> Subject: Re: REQ_PM vs REQ_TYPE_PM_RESUME In-Reply-To: <52CC20A0.7060606@ubuntu.com> Message-ID: <Pine.LNX.4.44L0.1401071057130.1296-100000@iolanthe.rowland.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-ide-owner@vger.kernel.org Precedence: bulk

Alan Stern Jan. 7, 2014, 4:08 p.m. UTC

On Tue, 7 Jan 2014, Phillip Susi wrote:

> On 1/7/2014 10:25 AM, Alan Stern wrote:
> > This doesn't seem like a good idea.  The way to speed up resumes is
> > to allow sd's resume routine to return while the disk is still
> > spinning up (i.e., make the spin-up asynchronous).  There already
> > have been patches submitted to do this; I don't know what happened
> > to them.
> 
> Sure, if that is your *only* goal.  I also want the disk to not spin
> up *at all* if possible.  There's no sense spinning up all of your
> disks every time you resume when you very rarely access some of them.

Okay, that's a different matter.  There's a much simpler way to 
accomplish this.  The patch below will avoid spinning up drives that 
were already in runtime suspend when the system sleep started.  
(If a drive wasn't in runtime suspend then presumably it was used 
recently; therefore it's likely to be used again in the near future and 
so it _should_ be spun up.)

Warning: This patch is completely untested.  I didn't even try to
compile it.  Still, it should give you a good idea as to what is really
needed here.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Phillip Susi Jan. 7, 2014, 4:37 p.m. UTC | #1

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 1/7/2014 11:08 AM, Alan Stern wrote:
> Okay, that's a different matter.  There's a much simpler way to 
> accomplish this.  The patch below will avoid spinning up drives
> that were already in runtime suspend when the system sleep started.
>  (If a drive wasn't in runtime suspend then presumably it was used
>  recently; therefore it's likely to be used again in the near
> future and so it _should_ be spun up.)

This is a poor assumption.  You may not be using autosuspend at all,
or with a rather long timeout, so it is fairly normal to not be
runtime suspended at suspend time, and yet not likely to access the
disk for some time after resume.  It also still suffers from the issue
of claiming the device is runtime suspended while the disk has in
fact, spun up of its own accord.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSzC1fAAoJEI5FoCIzSKrwO5oIAJaROYiUSrapbSo40/YLLuPH
Y+5ECV7JW1hVSK7n3MyYls5YS6DiBQpjlhZ0h7XXoRWzUuT8GtU69wufjgln9IkZ
0zTDj7idDBzbBVBzdhLRsposIWWldsnpfYqvOPZsCd/xQe/jhnamKfBCSr4gYHy0
/FklPdPa2HXwApTNEz2fpleNiqdl6lbDEUkqI/sWkbjKrGVb57CAu8MaruPd2Frh
lkE3Dw7OfgQcngU8pU6kj4otRaBMEoEfIN7gMQ2CNont5JICDf4CvrZXawvKD3/1
Vg72E7DEUkNOCorqGywPcbV8838joEefmcsBc73w7Cn0rZC/5OmgrPyIUfgaS9k=
=bu2W
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Alan Stern Jan. 7, 2014, 6:05 p.m. UTC | #2

On Tue, 7 Jan 2014, Phillip Susi wrote:

> On 1/7/2014 11:08 AM, Alan Stern wrote:
> > Okay, that's a different matter.  There's a much simpler way to 
> > accomplish this.  The patch below will avoid spinning up drives
> > that were already in runtime suspend when the system sleep started.
> >  (If a drive wasn't in runtime suspend then presumably it was used
> >  recently; therefore it's likely to be used again in the near
> > future and so it _should_ be spun up.)
> 
> This is a poor assumption.  You may not be using autosuspend at all,
> or with a rather long timeout, so it is fairly normal to not be
> runtime suspended at suspend time, and yet not likely to access the
> disk for some time after resume.

I disagree with your argument.  If you aren't using autosuspend at all 
then you don't mind having your disks spin all the time.  Therefore you 
don't mind if the disks spin up during system resume.

If you're using a long timeout and you don't like the way the timer 
gets reset by a system sleep, then you have set the timeout too long.

Now, a more reasonable argument might be that for some disks, the
kernel doesn't need to do an explicit spin-up or spin-down (for runtime
suspend) at all; instead we can rely on the drive's internal power
management.  In fact there already is a "manage_start_stop" attribute
to control this.  (Well, almost -- if the attribute is set to 0 then
the kernel won't issue a spin-down command even for system suspend.)

>  It also still suffers from the issue
> of claiming the device is runtime suspended while the disk has in
> fact, spun up of its own accord.

I don't understand.  Under what circumstances would this happen?  Are 
you saying this would happen during system resume?  Why would the disk 
spin up of its own accord at that time?

And if it does spin up of its own accord, what makes you think the 
kernel can do anything to prevent it from spinning up?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Phillip Susi Jan. 7, 2014, 6:43 p.m. UTC | #3

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 1/7/2014 1:05 PM, Alan Stern wrote:
> I disagree with your argument.  If you aren't using autosuspend at
> all then you don't mind having your disks spin all the time.
> Therefore you don't mind if the disks spin up during system
> resume.

I prefer to manually spin the disk down when I don't plan on using it
for a while instead of letting it auto suspend using a timeout since
the timeout often either is too short and spins the disk down before
I'm done with it, or too long, and doesn't let it sleep as much as it
could.  Yet, I do mind that the disk must spin up whenever I resume,
even though I don't plan on using it any time soon.

> If you're using a long timeout and you don't like the way the timer
>  gets reset by a system sleep, then you have set the timeout too
> long.

It isn't reset, it simply hasn't timed out at the time I suspend.
This isn't very hard to do with even only a 5 minute auto suspend
delay.  If I do something with the disk then suspend the whole system
3 minutes later, it would spin up again when I resume even though I
finished my use of the disk before suspending the system.

> Now, a more reasonable argument might be that for some disks, the 
> kernel doesn't need to do an explicit spin-up or spin-down (for
> runtime suspend) at all; instead we can rely on the drive's
> internal power management.  In fact there already is a
> "manage_start_stop" attribute to control this.  (Well, almost -- if
> the attribute is set to 0 then the kernel won't issue a spin-down
> command even for system suspend.)

This flag is broken and unusable and should be removed.  First, SCSI
disks *require* the start command after a suspend, and secondly, not
stopping the drive before suspend causes an emergency head retract,
which damages the drive.

> I don't understand.  Under what circumstances would this happen?
> Are you saying this would happen during system resume?  Why would
> the disk spin up of its own accord at that time?

By default, ATA disks spin up on their own when they are powered on.

> And if it does spin up of its own accord, what makes you think the
>  kernel can do anything to prevent it from spinning up?

It can't.  What it should do is notice when this has happened and not
claim the device is runtime suspended when it is in fact, spinning.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSzErYAAoJEI5FoCIzSKrw8R4H/RZZfPxQYcFEw1oVlkVY1gdM
TCi0MJoSBVAg9Iepplbl3RPZKb4DR9W+k6ICcZyOdw155B8mRzznn9RYoqkDEXX8
u2mqGfqx4TRf1OJo8jGKyJhPW6tJjwHM+G5Dx4WO6XAoVtwAvJHrFiNfuG9wQa1t
DswH0IOm6ZfEH0Z4hjeq6vnyEtmB3ecrMpvplM5+tRchE1SuvP+OkD3J6EXIlL3c
J7+b7pmu3Cx/+2vz92NlWHLpFXeC1fnOAy2+jJhwia1X3tVtf+wN1KIrQBK9Lq+i
Dmgmd++DUr4//CwKftPUx/cmv/RUgSnfd7MLfHZvxJrBOsOHFoI8BYbDHqoGWEU=
=O0Xo
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Alan Stern Jan. 7, 2014, 7:18 p.m. UTC | #4

On Tue, 7 Jan 2014, Phillip Susi wrote:

> On 1/7/2014 1:05 PM, Alan Stern wrote:
> > I disagree with your argument.  If you aren't using autosuspend at
> > all then you don't mind having your disks spin all the time.
> > Therefore you don't mind if the disks spin up during system
> > resume.
> 
> I prefer to manually spin the disk down when I don't plan on using it
> for a while instead of letting it auto suspend using a timeout since
> the timeout often either is too short and spins the disk down before
> I'm done with it, or too long, and doesn't let it sleep as much as it
> could.  Yet, I do mind that the disk must spin up whenever I resume,
> even though I don't plan on using it any time soon.

I don't know how you manually spin down your disk, but one reasonable
way to do it is to set the autosuspend timeout to 0.  If you use this
approach and apply my patch, the disk should remain spun down when the
system resumes.

> > If you're using a long timeout and you don't like the way the timer
> >  gets reset by a system sleep, then you have set the timeout too
> > long.
> 
> It isn't reset, it simply hasn't timed out at the time I suspend.

On the contrary, it is reset.  See below.

> This isn't very hard to do with even only a 5 minute auto suspend
> delay.  If I do something with the disk then suspend the whole system
> 3 minutes later, it would spin up again when I resume even though I
> finished my use of the disk before suspending the system.

Right.  If you hadn't put the whole system to sleep, the disk would
have gone into runtime suspend 2 minutes later (assuming you didn't use
it in the meantime).  Whereas since you did put the whole system to
sleep, the disk will go into runtime suspend 5 minutes after the system
wakes up -- not 2 minutes later.  I.e., the timer has been reset from 2 
minutes back to 5.

How is the kernel supposed to know that you finished using the disk
before suspending the system?  After all, by setting the autosuspend
timeout to 5 minutes, you have already told the kernel to keep the disk
spun up if it has been used any time in the last 5 minutes, and you
used it only 3 minutes ago.  As far as the kernel can tell, you still
want the disk to be spun up and ready for use.  That remains true at 
the time of the system resume.

> > Now, a more reasonable argument might be that for some disks, the 
> > kernel doesn't need to do an explicit spin-up or spin-down (for
> > runtime suspend) at all; instead we can rely on the drive's
> > internal power management.  In fact there already is a
> > "manage_start_stop" attribute to control this.  (Well, almost -- if
> > the attribute is set to 0 then the kernel won't issue a spin-down
> > command even for system suspend.)
> 
> This flag is broken and unusable and should be removed.

Not at all.

>  First, SCSI
> disks *require* the start command after a suspend, and secondly, not
> stopping the drive before suspend causes an emergency head retract,
> which damages the drive.

You're forgetting that the same sd driver manages other types of 
devices than disk drives.  Card readers and USB flash drives have no 
heads to retract and no spinning platters.  They don't need START STOP 
commands (in fact, some of them probably would crash if they received 
such a command).

> > I don't understand.  Under what circumstances would this happen?
> > Are you saying this would happen during system resume?  Why would
> > the disk spin up of its own accord at that time?
> 
> By default, ATA disks spin up on their own when they are powered on.

Irrelevant, unless you are also claiming that all ATA disks always get
powered on whenever the system resumes, something which is not at all
evident to me.  Particularly if the disk was in runtime suspend before
the system went to sleep.

> > And if it does spin up of its own accord, what makes you think the
> >  kernel can do anything to prevent it from spinning up?
> 
> It can't.  What it should do is notice when this has happened and not
> claim the device is runtime suspended when it is in fact, spinning.

Now I'm really getting confused.  Here, you are saying that ATA disks
will always spin up, all by themselves, whenever the system resumes,
and there's nothing the kernel can do to prevent it.  But earlier you
wrote:

> Sure, if that is your *only* goal.  I also want the disk to not spin
> up *at all* if possible.  There's no sense spinning up all of your
> disks every time you resume when you very rarely access some of them.

Isn't this a contradiction?  Or are you making a distinction between 
ATA and non-ATA disks?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Phillip Susi Jan. 7, 2014, 11:47 p.m. UTC | #5

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On 01/07/2014 02:18 PM, Alan Stern wrote:
> I don't know how you manually spin down your disk, but one
> reasonable way to do it is to set the autosuspend timeout to 0.  If
> you use this approach and apply my patch, the disk should remain
> spun down when the system resumes.

The traditional method before the recently added block pm feature was
with hdparm -y.

> Right.  If you hadn't put the whole system to sleep, the disk
> would have gone into runtime suspend 2 minutes later (assuming you
> didn't use it in the meantime).  Whereas since you did put the
> whole system to sleep, the disk will go into runtime suspend 5
> minutes after the system wakes up -- not 2 minutes later.  I.e.,
> the timer has been reset from 2 minutes back to 5.

What resets the timer?  It should only be reset when IO is done.  And
yes, it is exactly that wakeup and turn around and suspend again that
I'm trying to avoid; it puts unnecessary wear and tear on the disk.

> How is the kernel supposed to know that you finished using the
> disk before suspending the system?  After all, by setting the
> autosuspend timeout to 5 minutes, you have already told the kernel
> to keep the disk spun up if it has been used any time in the last 5
> minutes, and you used it only 3 minutes ago.  As far as the kernel
> can tell, you still want the disk to be spun up and ready for use.
> That remains true at the time of the system resume.

That's the whole point: it doesn't know.  What it does know is that
the disk is spun down on resume, and it has no reason to believe that
you will need it again soon.  This is especially true given that your
typical single ATA disk system will spin up the drive on its own
anyhow, so on systems with multiple scsi or ATA disks that explicitly
have been set to power up in standby, it is at least an even bet that
you won't be touching them all soon.

> You're forgetting that the same sd driver manages other types of 
> devices than disk drives.  Card readers and USB flash drives have
> no heads to retract and no spinning platters.  They don't need
> START STOP commands (in fact, some of them probably would crash if
> they received such a command).

They are irrelevant since they ignore the command anyhow.

> Irrelevant, unless you are also claiming that all ATA disks always
> get powered on whenever the system resumes, something which is not
> at all evident to me.  Particularly if the disk was in runtime
> suspend before the system went to sleep.

I'm saying that most do.  It doesn't require 100% to be relevant.
There will also be plenty of disks at least for some time that are
still using the internal standby timer rather than runtime pm.

> Now I'm really getting confused.  Here, you are saying that ATA
> disks will always spin up, all by themselves, whenever the system
> resumes, and there's nothing the kernel can do to prevent it.  But
> earlier you wrote:

No, I'm saying some ( most in fact ) do, and some do not.  Both cases
need to be handled.  Runtime pm must not indicate the disk is
suspended when it spins up on its own after power on, which ATA disks
do by default.  Oddly, the SCSI standard says that SCSI disks can be
configured behave this way as well rather than requiring an explicit
start command, though it doesn't say how and I've not heard of a way
to do this.  I suppose this also applies to the USB flash drives and
SD card readers you mentioned that are managed by sd.

> Isn't this a contradiction?  Or are you making a distinction
> between ATA and non-ATA disks?

What?  I have some disk drives that I rarely access.  I simply don't
want them starting up and spinning back down every time I resume.  In
my case they are ATA with PuiS enabled, but it applies to SCSI the
same way.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBCgAGBQJSzJItAAoJEI5FoCIzSKrwhCQIAIclA7u+FJ1PSZRYWcvFQg0B
n/WIupSCAXfiSo4uVZpC9m4W8TR9CawEorIPZGE+G/Hv+zRz3YKqA3OOuOQc1S5i
5IK019yY4EsOHsUK8RlBsgKu1bW0SdFsa73COzqT65bxNqUb5PUK5ed/Z1Kg7cTn
QM7jCMYz0wE7cqAQQ8eV56Y1Nolb6T3WmqKqoGl+qJj+IvebCuJ9vsYXf7MhSd9b
yb2Iq/ThR81vm68c6+KOFQkOkL3zPZ6QWCh5Xj4mRkvXtsd7htNKpxTkM7OC6azF
Z0I5xreArN9SQ8GLHtzB2Lrs+SzlMSvIE1HKQdieBcy//LImk8DajbddupJy7Bk=
=dYPC
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Alan Stern Jan. 8, 2014, 5:46 p.m. UTC | #6

On Tue, 7 Jan 2014, Phillip Susi wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
> 
> On 01/07/2014 02:18 PM, Alan Stern wrote:
> > I don't know how you manually spin down your disk, but one
> > reasonable way to do it is to set the autosuspend timeout to 0.  If
> > you use this approach and apply my patch, the disk should remain
> > spun down when the system resumes.
> 
> The traditional method before the recently added block pm feature was
> with hdparm -y.

A real problem here is that hdparm does not interact with the runtime 
PM part of the kernel.  They work independently and can interfere with 
each other.

> > Right.  If you hadn't put the whole system to sleep, the disk
> > would have gone into runtime suspend 2 minutes later (assuming you
> > didn't use it in the meantime).  Whereas since you did put the
> > whole system to sleep, the disk will go into runtime suspend 5
> > minutes after the system wakes up -- not 2 minutes later.  I.e.,
> > the timer has been reset from 2 minutes back to 5.
> 
> What resets the timer?  It should only be reset when IO is done.  And

I/O _was_ done.  To spin up the disk requires sending it a START 
command.  That's I/O.

Okay, you may not think that is a valid argument.  Still, the timer has
to be set to _something_, and during the resume we don't know how much
time was remaining as of the suspend.  (I suppose we could store this 
information when the suspend occurs, but currently we don't.)

> yes, it is exactly that wakeup and turn around and suspend again that
> I'm trying to avoid; it puts unnecessary wear and tear on the disk.

In essence, you want your disk's state to move directly from unpowered
(system asleep) to runtime-suspended with no intermediate active -- but
only if the disk doesn't spin up all by itself.

One problem: How can the kernel tell whether the disk has spun up by
itself?  I don't think it can be done at the SCSI level.  Can it be
done at the ATA level (and without using the SCSI stack, as that would
require the disk to go out of runtime suspend and spin up anyway)?

> > How is the kernel supposed to know that you finished using the
> > disk before suspending the system?  After all, by setting the
> > autosuspend timeout to 5 minutes, you have already told the kernel
> > to keep the disk spun up if it has been used any time in the last 5
> > minutes, and you used it only 3 minutes ago.  As far as the kernel
> > can tell, you still want the disk to be spun up and ready for use.
> > That remains true at the time of the system resume.
> 
> That's the whole point: it doesn't know.  What it does know is that
> the disk is spun down on resume, and it has no reason to believe that
> you will need it again soon.

Actually, it does have a reason.  Namely, you told it earlier to assume 
the disk will be needed again if it was used within the last 5 
minutes.  At the time of the system resume, the disk was last used 3 
minutes ago (not counting the time the system was asleep).

>  This is especially true given that your
> typical single ATA disk system will spin up the drive on its own
> anyhow, so on systems with multiple scsi or ATA disks that explicitly
> have been set to power up in standby, it is at least an even bet that
> you won't be touching them all soon.

Are you assuming these disks were all runtime-active before the system 
sleep?  If so, the kernel is justified in assuming that you will be 
touching them all soon.

> > You're forgetting that the same sd driver manages other types of 
> > devices than disk drives.  Card readers and USB flash drives have
> > no heads to retract and no spinning platters.  They don't need
> > START STOP commands (in fact, some of them probably would crash if
> > they received such a command).
> 
> They are irrelevant since they ignore the command anyhow.

Not the ones that crash when they receive it.  But this point is moot
since manage_start_stop doesn't do what you want anyway.

> > Now I'm really getting confused.  Here, you are saying that ATA
> > disks will always spin up, all by themselves, whenever the system
> > resumes, and there's nothing the kernel can do to prevent it.  But
> > earlier you wrote:
> 
> No, I'm saying some ( most in fact ) do, and some do not.  Both cases
> need to be handled.  Runtime pm must not indicate the disk is
> suspended when it spins up on its own after power on, which ATA disks
> do by default.  Oddly, the SCSI standard says that SCSI disks can be
> configured behave this way as well rather than requiring an explicit
> start command, though it doesn't say how and I've not heard of a way
> to do this.  I suppose this also applies to the USB flash drives and
> SD card readers you mentioned that are managed by sd.

No, it doesn't.  They don't spin, so they can't be configured either to 
spin up or not to spin up after power-on.

> > Isn't this a contradiction?  Or are you making a distinction
> > between ATA and non-ATA disks?
> 
> What?  I have some disk drives that I rarely access.  I simply don't
> want them starting up and spinning back down every time I resume.  In
> my case they are ATA with PuiS enabled, but it applies to SCSI the
> same way.

What is PuiS?

In the end, it sounds like what you want can be done fairly easily.  
But only if the kernel has a reliable way to tell whether or not a disk
is spinning (or is spinning up).

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Alan Stern Jan. 8, 2014, 6:31 p.m. UTC | #7

On Wed, 8 Jan 2014, Alan Stern wrote:

> In essence, you want your disk's state to move directly from unpowered
> (system asleep) to runtime-suspended with no intermediate active -- but
> only if the disk doesn't spin up all by itself.

> In the end, it sounds like what you want can be done fairly easily.  

It turns out there is a snag.  The PM core is set up such that during
system sleep transitions, there's no way to tell whether or not
userspace has permitted runtime PM for a device.  (This is because the
core wants to prevent untimely runtime suspends from occurring in the
middle of a system sleep transition, and to do so it uses the same
mechanism as it does when userspace disallows runtime suspend.)

As a result, we must not go directly from unpowered to
runtime-suspended.  Not without some sort of change to the PM core.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Phillip Susi Jan. 8, 2014, 8:20 p.m. UTC | #8

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 1/8/2014 12:46 PM, Alan Stern wrote:
> I/O _was_ done.  To spin up the disk requires sending it a START 
> command.  That's I/O.

Yes, currently, but that's exactly what I'm trying to get rid of.

> In essence, you want your disk's state to move directly from
> unpowered (system asleep) to runtime-suspended with no intermediate
> active -- but only if the disk doesn't spin up all by itself.
> 
> One problem: How can the kernel tell whether the disk has spun up
> by itself?  I don't think it can be done at the SCSI level.  Can it
> be done at the ATA level (and without using the SCSI stack, as that
> would require the disk to go out of runtime suspend and spin up
> anyway)?

You issue a REQUEST SENSE command and that returns status indicating
whether the drive is stopped, or in standby.  See my patches.  One of
them fixes the scsi-ata translation to issue the ATA CHECK power
command to see if the drive is suspended, as SAT-3 calls for.

> Actually, it does have a reason.  Namely, you told it earlier to
> assume the disk will be needed again if it was used within the last
> 5 minutes.  At the time of the system resume, the disk was last
> used 3 minutes ago (not counting the time the system was asleep).

You're looking at it backwards.  You aren't telling it to assume the
disk will be needed again if it was used within the last 5 minutes;
you are telling it that it can assume it *won't* be needed again soon
if not used in the last 5 minutes, and therefore, it is ok to shut it
down.

Also the act of suspending the system is a pretty clear breaking point
in general activity.  You normally stop your work and quiesce the
system before you suspend it.  If I finish my work and copy it to my
backup drive, then suspend the system, and my wife comes by an hour
later to browse the web, she's not going to be touching my backup disk.

> Are you assuming these disks were all runtime-active before the
> system sleep?  If so, the kernel is justified in assuming that you
> will be touching them all soon.

No it isn't.  Absence of evidence is not evidence of absence.  I
listed several use cases where the drive won't be runtime suspended
before the system suspend, yet will not be accessed any time soon
after resume.

> Not the ones that crash when they receive it.  But this point is
> moot since manage_start_stop doesn't do what you want anyway.

Any that crash when they get it either are already unusable or have a
quirk registered to inhibit the command, which would still apply.

> No, it doesn't.  They don't spin, so they can't be configured
> either to spin up or not to spin up after power-on.

Which means we shouldn't be setting the runtime status to suspended
after system resume... that's the second case.

> What is PuiS?

Power up in Standby.

> In the end, it sounds like what you want can be done fairly easily.
>  But only if the kernel has a reliable way to tell whether or not a
> disk is spinning (or is spinning up).

That's why my patches are attempting to do: use REQUEST SENSE to ask
the disk if it is spinning up or not.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSzbMqAAoJEI5FoCIzSKrwCYwIAKMwlRovIdfO6rSgZL7pufFG
LOV/9apHzqldPPLJ5BPk0A69Ok/TimrQax06HTPyGJ1/MmOrEcjJzb482mJ4ixpx
b7DP3De+RCEUifrfIlI/U+fWLHpw4DOYToSy5ZjNZbF/cDz43k85arAwRK/Bkb++
C2F+YfeQyLcR0KjToj8MK9evlGY6oKfEYx9QCjBEgiaXRjIHfu0AuLn7y7AATK6n
pFBzgq4cNSalob9I4WdwejzJz/RzqQFujtknB+nAaKhESqLCoclJLSzwNQyuxcRz
qT3DckTyTBF7vGmYayaeERnukqWnDEBAQOc8qtOxx2rbYpivNHiqIc8sfrRoaJc=
=2kXd
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Alan Stern Jan. 8, 2014, 9:21 p.m. UTC | #9

On Wed, 8 Jan 2014, Phillip Susi wrote:

> > In essence, you want your disk's state to move directly from
> > unpowered (system asleep) to runtime-suspended with no intermediate
> > active -- but only if the disk doesn't spin up all by itself.
> > 
> > One problem: How can the kernel tell whether the disk has spun up
> > by itself?  I don't think it can be done at the SCSI level.  Can it
> > be done at the ATA level (and without using the SCSI stack, as that
> > would require the disk to go out of runtime suspend and spin up
> > anyway)?
> 
> You issue a REQUEST SENSE command and that returns status indicating
> whether the drive is stopped, or in standby.  See my patches.  One of

I never saw your patches.  Where were they posted?

If you issue the REQUEST SENSE command in the usual way (going through
the SCSI and block layers), and the disk is already in runtime suspend,
it won't do what you want.  The block layer won't deliver requests
until the device leaves the RPM_SUSPENDED state.  In addition, when it
receives the command the block layer will submit a deferred
runtime-resume request, which rather defeats the whole purpose.

(I guess you actually saw some of this happen, and that's what led to
this email thread in the first place.)

It's a knotty situation.  The only way to find out whether the disk is
spinning is to ask it, which requires doing I/O, which requires
spinning up the disk.  Maybe we need to add a mechanism to the block
layer's runtime PM implementation for addressing just this situation.

> them fixes the scsi-ata translation to issue the ATA CHECK power
> command to see if the drive is suspended, as SAT-3 calls for.

What happens with non-ATA disks?

> > Actually, it does have a reason.  Namely, you told it earlier to
> > assume the disk will be needed again if it was used within the last
> > 5 minutes.  At the time of the system resume, the disk was last
> > used 3 minutes ago (not counting the time the system was asleep).
> 
> You're looking at it backwards.  You aren't telling it to assume the
> disk will be needed again if it was used within the last 5 minutes;
> you are telling it that it can assume it *won't* be needed again soon
> if not used in the last 5 minutes, and therefore, it is ok to shut it
> down.

Okay, yes, the autosuspend timeout affects when devices should be
powered down, not when they should be powered back up.

> Also the act of suspending the system is a pretty clear breaking point
> in general activity.  You normally stop your work and quiesce the
> system before you suspend it.  If I finish my work and copy it to my
> backup drive, then suspend the system, and my wife comes by an hour
> later to browse the web, she's not going to be touching my backup disk.

While that's true, it's also true that any userspace processes running 
before the system suspend will continue running, with no perceptible 
break, after the system wakes up.  If they were accessing the disk 
before, they will most likely continue to access it afterward.

But never mind all this; it's a pointless discussion.  The real 
question is how to implement what you want.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Phillip Susi Jan. 8, 2014, 9:50 p.m. UTC | #10

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 1/8/2014 4:21 PM, Alan Stern wrote:
> I never saw your patches.  Where were they posted?

Higher up in this thread on linux-ide and linux-scsi.  Subject was Let
sleeping disks lie.

> If you issue the REQUEST SENSE command in the usual way (going
> through the SCSI and block layers), and the disk is already in
> runtime suspend, it won't do what you want.  The block layer won't
> deliver requests until the device leaves the RPM_SUSPENDED state.
> In addition, when it receives the command the block layer will
> submit a deferred runtime-resume request, which rather defeats the
> whole purpose.

Right, which is why I left the device in the active state and used the
pre/post_suspend functions to change the queue into the RPM_SUSPENDING
state, then either complete the transition to RPM_SUSPENDED, or bail
out and go back to RPM_ACTIVE.

> What happens with non-ATA disks?

Same thing: they return a sense status that either says they are
suspended, stopped, or normal.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSzcg2AAoJEI5FoCIzSKrw7wcIAJLiliMYnoHLm/dGWHhhfcnb
ELJLLAhhWsOIQmHiV1svdy0F7EuKi3KF9qJ9JwCaAspO0w8q24UGhvQW20uSH9Md
JliKhn6eBY/NFTctaVP2tnOcc4vevCndJfEScBlWxI82MWMC1TV3lA7xPtcJ5ocX
LMBLPXpr0qypisgmu/tAnwTPQVyU4WUkgMmG9us4w3BgCkvI/oXf4oDKFkCNgUye
FX16NTZPR6iaIK+YJMG3uKCSD4CQjTnKJNXGR89XKJG+Z9v04jyQgWs7kgvqCov9
F5j+yueyMVlveGPHmNwhElHEPT2UWSoyQbonuOrYQZ+Db+xJBOxzoY8Jgq9lWVI=
=+vwf
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Aaron Lu Jan. 9, 2014, 1:29 a.m. UTC | #11

On 01/09/2014 05:21 AM, Alan Stern wrote:
> On Wed, 8 Jan 2014, Phillip Susi wrote:
>> You issue a REQUEST SENSE command and that returns status indicating
>> whether the drive is stopped, or in standby.  See my patches.  One of
> 
> I never saw your patches.  Where were they posted?
> 
> If you issue the REQUEST SENSE command in the usual way (going through
> the SCSI and block layers), and the disk is already in runtime suspend,
> it won't do what you want.  The block layer won't deliver requests
> until the device leaves the RPM_SUSPENDED state.  In addition, when it
> receives the command the block layer will submit a deferred
> runtime-resume request, which rather defeats the whole purpose.
> 
> (I guess you actually saw some of this happen, and that's what led to
> this email thread in the first place.)
> 
> It's a knotty situation.  The only way to find out whether the disk is
> spinning is to ask it, which requires doing I/O, which requires
> spinning up the disk.  Maybe we need to add a mechanism to the block
> layer's runtime PM implementation for addressing just this situation.

I think it's knotty because the runtime PM status is a view from the
kernel/host side, i.e. it is runtime suspended if it is not being used,
no matter which power state it is in. The trigger for the PM state
transition are all based on this, if we want to do it the other way
around(update device's runtime PM status based on device's actual power
state), we are in a knotty situation.
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Rafael J. Wysocki Jan. 9, 2014, 12:17 p.m. UTC | #12

On Thursday, January 09, 2014 09:29:59 AM Aaron Lu wrote:
> On 01/09/2014 05:21 AM, Alan Stern wrote:
> > On Wed, 8 Jan 2014, Phillip Susi wrote:
> >> You issue a REQUEST SENSE command and that returns status indicating
> >> whether the drive is stopped, or in standby.  See my patches.  One of
> > 
> > I never saw your patches.  Where were they posted?
> > 
> > If you issue the REQUEST SENSE command in the usual way (going through
> > the SCSI and block layers), and the disk is already in runtime suspend,
> > it won't do what you want.  The block layer won't deliver requests
> > until the device leaves the RPM_SUSPENDED state.  In addition, when it
> > receives the command the block layer will submit a deferred
> > runtime-resume request, which rather defeats the whole purpose.
> > 
> > (I guess you actually saw some of this happen, and that's what led to
> > this email thread in the first place.)
> > 
> > It's a knotty situation.  The only way to find out whether the disk is
> > spinning is to ask it, which requires doing I/O, which requires
> > spinning up the disk.  Maybe we need to add a mechanism to the block
> > layer's runtime PM implementation for addressing just this situation.
> 
> I think it's knotty because the runtime PM status is a view from the
> kernel/host side, i.e. it is runtime suspended if it is not being used,
> no matter which power state it is in. The trigger for the PM state
> transition are all based on this, if we want to do it the other way
> around(update device's runtime PM status based on device's actual power
> state), we are in a knotty situation.

Agreed.

Rafael J. Wysocki Jan. 9, 2014, 1:18 p.m. UTC | #13

On Thursday, January 09, 2014 01:17:12 PM Rafael J. Wysocki wrote:
> On Thursday, January 09, 2014 09:29:59 AM Aaron Lu wrote:
> > On 01/09/2014 05:21 AM, Alan Stern wrote:
> > > On Wed, 8 Jan 2014, Phillip Susi wrote:
> > >> You issue a REQUEST SENSE command and that returns status indicating
> > >> whether the drive is stopped, or in standby.  See my patches.  One of
> > > 
> > > I never saw your patches.  Where were they posted?
> > > 
> > > If you issue the REQUEST SENSE command in the usual way (going through
> > > the SCSI and block layers), and the disk is already in runtime suspend,
> > > it won't do what you want.  The block layer won't deliver requests
> > > until the device leaves the RPM_SUSPENDED state.  In addition, when it
> > > receives the command the block layer will submit a deferred
> > > runtime-resume request, which rather defeats the whole purpose.
> > > 
> > > (I guess you actually saw some of this happen, and that's what led to
> > > this email thread in the first place.)
> > > 
> > > It's a knotty situation.  The only way to find out whether the disk is
> > > spinning is to ask it, which requires doing I/O, which requires
> > > spinning up the disk.  Maybe we need to add a mechanism to the block
> > > layer's runtime PM implementation for addressing just this situation.
> > 
> > I think it's knotty because the runtime PM status is a view from the
> > kernel/host side, i.e. it is runtime suspended if it is not being used,
> > no matter which power state it is in. The trigger for the PM state
> > transition are all based on this, if we want to do it the other way
> > around(update device's runtime PM status based on device's actual power
> > state), we are in a knotty situation.
> 
> Agreed.

That said during system resume, when we are trying to avoid resuming the
device to save time/energy, it makes sense to check its physical state and
do a pm_request_resume() if that is "on" to avoid a situation in which the
device is physically "on", but its runtime PM status is "suspended" and it
never gets powered down because of that.

Thanks!

Alan Stern Jan. 9, 2014, 3:40 p.m. UTC | #14

On Thu, 9 Jan 2014, Aaron Lu wrote:

> > It's a knotty situation.  The only way to find out whether the disk is
> > spinning is to ask it, which requires doing I/O, which requires
> > spinning up the disk.  Maybe we need to add a mechanism to the block
> > layer's runtime PM implementation for addressing just this situation.
> 
> I think it's knotty because the runtime PM status is a view from the
> kernel/host side, i.e. it is runtime suspended if it is not being used,
> no matter which power state it is in. The trigger for the PM state
> transition are all based on this, if we want to do it the other way
> around(update device's runtime PM status based on device's actual power
> state), we are in a knotty situation.

No, that's not the problem.  The problem is that the block layer does
not permit any requests to go through if the queue's rpm_status is
RPM_SUSPENDED.

We should change the code in the block layer.  blk_pm_peek_request 
should allow requests with REQ_PM set to go through, no matter what the 
rpm_status is.  Then the disk driver could query the disk (to see 
whether it is spinning) while the disk remains in runtime suspend.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Phillip Susi Jan. 9, 2014, 3:53 p.m. UTC | #15

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 1/9/2014 10:40 AM, Alan Stern wrote:
> We should change the code in the block layer.  blk_pm_peek_request
>  should allow requests with REQ_PM set to go through, no matter
> what the rpm_status is.  Then the disk driver could query the disk
> (to see whether it is spinning) while the disk remains in runtime
> suspend.

No, because if the disk is RPM_SUSPENDED, the port may be
RPM_SUSPENDED, which likely means that you CAN'T send down requests to
the device.

We need to put the device into one of the transitioning states to
block other IO, without allowing the port to suspend.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSzsXvAAoJEI5FoCIzSKrwg30H/A52cy6+GAzE372KVMkP/bI+
LQe4FTTIjo8PDBuYeASzmjdiD8x51++w/QfUQEtSTEhr7qWhyXdMbb1B6eGuDlv/
KqOBTTuhC7eiLArGLTGyLRiLOLEiQb7lHStw4ZCpQpZ88f9pNPdcLCslp8zrz4Of
6pt2lLztdG8zexmKyu0Zq/oSe6ajKqVgrSqjoCO9jQL7NUecZNlBnOnsF4jri0MY
GPk3HIb0gZjtVVfeWAiEE9s4MpRlMYHfx+dg8uxNJiAhJcMQeHJLCTKAKyEMKG68
45Ehqp5XITJG8Mx/Mb6gOzA/mFDShzqRiA0i6MtKBcnr+zNeRnWKRV1UbDKcrxk=
=B5lj
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Alan Stern Jan. 9, 2014, 4:14 p.m. UTC | #16

On Thu, 9 Jan 2014, Phillip Susi wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 1/9/2014 10:40 AM, Alan Stern wrote:
> > We should change the code in the block layer.  blk_pm_peek_request
> >  should allow requests with REQ_PM set to go through, no matter
> > what the rpm_status is.  Then the disk driver could query the disk
> > (to see whether it is spinning) while the disk remains in runtime
> > suspend.
> 
> No, because if the disk is RPM_SUSPENDED, the port may be
> RPM_SUSPENDED, which likely means that you CAN'T send down requests to
> the device.

That's true, but it isn't a problem.  We know that requests with REQ_PM
are sent only at certain, controlled times.  In particular, the only
time such a request would be sent while the disk is RPM_SUSPENDED is
during a system resume.

> We need to put the device into one of the transitioning states to
> block other IO, without allowing the port to suspend.

No, we only need to make sure that the port doesn't go into runtime 
suspend while we carry out the "is the disk spinning" test.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Phillip Susi Jan. 9, 2014, 4:34 p.m. UTC | #17

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 1/9/2014 11:14 AM, Alan Stern wrote:
> That's true, but it isn't a problem.  We know that requests with
> REQ_PM are sent only at certain, controlled times.  In particular,
> the only time such a request would be sent while the disk is
> RPM_SUSPENDED is during a system resume.

Yes, but if the disk and port were both already RPM_SUSPENDED when the
system was suspended, and so the port is still RPM_SUSPENDED when the
disk's system resume method is called, then it can't communicate with
the disk.

>> We need to put the device into one of the transitioning states
>> to block other IO, without allowing the port to suspend.
> 
> No, we only need to make sure that the port doesn't go into runtime
>  suspend while we carry out the "is the disk spinning" test.

Well how else do you do that other than with the transition states?


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJSzs+RAAoJEI5FoCIzSKrw1KoH+gJ4c1iXqOUUqJfF2xoxrF6k
Fv3KyTNMn8Y9RF12o5OlfQHwPkwhplIR7uBd5agnd9cXR2owRVLgAE34/ggEEOOq
NgaRwfNBbD3Q83I/6aEZ7C7A0wbWyrA+w2aaBIe5RzTslH3IYcePZvnKUZuG+/0A
rAR/IVyNL5e6JDPgvSkx7/LmxLVb3M4NkYeCMHvl8Gmg+bvg2S+R3TAcUTvRzRv4
gwtkbYGkyS/NmXc86Vi7Z2pOcIyZT3T0CyzlSqAIjljYSFJbzY+PRP4D4gAOXqyd
vqrg+PtxUYlQ0POdYRo0TtCe2N5nk4RI3jwiyD0U1/GFgcY/uNNJOtPaiQOLxOU=
=cMp1
-----END PGP SIGNATURE-----
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Alan Stern Jan. 9, 2014, 5:06 p.m. UTC | #18

On Thu, 9 Jan 2014, Phillip Susi wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 1/9/2014 11:14 AM, Alan Stern wrote:
> > That's true, but it isn't a problem.  We know that requests with
> > REQ_PM are sent only at certain, controlled times.  In particular,
> > the only time such a request would be sent while the disk is
> > RPM_SUSPENDED is during a system resume.
> 
> Yes, but if the disk and port were both already RPM_SUSPENDED when the
> system was suspended, and so the port is still RPM_SUSPENDED when the
> disk's system resume method is called, then it can't communicate with
> the disk.

The port should not be runtime suspended during system resume.  It 
should always go to runtime active.
 
> >> We need to put the device into one of the transitioning states
> >> to block other IO, without allowing the port to suspend.
> > 
> > No, we only need to make sure that the port doesn't go into runtime
> >  suspend while we carry out the "is the disk spinning" test.
> 
> Well how else do you do that other than with the transition states?

pm_runtime_get_noresume(dev->parent).

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

REQ_PM vs REQ_TYPE_PM_RESUME

Commit Message

Comments

Patch