mbox series

[v7,00/23] Fix libata suspend/resume handling and code cleanup

Message ID 20230926081507.69346-1-dlemoal@kernel.org
Headers show
Series Fix libata suspend/resume handling and code cleanup | expand

Message

Damien Le Moal Sept. 26, 2023, 8:14 a.m. UTC
The first 9 patches of this series fix several issues with suspend/resume
power management operations in scsi and libata. The most significant
changes introduced are in patch 4 and 5, where the manage_start_stop
flag of scsi devices is split into the manage_system_start_stop and
manage_runtime_start_stop flags to allow keeping scsi runtime power
operations for spining up/down ATA devices but have libata do its own
system suspend/resume device power state management using EH.

The remaining patches are code cleanup that do not introduce any
significant functional change.

This series was tested on qemu and on various PCs and servers. I am
CC-ing people who recently reported issues with suspend/resume.
Additional testing would be much appreciated.

Changes from v6:
 * Changed patch 9 to use a bool for the new suspended flag.

Changes from v5:
 * Typo and style corrections in patch 4 commit message
 * Changed patch 9 to use a new flag to track a disk suspended state
   instead of using the scsi device state
 * Added review tags

Changes from v4:
 * Remove ata_scsi_dev_alloc() function in patch 3, coding it directly
   in ata_scsi_slave_alloc()
 * Correct typo in patch 19 commit message
 * Added Tested and review tags

Changes from v3:
 * Corrected patch 1 (typo in commit message and WARN_ON() removal)
 * Changed path 3 as suggested by Niklas (moved definition of
   ->slave_alloc)
 * Rebased on rc2
 * Added review tags

Changes from v2:
 * Added patch 4 as simply disabling manage_start_stop from libata was
   breaking individual disk runtime suspend/autosuspend. Patch 5 was
   reworked accordingly to the changes in patch 4.
 * Fixed patch 3: applying the link creation was missing and the link
   creation itself was also incorrect, preventing sd probe to execute
   correctly. Thanks to Geert for testing and reporting this issue.
 * Split the "Fix delayed scsi_rescan_device() execution" patch into
   patch 6 (scsi part) and patch 7 (ata part).
 * Modified patch 9 to not call sd_shutdown() from sd_remove() for
   devices that are not running.
 * Added Chia-Lin Tested tag to unchanged patches

Changes from v1:
 * Added patch 8 and 9 to fix compilation warnings with W=1
 * Addressed John comment in patch 19
 * Fixed patch 20 commit message (Sergei)
 * Added Hannes Review tag

Damien Le Moal (23):
  ata: libata-core: Fix ata_port_request_pm() locking
  ata: libata-core: Fix port and device removal
  ata: libata-scsi: link ata port and scsi device
  scsi: sd: Differentiate system and runtime start/stop management
  ata: libata-scsi: Disable scsi device manage_system_start_stop
  scsi: Do not attempt to rescan suspended devices
  ata: libata-scsi: Fix delayed scsi_rescan_device() execution
  ata: libata-core: Do not register PM operations for SAS ports
  scsi: sd: Do not issue commands to suspended disks on shutdown
  ata: libata-core: Fix compilation warning in ata_dev_config_ncq()
  ata: libata-eh: Fix compilation warning in ata_eh_link_report()
  scsi: Remove scsi device no_start_on_resume flag
  ata: libata-scsi: Cleanup ata_scsi_start_stop_xlat()
  ata: libata-core: Synchronize ata_port_detach() with hotplug
  ata: libata-core: Detach a port devices on shutdown
  ata: libata-core: Remove ata_port_suspend_async()
  ata: libata-core: Remove ata_port_resume_async()
  ata: libata-core: Do not poweroff runtime suspended ports
  ata: libata-core: Do not resume runtime suspended ports
  ata: libata-sata: Improve ata_sas_slave_configure()
  ata: libata-eh: Improve reset error messages
  ata: libata-eh: Reduce "disable device" message verbosity
  ata: libata: Cleanup inline DMA helper functions

 drivers/ata/libata-core.c      | 242 +++++++++++++++++++++++++--------
 drivers/ata/libata-eh.c        |  76 +++++++++--
 drivers/ata/libata-sata.c      |   5 +-
 drivers/ata/libata-scsi.c      | 142 ++++++++++---------
 drivers/ata/libata-transport.c |   9 +-
 drivers/ata/libata.h           |   6 +
 drivers/firewire/sbp2.c        |   9 +-
 drivers/scsi/scsi_scan.c       |  18 ++-
 drivers/scsi/sd.c              | 102 +++++++++++---
 drivers/scsi/sd.h              |   1 +
 include/linux/libata.h         |  26 ++--
 include/scsi/scsi_device.h     |   4 +-
 include/scsi/scsi_host.h       |   2 +-
 13 files changed, 457 insertions(+), 185 deletions(-)

Comments

Geert Uytterhoeven Sept. 28, 2023, 12:26 p.m. UTC | #1
Hi Damien,

(oops, found a two-day old email still in draft)

On Tue, Sep 26, 2023 at 10:15 AM Damien Le Moal <dlemoal@kernel.org> wrote:
> The first 9 patches of this series fix several issues with suspend/resume
> power management operations in scsi and libata. The most significant
> changes introduced are in patch 4 and 5, where the manage_start_stop
> flag of scsi devices is split into the manage_system_start_stop and
> manage_runtime_start_stop flags to allow keeping scsi runtime power
> operations for spining up/down ATA devices but have libata do its own
> system suspend/resume device power state management using EH.
>
> The remaining patches are code cleanup that do not introduce any
> significant functional change.
>
> This series was tested on qemu and on various PCs and servers. I am
> CC-ing people who recently reported issues with suspend/resume.
> Additional testing would be much appreciated.

JFTR, with current libata/for-next[*], I saw the following with
rcar-sata, once (interesting lines marked with "!"):

    PM: suspend entry (s2idle)
    Filesystems sync: 0.026 seconds
    Freezing user space processes
 !  ata1.00: qc timeout after 10000 msecs (cmd 0x40)
    Freezing user space processes completed (elapsed 0.007 seconds)
 !  ata1.00: VERIFY failed (err_mask=0x4)
    OOM killer disabled.
 !  ata1.00: failed to IDENTIFY (I/O error, err_mask=0x40)
    Freezing remaining freezable tasks
 !  ata1.00: revalidation failed (errno=-5)
    Freezing remaining freezable tasks completed (elapsed 0.002 seconds)
    sd 0:0:0:0: [sda] Synchronizing SCSI cache
    ata1: link resume succeeded after 1 retries
    ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
    ata1.00: configured for UDMA/133
    ata1.00: Entering active power mode
    ata1.00: Entering standby power mode
    ravb e6800000.ethernet eth0: Link is Down
    Micrel KSZ9031 Gigabit PHY e6800000.ethernet-ffffffff:00: attached
PHY driver (mii_bus:phy_addr=e6800000.ethernet-ffffffff:00, irq=136)
    OOM killer enabled.
    Restarting tasks ... done.
    random: crng reseeded on system resumption
    PM: suspend exit
    ata1: link resume succeeded after 1 retries
    ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
    ata1.00: Entering active power mode
    ata1.00: configured for UDMA/133
    ravb e6800000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off

Regardless, the disk worked fine after resume.

Note that I saw this only once.

[*] Commit f940258b63da95f4 ("ata: libata: Cleanup inline DMA helper
functions").
    No idea which version of this series that corresponds to, adding
    Link:-tags to your commits would make it easier to track that.
    https://www.kernel.org/doc/html/latest/maintainer/configure-git.html#creating-commit-links-to-lore-kernel-org

Gr{oetje,eeting}s,

                        Geert


--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
Damien Le Moal Sept. 28, 2023, 12:39 p.m. UTC | #2
On 2023/09/28 14:26, Geert Uytterhoeven wrote:
> Hi Damien,
> 
> (oops, found a two-day old email still in draft)
> 
> On Tue, Sep 26, 2023 at 10:15 AM Damien Le Moal <dlemoal@kernel.org> wrote:
>> The first 9 patches of this series fix several issues with suspend/resume
>> power management operations in scsi and libata. The most significant
>> changes introduced are in patch 4 and 5, where the manage_start_stop
>> flag of scsi devices is split into the manage_system_start_stop and
>> manage_runtime_start_stop flags to allow keeping scsi runtime power
>> operations for spining up/down ATA devices but have libata do its own
>> system suspend/resume device power state management using EH.
>>
>> The remaining patches are code cleanup that do not introduce any
>> significant functional change.
>>
>> This series was tested on qemu and on various PCs and servers. I am
>> CC-ing people who recently reported issues with suspend/resume.
>> Additional testing would be much appreciated.
> 
> JFTR, with current libata/for-next[*], I saw the following with
> rcar-sata, once (interesting lines marked with "!"):
> 
>     PM: suspend entry (s2idle)
>     Filesystems sync: 0.026 seconds
>     Freezing user space processes
>  !  ata1.00: qc timeout after 10000 msecs (cmd 0x40)
>     Freezing user space processes completed (elapsed 0.007 seconds)
>  !  ata1.00: VERIFY failed (err_mask=0x4)
>     OOM killer disabled.
>  !  ata1.00: failed to IDENTIFY (I/O error, err_mask=0x40)
>     Freezing remaining freezable tasks
>  !  ata1.00: revalidation failed (errno=-5)
>     Freezing remaining freezable tasks completed (elapsed 0.002 seconds)
>     sd 0:0:0:0: [sda] Synchronizing SCSI cache
>     ata1: link resume succeeded after 1 retries
>     ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>     ata1.00: configured for UDMA/133
>     ata1.00: Entering active power mode
>     ata1.00: Entering standby power mode
>     ravb e6800000.ethernet eth0: Link is Down
>     Micrel KSZ9031 Gigabit PHY e6800000.ethernet-ffffffff:00: attached
> PHY driver (mii_bus:phy_addr=e6800000.ethernet-ffffffff:00, irq=136)
>     OOM killer enabled.
>     Restarting tasks ... done.
>     random: crng reseeded on system resumption
>     PM: suspend exit
>     ata1: link resume succeeded after 1 retries
>     ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>     ata1.00: Entering active power mode
>     ata1.00: configured for UDMA/133
>     ravb e6800000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off
> 
> Regardless, the disk worked fine after resume.

Looks like ata-EH was still doing a resume while the suspend was on-going...
Hmm.. Will need to look at that. Regardless, I am going to send the fixes for
rc4 given that they vastly improve things as they are. But there may still be
some corner cases that are not handled.
Thanks for testing.

> 
> Note that I saw this only once.
> 
> [*] Commit f940258b63da95f4 ("ata: libata: Cleanup inline DMA helper
> functions").
>     No idea which version of this series that corresponds to, adding
>     Link:-tags to your commits would make it easier to track that.
>     https://www.kernel.org/doc/html/latest/maintainer/configure-git.html#creating-commit-links-to-lore-kernel-org
> 
> Gr{oetje,eeting}s,
> 
>                         Geert
> 
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds
Damien Le Moal Sept. 29, 2023, 1:37 p.m. UTC | #3
On 2023/09/28 14:26, Geert Uytterhoeven wrote:
> Hi Damien,
> 
> (oops, found a two-day old email still in draft)
> 
> On Tue, Sep 26, 2023 at 10:15 AM Damien Le Moal <dlemoal@kernel.org> wrote:
>> The first 9 patches of this series fix several issues with suspend/resume
>> power management operations in scsi and libata. The most significant
>> changes introduced are in patch 4 and 5, where the manage_start_stop
>> flag of scsi devices is split into the manage_system_start_stop and
>> manage_runtime_start_stop flags to allow keeping scsi runtime power
>> operations for spining up/down ATA devices but have libata do its own
>> system suspend/resume device power state management using EH.
>>
>> The remaining patches are code cleanup that do not introduce any
>> significant functional change.
>>
>> This series was tested on qemu and on various PCs and servers. I am
>> CC-ing people who recently reported issues with suspend/resume.
>> Additional testing would be much appreciated.
> 
> JFTR, with current libata/for-next[*], I saw the following with
> rcar-sata, once (interesting lines marked with "!"):
> 
>     PM: suspend entry (s2idle)
>     Filesystems sync: 0.026 seconds
>     Freezing user space processes
>  !  ata1.00: qc timeout after 10000 msecs (cmd 0x40)
>     Freezing user space processes completed (elapsed 0.007 seconds)
>  !  ata1.00: VERIFY failed (err_mask=0x4)
>     OOM killer disabled.
>  !  ata1.00: failed to IDENTIFY (I/O error, err_mask=0x40)
>     Freezing remaining freezable tasks
>  !  ata1.00: revalidation failed (errno=-5)
>     Freezing remaining freezable tasks completed (elapsed 0.002 seconds)
>     sd 0:0:0:0: [sda] Synchronizing SCSI cache
>     ata1: link resume succeeded after 1 retries
>     ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>     ata1.00: configured for UDMA/133
>     ata1.00: Entering active power mode
>     ata1.00: Entering standby power mode
>     ravb e6800000.ethernet eth0: Link is Down
>     Micrel KSZ9031 Gigabit PHY e6800000.ethernet-ffffffff:00: attached
> PHY driver (mii_bus:phy_addr=e6800000.ethernet-ffffffff:00, irq=136)
>     OOM killer enabled.
>     Restarting tasks ... done.
>     random: crng reseeded on system resumption
>     PM: suspend exit
>     ata1: link resume succeeded after 1 retries
>     ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>     ata1.00: Entering active power mode
>     ata1.00: configured for UDMA/133
>     ravb e6800000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off
> 
> Regardless, the disk worked fine after resume.
> 
> Note that I saw this only once.

I think I found the reason for this, but to confirm, were you doing a suspend
right after resuming the system ? If yes, that I think I exactly understand the
issue and why you saw it only once (it is a subtle race with scheduling
libata-EH suspend/resume operations). I will send a fix next week.
Geert Uytterhoeven Sept. 29, 2023, 1:56 p.m. UTC | #4
Hi Damien,

On Fri, Sep 29, 2023 at 3:37 PM Damien Le Moal <dlemoal@kernel.org> wrote:
> On 2023/09/28 14:26, Geert Uytterhoeven wrote:
> > On Tue, Sep 26, 2023 at 10:15 AM Damien Le Moal <dlemoal@kernel.org> wrote:
> >> The first 9 patches of this series fix several issues with suspend/resume
> >> power management operations in scsi and libata. The most significant
> >> changes introduced are in patch 4 and 5, where the manage_start_stop
> >> flag of scsi devices is split into the manage_system_start_stop and
> >> manage_runtime_start_stop flags to allow keeping scsi runtime power
> >> operations for spining up/down ATA devices but have libata do its own
> >> system suspend/resume device power state management using EH.
> >>
> >> The remaining patches are code cleanup that do not introduce any
> >> significant functional change.
> >>
> >> This series was tested on qemu and on various PCs and servers. I am
> >> CC-ing people who recently reported issues with suspend/resume.
> >> Additional testing would be much appreciated.
> >
> > JFTR, with current libata/for-next[*], I saw the following with
> > rcar-sata, once (interesting lines marked with "!"):
> >
> >     PM: suspend entry (s2idle)
> >     Filesystems sync: 0.026 seconds
> >     Freezing user space processes
> >  !  ata1.00: qc timeout after 10000 msecs (cmd 0x40)
> >     Freezing user space processes completed (elapsed 0.007 seconds)
> >  !  ata1.00: VERIFY failed (err_mask=0x4)
> >     OOM killer disabled.
> >  !  ata1.00: failed to IDENTIFY (I/O error, err_mask=0x40)
> >     Freezing remaining freezable tasks
> >  !  ata1.00: revalidation failed (errno=-5)
> >     Freezing remaining freezable tasks completed (elapsed 0.002 seconds)
> >     sd 0:0:0:0: [sda] Synchronizing SCSI cache
> >     ata1: link resume succeeded after 1 retries
> >     ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> >     ata1.00: configured for UDMA/133
> >     ata1.00: Entering active power mode
> >     ata1.00: Entering standby power mode
> >     ravb e6800000.ethernet eth0: Link is Down
> >     Micrel KSZ9031 Gigabit PHY e6800000.ethernet-ffffffff:00: attached
> > PHY driver (mii_bus:phy_addr=e6800000.ethernet-ffffffff:00, irq=136)
> >     OOM killer enabled.
> >     Restarting tasks ... done.
> >     random: crng reseeded on system resumption
> >     PM: suspend exit
> >     ata1: link resume succeeded after 1 retries
> >     ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> >     ata1.00: Entering active power mode
> >     ata1.00: configured for UDMA/133
> >     ravb e6800000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off
> >
> > Regardless, the disk worked fine after resume.
> >
> > Note that I saw this only once.
>
> I think I found the reason for this, but to confirm, were you doing a suspend
> right after resuming the system ? If yes, that I think I exactly understand the
> issue and why you saw it only once (it is a subtle race with scheduling
> libata-EH suspend/resume operations). I will send a fix next week.

Now you ask that, yes there was a system suspend before.

Relevant log with timing info:

    [  130.177616] PM: suspend exit
    [  130.257981] ata1: link resume succeeded after 1 retries
    [  130.376714] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
    [  130.388525] ata1.00: Entering active power mode

so the drive should have been ready here.

    [  140.452669] PM: suspend entry (s2idle)
    [  140.488313] Filesystems sync: 0.026 seconds
    [  140.515957] Freezing user space processes
    [  140.518209] ata1.00: qc timeout after 10000 msecs (cmd 0x40)
    [  140.523384] Freezing user space processes completed (elapsed
0.007 seconds)
    [  140.527718] ata1.00: VERIFY failed (err_mask=0x4)
    [  140.532541] OOM killer disabled.
    [  140.537270] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x40)
    [  140.542069] Freezing remaining freezable tasks
    [  140.546784] ata1.00: revalidation failed (errno=-5)

Gr{oetje,eeting}s,

                        Geert
Damien Le Moal Sept. 29, 2023, 2:29 p.m. UTC | #5
On 2023/09/29 15:56, Geert Uytterhoeven wrote:
> Hi Damien,
> 
> On Fri, Sep 29, 2023 at 3:37 PM Damien Le Moal <dlemoal@kernel.org> wrote:
>> On 2023/09/28 14:26, Geert Uytterhoeven wrote:
>>> On Tue, Sep 26, 2023 at 10:15 AM Damien Le Moal <dlemoal@kernel.org> wrote:
>>>> The first 9 patches of this series fix several issues with suspend/resume
>>>> power management operations in scsi and libata. The most significant
>>>> changes introduced are in patch 4 and 5, where the manage_start_stop
>>>> flag of scsi devices is split into the manage_system_start_stop and
>>>> manage_runtime_start_stop flags to allow keeping scsi runtime power
>>>> operations for spining up/down ATA devices but have libata do its own
>>>> system suspend/resume device power state management using EH.
>>>>
>>>> The remaining patches are code cleanup that do not introduce any
>>>> significant functional change.
>>>>
>>>> This series was tested on qemu and on various PCs and servers. I am
>>>> CC-ing people who recently reported issues with suspend/resume.
>>>> Additional testing would be much appreciated.
>>>
>>> JFTR, with current libata/for-next[*], I saw the following with
>>> rcar-sata, once (interesting lines marked with "!"):
>>>
>>>     PM: suspend entry (s2idle)
>>>     Filesystems sync: 0.026 seconds
>>>     Freezing user space processes
>>>  !  ata1.00: qc timeout after 10000 msecs (cmd 0x40)
>>>     Freezing user space processes completed (elapsed 0.007 seconds)
>>>  !  ata1.00: VERIFY failed (err_mask=0x4)
>>>     OOM killer disabled.
>>>  !  ata1.00: failed to IDENTIFY (I/O error, err_mask=0x40)
>>>     Freezing remaining freezable tasks
>>>  !  ata1.00: revalidation failed (errno=-5)
>>>     Freezing remaining freezable tasks completed (elapsed 0.002 seconds)
>>>     sd 0:0:0:0: [sda] Synchronizing SCSI cache
>>>     ata1: link resume succeeded after 1 retries
>>>     ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>>>     ata1.00: configured for UDMA/133
>>>     ata1.00: Entering active power mode
>>>     ata1.00: Entering standby power mode
>>>     ravb e6800000.ethernet eth0: Link is Down
>>>     Micrel KSZ9031 Gigabit PHY e6800000.ethernet-ffffffff:00: attached
>>> PHY driver (mii_bus:phy_addr=e6800000.ethernet-ffffffff:00, irq=136)
>>>     OOM killer enabled.
>>>     Restarting tasks ... done.
>>>     random: crng reseeded on system resumption
>>>     PM: suspend exit
>>>     ata1: link resume succeeded after 1 retries
>>>     ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>>>     ata1.00: Entering active power mode
>>>     ata1.00: configured for UDMA/133
>>>     ravb e6800000.ethernet eth0: Link is Up - 1Gbps/Full - flow control off
>>>
>>> Regardless, the disk worked fine after resume.
>>>
>>> Note that I saw this only once.
>>
>> I think I found the reason for this, but to confirm, were you doing a suspend
>> right after resuming the system ? If yes, that I think I exactly understand the
>> issue and why you saw it only once (it is a subtle race with scheduling
>> libata-EH suspend/resume operations). I will send a fix next week.
> 
> Now you ask that, yes there was a system suspend before.
> 
> Relevant log with timing info:
> 
>     [  130.177616] PM: suspend exit
>     [  130.257981] ata1: link resume succeeded after 1 retries
>     [  130.376714] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>     [  130.388525] ata1.00: Entering active power mode
> 
> so the drive should have been ready here.

yep.

> 
>     [  140.452669] PM: suspend entry (s2idle)

Then suspend again 10s later.

>     [  140.488313] Filesystems sync: 0.026 seconds
>     [  140.515957] Freezing user space processes
>     [  140.518209] ata1.00: qc timeout after 10000 msecs (cmd 0x40)
>     [  140.523384] Freezing user space processes completed (elapsed
> 0.007 seconds)
>     [  140.527718] ata1.00: VERIFY failed (err_mask=0x4)

But that verify sent 10s earlier to spinup the drive failed... Hmmm... That is
not exactly what I was thinking of. While the race between scheduling suspend
while resume is still on-going does exist, it is likely not what is happening
here given the time interval with suspend entry. Need to dig further.
Do you perhaps have "Power-up in Standby" (PUIS) enabled on that drive ?

>     [  140.532541] OOM killer disabled.
>     [  140.537270] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x40)
>     [  140.542069] Freezing remaining freezable tasks
>     [  140.546784] ata1.00: revalidation failed (errno=-5)
> 
> Gr{oetje,eeting}s,
> 
>                         Geert
>