mbox series

[0/1,SRU,OEM-5.14] Fix system hang during suspend and resume with atlantic nic

Message ID 20220609055451.1805026-1-koba.ko@canonical.com
Headers show
Series Fix system hang during suspend and resume with atlantic nic | expand

Message

Koba Ko June 9, 2022, 5:54 a.m. UTC
BugLink: https://bugs.launchpad.net/bugs/1978058

[Impact]
During suspend and resume, system would hang.

[Fix]
The impact of this regression is the same for resume that I saw on
thaw: the kernel hangs and nothing except SysRq rebooting can be done.

Fixes regression in commit cbe6c3a8f8f4 ("net: atlantic: invert deep
par in pm functions, preventing null derefs"), where I disabled deep
pm resets in suspend and resume, trying to make sense of the
atl_resume_common() deep parameter in the first place.

It turns out, that atlantic always has to deep reset on pm
operations. Even though I expected that and tested resume, I screwed
up by kexec-rebooting into an unpatched kernel, thus missing the
    breakage.

This fixup obsoletes the deep parameter of atl_resume_common, but I
leave the cleanup for the maintainers to post to mainline.

Suspend and hibernation were successfully tested by the reporters.

[Test Case]
1. Suspend the machine
2. wake up the machine and check if system could work.

[Where problems could occur]
Low

Manuel Ullmann (1):
  net: atlantic: always deep reset on pm op, fixing up my null deref
    regression

 drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Timo Aaltonen June 9, 2022, 7:54 a.m. UTC | #1
Koba Ko kirjoitti 9.6.2022 klo 8.54:
> BugLink: https://bugs.launchpad.net/bugs/1978058
> 
> [Impact]
> During suspend and resume, system would hang.
> 
> [Fix]
> The impact of this regression is the same for resume that I saw on
> thaw: the kernel hangs and nothing except SysRq rebooting can be done.
> 
> Fixes regression in commit cbe6c3a8f8f4 ("net: atlantic: invert deep
> par in pm functions, preventing null derefs"), where I disabled deep
> pm resets in suspend and resume, trying to make sense of the
> atl_resume_common() deep parameter in the first place.
> 
> It turns out, that atlantic always has to deep reset on pm
> operations. Even though I expected that and tested resume, I screwed
> up by kexec-rebooting into an unpatched kernel, thus missing the
>      breakage.
> 
> This fixup obsoletes the deep parameter of atl_resume_common, but I
> leave the cleanup for the maintainers to post to mainline.
> 
> Suspend and hibernation were successfully tested by the reporters.
> 
> [Test Case]
> 1. Suspend the machine
> 2. wake up the machine and check if system could work.
> 
> [Where problems could occur]
> Low
> 
> Manuel Ullmann (1):
>    net: atlantic: always deep reset on pm op, fixing up my null deref
>      regression
> 
>   drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 

this came via stable backports
Koba Ko June 9, 2022, 7:59 a.m. UTC | #2
On Thu, Jun 9, 2022 at 3:55 PM Timo Aaltonen <tjaalton@ubuntu.com> wrote:
>
> Koba Ko kirjoitti 9.6.2022 klo 8.54:
> > BugLink: https://bugs.launchpad.net/bugs/1978058
> >
> > [Impact]
> > During suspend and resume, system would hang.
> >
> > [Fix]
> > The impact of this regression is the same for resume that I saw on
> > thaw: the kernel hangs and nothing except SysRq rebooting can be done.
> >
> > Fixes regression in commit cbe6c3a8f8f4 ("net: atlantic: invert deep
> > par in pm functions, preventing null derefs"), where I disabled deep
> > pm resets in suspend and resume, trying to make sense of the
> > atl_resume_common() deep parameter in the first place.
> >
> > It turns out, that atlantic always has to deep reset on pm
> > operations. Even though I expected that and tested resume, I screwed
> > up by kexec-rebooting into an unpatched kernel, thus missing the
> >      breakage.
> >
> > This fixup obsoletes the deep parameter of atl_resume_common, but I
> > leave the cleanup for the maintainers to post to mainline.
> >
> > Suspend and hibernation were successfully tested by the reporters.
> >
> > [Test Case]
> > 1. Suspend the machine
> > 2. wake up the machine and check if system could work.
> >
> > [Where problems could occur]
> > Low
> >
> > Manuel Ullmann (1):
> >    net: atlantic: always deep reset on pm op, fixing up my null deref
> >      regression
> >
> >   drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> >
>
> this came via stable backports
I mentioned this.
how could i manage if it came via stable backports?
>
> --
> t
Timo Aaltonen June 9, 2022, 8:08 a.m. UTC | #3
Koba Ko kirjoitti 9.6.2022 klo 10.59:
> On Thu, Jun 9, 2022 at 3:55 PM Timo Aaltonen <tjaalton@ubuntu.com> wrote:
>>
>> Koba Ko kirjoitti 9.6.2022 klo 8.54:
>>> BugLink: https://bugs.launchpad.net/bugs/1978058
>>>
>>> [Impact]
>>> During suspend and resume, system would hang.
>>>
>>> [Fix]
>>> The impact of this regression is the same for resume that I saw on
>>> thaw: the kernel hangs and nothing except SysRq rebooting can be done.
>>>
>>> Fixes regression in commit cbe6c3a8f8f4 ("net: atlantic: invert deep
>>> par in pm functions, preventing null derefs"), where I disabled deep
>>> pm resets in suspend and resume, trying to make sense of the
>>> atl_resume_common() deep parameter in the first place.
>>>
>>> It turns out, that atlantic always has to deep reset on pm
>>> operations. Even though I expected that and tested resume, I screwed
>>> up by kexec-rebooting into an unpatched kernel, thus missing the
>>>       breakage.
>>>
>>> This fixup obsoletes the deep parameter of atl_resume_common, but I
>>> leave the cleanup for the maintainers to post to mainline.
>>>
>>> Suspend and hibernation were successfully tested by the reporters.
>>>
>>> [Test Case]
>>> 1. Suspend the machine
>>> 2. wake up the machine and check if system could work.
>>>
>>> [Where problems could occur]
>>> Low
>>>
>>> Manuel Ullmann (1):
>>>     net: atlantic: always deep reset on pm op, fixing up my null deref
>>>       regression
>>>
>>>    drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c | 4 ++--
>>>    1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>
>> this came via stable backports
> I mentioned this.
> how could i manage if it came via stable backports?

I meant that this commit is already applied on my local tree as part of 
bug 1977835 which is for stable backports from 5.15.38..5.15.45 :)
Koba Ko June 9, 2022, 8:51 a.m. UTC | #4
On Thu, Jun 9, 2022 at 4:11 PM Timo Aaltonen <tjaalton@ubuntu.com> wrote:
>
> Koba Ko kirjoitti 9.6.2022 klo 10.59:
> > On Thu, Jun 9, 2022 at 3:55 PM Timo Aaltonen <tjaalton@ubuntu.com> wrote:
> >>
> >> Koba Ko kirjoitti 9.6.2022 klo 8.54:
> >>> BugLink: https://bugs.launchpad.net/bugs/1978058
> >>>
> >>> [Impact]
> >>> During suspend and resume, system would hang.
> >>>
> >>> [Fix]
> >>> The impact of this regression is the same for resume that I saw on
> >>> thaw: the kernel hangs and nothing except SysRq rebooting can be done.
> >>>
> >>> Fixes regression in commit cbe6c3a8f8f4 ("net: atlantic: invert deep
> >>> par in pm functions, preventing null derefs"), where I disabled deep
> >>> pm resets in suspend and resume, trying to make sense of the
> >>> atl_resume_common() deep parameter in the first place.
> >>>
> >>> It turns out, that atlantic always has to deep reset on pm
> >>> operations. Even though I expected that and tested resume, I screwed
> >>> up by kexec-rebooting into an unpatched kernel, thus missing the
> >>>       breakage.
> >>>
> >>> This fixup obsoletes the deep parameter of atl_resume_common, but I
> >>> leave the cleanup for the maintainers to post to mainline.
> >>>
> >>> Suspend and hibernation were successfully tested by the reporters.
> >>>
> >>> [Test Case]
> >>> 1. Suspend the machine
> >>> 2. wake up the machine and check if system could work.
> >>>
> >>> [Where problems could occur]
> >>> Low
> >>>
> >>> Manuel Ullmann (1):
> >>>     net: atlantic: always deep reset on pm op, fixing up my null deref
> >>>       regression
> >>>
> >>>    drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c | 4 ++--
> >>>    1 file changed, 2 insertions(+), 2 deletions(-)
> >>>
> >>
> >> this came via stable backports
> > I mentioned this.
> > how could i manage if it came via stable backports?
>
> I meant that this commit is already applied on my local tree as part of
> bug 1977835 which is for stable backports from 5.15.38..5.15.45 :)
Thanks
>
> --
> t