mbox series

[0/3,SRU,ARTFUL] Fix deadlock on task switches with new microcode

Message ID 1522907497-14743-1-git-send-email-tyhicks@canonical.com
Headers show
Series Fix deadlock on task switches with new microcode | expand

Message

Tyler Hicks April 5, 2018, 5:51 a.m. UTC
BugLink: https://bugs.launchpad.net/bugs/1759920

[Impact]

Some systems experience kernel lockups after updating to the latest
intel-microcode package or when receiving updated microcode from a BIOS update.

In many cases, the lockups occur before users can reach the login screen which
makes it very difficult to debug/workaround.

[Test Case]

The most reliable test case currently known is to install the sssd package.
Lockups may occur during package installation (disable IBPB by writing 0 to
/proc/sys/kernel/ibpb_enabled to prevent this from happening). A lockup will
most likely occur just after booting the system up as the lock screen is
displayed.

[Regression Potential]

The fix is in the task switching code of the kernel so complexity of the change
is relatively high.

[Other Information]

The third patch fixes what I think was an incomplete backport of 72be211ba.
That commit added the initialize_tlbstate_and_flush() function but then never
added any callers of that function.

I was hopeful that the third patch would fix a resume from hibernation/sleep
bug (LP: #1748393) but one tester reported that it did not have an effect.

Tyler

Comments

Stefan Bader April 5, 2018, 7:04 a.m. UTC | #1
On 05.04.2018 07:51, Tyler Hicks wrote:
> BugLink: https://bugs.launchpad.net/bugs/1759920
> 
> [Impact]
> 
> Some systems experience kernel lockups after updating to the latest
> intel-microcode package or when receiving updated microcode from a BIOS update.
> 
> In many cases, the lockups occur before users can reach the login screen which
> makes it very difficult to debug/workaround.
> 
> [Test Case]
> 
> The most reliable test case currently known is to install the sssd package.
> Lockups may occur during package installation (disable IBPB by writing 0 to
> /proc/sys/kernel/ibpb_enabled to prevent this from happening). A lockup will
> most likely occur just after booting the system up as the lock screen is
> displayed.
> 
> [Regression Potential]
> 
> The fix is in the task switching code of the kernel so complexity of the change
> is relatively high.
> 
> [Other Information]
> 
> The third patch fixes what I think was an incomplete backport of 72be211ba.
> That commit added the initialize_tlbstate_and_flush() function but then never
> added any callers of that function.
> 
> I was hopeful that the third patch would fix a resume from hibernation/sleep
> bug (LP: #1748393) but one tester reported that it did not have an effect.

The third patch in this set imo does not belong into this submission. Different
bug references / CVE numbers should always be submitted as their own thread.

About 1-2, the revert should include the same CVE (or BugLink if there were one)
as the follow-up replacement. Only then both will be grouped together in the
changelog. So actually in this case both patches should have both. This (add CVE
line to revert and add BugLink to replacement) can be done when we apply it. So
no need for re-send if there is not anything else which I missed.

-Stefan
> 
> Tyler
> 
>
Kleber Sacilotto de Souza April 5, 2018, 10:54 a.m. UTC | #2
On 04/05/18 07:51, Tyler Hicks wrote:
> BugLink: https://bugs.launchpad.net/bugs/1759920
> 
> [Impact]
> 
> Some systems experience kernel lockups after updating to the latest
> intel-microcode package or when receiving updated microcode from a BIOS update.
> 
> In many cases, the lockups occur before users can reach the login screen which
> makes it very difficult to debug/workaround.
> 
> [Test Case]
> 
> The most reliable test case currently known is to install the sssd package.
> Lockups may occur during package installation (disable IBPB by writing 0 to
> /proc/sys/kernel/ibpb_enabled to prevent this from happening). A lockup will
> most likely occur just after booting the system up as the lock screen is
> displayed.
> 
> [Regression Potential]
> 
> The fix is in the task switching code of the kernel so complexity of the change
> is relatively high.
> 
> [Other Information]
> 
> The third patch fixes what I think was an incomplete backport of 72be211ba.
> That commit added the initialize_tlbstate_and_flush() function but then never
> added any callers of that function.
> 
> I was hopeful that the third patch would fix a resume from hibernation/sleep
> bug (LP: #1748393) but one tester reported that it did not have an effect.
> 
> Tyler
> 
> 

With same comments as Stefan:

Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
Kleber Sacilotto de Souza April 5, 2018, 12:22 p.m. UTC | #3
On 04/05/18 07:51, Tyler Hicks wrote:
> BugLink: https://bugs.launchpad.net/bugs/1759920
> 
> [Impact]
> 
> Some systems experience kernel lockups after updating to the latest
> intel-microcode package or when receiving updated microcode from a BIOS update.
> 
> In many cases, the lockups occur before users can reach the login screen which
> makes it very difficult to debug/workaround.
> 
> [Test Case]
> 
> The most reliable test case currently known is to install the sssd package.
> Lockups may occur during package installation (disable IBPB by writing 0 to
> /proc/sys/kernel/ibpb_enabled to prevent this from happening). A lockup will
> most likely occur just after booting the system up as the lock screen is
> displayed.
> 
> [Regression Potential]
> 
> The fix is in the task switching code of the kernel so complexity of the change
> is relatively high.
> 
> [Other Information]
> 
> The third patch fixes what I think was an incomplete backport of 72be211ba.
> That commit added the initialize_tlbstate_and_flush() function but then never
> added any callers of that function.
> 
> I was hopeful that the third patch would fix a resume from hibernation/sleep
> bug (LP: #1748393) but one tester reported that it did not have an effect.
> 
> Tyler
> 
> 

Applied patches 1/3 and 2/3 to artful/master-next branch, adding the CVE
reference to patch 1/3 and the BugLink to patch 2/3.

Thanks,
Kleber
Stefan Bader April 5, 2018, 12:41 p.m. UTC | #4
On 05.04.2018 07:51, Tyler Hicks wrote:
> BugLink: https://bugs.launchpad.net/bugs/1759920
> 
> [Impact]
> 
> Some systems experience kernel lockups after updating to the latest
> intel-microcode package or when receiving updated microcode from a BIOS update.
> 
> In many cases, the lockups occur before users can reach the login screen which
> makes it very difficult to debug/workaround.
> 
> [Test Case]
> 
> The most reliable test case currently known is to install the sssd package.
> Lockups may occur during package installation (disable IBPB by writing 0 to
> /proc/sys/kernel/ibpb_enabled to prevent this from happening). A lockup will
> most likely occur just after booting the system up as the lock screen is
> displayed.
> 
> [Regression Potential]
> 
> The fix is in the task switching code of the kernel so complexity of the change
> is relatively high.
> 
> [Other Information]
> 
> The third patch fixes what I think was an incomplete backport of 72be211ba.
> That commit added the initialize_tlbstate_and_flush() function but then never
> added any callers of that function.
> 
> I was hopeful that the third patch would fix a resume from hibernation/sleep
> bug (LP: #1748393) but one tester reported that it did not have an effect.

After some more discussions and pointing out that 2/3 does add some init code to
the function that currently was not used (and also that 3/3 is related to both
CVEs) I applied 3/3 to master-next as well.

-Stefan

> 
> Tyler
> 
>