mbox series

[Xenial,SRU,CVE-2018-20784,0/1] fix infinite loop

Message ID 20190927185450.29493-1-connor.kuehl@canonical.com
Headers show
Series fix infinite loop | expand

Message

Connor Kuehl Sept. 27, 2019, 6:54 p.m. UTC
https://people.canonical.com/~ubuntu-security/cve/2018/CVE-2018-20784.html

From the link above:

	"In the Linux kernel before 4.20.2, kernel/sched/fair.c mishandles leaf
	cfs_rq's, which allows attackers to cause a denial of service (infinite
	loop in update_blocked_averages) or possibly have unspecified other impact
	by inducing a high load."

Note, this fix reverts another patch that was specifically SRU'd in to
Xenial: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747896

In the hopes of avoiding a trade of 1 regression for another, I did a bit of an
A/B test to see if I could experience any blatant issues.

I booted Xenial in a 64 bit VM twice. The first time was without this
CVE backport applied. The second time was with it applied. I ran the
reproducer in both cases and experienced the same CPU utilization (both
cores I allocated to my VM were at 100%) and in both cases I experienced
stable memory pressure. They would both hover around 120MB +/- 3-5MB.

The primary difference between the two runs was where I'd watch the
cfs_rqs:

WITHOUT the CVE backport: the cfs_rqs fluctuated between 13-18

WITH the CVE backport: the cfs_rqs started around 65, then floated down
to 61.

If there are more tests that anyone would like to see performed before
we settle on a decision for this backport, please let me know. I'm happy
to do it.

- Connor

Linus Torvalds (1):
  sched/fair: Fix infinite loop in update_blocked_averages() by
    reverting a9e7f6544b9c

 kernel/sched/fair.c | 44 ++++++++++----------------------------------
 1 file changed, 10 insertions(+), 34 deletions(-)

Comments

Kamal Mostafa Sept. 30, 2019, 3:34 p.m. UTC | #1
Port looks good to me.

Acked-by: Kamal Mostafa <kamal@canonical.com>

 -Kamal

On Fri, Sep 27, 2019 at 11:54:49AM -0700, Connor Kuehl wrote:
> https://people.canonical.com/~ubuntu-security/cve/2018/CVE-2018-20784.html
> 
> From the link above:
> 
> 	"In the Linux kernel before 4.20.2, kernel/sched/fair.c mishandles leaf
> 	cfs_rq's, which allows attackers to cause a denial of service (infinite
> 	loop in update_blocked_averages) or possibly have unspecified other impact
> 	by inducing a high load."
> 
> Note, this fix reverts another patch that was specifically SRU'd in to
> Xenial: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747896
> 
> In the hopes of avoiding a trade of 1 regression for another, I did a bit of an
> A/B test to see if I could experience any blatant issues.
> 
> I booted Xenial in a 64 bit VM twice. The first time was without this
> CVE backport applied. The second time was with it applied. I ran the
> reproducer in both cases and experienced the same CPU utilization (both
> cores I allocated to my VM were at 100%) and in both cases I experienced
> stable memory pressure. They would both hover around 120MB +/- 3-5MB.
> 
> The primary difference between the two runs was where I'd watch the
> cfs_rqs:
> 
> WITHOUT the CVE backport: the cfs_rqs fluctuated between 13-18
> 
> WITH the CVE backport: the cfs_rqs started around 65, then floated down
> to 61.
> 
> If there are more tests that anyone would like to see performed before
> we settle on a decision for this backport, please let me know. I'm happy
> to do it.
> 
> - Connor
> 
> Linus Torvalds (1):
>   sched/fair: Fix infinite loop in update_blocked_averages() by
>     reverting a9e7f6544b9c
> 
>  kernel/sched/fair.c | 44 ++++++++++----------------------------------
>  1 file changed, 10 insertions(+), 34 deletions(-)
> 
> -- 
> 2.17.1
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
Tyler Hicks Sept. 30, 2019, 3:39 p.m. UTC | #2
On 2019-09-27 11:54:49, Connor Kuehl wrote:
> https://people.canonical.com/~ubuntu-security/cve/2018/CVE-2018-20784.html
> 
> From the link above:
> 
> 	"In the Linux kernel before 4.20.2, kernel/sched/fair.c mishandles leaf
> 	cfs_rq's, which allows attackers to cause a denial of service (infinite
> 	loop in update_blocked_averages) or possibly have unspecified other impact
> 	by inducing a high load."
> 
> Note, this fix reverts another patch that was specifically SRU'd in to
> Xenial: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747896

Lets skip this one for SRU cycle 2019.09.30 since I think we need to
think a little more about reverting something that was specifically
SRU'ed.

Tyler

> 
> In the hopes of avoiding a trade of 1 regression for another, I did a bit of an
> A/B test to see if I could experience any blatant issues.
> 
> I booted Xenial in a 64 bit VM twice. The first time was without this
> CVE backport applied. The second time was with it applied. I ran the
> reproducer in both cases and experienced the same CPU utilization (both
> cores I allocated to my VM were at 100%) and in both cases I experienced
> stable memory pressure. They would both hover around 120MB +/- 3-5MB.
> 
> The primary difference between the two runs was where I'd watch the
> cfs_rqs:
> 
> WITHOUT the CVE backport: the cfs_rqs fluctuated between 13-18
> 
> WITH the CVE backport: the cfs_rqs started around 65, then floated down
> to 61.
> 
> If there are more tests that anyone would like to see performed before
> we settle on a decision for this backport, please let me know. I'm happy
> to do it.
> 
> - Connor
> 
> Linus Torvalds (1):
>   sched/fair: Fix infinite loop in update_blocked_averages() by
>     reverting a9e7f6544b9c
> 
>  kernel/sched/fair.c | 44 ++++++++++----------------------------------
>  1 file changed, 10 insertions(+), 34 deletions(-)
> 
> -- 
> 2.17.1
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
Andrea Righi Sept. 30, 2019, 3:46 p.m. UTC | #3
On Fri, Sep 27, 2019 at 11:54:49AM -0700, Connor Kuehl wrote:
> https://people.canonical.com/~ubuntu-security/cve/2018/CVE-2018-20784.html
> 
> From the link above:
> 
> 	"In the Linux kernel before 4.20.2, kernel/sched/fair.c mishandles leaf
> 	cfs_rq's, which allows attackers to cause a denial of service (infinite
> 	loop in update_blocked_averages) or possibly have unspecified other impact
> 	by inducing a high load."
> 
> Note, this fix reverts another patch that was specifically SRU'd in to
> Xenial: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747896
> 
> In the hopes of avoiding a trade of 1 regression for another, I did a bit of an
> A/B test to see if I could experience any blatant issues.
> 
> I booted Xenial in a 64 bit VM twice. The first time was without this
> CVE backport applied. The second time was with it applied. I ran the
> reproducer in both cases and experienced the same CPU utilization (both
> cores I allocated to my VM were at 100%) and in both cases I experienced
> stable memory pressure. They would both hover around 120MB +/- 3-5MB.
> 
> The primary difference between the two runs was where I'd watch the
> cfs_rqs:
> 
> WITHOUT the CVE backport: the cfs_rqs fluctuated between 13-18
> 
> WITH the CVE backport: the cfs_rqs started around 65, then floated down
> to 61.
> 
> If there are more tests that anyone would like to see performed before
> we settle on a decision for this backport, please let me know. I'm happy
> to do it.
> 
> - Connor
> 
> Linus Torvalds (1):
>   sched/fair: Fix infinite loop in update_blocked_averages() by
>     reverting a9e7f6544b9c
> 
>  kernel/sched/fair.c | 44 ++++++++++----------------------------------
>  1 file changed, 10 insertions(+), 34 deletions(-)

It seems reasonable to apply this and the backport looks good.

Acked-by: Andrea Righi <andrea.righi@canonical.com>
Tyler Hicks Oct. 11, 2019, 4:02 p.m. UTC | #4
[+Gavin since he SRU'ed the fix for 1747896. Let's keep him cc'ed on any
 resubmissions.]

On 2019-09-27 11:54:49, Connor Kuehl wrote:
> https://people.canonical.com/~ubuntu-security/cve/2018/CVE-2018-20784.html
> 
> From the link above:
> 
> 	"In the Linux kernel before 4.20.2, kernel/sched/fair.c mishandles leaf
> 	cfs_rq's, which allows attackers to cause a denial of service (infinite
> 	loop in update_blocked_averages) or possibly have unspecified other impact
> 	by inducing a high load."
> 
> Note, this fix reverts another patch that was specifically SRU'd in to
> Xenial: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747896
> 
> In the hopes of avoiding a trade of 1 regression for another, I did a bit of an
> A/B test to see if I could experience any blatant issues.
> 
> I booted Xenial in a 64 bit VM twice. The first time was without this
> CVE backport applied. The second time was with it applied. I ran the
> reproducer in both cases and experienced the same CPU utilization (both
> cores I allocated to my VM were at 100%) and in both cases I experienced
> stable memory pressure. They would both hover around 120MB +/- 3-5MB.
> 
> The primary difference between the two runs was where I'd watch the
> cfs_rqs:
> 
> WITHOUT the CVE backport: the cfs_rqs fluctuated between 13-18
> 
> WITH the CVE backport: the cfs_rqs started around 65, then floated down
> to 61.

This seems to indicate that a portion of the [Impact] section from bug
1747896 is reintroduced. Specifically, "Also, the OOM happens because of
the decayed cfs_rqs are not released."

I didn't look too closely but I suspect that we need the equivalence of
this patch sequence to fix CVE-2018-20784 and not reintroduce bug
1747896:

 039ae8bcf7a5 sched/fair: Fix O(nr_cgroups) in the load balancing path
 31bc6aeaab1d sched/fair: Optimize update_blocked_averages()
 f6783319737f sched/fair: Fix insertion in rq->leaf_cfs_rq_list
 c40f7d74c741 sched/fair: Fix infinite loop in update_blocked_averages() by reverting a9e7f6544b9c

Tyler

> 
> If there are more tests that anyone would like to see performed before
> we settle on a decision for this backport, please let me know. I'm happy
> to do it.
> 
> - Connor
> 
> Linus Torvalds (1):
>   sched/fair: Fix infinite loop in update_blocked_averages() by
>     reverting a9e7f6544b9c
> 
>  kernel/sched/fair.c | 44 ++++++++++----------------------------------
>  1 file changed, 10 insertions(+), 34 deletions(-)
> 
> -- 
> 2.17.1
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
Gavin Guo Oct. 26, 2019, 9:09 a.m. UTC | #5
Hi Connor/Tyler,

Thank you for notifying me and we just had a case related the
CVE 2018-20784.

I reviewed the patch set and found v4.15 also reverts the patch
a9e7f6544b9c:

cf740dd81381 sched/fair: Fix infinite loop in update_blocked_averages() by
reverting a9e7f6544b9c

$ git describe --contains cf740dd81381
Ubuntu-4.15.0-59.66~3542

and didn't include the ones just backported for Xenial:

v5.1-rc2:
039ae8bcf7a5 sched/fair: Fix O(nr_cgroups) in the load balancing path
31bc6aeaab1d sched/fair: Optimize update_blocked_averages()
f6783319737f sched/fair: Fix insertion in rq->leaf_cfs_rq_list
5d299eabea5a sched/fair: Add tmp_alone_branch assertion

So, that means the regression of LP 1747896 is happening on the v4.15
Bionic kernel.

Would it be possible to apply these patches set to Bionic? And it will
make sure v4.15 is correct.

The following patch isn't needed for Bionic v4.15 as it was merged in
v4.10-rc1.

9c2791f936ef sched/fair: Fix hierarchical order in rq->leaf_cfs_rq_list
$ git describe --contains 9c2791f936ef5fd04a118b5c284f2c9a95f4a647

v4.10-rc1~189^2~27


On Sat, Oct 12, 2019 at 12:02 AM Tyler Hicks <tyhicks@canonical.com> wrote:

> [+Gavin since he SRU'ed the fix for 1747896. Let's keep him cc'ed on any
>  resubmissions.]
>
> On 2019-09-27 11:54:49, Connor Kuehl wrote:
> >
> https://people.canonical.com/~ubuntu-security/cve/2018/CVE-2018-20784.html
> >
> > From the link above:
> >
> >       "In the Linux kernel before 4.20.2, kernel/sched/fair.c mishandles
> leaf
> >       cfs_rq's, which allows attackers to cause a denial of service
> (infinite
> >       loop in update_blocked_averages) or possibly have unspecified
> other impact
> >       by inducing a high load."
> >
> > Note, this fix reverts another patch that was specifically SRU'd in to
> > Xenial: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1747896
> >
> > In the hopes of avoiding a trade of 1 regression for another, I did a
> bit of an
> > A/B test to see if I could experience any blatant issues.
> >
> > I booted Xenial in a 64 bit VM twice. The first time was without this
> > CVE backport applied. The second time was with it applied. I ran the
> > reproducer in both cases and experienced the same CPU utilization (both
> > cores I allocated to my VM were at 100%) and in both cases I experienced
> > stable memory pressure. They would both hover around 120MB +/- 3-5MB.
> >
> > The primary difference between the two runs was where I'd watch the
> > cfs_rqs:
> >
> > WITHOUT the CVE backport: the cfs_rqs fluctuated between 13-18
> >
> > WITH the CVE backport: the cfs_rqs started around 65, then floated down
> > to 61.
>
> This seems to indicate that a portion of the [Impact] section from bug
> 1747896 is reintroduced. Specifically, "Also, the OOM happens because of
> the decayed cfs_rqs are not released."
>
> I didn't look too closely but I suspect that we need the equivalence of
> this patch sequence to fix CVE-2018-20784 and not reintroduce bug
> 1747896:
>
>  039ae8bcf7a5 sched/fair: Fix O(nr_cgroups) in the load balancing path
>  31bc6aeaab1d sched/fair: Optimize update_blocked_averages()
>  f6783319737f sched/fair: Fix insertion in rq->leaf_cfs_rq_list
>  c40f7d74c741 sched/fair: Fix infinite loop in update_blocked_averages()
> by reverting a9e7f6544b9c
>
> Tyler
>
> >
> > If there are more tests that anyone would like to see performed before
> > we settle on a decision for this backport, please let me know. I'm happy
> > to do it.
> >
> > - Connor
> >
> > Linus Torvalds (1):
> >   sched/fair: Fix infinite loop in update_blocked_averages() by
> >     reverting a9e7f6544b9c
> >
> >  kernel/sched/fair.c | 44 ++++++++++----------------------------------
> >  1 file changed, 10 insertions(+), 34 deletions(-)
> >
> > --
> > 2.17.1
> >
> >
> > --
> > kernel-team mailing list
> > kernel-team@lists.ubuntu.com
> > https://lists.ubuntu.com/mailman/listinfo/kernel-team
>