mbox series

[SRU,Focal,0/3] LP: #1996678 - containerd regressions

Message ID 20221123174836.1973558-1-cascardo@canonical.com
Headers show
Series LP: #1996678 - containerd regressions | expand

Message

Thadeu Lima de Souza Cascardo Nov. 23, 2022, 5:48 p.m. UTC
[Impact]
Multiple users and partners have reported containerd/k8s/runc failures.
These manifest as containers being in the unknown state and failures
to communicate with containerd (response timeouts).

[Fixes]
Some reports attribute the failure to epoll, in particular commit a16ceb139610.
Others have attribute containerd failures to splice commit 97ef77c52b78.

The epoll commit was applied upstream after many other fixes were applied to
epoll. Those other fixes prevent race conditions between epoll_pwait timing out
and wakeups being missed. a16ceb139610 upstream would make such races produce
extra events, not miss events. But on current 5.4 kernel, specially without
commit 289caf5d8f6c, this would result in events more likely being missed.

Some tests have shown that when applying 65759097d804 and 289caf5d8f6c and not
reverting a16ceb139610, the number of responses timeouts compared to either
Ubuntu-5.4.0-131 (without a16ceb139610) and 5.4.0-132 (with a16ceb139610) go
down.

Also, 5.15 carry some epoll kselftests and at least epoll61 test passes once
those fixes are applied, and no other regressions in that test suite are found.
That particular test is demonstrating the described race condition from commit
289caf5d8f6c.

97ef77c52b78, on the other hand, has also been reverted on upstream 5.4.y as it
depended on a lot of other changes that have not been backported or targeted to
5.4. In particular, the reasoning upstream was NFS regressions.

In light of all this, even though these fixes may not be necessary to fix the
reported containerd bugs, they should go into all 5.4 kernels anyway as they
fix other real bugs.

[URGENCY]
Applying these fixes as soon as possible is in line with a strategy of getting
fixes out so they can be tested and verified, particularly for those users who
require offical Ubuntu builds. In case these cause more harm or do not fix the
regressions, they can be reverted and better fixes applied instead.


Roman Penyaev (1):
  epoll: call final ep_events_available() check under the lock

Sasha Levin (1):
  Revert "fs: check FMODE_LSEEK to control internal pipe splicing"

Soheil Hassas Yeganeh (1):
  epoll: check for events when removing a timed out thread from the wait
    queue

 fs/eventpoll.c | 69 ++++++++++++++++++++++++++++++--------------------
 fs/splice.c    | 10 +++++---
 2 files changed, 48 insertions(+), 31 deletions(-)

Comments

Luke Nowakowski-Krijger Nov. 23, 2022, 7:25 p.m. UTC | #1
Acked-by: Luke Nowakowski-Krijger <luke.nowakowskikrijger@canonical.com>

On Wed, Nov 23, 2022 at 9:50 AM Thadeu Lima de Souza Cascardo <
cascardo@canonical.com> wrote:

> [Impact]
> Multiple users and partners have reported containerd/k8s/runc failures.
> These manifest as containers being in the unknown state and failures
> to communicate with containerd (response timeouts).
>
> [Fixes]
> Some reports attribute the failure to epoll, in particular commit
> a16ceb139610.
> Others have attribute containerd failures to splice commit 97ef77c52b78.
>
> The epoll commit was applied upstream after many other fixes were applied
> to
> epoll. Those other fixes prevent race conditions between epoll_pwait
> timing out
> and wakeups being missed. a16ceb139610 upstream would make such races
> produce
> extra events, not miss events. But on current 5.4 kernel, specially without
> commit 289caf5d8f6c, this would result in events more likely being missed.
>
> Some tests have shown that when applying 65759097d804 and 289caf5d8f6c and
> not
> reverting a16ceb139610, the number of responses timeouts compared to either
> Ubuntu-5.4.0-131 (without a16ceb139610) and 5.4.0-132 (with a16ceb139610)
> go
> down.
>
> Also, 5.15 carry some epoll kselftests and at least epoll61 test passes
> once
> those fixes are applied, and no other regressions in that test suite are
> found.
> That particular test is demonstrating the described race condition from
> commit
> 289caf5d8f6c.
>
> 97ef77c52b78, on the other hand, has also been reverted on upstream 5.4.y
> as it
> depended on a lot of other changes that have not been backported or
> targeted to
> 5.4. In particular, the reasoning upstream was NFS regressions.
>
> In light of all this, even though these fixes may not be necessary to fix
> the
> reported containerd bugs, they should go into all 5.4 kernels anyway as
> they
> fix other real bugs.
>
> [URGENCY]
> Applying these fixes as soon as possible is in line with a strategy of
> getting
> fixes out so they can be tested and verified, particularly for those users
> who
> require offical Ubuntu builds. In case these cause more harm or do not fix
> the
> regressions, they can be reverted and better fixes applied instead.
>
>
> Roman Penyaev (1):
>   epoll: call final ep_events_available() check under the lock
>
> Sasha Levin (1):
>   Revert "fs: check FMODE_LSEEK to control internal pipe splicing"
>
> Soheil Hassas Yeganeh (1):
>   epoll: check for events when removing a timed out thread from the wait
>     queue
>
>  fs/eventpoll.c | 69 ++++++++++++++++++++++++++++++--------------------
>  fs/splice.c    | 10 +++++---
>  2 files changed, 48 insertions(+), 31 deletions(-)
>
> --
> 2.34.1
>
>
> --
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
>
Khalid Elmously Nov. 23, 2022, 7:29 p.m. UTC | #2
On 2022-11-23 14:48:33 , Thadeu Lima de Souza Cascardo wrote:
> [Impact]
> Multiple users and partners have reported containerd/k8s/runc failures.
> These manifest as containers being in the unknown state and failures
> to communicate with containerd (response timeouts).
> 
> [Fixes]
> Some reports attribute the failure to epoll, in particular commit a16ceb139610.
> Others have attribute containerd failures to splice commit 97ef77c52b78.
> 
> The epoll commit was applied upstream after many other fixes were applied to
> epoll. Those other fixes prevent race conditions between epoll_pwait timing out
> and wakeups being missed. a16ceb139610 upstream would make such races produce
> extra events, not miss events. But on current 5.4 kernel, specially without
> commit 289caf5d8f6c, this would result in events more likely being missed.
> 
> Some tests have shown that when applying 65759097d804 and 289caf5d8f6c and not
> reverting a16ceb139610, the number of responses timeouts compared to either
> Ubuntu-5.4.0-131 (without a16ceb139610) and 5.4.0-132 (with a16ceb139610) go
> down.
> 
> Also, 5.15 carry some epoll kselftests and at least epoll61 test passes once
> those fixes are applied, and no other regressions in that test suite are found.
> That particular test is demonstrating the described race condition from commit
> 289caf5d8f6c.
> 
> 97ef77c52b78, on the other hand, has also been reverted on upstream 5.4.y as it
> depended on a lot of other changes that have not been backported or targeted to
> 5.4. In particular, the reasoning upstream was NFS regressions.
> 
> In light of all this, even though these fixes may not be necessary to fix the
> reported containerd bugs, they should go into all 5.4 kernels anyway as they
> fix other real bugs.
> 
> [URGENCY]
> Applying these fixes as soon as possible is in line with a strategy of getting
> fixes out so they can be tested and verified, particularly for those users who
> require offical Ubuntu builds. In case these cause more harm or do not fix the
> regressions, they can be reverted and better fixes applied instead.
> 
> 
> Roman Penyaev (1):
>   epoll: call final ep_events_available() check under the lock
> 
> Sasha Levin (1):
>   Revert "fs: check FMODE_LSEEK to control internal pipe splicing"
> 
> Soheil Hassas Yeganeh (1):
>   epoll: check for events when removing a timed out thread from the wait
>     queue
> 
>  fs/eventpoll.c | 69 ++++++++++++++++++++++++++++++--------------------
>  fs/splice.c    | 10 +++++---
>  2 files changed, 48 insertions(+), 31 deletions(-)
> 
> -- 
> 2.34.1
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
Khalid Elmously Nov. 23, 2022, 7:31 p.m. UTC | #3
Adding my signoff properly:

Acked-by: Khalid Elmously <khalid.elmously@canonical.com>


On 2022-11-23 14:29:49 , Khaled Elmously wrote:
> On 2022-11-23 14:48:33 , Thadeu Lima de Souza Cascardo wrote:
> > [Impact]
> > Multiple users and partners have reported containerd/k8s/runc failures.
> > These manifest as containers being in the unknown state and failures
> > to communicate with containerd (response timeouts).
> > 
> > [Fixes]
> > Some reports attribute the failure to epoll, in particular commit a16ceb139610.
> > Others have attribute containerd failures to splice commit 97ef77c52b78.
> > 
> > The epoll commit was applied upstream after many other fixes were applied to
> > epoll. Those other fixes prevent race conditions between epoll_pwait timing out
> > and wakeups being missed. a16ceb139610 upstream would make such races produce
> > extra events, not miss events. But on current 5.4 kernel, specially without
> > commit 289caf5d8f6c, this would result in events more likely being missed.
> > 
> > Some tests have shown that when applying 65759097d804 and 289caf5d8f6c and not
> > reverting a16ceb139610, the number of responses timeouts compared to either
> > Ubuntu-5.4.0-131 (without a16ceb139610) and 5.4.0-132 (with a16ceb139610) go
> > down.
> > 
> > Also, 5.15 carry some epoll kselftests and at least epoll61 test passes once
> > those fixes are applied, and no other regressions in that test suite are found.
> > That particular test is demonstrating the described race condition from commit
> > 289caf5d8f6c.
> > 
> > 97ef77c52b78, on the other hand, has also been reverted on upstream 5.4.y as it
> > depended on a lot of other changes that have not been backported or targeted to
> > 5.4. In particular, the reasoning upstream was NFS regressions.
> > 
> > In light of all this, even though these fixes may not be necessary to fix the
> > reported containerd bugs, they should go into all 5.4 kernels anyway as they
> > fix other real bugs.
> > 
> > [URGENCY]
> > Applying these fixes as soon as possible is in line with a strategy of getting
> > fixes out so they can be tested and verified, particularly for those users who
> > require offical Ubuntu builds. In case these cause more harm or do not fix the
> > regressions, they can be reverted and better fixes applied instead.
> > 
> > 
> > Roman Penyaev (1):
> >   epoll: call final ep_events_available() check under the lock
> > 
> > Sasha Levin (1):
> >   Revert "fs: check FMODE_LSEEK to control internal pipe splicing"
> > 
> > Soheil Hassas Yeganeh (1):
> >   epoll: check for events when removing a timed out thread from the wait
> >     queue
> > 
> >  fs/eventpoll.c | 69 ++++++++++++++++++++++++++++++--------------------
> >  fs/splice.c    | 10 +++++---
> >  2 files changed, 48 insertions(+), 31 deletions(-)
> > 
> > -- 
> > 2.34.1
> > 
> > 
> > -- 
> > kernel-team mailing list
> > kernel-team@lists.ubuntu.com
> > https://lists.ubuntu.com/mailman/listinfo/kernel-team
Kamal Mostafa Nov. 23, 2022, 7:41 p.m. UTC | #4
Acked-by: Kamal Mostafa <kamal@canonical.com>

LGTM.

 -Kamal

On Wed, Nov 23, 2022 at 9:50 AM Thadeu Lima de Souza Cascardo <
cascardo@canonical.com> wrote:

> [Impact]
> Multiple users and partners have reported containerd/k8s/runc failures.
> These manifest as containers being in the unknown state and failures
> to communicate with containerd (response timeouts).
>
> [Fixes]
> Some reports attribute the failure to epoll, in particular commit
> a16ceb139610.
> Others have attribute containerd failures to splice commit 97ef77c52b78.
>
> The epoll commit was applied upstream after many other fixes were applied
> to
> epoll. Those other fixes prevent race conditions between epoll_pwait
> timing out
> and wakeups being missed. a16ceb139610 upstream would make such races
> produce
> extra events, not miss events. But on current 5.4 kernel, specially without
> commit 289caf5d8f6c, this would result in events more likely being missed.
>
> Some tests have shown that when applying 65759097d804 and 289caf5d8f6c and
> not
> reverting a16ceb139610, the number of responses timeouts compared to either
> Ubuntu-5.4.0-131 (without a16ceb139610) and 5.4.0-132 (with a16ceb139610)
> go
> down.
>
> Also, 5.15 carry some epoll kselftests and at least epoll61 test passes
> once
> those fixes are applied, and no other regressions in that test suite are
> found.
> That particular test is demonstrating the described race condition from
> commit
> 289caf5d8f6c.
>
> 97ef77c52b78, on the other hand, has also been reverted on upstream 5.4.y
> as it
> depended on a lot of other changes that have not been backported or
> targeted to
> 5.4. In particular, the reasoning upstream was NFS regressions.
>
> In light of all this, even though these fixes may not be necessary to fix
> the
> reported containerd bugs, they should go into all 5.4 kernels anyway as
> they
> fix other real bugs.
>
> [URGENCY]
> Applying these fixes as soon as possible is in line with a strategy of
> getting
> fixes out so they can be tested and verified, particularly for those users
> who
> require offical Ubuntu builds. In case these cause more harm or do not fix
> the
> regressions, they can be reverted and better fixes applied instead.
>
>
> Roman Penyaev (1):
>   epoll: call final ep_events_available() check under the lock
>
> Sasha Levin (1):
>   Revert "fs: check FMODE_LSEEK to control internal pipe splicing"
>
> Soheil Hassas Yeganeh (1):
>   epoll: check for events when removing a timed out thread from the wait
>     queue
>
>  fs/eventpoll.c | 69 ++++++++++++++++++++++++++++++--------------------
>  fs/splice.c    | 10 +++++---
>  2 files changed, 48 insertions(+), 31 deletions(-)
>
> --
> 2.34.1
>
>
> --
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
>
Thadeu Lima de Souza Cascardo Nov. 23, 2022, 8:03 p.m. UTC | #5
Applied to an Ubuntu kernel coming up soon!

Thanks.
Cascardo.