mbox series

[v3,bpf-next,0/4] cgroup bpf auto-detachment

Message ID 20190523194532.2376233-1-guro@fb.com
Headers show
Series cgroup bpf auto-detachment | expand

Message

Roman Gushchin May 23, 2019, 7:45 p.m. UTC
This patchset implements a cgroup bpf auto-detachment functionality:
bpf programs are detached as soon as possible after removal of the
cgroup, without waiting for the release of all associated resources.

Patches 2 and 3 are required to implement a corresponding kselftest
in patch 4.

v3:
  1) some minor changes and typo fixes

v2:
  1) removed a bogus check in patch 4
  2) moved buf[len] = 0 in patch 2


Roman Gushchin (4):
  bpf: decouple the lifetime of cgroup_bpf from cgroup itself
  selftests/bpf: convert test_cgrp2_attach2 example into kselftest
  selftests/bpf: enable all available cgroup v2 controllers
  selftests/bpf: add auto-detach test

 include/linux/bpf-cgroup.h                    |   8 +-
 include/linux/cgroup.h                        |  18 +++
 kernel/bpf/cgroup.c                           |  25 ++-
 kernel/cgroup/cgroup.c                        |  11 +-
 samples/bpf/Makefile                          |   2 -
 tools/testing/selftests/bpf/Makefile          |   4 +-
 tools/testing/selftests/bpf/cgroup_helpers.c  |  57 +++++++
 .../selftests/bpf/test_cgroup_attach.c        | 146 ++++++++++++++++--
 8 files changed, 243 insertions(+), 28 deletions(-)
 rename samples/bpf/test_cgrp2_attach2.c => tools/testing/selftests/bpf/test_cgroup_attach.c (79%)

Comments

Alexei Starovoitov May 24, 2019, 9:03 p.m. UTC | #1
On Thu, May 23, 2019 at 12:45:28PM -0700, Roman Gushchin wrote:
> This patchset implements a cgroup bpf auto-detachment functionality:
> bpf programs are detached as soon as possible after removal of the
> cgroup, without waiting for the release of all associated resources.

The idea looks great, but doesn't quite work:

$ ./test_cgroup_attach
#override:PASS
[   66.475219] BUG: sleeping function called from invalid context at ../include/linux/percpu-rwsem.h:34
[   66.476095] in_atomic(): 1, irqs_disabled(): 0, pid: 21, name: ksoftirqd/2
[   66.476706] CPU: 2 PID: 21 Comm: ksoftirqd/2 Not tainted 5.2.0-rc1-00211-g1861420d0162 #1564
[   66.477595] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
[   66.478360] Call Trace:
[   66.478591]  dump_stack+0x5b/0x8b
[   66.478892]  ___might_sleep+0x22f/0x290
[   66.479230]  cpus_read_lock+0x18/0x50
[   66.479550]  static_key_slow_dec+0x41/0x70
[   66.479914]  cgroup_bpf_release+0x1a6/0x400
[   66.480285]  percpu_ref_switch_to_atomic_rcu+0x203/0x330
[   66.480754]  rcu_core+0x475/0xcc0
[   66.481047]  ? switch_mm_irqs_off+0x684/0xa40
[   66.481422]  ? rcu_note_context_switch+0x260/0x260
[   66.481842]  __do_softirq+0x1cf/0x5ff
[   66.482174]  ? takeover_tasklets+0x5f0/0x5f0
[   66.482542]  ? smpboot_thread_fn+0xab/0x780
[   66.482911]  run_ksoftirqd+0x1a/0x40
[   66.483225]  smpboot_thread_fn+0x3ad/0x780
[   66.483583]  ? sort_range+0x20/0x20
[   66.483894]  ? __kthread_parkme+0xb0/0x190
[   66.484253]  ? sort_range+0x20/0x20
[   66.484562]  ? sort_range+0x20/0x20
[   66.484878]  kthread+0x2e2/0x3e0
[   66.485166]  ? kthread_create_worker_on_cpu+0xb0/0xb0
[   66.485620]  ret_from_fork+0x1f/0x30

Same test runs fine before the patches.
Roman Gushchin May 24, 2019, 9:40 p.m. UTC | #2
On Fri, May 24, 2019 at 02:03:23PM -0700, Alexei Starovoitov wrote:
> On Thu, May 23, 2019 at 12:45:28PM -0700, Roman Gushchin wrote:
> > This patchset implements a cgroup bpf auto-detachment functionality:
> > bpf programs are detached as soon as possible after removal of the
> > cgroup, without waiting for the release of all associated resources.
> 
> The idea looks great, but doesn't quite work:
> 
> $ ./test_cgroup_attach
> #override:PASS
> [   66.475219] BUG: sleeping function called from invalid context at ../include/linux/percpu-rwsem.h:34
> [   66.476095] in_atomic(): 1, irqs_disabled(): 0, pid: 21, name: ksoftirqd/2
> [   66.476706] CPU: 2 PID: 21 Comm: ksoftirqd/2 Not tainted 5.2.0-rc1-00211-g1861420d0162 #1564
> [   66.477595] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.el7 04/01/2014
> [   66.478360] Call Trace:
> [   66.478591]  dump_stack+0x5b/0x8b
> [   66.478892]  ___might_sleep+0x22f/0x290
> [   66.479230]  cpus_read_lock+0x18/0x50
> [   66.479550]  static_key_slow_dec+0x41/0x70
> [   66.479914]  cgroup_bpf_release+0x1a6/0x400
> [   66.480285]  percpu_ref_switch_to_atomic_rcu+0x203/0x330
> [   66.480754]  rcu_core+0x475/0xcc0
> [   66.481047]  ? switch_mm_irqs_off+0x684/0xa40
> [   66.481422]  ? rcu_note_context_switch+0x260/0x260
> [   66.481842]  __do_softirq+0x1cf/0x5ff
> [   66.482174]  ? takeover_tasklets+0x5f0/0x5f0
> [   66.482542]  ? smpboot_thread_fn+0xab/0x780
> [   66.482911]  run_ksoftirqd+0x1a/0x40
> [   66.483225]  smpboot_thread_fn+0x3ad/0x780
> [   66.483583]  ? sort_range+0x20/0x20
> [   66.483894]  ? __kthread_parkme+0xb0/0x190
> [   66.484253]  ? sort_range+0x20/0x20
> [   66.484562]  ? sort_range+0x20/0x20
> [   66.484878]  kthread+0x2e2/0x3e0
> [   66.485166]  ? kthread_create_worker_on_cpu+0xb0/0xb0
> [   66.485620]  ret_from_fork+0x1f/0x30
> 
> Same test runs fine before the patches.
> 

Ouch, static_branch_dec() might block, so it's not possible to call it from
percpu ref counter release callback. It's not what I expected, tbh.

Good catch, thanks!