Message ID | 20230428084922.9834-1-liwang@redhat.com |
---|---|
State | Superseded |
Headers | show |
Series | [RFC] tst_cgroup: Avoid mixing mounts V1 and V2 simultaneously | expand |
Hi! > There is a tiny problem with the test logic of this tst_cgroup library, > that it potentially mixes Cgroup V1 and V2 together to be mounted at > the same time. The scenario happens once people just requests CTRL_BASE > (or a V2 controller not enabled) on a only V1-mounted system. > > Cgroup community always objected to enabling Cgroup like that (V1&V2), > which may bring unexpected issues along the way. > > So this patch cancels LTP mount V1&V2 simultaneously even if there is > no overlap in specific controller files. Isn't this the point of the library, to be able to use mixed V1 and V2 setup? As far as I understand it the only limitation is that we can bind a controllter to either V1 or V2 but not both. Also as far as I can tell, there is pleny of distributions out there at the moment where the default is split between V1 and V2 both mounted at the same time.
On Fri, Apr 28, 2023 at 6:24 PM Cyril Hrubis <chrubis@suse.cz> wrote: > Hi! > > There is a tiny problem with the test logic of this tst_cgroup library, > > that it potentially mixes Cgroup V1 and V2 together to be mounted at > > the same time. The scenario happens once people just requests CTRL_BASE > > (or a V2 controller not enabled) on a only V1-mounted system. > > > > Cgroup community always objected to enabling Cgroup like that (V1&V2), > > which may bring unexpected issues along the way. > > > > So this patch cancels LTP mount V1&V2 simultaneously even if there is > > no overlap in specific controller files. > > Isn't this the point of the library, to be able to use mixed V1 and V2 > setup? As far as I understand it the only limitation is that we can bind > a controllter to either V1 or V2 but not both. > That's the original design. We tried to keep flexible but ignored one exception V1 mounts all controllers and V2 only basic mount. (No controllers conflict in this mounting). From my observation, if a system(e.g. RHEL8) only announces Cgroup V1 support but does not guarantee V2 to be used. A test required 'CTRL_BASE' could mount V2 success but that V2 is only part work and test will get TBROK. We are unable to say this situation is a bug. So I suppose here mixed V1 and V2 might not be a good design unless we find a way to detect if the system fully supports both versions. But I guess this is difficult to check. > > Also as far as I can tell, there is pleny of distributions out there at > the moment where the default is split between V1 and V2 both mounted at > the same time. > Yes, as long as a distribution announces support for both Cgroup. This design makes sense for it. But I'm afraid rare people mixed using both at the same time(and that is also not recommended).
Hi! > That's the original design. We tried to keep flexible but ignored > one exception V1 mounts all controllers and V2 only basic mount. > (No controllers conflict in this mounting). > > From my observation, if a system(e.g. RHEL8) only announces > Cgroup V1 support but does not guarantee V2 to be used. > A test required 'CTRL_BASE' could mount V2 success but > that V2 is only part work and test will get TBROK. > We are unable to say this situation is a bug. So the V2 does not actually work unless there is at least one controller enabled? That sounds like a bug to me, my system actually uses v1 controllers and unified hierarchy at the same time. The unified hierarchy is used to group deamon processes and kill them with the cgropu.kill if needed. What exactly happens on your system?
On Fri, Apr 28, 2023 at 9:12 PM Cyril Hrubis <chrubis@suse.cz> wrote: > Hi! > > That's the original design. We tried to keep flexible but ignored > > one exception V1 mounts all controllers and V2 only basic mount. > > (No controllers conflict in this mounting). > > > > From my observation, if a system(e.g. RHEL8) only announces > > Cgroup V1 support but does not guarantee V2 to be used. > > A test required 'CTRL_BASE' could mount V2 success but > > that V2 is only part work and test will get TBROK. > > We are unable to say this situation is a bug. > > So the V2 does not actually work unless there is at least one controller > enabled? That sounds like a bug to me, my system actually uses v1 > controllers and unified hierarchy at the same time. The unified > hierarchy is used to group deamon processes and kill them with the > cgropu.kill if needed. Hmm, let me investigate... > > What exactly happens on your system? > When mounting V1 with all controllers and only the basic of V2, there lack of some controller files (cgroup.kill) in V2. # mount | grep cgroup tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,seclabel,size=1813984k,nr_inodes=453496,mode=755) cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd) cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,perf_event) cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,devices) cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,net_cls,net_prio) cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,pids) cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,rdma) cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,cpu,cpuacct) cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,hugetlb) cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,cpuset) cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,freezer) cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,memory) cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,seclabel,blkio) cgroup2 on /tmp/cgroup_unified type cgroup2 (rw,relatime,seclabel,memory_recursiveprot) # cd /tmp/cgroup_unified/ # ls cgroup.controllers cgroup.max.depth cgroup.max.descendants cgroup.procs cgroup.stat cgroup.subtree_control cgroup.threads cpu.pressure cpu.stat io.pressure ltp memory.pressure # cd ltp/ # ls cgroup.controllers cgroup.events cgroup.freeze cgroup.max.depth cgroup.max.descendants cgroup.procs cgroup.stat cgroup.subtree_control cgroup.threads cgroup.type cpu.pressure cpu.stat drain io.pressure memory.pressure test-41592
Hi Cyril, On Fri, Apr 28, 2023 at 9:30 PM Li Wang <liwang@redhat.com> wrote: > > > On Fri, Apr 28, 2023 at 9:12 PM Cyril Hrubis <chrubis@suse.cz> wrote: > >> Hi! >> > That's the original design. We tried to keep flexible but ignored >> > one exception V1 mounts all controllers and V2 only basic mount. >> > (No controllers conflict in this mounting). >> > >> > From my observation, if a system(e.g. RHEL8) only announces >> > Cgroup V1 support but does not guarantee V2 to be used. >> > A test required 'CTRL_BASE' could mount V2 success but >> > that V2 is only part work and test will get TBROK. >> > We are unable to say this situation is a bug. >> >> So the V2 does not actually work unless there is at least one controller >> enabled? That sounds like a bug to me, my system actually uses v1 >> controllers and unified hierarchy at the same time. The unified >> hierarchy is used to group deamon processes and kill them with the >> cgropu.kill if needed. > > > Hmm, let me investigate... > After talking with our Cgroup colleagues. I've been told that on my RHEL8, the 'cgroup.kill' is too new to have. The default Cgroup is V1 and if we want V2 singly, it could be explicitly enabled via "systemd.unified_cgroup_hierarchy=1" kernel parameters. But so far V2 is not supporting such a new feature. That's why I couldn't find 'cgroup.kill' after mounting V2 on RHEL8. And the mainline kernel merges this feature since v5.14. $ git describe --contains 661ee6280931548f v5.14-rc1~108^2~8 So, the problem is not caused by mixing V1 and V2. I misunderstood it before. Please ignore my patch, the problem can be resolved by: + if (!SAFE_CG_HAS(cg_child_test_simple, "cgroup.kill")) { + cg_child_test_simple = tst_cg_group_rm(cg_child_test_simple); + tst_brk(TCONF, "cgroup.kill is not supported on your distribution"); + } + to skip the test gracefully if there is no 'cgroup.kill' interface. (thus GAO Wei needs to rebase his patch again)
diff --git a/lib/tst_cgroup.c b/lib/tst_cgroup.c index d08a31651..bceb10d1d 100644 --- a/lib/tst_cgroup.c +++ b/lib/tst_cgroup.c @@ -786,6 +786,10 @@ static void cgroup_copy_cpuset(const struct cgroup_root *const root) * If we can't mount V2 or the controller is not on V2, then we try * mounting it on its own V1 tree. * + * The worth mentioning here is that once there detect V1 mounted, + * we skip mounting V2 because LTP doesn't allow V1 and V2 mixes to + * be tested in any scenario, and vice versa. + * * Once we have mounted the controller somehow, we create a hierarchy * of cgroups. If we are on V2 we first need to enable the controller * for all children of root. Then we create hierarchy described in @@ -828,13 +832,14 @@ void tst_cg_require(const char *const ctrl_name, if (ctrl->ctrl_root) goto mkdirs; - if (!cgroup_v2_mounted() && options->needs_ver != TST_CG_V1) + if (!cgroup_v2_mounted() && options->needs_ver != TST_CG_V1 + && !cgroup_v1_mounted()) cgroup_mount_v2(); if (ctrl->ctrl_root) goto mkdirs; - if (options->needs_ver != TST_CG_V2) + if (options->needs_ver != TST_CG_V2 && !cgroup_v2_mounted()) cgroup_mount_v1(ctrl); if (base)
There is a tiny problem with the test logic of this tst_cgroup library, that it potentially mixes Cgroup V1 and V2 together to be mounted at the same time. The scenario happens once people just requests CTRL_BASE (or a V2 controller not enabled) on a only V1-mounted system. Cgroup community always objected to enabling Cgroup like that (V1&V2), which may bring unexpected issues along the way. So this patch cancels LTP mount V1&V2 simultaneously even if there is no overlap in specific controller files. Note: This patch is a follow-up on GAO Wei's work about adding CTRL_BASE. So we need to apply after that patchset. Signed-off-by: Li Wang <liwang@redhat.com> Cc: Richard Palethorpe <rpalethorpe@suse.de> Cc: Wei Gao <wegao@suse.com> --- lib/tst_cgroup.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)