[{"id":1711398,"web_url":"http://patchwork.ozlabs.org/comment/1711398/","msgid":"<1499409574.19784.26.camel@abdul.in.ibm.com>","date":"2017-07-07T06:39:34","subject":"Re: [next-20170609] Oops while running CPU off-on\n\t(cpuset.c/cpuset_can_attach)","submitter":{"id":69191,"url":"http://patchwork.ozlabs.org/api/people/69191/","name":"Abdul Haleem","email":"abdhalee@linux.vnet.ibm.com"},"content":"On Wed, 2017-07-05 at 11:28 -0400, Tejun Heo wrote:\n> Hello, Abdul.\n> \n> Thanks for the debug info.  Can you please see whether the following\n> patch fixes the issue?  \n\nIt is my pleasure and yes the patch fixes the problem.\n\n> If the problem is too difficult to reproduce\n\nThe problem was reproducible all the time. \n\nWith the patch fix, I tried multiple times and long runs of cpu off-on\ncycles but no Oops is seen.\n\nThank you for spending your valuable time on fixing this issue.\n\nReported-and-tested-by : Abdul Haleem <abdhalee@linux.vnet.ibm.com>\n\n> to confirm the fix by seeing whether it no longer triggers, please let\n> me know.  We can instead apply a patch which triggers WARN on the\n> failing condition to confirm the diagnosis.\n> \n> Thanks.\n> \n> diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h\n> index 793565c05742..8b4c3c2f2509 100644\n> --- a/kernel/cgroup/cgroup-internal.h\n> +++ b/kernel/cgroup/cgroup-internal.h\n> @@ -33,6 +33,9 @@ struct cgroup_taskset {\n>  \tstruct list_head\tsrc_csets;\n>  \tstruct list_head\tdst_csets;\n> \n> +\t/* the number of tasks in the set */\n> +\tint\t\t\tnr_tasks;\n> +\n>  \t/* the subsys currently being processed */\n>  \tint\t\t\tssid;\n> \n> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c\n> index dbfd7028b1c6..e3c4152741a3 100644\n> --- a/kernel/cgroup/cgroup.c\n> +++ b/kernel/cgroup/cgroup.c\n> @@ -1954,6 +1954,8 @@ static void cgroup_migrate_add_task(struct task_struct *task,\n>  \tif (!cset->mg_src_cgrp)\n>  \t\treturn;\n> \n> +\tmgctx->tset.nr_tasks++;\n> +\n>  \tlist_move_tail(&task->cg_list, &cset->mg_tasks);\n>  \tif (list_empty(&cset->mg_node))\n>  \t\tlist_add_tail(&cset->mg_node,\n> @@ -2047,16 +2049,18 @@ static int cgroup_migrate_execute(struct cgroup_mgctx *mgctx)\n>  \t\treturn 0;\n> \n>  \t/* check that we can legitimately attach to the cgroup */\n> -\tdo_each_subsys_mask(ss, ssid, mgctx->ss_mask) {\n> -\t\tif (ss->can_attach) {\n> -\t\t\ttset->ssid = ssid;\n> -\t\t\tret = ss->can_attach(tset);\n> -\t\t\tif (ret) {\n> -\t\t\t\tfailed_ssid = ssid;\n> -\t\t\t\tgoto out_cancel_attach;\n> +\tif (tset->nr_tasks) {\n> +\t\tdo_each_subsys_mask(ss, ssid, mgctx->ss_mask) {\n> +\t\t\tif (ss->can_attach) {\n> +\t\t\t\ttset->ssid = ssid;\n> +\t\t\t\tret = ss->can_attach(tset);\n> +\t\t\t\tif (ret) {\n> +\t\t\t\t\tfailed_ssid = ssid;\n> +\t\t\t\t\tgoto out_cancel_attach;\n> +\t\t\t\t}\n>  \t\t\t}\n> -\t\t}\n> -\t} while_each_subsys_mask();\n> +\t\t} while_each_subsys_mask();\n> +\t}\n> \n>  \t/*\n>  \t * Now that we're guaranteed success, proceed to move all tasks to\n> @@ -2085,25 +2089,29 @@ static int cgroup_migrate_execute(struct cgroup_mgctx *mgctx)\n>  \t */\n>  \ttset->csets = &tset->dst_csets;\n> \n> -\tdo_each_subsys_mask(ss, ssid, mgctx->ss_mask) {\n> -\t\tif (ss->attach) {\n> -\t\t\ttset->ssid = ssid;\n> -\t\t\tss->attach(tset);\n> -\t\t}\n> -\t} while_each_subsys_mask();\n> +\tif (tset->nr_tasks) {\n> +\t\tdo_each_subsys_mask(ss, ssid, mgctx->ss_mask) {\n> +\t\t\tif (ss->attach) {\n> +\t\t\t\ttset->ssid = ssid;\n> +\t\t\t\tss->attach(tset);\n> +\t\t\t}\n> +\t\t} while_each_subsys_mask();\n> +\t}\n> \n>  \tret = 0;\n>  \tgoto out_release_tset;\n> \n>  out_cancel_attach:\n> -\tdo_each_subsys_mask(ss, ssid, mgctx->ss_mask) {\n> -\t\tif (ssid == failed_ssid)\n> -\t\t\tbreak;\n> -\t\tif (ss->cancel_attach) {\n> -\t\t\ttset->ssid = ssid;\n> -\t\t\tss->cancel_attach(tset);\n> -\t\t}\n> -\t} while_each_subsys_mask();\n> +\tif (tset->nr_tasks) {\n> +\t\tdo_each_subsys_mask(ss, ssid, mgctx->ss_mask) {\n> +\t\t\tif (ssid == failed_ssid)\n> +\t\t\t\tbreak;\n> +\t\t\tif (ss->cancel_attach) {\n> +\t\t\t\ttset->ssid = ssid;\n> +\t\t\t\tss->cancel_attach(tset);\n> +\t\t\t}\n> +\t\t} while_each_subsys_mask();\n> +\t}\n>  out_release_tset:\n>  \tspin_lock_irq(&css_set_lock);\n>  \tlist_splice_init(&tset->dst_csets, &tset->src_csets);\n>","headers":{"Return-Path":"<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>","X-Original-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Delivered-To":["patchwork-incoming@ozlabs.org","linuxppc-dev@lists.ozlabs.org"],"Received":["from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3x3lPC6K2sz9s3T\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri,  7 Jul 2017 16:41:03 +1000 (AEST)","from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 3x3lPC5YHSzDr9b\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri,  7 Jul 2017 16:41:03 +1000 (AEST)","from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com\n\t[148.163.158.5])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby lists.ozlabs.org (Postfix) with ESMTPS id 3x3lMm4R5xzDqj8\n\tfor <linuxppc-dev@lists.ozlabs.org>;\n\tFri,  7 Jul 2017 16:39:48 +1000 (AEST)","from pps.filterd (m0098420.ppops.net [127.0.0.1])\n\tby mx0b-001b2d01.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id\n\tv676cbg2076078\n\tfor <linuxppc-dev@lists.ozlabs.org>; Fri, 7 Jul 2017 02:39:45 -0400","from e23smtp03.au.ibm.com (e23smtp03.au.ibm.com [202.81.31.145])\n\tby mx0b-001b2d01.pphosted.com with ESMTP id 2bhwbynh9v-1\n\t(version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT)\n\tfor <linuxppc-dev@lists.ozlabs.org>; Fri, 07 Jul 2017 02:39:44 -0400","from localhost\n\tby e23smtp03.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use\n\tOnly! Violators will be prosecuted\n\tfor <linuxppc-dev@lists.ozlabs.org> from\n\t<abdhalee@linux.vnet.ibm.com>; Fri, 7 Jul 2017 16:39:41 +1000","from d23relay09.au.ibm.com (202.81.31.228)\n\tby e23smtp03.au.ibm.com (202.81.31.209) with IBM ESMTP SMTP Gateway:\n\tAuthorized Use Only! Violators will be prosecuted; \n\tFri, 7 Jul 2017 16:39:39 +1000","from d23av03.au.ibm.com (d23av03.au.ibm.com [9.190.234.97])\n\tby d23relay09.au.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id\n\tv676ddVD64684130\n\tfor <linuxppc-dev@lists.ozlabs.org>; Fri, 7 Jul 2017 16:39:39 +1000","from d23av03.au.ibm.com (localhost [127.0.0.1])\n\tby d23av03.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id\n\tv676dUNL027915\n\tfor <linuxppc-dev@lists.ozlabs.org>; Fri, 7 Jul 2017 16:39:31 +1000","from [9.84.230.205] ([9.84.230.205])\n\tby d23av03.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id\n\tv676dQqH027786; Fri, 7 Jul 2017 16:39:27 +1000"],"Subject":"Re: [next-20170609] Oops while running CPU off-on\n\t(cpuset.c/cpuset_can_attach)","From":"Abdul Haleem <abdhalee@linux.vnet.ibm.com>","To":"Tejun Heo <tj@kernel.org>","Date":"Fri, 07 Jul 2017 12:09:34 +0530","In-Reply-To":"<20170705152855.GD19330@htj.duckdns.org>","References":"<1497266622.15415.39.camel@abdul.in.ibm.com>\n\t<20170627153608.GD2289@htj.duckdns.org>\n\t<1499092582.10651.15.camel@abdul.in.ibm.com>\n\t<20170705152855.GD19330@htj.duckdns.org>","Content-Type":"text/plain; charset=\"UTF-8\"","X-Mailer":"Evolution 3.10.4-0ubuntu1 ","Mime-Version":"1.0","Content-Transfer-Encoding":"7bit","X-TM-AS-MML":"disable","x-cbid":"17070706-0008-0000-0000-0000014DE688","X-IBM-AV-DETECTION":"SAVI=unused REMOTE=unused XFE=unused","x-cbparentid":"17070706-0009-0000-0000-0000097E1DA9","Message-Id":"<1499409574.19784.26.camel@abdul.in.ibm.com>","X-Proofpoint-Virus-Version":"vendor=fsecure engine=2.50.10432:, ,\n\tdefinitions=2017-07-07_04:, , signatures=0","X-Proofpoint-Spam-Details":"rule=outbound_notspam policy=outbound score=0\n\tspamscore=0 suspectscore=0\n\tmalwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam\n\tadjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000\n\tdefinitions=main-1707070108","X-BeenThere":"linuxppc-dev@lists.ozlabs.org","X-Mailman-Version":"2.1.23","Precedence":"list","List-Id":"Linux on PowerPC Developers Mail List\n\t<linuxppc-dev.lists.ozlabs.org>","List-Unsubscribe":"<https://lists.ozlabs.org/options/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=unsubscribe>","List-Archive":"<http://lists.ozlabs.org/pipermail/linuxppc-dev/>","List-Post":"<mailto:linuxppc-dev@lists.ozlabs.org>","List-Help":"<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=help>","List-Subscribe":"<https://lists.ozlabs.org/listinfo/linuxppc-dev>,\n\t<mailto:linuxppc-dev-request@lists.ozlabs.org?subject=subscribe>","Cc":"sachinp <sachinp@linux.vnet.ibm.com>,\n\tStephen Rothwell <sfr@canb.auug.org.au>, ego <ego@linux.vnet.ibm.com>,\n\tlinux-kernel <linux-kernel@vger.kernel.org>,\n\tLi Zefan <lizefan@huawei.com>, \n\tlinuxppc-dev <linuxppc-dev@lists.ozlabs.org>,\n\tIngo Molnar <mingo@kernel.org>","Errors-To":"linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org","Sender":"\"Linuxppc-dev\"\n\t<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>"}}]