From patchwork Thu Feb 8 08:08:12 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?aGFpYmluemhhbmco5byg5rW35paMKQ==?= X-Patchwork-Id: 870879 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) by ozlabs.org (Postfix) with ESMTP id 3zcfMG0G6tz9s7F; Fri, 9 Feb 2018 00:35:06 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1ejmLi-0004oB-4K; Thu, 08 Feb 2018 13:34:50 +0000 Received: from mail4.tencent.com ([183.57.53.109]) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1ejhGA-0007CT-Ry for kernel-team@lists.ubuntu.com; Thu, 08 Feb 2018 08:08:47 +0000 Received: from EXHUB-SZMail01.tencent.com (unknown [10.14.6.21]) by mail4.tencent.com (Postfix) with ESMTP id 52B6D501C8 for ; Thu, 8 Feb 2018 16:08:14 +0800 (CST) Received: from EXMBX-SZMAIL006.tencent.com ([fe80::79d2:3448:1596:46df]) by EXHUB-SZMail01.tencent.com ([::1]) with mapi id 14.03.0123.003; Thu, 8 Feb 2018 16:08:14 +0800 From: =?gb2312?b?aGFpYmluemhhbmco1cW6o7HzKQ==?= To: "kernel-team@lists.ubuntu.com" Subject: cgroup: remove cgroup directory leading kernel crash in kill_css Thread-Topic: cgroup: remove cgroup directory leading kernel crash in kill_css Thread-Index: AdOgsic4OinlPxLpQGWI8SAi6hbapg== Date: Thu, 8 Feb 2018 08:08:12 +0000 Message-ID: <88D661ADF6AFBF42B2AB88D8E7682B0901FBDC04@EXMBX-SZMAIL006.tencent.com> Accept-Language: zh-CN, en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.19.90.112] MIME-Version: 1.0 X-Mailman-Approved-At: Thu, 08 Feb 2018 13:34:46 +0000 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" Hi, We got feedback from customer that cvm(cloud virtual machine) crashed when using kubelet updating container-service in ubuntu xenial. Logs show as follow. We find a patch (commit 33c35aa4817864e056fd772230b0c6b552e36ea2) in linux mainline, which can indeed fix this bug. But ubuntu-xenial.git has not merged it yet. Do you guys have a plan for merging? ----------------------panic log----------------------------- [2018-02-02 10:21:48][4397731.721563] BUG: unable to handle kernel paging request at 000000010000005c [2018-02-02 10:40:50][4397731.722666] IP: css_clear_dir+0x5/0x70 [2018-02-02 10:40:50][4397731.723261] PGD a12b067 [2018-02-02 10:40:50][4397731.723261] PUD 0 [2018-02-02 10:40:50][4397731.723628] [2018-02-02 10:40:50][4397731.724004] Oops: 0000 [#1] SMP [2018-02-02 10:40:50][4397731.724004] Modules linked in: xt_statistic nf_conntrack_netlink ebt_ip ebtable_filter ebtables veth xt_set ip_set_hash_net ip_set nfnetlink xt_nat xt_recent xt_mark ipt_REJ[2018-02-02 10:40:50]ECT nf_reject_ipv4 xt_tcpudp xt_comment ipt_MASQUERADE nf_nat_masquerade_ipv4 xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_fil[2018-02-02 10:40:50]ter ip_tables xt_conntrack x_tables nf_nat nf_conntrack br_netfilter bridge stp llc aufs ppdev sb_edac edac_core crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev input_le[2018-02-02 10:40:50]ds serio_raw parport_pc parport i2c_piix4 mac_hid ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 a[2018-02-02 10:40:50]sync_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath [2018-02-02 10:40:50][4397731.724004] linear cirrus ttm drm_kms_helper syscopyarea sysfillrect sysimgblt aesni_intel fb_sys_fops aes_x86_64 crypto_simd cryptd glue_helper psmouse virtio_blk virtio_n[2018-02-02 10:40:50]et drm pata_acpi floppy [2018-02-02 10:40:50][4397731.724004] CPU: 0 PID: 23347 Comm: kubelet Not tainted 4.10.0-32-generic #36~16.04.1-Ubuntu [2018-02-02 10:40:50][4397731.724004] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [2018-02-02 10:40:50][4397731.724004] task: ffff92abde590000 task.stack: ffffbaa94165c000 [2018-02-02 10:40:50][4397731.724004] RIP: 0010:css_clear_dir+0x5/0x70 [2018-02-02 10:40:50][4397731.724004] RSP: 0018:ffffbaa94165fe10 EFLAGS: 00010206 [2018-02-02 10:40:50][4397731.724004] RAX: 000047fd40005d7b RBX: 00000000ffffffe8 RCX: ffff92abffc0fcec [2018-02-02 10:40:50][4397731.724004] RDX: ffffffff9b070800 RSI: 0000000000000206 RDI: 00000000ffffffe8 [2018-02-02 10:40:50][4397731.724004] RBP: ffffbaa94165fe20 R08: 00000000c8b18701 R09: 0000000180220017 [2018-02-02 10:40:50][4397731.724004] R10: ffff92abc8b187f8 R11: ffff92abf7751d00 R12: ffff92abd5601000 [2018-02-02 10:40:50][4397731.724004] R13: 0000000000000000 R14: ffff92abd5601150 R15: 0000000000000000 [2018-02-02 10:40:50][4397731.724004] FS: 00007f6f92ffd700(0000) GS:ffff92abffc00000(0000) knlGS:0000000000000000 [2018-02-02 10:40:50][4397731.724004] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [2018-02-02 10:40:50][4397731.724004] CR2: 000000010000005c CR3: 00000000280cb000 CR4: 00000000000406f0 [2018-02-02 10:40:50][4397731.724004] Call Trace: [2018-02-02 10:40:50][4397731.724004] ? kill_css+0x12/0x60 [2018-02-02 10:40:50][4397731.724004] cgroup_destroy_locked+0xa5/0xf0 [2018-02-02 10:40:50][4397731.724004] cgroup_rmdir+0x2c/0x90 [2018-02-02 10:40:50][4397731.724004] kernfs_iop_rmdir+0x4d/0x80 [2018-02-02 10:40:50][4397731.724004] vfs_rmdir+0xb4/0x130 [2018-02-02 10:40:50][4397731.724004] do_rmdir+0x1c7/0x1e0 [2018-02-02 10:40:50][4397731.724004] SyS_unlinkat+0x22/0x30 [2018-02-02 10:40:50][4397731.724004] entry_SYSCALL_64_fastpath+0x1e/0xad [2018-02-02 10:40:50][4397731.724004] RIP: 0033:0x481bd4 [2018-02-02 10:40:50][4397731.724004] RSP: 002b:000000c422893af0 EFLAGS: 00000246 ORIG_RAX: 0000000000000107 [2018-02-02 10:40:50][4397731.724004] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000481bd4 [2018-02-02 10:40:50][4397731.724004] RDX: 0000000000000200 RSI: 000000c421c7ef00 RDI: ffffffffffffff9c [2018-02-02 10:40:50][4397731.724004] RBP: 000000c422893bc0 R08: 0000000000000000 R09: 0000000000000000 [2018-02-02 10:40:50][4397731.724004] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000000000ce [2018-02-02 10:40:50][4397731.724004] R13: 00000000ffffffee R14: 0000000000001740 R15: 0000000000000055 [2018-02-02 10:40:50][4397731.724004] Code: fd ff ff 85 c0 41 89 c6 0f 84 5b fd ff ff eb 83 4d 89 fc e9 0f ff ff ff e8 d9 37 f6 ff 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 <8b> 47 74 a8 08 74 5d 55 [2018-02-02 10:40:50]83 e0 f7 48 89 e5 41 55 41 54 53 89 47 [2018-02-02 10:40:50][4397731.724004] RIP: css_clear_dir+0x5/0x70 RSP: ffffbaa94165fe10 [2018-02-02 10:40:50][4397731.724004] CR2: 000000010000005c ----------------------patch in linux.git---------------------------- commit 33c35aa4817864e056fd772230b0c6b552e36ea2 Author: Waiman Long > Date: Mon May 15 09:34:06 2017 -0400 cgroup: Prevent kill_css() from being called more than once The kill_css() function may be called more than once under the condition that the css was killed but not physically removed yet followed by the removal of the cgroup that is hosting the css. This patch prevents any harmm from being done when that happens. Signed-off-by: Waiman Long > Signed-off-by: Tejun Heo > Cc: stable@vger.kernel.org # v4.5+ diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index c3c9a0e1b3c9..8d4e85eae42c 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -4265,6 +4265,11 @@ static void kill_css(struct cgroup_subsys_state *css) { lockdep_assert_held(&cgroup_mutex); + if (css->flags & CSS_DYING) + return; + + css->flags |= CSS_DYING; + /* * This must happen before css is disassociated with its cgroup. * See seq_css() for details.