From patchwork Sat Mar 12 14:09:14 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gavin Guo X-Patchwork-Id: 596656 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) by ozlabs.org (Postfix) with ESMTP id 3qMm9L3pT0z9sds; Sun, 13 Mar 2016 01:09:42 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical-com.20150623.gappssmtp.com header.i=@canonical-com.20150623.gappssmtp.com header.b=SArb82Dv; dkim-atps=neutral Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1aekEd-0000fE-G9; Sat, 12 Mar 2016 14:09:39 +0000 Received: from mail-qg0-f50.google.com ([209.85.192.50]) by huckleberry.canonical.com with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.76) (envelope-from ) id 1aekEW-0000c6-AW for kernel-team@lists.ubuntu.com; Sat, 12 Mar 2016 14:09:32 +0000 Received: by mail-qg0-f50.google.com with SMTP id u110so120796766qge.3 for ; Sat, 12 Mar 2016 06:09:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical-com.20150623.gappssmtp.com; s=20150623; h=from:to:subject:date:message-id:in-reply-to:references; bh=U3QIL/E9gYuuCxDPlpHNdgWT2gMDv6EL9OQ5+Huiuak=; b=SArb82Dv26DuHwJK/WuRaZzaC2d3WXk5W609emZJKXkLm3WxPVAB2kyC4MOKqHki6e 19a5s5nHvwqqojh1I+new+YFvN0CzgfPMw+dx7vUgNf2oHsMKu66WWTiZufxV7sFDRMD 74tA6balm2nk+XPknI1f/Gv27lpmctIBV3dvM85ZA6z4Ial+4VlgUqWX2jOayN2eklep 3Hw67kj8v0Vioxz5aDbhIbt2pFsQcrBGkCd7tpCKMRUEsNAfF1fWx6bOhO4QxWI2/wzD crDzy1JqmEnK2pB3gJ7NpcyL1UF+gZN9YlkQp91XkOMw2Mglz9a2OWVxE1FPz214TLWN EqtQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=U3QIL/E9gYuuCxDPlpHNdgWT2gMDv6EL9OQ5+Huiuak=; b=feE7KA0+EgQpM/rBimvIsgJWCttp/N5PLBOkLTjd1EgQFOvx+wUHHx3Fo99xozdzEw CjraC+k3KmOjzLsfu0owDBCWlH74trpESvbJAa+7jH+GB4JCLMZl+5IaxydZ53ikHIR1 UsoKSnJXTC3p4Lg6f1VwzKM/2xinz6sJnlfSYqwNMhylWC06htZb2Gft92P6/APtNJe0 FMNuHkJvbO/nPTuNnEsRF3wPzUglt/9IpZ31W8FH3hW9RQZz6zof5hl7wr6Vod+OjAGZ E4SMCRZ7vOtNSyzq5I5WF0N3+qoqUsSeN0FjMT9d6p5bef3AtowJL1ZeFeg8n2fHDEk7 rI7A== X-Gm-Message-State: AD7BkJIMDpA/1J8KoZSP04hMVVgRnch8EIvwwebcdjxBU/KdGLIm5YawBPrEFSEWdsMkY1DZ X-Received: by 10.140.241.74 with SMTP id m71mr20106417qhc.36.1457791771344; Sat, 12 Mar 2016 06:09:31 -0800 (PST) Received: from localhost.localdomain (114-35-245-81.HINET-IP.hinet.net. [114.35.245.81]) by smtp.gmail.com with ESMTPSA id z110sm6131857qgd.45.2016.03.12.06.09.29 for (version=TLS1_1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 12 Mar 2016 06:09:30 -0800 (PST) From: Gavin Guo To: kernel-team@lists.ubuntu.com Subject: [SRU][Utopic][PATCH 2/2] sched/numa: Fix use-after-free bug in the task_numa_compare Date: Sat, 12 Mar 2016 22:09:14 +0800 Message-Id: <1457791755-7266-7-git-send-email-gavin.guo@canonical.com> X-Mailer: git-send-email 2.0.0 In-Reply-To: <1457791755-7266-1-git-send-email-gavin.guo@canonical.com> References: <1457791755-7266-1-git-send-email-gavin.guo@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.14 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: kernel-team-bounces@lists.ubuntu.com BugLink: https://bugs.launchpad.net/bugs/1527643 The following message can be observed on the Ubuntu v3.13.0-65 with KASan backported: ================================================================== BUG: KASan: use after free in task_numa_find_cpu+0x64c/0x890 at addr ffff880dd393ecd8 Read of size 8 by task qemu-system-x86/3998900 ============================================================================= BUG kmalloc-128 (Tainted: G B ): kasan: bad access detected ----------------------------------------------------------------------------- INFO: Allocated in task_numa_fault+0xc1b/0xed0 age=41980 cpu=18 pid=3998890 __slab_alloc+0x4f8/0x560 __kmalloc+0x1eb/0x280 task_numa_fault+0xc1b/0xed0 do_numa_page+0x192/0x200 handle_mm_fault+0x808/0x1160 __do_page_fault+0x218/0x750 do_page_fault+0x1a/0x70 page_fault+0x28/0x30 SyS_poll+0x66/0x1a0 system_call_fastpath+0x1a/0x1f INFO: Freed in task_numa_free+0x1d2/0x200 age=62 cpu=18 pid=0 __slab_free+0x2ab/0x3f0 kfree+0x161/0x170 task_numa_free+0x1d2/0x200 finish_task_switch+0x1d2/0x210 __schedule+0x5d4/0xc60 schedule_preempt_disabled+0x40/0xc0 cpu_startup_entry+0x2da/0x340 start_secondary+0x28f/0x360 Call Trace: [] dump_stack+0x45/0x56 [] print_trailer+0xfd/0x170 [] object_err+0x36/0x40 [] kasan_report_error+0x1e9/0x3a0 [] kasan_report+0x40/0x50 [] ? task_numa_find_cpu+0x64c/0x890 [] __asan_load8+0x69/0xa0 [] ? find_next_bit+0xd8/0x120 [] task_numa_find_cpu+0x64c/0x890 [] task_numa_migrate+0x4ac/0x7b0 [] numa_migrate_preferred+0xb3/0xc0 [] task_numa_fault+0xb88/0xed0 [] do_numa_page+0x192/0x200 [] handle_mm_fault+0x808/0x1160 [] ? sched_clock_cpu+0x10d/0x160 [] ? native_load_tls+0x82/0xa0 [] __do_page_fault+0x218/0x750 [] ? hrtimer_try_to_cancel+0x76/0x160 [] ? schedule_hrtimeout_range_clock.part.24+0xf7/0x1c0 [] do_page_fault+0x1a/0x70 [] page_fault+0x28/0x30 [] ? do_sys_poll+0x1c4/0x6d0 [] ? enqueue_task_fair+0x4b6/0xaa0 [] ? sched_clock+0x9/0x10 [] ? resched_task+0x7a/0xc0 [] ? check_preempt_curr+0xb3/0x130 [] ? poll_select_copy_remaining+0x170/0x170 [] ? wake_up_state+0x10/0x20 [] ? drop_futex_key_refs.isra.14+0x1f/0x90 [] ? futex_requeue+0x3de/0xba0 [] ? do_futex+0xbe/0x8f0 [] ? read_tsc+0x9/0x20 [] ? ktime_get_ts+0x12d/0x170 [] ? timespec_add_safe+0x59/0xe0 [] SyS_poll+0x66/0x1a0 [] system_call_fastpath+0x1a/0x1f As commit 1effd9f19324 ("sched/numa: Fix unsafe get_task_struct() in task_numa_assign()") points out, the rcu_read_lock() cannot protect the task_struct from being freed in the finish_task_switch(). And the bug happens in the process of calculation of imp which requires the access of p->numa_faults being freed in the following path: do_exit() current->flags |= PF_EXITING; release_task() ~~delayed_put_task_struct()~~ schedule() ... ... rq->curr = next; context_switch() finish_task_switch() put_task_struct() __put_task_struct() task_numa_free() The fix here to get_task_struct() early before end of dst_rq->lock to protect the calculation process and also put_task_struct() in the corresponding point if finally the dst_rq->curr somehow cannot be assigned. Additional credit to Liang Chen who helped fix the error logic and add the put_task_struct() to the place it missed. Signed-off-by: Gavin Guo Signed-off-by: Peter Zijlstra (Intel) Cc: Andrea Arcangeli Cc: Andrew Morton Cc: Hugh Dickins Cc: Linus Torvalds Cc: Mel Gorman Cc: Peter Zijlstra Cc: Rik van Riel Cc: Thomas Gleixner Cc: jay.vosburgh@canonical.com Cc: liang.chen@canonical.com Link: http://lkml.kernel.org/r/1453264618-17645-1-git-send-email-gavin.guo@canonical.com Signed-off-by: Ingo Molnar (backported from commit 1dff76b92f69051e579bdc131e01500da9fa2a91) Signed-off-by: Gavin Guo Conflicts: kernel/sched/fair.c --- kernel/sched/fair.c | 29 ++++++++++++++++++++++------- 1 file changed, 22 insertions(+), 7 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index ca3829734199..be29cccf6189 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1088,8 +1088,6 @@ static void task_numa_assign(struct task_numa_env *env, { if (env->best_task) put_task_struct(env->best_task); - if (p) - get_task_struct(p); env->best_task = p; env->best_imp = imp; @@ -1140,20 +1138,30 @@ static void task_numa_compare(struct task_numa_env *env, long orig_dst_load, dst_load; long load; long imp = (groupimp > 0) ? groupimp : taskimp; + bool assigned = false; rcu_read_lock(); raw_spin_lock_irq(&dst_rq->lock); cur = dst_rq->curr; /* - * No need to move the exiting task, and this ensures that ->curr - * wasn't reaped and thus get_task_struct() in task_numa_assign() - * is safe under RCU read lock. - * Note that rcu_read_lock() itself can't protect from the final - * put_task_struct() after the last schedule(). + * No need to move the exiting task or idle task. */ if ((cur->flags & PF_EXITING) || is_idle_task(cur)) cur = NULL; + else { + /* + * The task_struct must be protected here to protect the + * p->numa_faults access in the task_weight since the + * numa_faults could already be freed in the following path: + * finish_task_switch() + * --> put_task_struct() + * --> __put_task_struct() + * --> task_numa_free() + */ + get_task_struct(cur); + } + raw_spin_unlock_irq(&dst_rq->lock); /* @@ -1240,9 +1248,16 @@ balance: goto unlock; assign: + assigned = true; task_numa_assign(env, cur, imp); unlock: rcu_read_unlock(); + /* + * The dst_rq->curr isn't assigned. The protection for task_struct is + * finished. + */ + if (cur && !assigned) + put_task_struct(cur); } static void task_numa_find_cpu(struct task_numa_env *env,