From patchwork Thu Mar 21 23:48:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauricio Faria de Oliveira X-Patchwork-Id: 1060605 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44QNmp6P2Nz9sRy; Fri, 22 Mar 2019 10:49:30 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1h77R8-0001mt-45; Thu, 21 Mar 2019 23:49:26 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1h77R6-0001mc-Kh for kernel-team@lists.ubuntu.com; Thu, 21 Mar 2019 23:49:24 +0000 Received: from mail-qk1-f197.google.com ([209.85.222.197]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1h77R6-0001fE-AO for kernel-team@lists.ubuntu.com; Thu, 21 Mar 2019 23:49:24 +0000 Received: by mail-qk1-f197.google.com with SMTP id 77so409448qkd.9 for ; Thu, 21 Mar 2019 16:49:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=yz4ORE5d7ObRB+EdFJKW2p/Vp/PJ2F28vXaQshqMHnQ=; b=qTHf+gULJX4hVIIGc9pkB3Dv7IlbHwQLslp1SHZY01ij6Y+dXDDBUtnbCnYNADaMhT PSJsXWxnA1obYLZP94HISUe2GtLVYAUxQxSf2B09bBQz2CTYpRCtiFOPh3QLraXDEaWW Rkv0Oz/xYFLePfkTLnwWPn/TA9UkGqwvzjfcyVrehfnu1HTv2ZcgTKIYNUxgk4X0cC5j 1nfTy0azSuyVjMLPyrW/C4DJfPwNiEyNmhPDusQpz9RhPoZuxGUoY01nud7dLlp/A/v0 0cz7aYqlb8vBPQBqsZN4UGBx8MCBXBJy5n/3JPfquMi+SXHiZ4J+CQkRefq4dviRj72u T4VQ== X-Gm-Message-State: APjAAAUoT9osSQhDUO5NveiDvRMjhKHqSMiWZDAdZiWMh+nvJ3XwHL24 iZ6adVv6q/cRetKTRUVFlAhxNa+5CJga/1b66ulMq2w4CQIIuOVbGA4H7mUaRV4PMwGB90wLLNP 33FTsnFTaCUE7bBvJvvqgfBa50p9qpQsBpyKF8Hc/BQ== X-Received: by 2002:ac8:865:: with SMTP id x34mr5577149qth.379.1553212163405; Thu, 21 Mar 2019 16:49:23 -0700 (PDT) X-Google-Smtp-Source: APXvYqzqXxmzVwhiLVmsEFMUlFLJJ8+Mqbh1qre5m2G5hBaAqgcTG/v1jNMN6/k6uFh6rnh6aohZPw== X-Received: by 2002:ac8:865:: with SMTP id x34mr5577141qth.379.1553212163240; Thu, 21 Mar 2019 16:49:23 -0700 (PDT) Received: from localhost.localdomain ([2804:14c:4e7:c0e:5083:4574:81c5:ff8d]) by smtp.gmail.com with ESMTPSA id e6sm445639qtr.56.2019.03.21.16.49.21 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 21 Mar 2019 16:49:22 -0700 (PDT) From: Mauricio Faria de Oliveira To: kernel-team@lists.ubuntu.com Subject: [B][PATCH 2/2] stop_machine: Atomically queue and wake stopper threads Date: Thu, 21 Mar 2019 20:48:36 -0300 Message-Id: <20190321234836.11774-3-mfo@canonical.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190321234836.11774-1-mfo@canonical.com> References: <20190321234836.11774-1-mfo@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Prasad Sodagudi BugLink: https://bugs.launchpad.net/bugs/1821259 When cpu_stop_queue_work() releases the lock for the stopper thread that was queued into its wake queue, preemption is enabled, which leads to the following deadlock: CPU0 CPU1 sched_setaffinity(0, ...) __set_cpus_allowed_ptr() stop_one_cpu(0, ...) stop_two_cpus(0, 1, ...) cpu_stop_queue_work(0, ...) cpu_stop_queue_two_works(0, ..., 1, ...) -grabs lock for migration/0- -spins with preemption disabled, waiting for migration/0's lock to be released- -adds work items for migration/0 and queues migration/0 to its wake_q- -releases lock for migration/0 and preemption is enabled- -current thread is preempted, and __set_cpus_allowed_ptr has changed the thread's cpu allowed mask to CPU1 only- -acquires migration/0 and migration/1's locks- -adds work for migration/0 but does not add migration/0 to wake_q, since it is already in a wake_q- -adds work for migration/1 and adds migration/1 to its wake_q- -releases migration/0 and migration/1's locks, wakes migration/1, and enables preemption- -since migration/1 is requested to run, migration/1 begins to run and waits on migration/0, but migration/0 will never be able to run, since the thread that can wake it is affine to CPU1- Disable preemption in cpu_stop_queue_work() before queueing works for stopper threads, and queueing the stopper thread in the wake queue, to ensure that the operation of queueing the works and waking the stopper threads is atomic. Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock") Signed-off-by: Prasad Sodagudi Signed-off-by: Isaac J. Manjarres Signed-off-by: Thomas Gleixner Cc: peterz@infradead.org Cc: matt@codeblueprint.co.uk Cc: bigeasy@linutronix.de Cc: gregkh@linuxfoundation.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1533329766-4856-1-git-send-email-isaacm@codeaurora.org Co-Developed-by: Isaac J. Manjarres (cherry picked from commit cfd355145c32bb7ccb65fccbe2d67280dc2119e1) Signed-off-by: Mauricio Faria de Oliveira --- kernel/stop_machine.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index e190d1ef3a23..69eb76daed34 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -81,6 +81,7 @@ static bool cpu_stop_queue_work(unsigned int cpu, struct cpu_stop_work *work) unsigned long flags; bool enabled; + preempt_disable(); raw_spin_lock_irqsave(&stopper->lock, flags); enabled = stopper->enabled; if (enabled) @@ -90,6 +91,7 @@ static bool cpu_stop_queue_work(unsigned int cpu, struct cpu_stop_work *work) raw_spin_unlock_irqrestore(&stopper->lock, flags); wake_up_q(&wakeq); + preempt_enable(); return enabled; }