From patchwork Fri Jan 11 11:08:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauricio Faria de Oliveira X-Patchwork-Id: 1023519 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 43bg9C2tVjz9sCh; Fri, 11 Jan 2019 22:09:03 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1ghugL-0007M1-Np; Fri, 11 Jan 2019 11:08:57 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1ghugJ-0007L0-MM for kernel-team@lists.ubuntu.com; Fri, 11 Jan 2019 11:08:55 +0000 Received: from mail-qt1-f200.google.com ([209.85.160.200]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1ghugJ-0004wJ-9Z for kernel-team@lists.ubuntu.com; Fri, 11 Jan 2019 11:08:55 +0000 Received: by mail-qt1-f200.google.com with SMTP id d35so16276432qtd.20 for ; Fri, 11 Jan 2019 03:08:55 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=XJkUVmigLsFILupFd7WsePQFwNAlfbk0YQtWnCNsAI0=; b=iNvXFut5bzBnWeA2JbLN8cQPoRVIFYrYLpL71uBsktqf41LO8mq01Jd/jHc/shscOU efrmlYgSyapWJ+0/tsctmjirIG1xRirD2XohNun/TyQ7GrthEG8vxF50vxhD/aGcugbq MrPK07JZpVh0Iq4XHea1tXijIowjdON6CzkCdo+8FqiJLSKzbBEyKNhf1y4Hsd+634MU qyjaBl1EknC4kZ6mF4MxaOnmBwtR9wiNXr43qd6WpzmCizjIQrUNroXXLYHBPTHtO0uW 4vkj3lcB0oBDSUorbwYkrhplGh+nYPaMfZivYd18aCjAyQAdMgV8alFYjnzftABnwt2t 3fyA== X-Gm-Message-State: AJcUukdmfNpU/Ul6ir2l5B35WV38fzknfTDBYd1OSEsGHopDUUqu6DZ4 eRwb4+QNlUb8Ynt4fU8I9J1LijA5/cQdovGmEVGE+gDk13HZrlNP0BdumodJDRGiLf5FazEaOzr XL4dQ4G5c0tzC6Gap69h2wHjqfTukPA5TSPrA63wkmQ== X-Received: by 2002:aed:21d2:: with SMTP id m18mr13535973qtc.121.1547204933034; Fri, 11 Jan 2019 03:08:53 -0800 (PST) X-Google-Smtp-Source: ALg8bN7KBgIEnkhvOl1bH0NPHAfeD6ezzjbdKQeddrpCLeD8m1BZBMCzQRRBBixC3Ug2P4PcuguVjA== X-Received: by 2002:aed:21d2:: with SMTP id m18mr13535957qtc.121.1547204932681; Fri, 11 Jan 2019 03:08:52 -0800 (PST) Received: from localhost.localdomain ([177.181.227.3]) by smtp.gmail.com with ESMTPSA id x202sm35844345qka.67.2019.01.11.03.08.50 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 11 Jan 2019 03:08:52 -0800 (PST) From: Mauricio Faria de Oliveira To: kernel-team@lists.ubuntu.com Subject: [SRU C][PATCH v2 1/6] blk-wbt: Avoid lock contention and thundering herd issue in wbt_wait Date: Fri, 11 Jan 2019 09:08:38 -0200 Message-Id: <20190111110843.18042-2-mfo@canonical.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190111110843.18042-1-mfo@canonical.com> References: <20190111110843.18042-1-mfo@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Anchal Agarwal BugLink: https://bugs.launchpad.net/bugs/1810998 I am currently running a large bare metal instance (i3.metal) on EC2 with 72 cores, 512GB of RAM and NVME drives, with a 4.18 kernel. I have a workload that simulates a database workload and I am running into lockup issues when writeback throttling is enabled,with the hung task detector also kicking in. Crash dumps show that most CPUs (up to 50 of them) are all trying to get the wbt wait queue lock while trying to add themselves to it in __wbt_wait (see stack traces below). [ 0.948118] CPU: 45 PID: 0 Comm: swapper/45 Not tainted 4.14.51-62.38.amzn1.x86_64 #1 [ 0.948119] Hardware name: Amazon EC2 i3.metal/Not Specified, BIOS 1.0 10/16/2017 [ 0.948120] task: ffff883f7878c000 task.stack: ffffc9000c69c000 [ 0.948124] RIP: 0010:native_queued_spin_lock_slowpath+0xf8/0x1a0 [ 0.948125] RSP: 0018:ffff883f7fcc3dc8 EFLAGS: 00000046 [ 0.948126] RAX: 0000000000000000 RBX: ffff887f7709ca68 RCX: ffff883f7fce2a00 [ 0.948128] RDX: 000000000000001c RSI: 0000000000740001 RDI: ffff887f7709ca68 [ 0.948129] RBP: 0000000000000002 R08: 0000000000b80000 R09: 0000000000000000 [ 0.948130] R10: ffff883f7fcc3d78 R11: 000000000de27121 R12: 0000000000000002 [ 0.948131] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000 [ 0.948132] FS: 0000000000000000(0000) GS:ffff883f7fcc0000(0000) knlGS:0000000000000000 [ 0.948134] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.948135] CR2: 000000c424c77000 CR3: 0000000002010005 CR4: 00000000003606e0 [ 0.948136] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 0.948137] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 0.948138] Call Trace: [ 0.948139] [ 0.948142] do_raw_spin_lock+0xad/0xc0 [ 0.948145] _raw_spin_lock_irqsave+0x44/0x4b [ 0.948149] ? __wake_up_common_lock+0x53/0x90 [ 0.948150] __wake_up_common_lock+0x53/0x90 [ 0.948155] wbt_done+0x7b/0xa0 [ 0.948158] blk_mq_free_request+0xb7/0x110 [ 0.948161] __blk_mq_complete_request+0xcb/0x140 [ 0.948166] nvme_process_cq+0xce/0x1a0 [nvme] [ 0.948169] nvme_irq+0x23/0x50 [nvme] [ 0.948173] __handle_irq_event_percpu+0x46/0x300 [ 0.948176] handle_irq_event_percpu+0x20/0x50 [ 0.948179] handle_irq_event+0x34/0x60 [ 0.948181] handle_edge_irq+0x77/0x190 [ 0.948185] handle_irq+0xaf/0x120 [ 0.948188] do_IRQ+0x53/0x110 [ 0.948191] common_interrupt+0x87/0x87 [ 0.948192] .... [ 0.311136] CPU: 4 PID: 9737 Comm: run_linux_amd64 Not tainted 4.14.51-62.38.amzn1.x86_64 #1 [ 0.311137] Hardware name: Amazon EC2 i3.metal/Not Specified, BIOS 1.0 10/16/2017 [ 0.311138] task: ffff883f6e6a8000 task.stack: ffffc9000f1ec000 [ 0.311141] RIP: 0010:native_queued_spin_lock_slowpath+0xf5/0x1a0 [ 0.311142] RSP: 0018:ffffc9000f1efa28 EFLAGS: 00000046 [ 0.311144] RAX: 0000000000000000 RBX: ffff887f7709ca68 RCX: ffff883f7f722a00 [ 0.311145] RDX: 0000000000000035 RSI: 0000000000d80001 RDI: ffff887f7709ca68 [ 0.311146] RBP: 0000000000000202 R08: 0000000000140000 R09: 0000000000000000 [ 0.311147] R10: ffffc9000f1ef9d8 R11: 000000001a249fa0 R12: ffff887f7709ca68 [ 0.311148] R13: ffffc9000f1efad0 R14: 0000000000000000 R15: ffff887f7709ca00 [ 0.311149] FS: 000000c423f30090(0000) GS:ffff883f7f700000(0000) knlGS:0000000000000000 [ 0.311150] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.311151] CR2: 00007feefcea4000 CR3: 0000007f7016e001 CR4: 00000000003606e0 [ 0.311152] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 0.311153] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 0.311154] Call Trace: [ 0.311157] do_raw_spin_lock+0xad/0xc0 [ 0.311160] _raw_spin_lock_irqsave+0x44/0x4b [ 0.311162] ? prepare_to_wait_exclusive+0x28/0xb0 [ 0.311164] prepare_to_wait_exclusive+0x28/0xb0 [ 0.311167] wbt_wait+0x127/0x330 [ 0.311169] ? finish_wait+0x80/0x80 [ 0.311172] ? generic_make_request+0xda/0x3b0 [ 0.311174] blk_mq_make_request+0xd6/0x7b0 [ 0.311176] ? blk_queue_enter+0x24/0x260 [ 0.311178] ? generic_make_request+0xda/0x3b0 [ 0.311181] generic_make_request+0x10c/0x3b0 [ 0.311183] ? submit_bio+0x5c/0x110 [ 0.311185] submit_bio+0x5c/0x110 [ 0.311197] ? __ext4_journal_stop+0x36/0xa0 [ext4] [ 0.311210] ext4_io_submit+0x48/0x60 [ext4] [ 0.311222] ext4_writepages+0x810/0x11f0 [ext4] [ 0.311229] ? do_writepages+0x3c/0xd0 [ 0.311239] ? ext4_mark_inode_dirty+0x260/0x260 [ext4] [ 0.311240] do_writepages+0x3c/0xd0 [ 0.311243] ? _raw_spin_unlock+0x24/0x30 [ 0.311245] ? wbc_attach_and_unlock_inode+0x165/0x280 [ 0.311248] ? __filemap_fdatawrite_range+0xa3/0xe0 [ 0.311250] __filemap_fdatawrite_range+0xa3/0xe0 [ 0.311253] file_write_and_wait_range+0x34/0x90 [ 0.311264] ext4_sync_file+0x151/0x500 [ext4] [ 0.311267] do_fsync+0x38/0x60 [ 0.311270] SyS_fsync+0xc/0x10 [ 0.311272] do_syscall_64+0x6f/0x170 [ 0.311274] entry_SYSCALL_64_after_hwframe+0x42/0xb7 In the original patch, wbt_done is waking up all the exclusive processes in the wait queue, which can cause a thundering herd if there is a large number of writer threads in the queue. The original intention of the code seems to be to wake up one thread only however, it uses wake_up_all() in __wbt_done(), and then uses the following check in __wbt_wait to have only one thread actually get out of the wait loop: if (waitqueue_active(&rqw->wait) && rqw->wait.head.next != &wait->entry) return false; The problem with this is that the wait entry in wbt_wait is define with DEFINE_WAIT, which uses the autoremove wakeup function. That means that the above check is invalid - the wait entry will have been removed from the queue already by the time we hit the check in the loop. Secondly, auto-removing the wait entries also means that the wait queue essentially gets reordered "randomly" (e.g. threads re-add themselves in the order they got to run after being woken up). Additionally, new requests entering wbt_wait might overtake requests that were queued earlier, because the wait queue will be (temporarily) empty after the wake_up_all, so the waitqueue_active check will not stop them. This can cause certain threads to starve under high load. The fix is to leave the woken up requests in the queue and remove them in finish_wait() once the current thread breaks out of the wait loop in __wbt_wait. This will ensure new requests always end up at the back of the queue, and they won't overtake requests that are already in the wait queue. With that change, the loop in wbt_wait is also in line with many other wait loops in the kernel. Waking up just one thread drastically reduces lock contention, as does moving the wait queue add/remove out of the loop. A significant drop in lockdep's lock contention numbers is seen when running the test application on the patched kernel. Signed-off-by: Anchal Agarwal Signed-off-by: Frank van der Linden Signed-off-by: Jens Axboe (backported from commit 2887e41b910bb14fd847cf01ab7a5993db989d88) [mfo: backport: - s/rq_wait_inc_below(rqw/atomic_inc_below(&rqw->inflight/] Signed-off-by: Mauricio Faria de Oliveira --- block/blk-wbt.c | 55 +++++++++++++++++++++---------------------------- 1 file changed, 24 insertions(+), 31 deletions(-) diff --git a/block/blk-wbt.c b/block/blk-wbt.c index 4f89b28fa652..5733d3ab8ed5 100644 --- a/block/blk-wbt.c +++ b/block/blk-wbt.c @@ -186,7 +186,7 @@ void __wbt_done(struct rq_wb *rwb, enum wbt_flags wb_acct) int diff = limit - inflight; if (!inflight || diff >= rwb->wb_background / 2) - wake_up_all(&rqw->wait); + wake_up(&rqw->wait); } } @@ -533,30 +533,6 @@ static inline unsigned int get_limit(struct rq_wb *rwb, unsigned long rw) return limit; } -static inline bool may_queue(struct rq_wb *rwb, struct rq_wait *rqw, - wait_queue_entry_t *wait, unsigned long rw) -{ - /* - * inc it here even if disabled, since we'll dec it at completion. - * this only happens if the task was sleeping in __wbt_wait(), - * and someone turned it off at the same time. - */ - if (!rwb_enabled(rwb)) { - atomic_inc(&rqw->inflight); - return true; - } - - /* - * If the waitqueue is already active and we are not the next - * in line to be woken up, wait for our turn. - */ - if (waitqueue_active(&rqw->wait) && - rqw->wait.head.next != &wait->entry) - return false; - - return atomic_inc_below(&rqw->inflight, get_limit(rwb, rw)); -} - /* * Block if we will exceed our limit, or if we are currently waiting for * the timer to kick off queuing again. @@ -567,16 +543,32 @@ static void __wbt_wait(struct rq_wb *rwb, enum wbt_flags wb_acct, __acquires(lock) { struct rq_wait *rqw = get_rq_wait(rwb, wb_acct); - DEFINE_WAIT(wait); + DECLARE_WAITQUEUE(wait, current); + + /* + * inc it here even if disabled, since we'll dec it at completion. + * this only happens if the task was sleeping in __wbt_wait(), + * and someone turned it off at the same time. + */ + if (!rwb_enabled(rwb)) { + atomic_inc(&rqw->inflight); + return; + } - if (may_queue(rwb, rqw, &wait, rw)) + if (!waitqueue_active(&rqw->wait) + && atomic_inc_below(&rqw->inflight, get_limit(rwb, rw))) return; + add_wait_queue_exclusive(&rqw->wait, &wait); do { - prepare_to_wait_exclusive(&rqw->wait, &wait, - TASK_UNINTERRUPTIBLE); + set_current_state(TASK_UNINTERRUPTIBLE); + + if (!rwb_enabled(rwb)) { + atomic_inc(&rqw->inflight); + break; + } - if (may_queue(rwb, rqw, &wait, rw)) + if (atomic_inc_below(&rqw->inflight, get_limit(rwb, rw))) break; if (lock) { @@ -587,7 +579,8 @@ static void __wbt_wait(struct rq_wb *rwb, enum wbt_flags wb_acct, io_schedule(); } while (1); - finish_wait(&rqw->wait, &wait); + __set_current_state(TASK_RUNNING); + remove_wait_queue(&rqw->wait, &wait); } static inline bool wbt_should_throttle(struct rq_wb *rwb, struct bio *bio) From patchwork Fri Jan 11 11:08:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauricio Faria de Oliveira X-Patchwork-Id: 1023518 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 43bg9C5BlCz9sLw; Fri, 11 Jan 2019 22:09:03 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1ghugL-0007ML-Tv; Fri, 11 Jan 2019 11:08:57 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1ghugK-0007LR-Ae for kernel-team@lists.ubuntu.com; Fri, 11 Jan 2019 11:08:56 +0000 Received: from mail-qt1-f198.google.com ([209.85.160.198]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1ghugK-0004wN-0J for kernel-team@lists.ubuntu.com; Fri, 11 Jan 2019 11:08:56 +0000 Received: by mail-qt1-f198.google.com with SMTP id m37so16121740qte.10 for ; Fri, 11 Jan 2019 03:08:55 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=OT07/r3+dkPLSvYGkw7jxrtfQxCfSmsrYj/XX1k1M8o=; b=fZFof9dBcP0zE0dsWzZxQYrU9jYpluUjoHBdcdfIr9NpJ9RX9Ym6OFuxsVUMU4rNeq 87/TbEHhfj5fbm67jZMmd8fAC4lOkY7cZ3LsPAprcGVAufNIybR6ujtUTCspNQbEIFoT OgtZMaJjBBKPSEgWZNFB1Pt3VXJOmFI+LOH73D8ohAB310oQvAj/DGUyy94VuEEzscqn Wwoj2sMLPAKT0SDu2XN6XpQZD7AIuHhNbqMnj8QKndSO4OVEN2lDuAS07z/vjbIBjeKa i5JHk5KBhmc1BRRDrZgNxZq3nCPT1yk4iA9cWAR1inQfEreUOD042ZCogYyFv72GQOVx S39w== X-Gm-Message-State: AJcUukfFmkUxbDBjFqSVwIu+ZRiiFMPYpk2cF+Fn80K4CWYRPgktRTwe /uS/oo6qsOTrhZAzObzhBLd2Acg4vHZJPkd76DbBI2GftVRc/fKHDGh57UhM/batlkpA3qZ7hQy Pg3BmvgLLW3WqD3shrTGgkmnO5jM8oJRU13hZnl8zwg== X-Received: by 2002:a37:a28d:: with SMTP id l135mr12831171qke.226.1547204935039; Fri, 11 Jan 2019 03:08:55 -0800 (PST) X-Google-Smtp-Source: ALg8bN4DQuGgVbIy4DeRGvubH1XWma8Pw4gL1523BidkTyfnobPENqtEhJhxpEINMExU1X0aTqWNhQ== X-Received: by 2002:a37:a28d:: with SMTP id l135mr12831159qke.226.1547204934869; Fri, 11 Jan 2019 03:08:54 -0800 (PST) Received: from localhost.localdomain ([177.181.227.3]) by smtp.gmail.com with ESMTPSA id x202sm35844345qka.67.2019.01.11.03.08.52 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 11 Jan 2019 03:08:54 -0800 (PST) From: Mauricio Faria de Oliveira To: kernel-team@lists.ubuntu.com Subject: [SRU C][PATCH v2 2/6] blk-wbt: move disable check into get_limit() Date: Fri, 11 Jan 2019 09:08:39 -0200 Message-Id: <20190111110843.18042-3-mfo@canonical.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190111110843.18042-1-mfo@canonical.com> References: <20190111110843.18042-1-mfo@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Jens Axboe BugLink: https://bugs.launchpad.net/bugs/1810998 Check it in one place, instead of in multiple places. Tested-by: Anchal Agarwal Signed-off-by: Jens Axboe (backported from commit ffa358dcaae1f2f00926484e712e06daa8953cb4) [mfo: backport: - blk-wbt.c: - hunk 2: s/rq_wait_inc_below(rqw/atomic_inc_below(&rqw->inflight/ - hunk 3: s/rq_wait_inc_below(rqw/atomic_inc_below(&rqw->inflight/ Signed-off-by: Mauricio Faria de Oliveira --- block/blk-wbt.c | 22 +++++++--------------- 1 file changed, 7 insertions(+), 15 deletions(-) diff --git a/block/blk-wbt.c b/block/blk-wbt.c index 5733d3ab8ed5..84e5cefbb3bb 100644 --- a/block/blk-wbt.c +++ b/block/blk-wbt.c @@ -508,6 +508,13 @@ static inline unsigned int get_limit(struct rq_wb *rwb, unsigned long rw) { unsigned int limit; + /* + * If we got disabled, just return UINT_MAX. This ensures that + * we'll properly inc a new IO, and dec+wakeup at the end. + */ + if (!rwb_enabled(rwb)) + return UINT_MAX; + if ((rw & REQ_OP_MASK) == REQ_OP_DISCARD) return rwb->wb_background; @@ -545,16 +552,6 @@ static void __wbt_wait(struct rq_wb *rwb, enum wbt_flags wb_acct, struct rq_wait *rqw = get_rq_wait(rwb, wb_acct); DECLARE_WAITQUEUE(wait, current); - /* - * inc it here even if disabled, since we'll dec it at completion. - * this only happens if the task was sleeping in __wbt_wait(), - * and someone turned it off at the same time. - */ - if (!rwb_enabled(rwb)) { - atomic_inc(&rqw->inflight); - return; - } - if (!waitqueue_active(&rqw->wait) && atomic_inc_below(&rqw->inflight, get_limit(rwb, rw))) return; @@ -563,11 +560,6 @@ static void __wbt_wait(struct rq_wb *rwb, enum wbt_flags wb_acct, do { set_current_state(TASK_UNINTERRUPTIBLE); - if (!rwb_enabled(rwb)) { - atomic_inc(&rqw->inflight); - break; - } - if (atomic_inc_below(&rqw->inflight, get_limit(rwb, rw))) break; From patchwork Fri Jan 11 11:08:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauricio Faria de Oliveira X-Patchwork-Id: 1023520 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 43bg9H3DZYz9sLw; Fri, 11 Jan 2019 22:09:07 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1ghugP-0007PC-DN; Fri, 11 Jan 2019 11:09:01 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1ghugN-0007Nc-I9 for kernel-team@lists.ubuntu.com; Fri, 11 Jan 2019 11:08:59 +0000 Received: from mail-qt1-f200.google.com ([209.85.160.200]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1ghugN-0004wX-4S for kernel-team@lists.ubuntu.com; Fri, 11 Jan 2019 11:08:59 +0000 Received: by mail-qt1-f200.google.com with SMTP id 41so15988977qto.17 for ; Fri, 11 Jan 2019 03:08:59 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=/MYP8xuKoel9tFjCStAC89Ci3IFks0hWmDr9zCnzGfQ=; b=gbL+nRgJGhK0wz2rCP03xY6m4gv8M5PzAB/geCobBqe1rYH9iQ8UmOpZvd6Mx1k/d6 J7rBIQK2ZXSMYJVwQMZ6eCrUNKBgmCKu4jRHXLZSS8dBCe4ozpbouKg0J6u9r29mvCt1 jlcWcL+WiFGmTki/pyYq/QF6GbiyQB7o5a4wIn/58MUwtwAYBlRpASVuXNR0KGLwuXDg VtSIU9Qoau6WPHWq63fZeW/91Xx1MKLraqj+CZ4289BkYi97mflWbW7mIP7XXLGEfpha aJMNWfq7xuUJea0T0o0yvYNCH4UWhL+QgzrlwkYH16Q84YK5kGIzo6cINPMgB0XMtnY+ sC0w== X-Gm-Message-State: AJcUukccij0ClALGkatWEqr9bcejEpxwOxkQg6xk5g0wK73kVW/dAGeF RI5m+zeAxLMInWwPt4bmCs4gTDrd8tiKDwZ7YY7NKyj/Gcu+Vvfzio7ZsQZUOtE7FASxIVbMRJt G++/rU9eHGwUjpWfwt8Clf8hX0VG3NsOkkKdv3C0Yvg== X-Received: by 2002:a37:2ec4:: with SMTP id u187mr12655601qkh.304.1547204938068; Fri, 11 Jan 2019 03:08:58 -0800 (PST) X-Google-Smtp-Source: ALg8bN7NH1ME9w8yfnd0g/LuAmcNyZslCGkLAgdIETsy3BlQpDaMSH5cy1QrpWm5CT6/tvef125Kzw== X-Received: by 2002:a37:2ec4:: with SMTP id u187mr12655594qkh.304.1547204937923; Fri, 11 Jan 2019 03:08:57 -0800 (PST) Received: from localhost.localdomain ([177.181.227.3]) by smtp.gmail.com with ESMTPSA id x202sm35844345qka.67.2019.01.11.03.08.55 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 11 Jan 2019 03:08:57 -0800 (PST) From: Mauricio Faria de Oliveira To: kernel-team@lists.ubuntu.com Subject: [SRU C][PATCH v2 3/6] blk-wbt: use wq_has_sleeper() for wq active check Date: Fri, 11 Jan 2019 09:08:40 -0200 Message-Id: <20190111110843.18042-4-mfo@canonical.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190111110843.18042-1-mfo@canonical.com> References: <20190111110843.18042-1-mfo@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Jens Axboe BugLink: https://bugs.launchpad.net/bugs/1810998 We need the memory barrier before checking the list head, use the appropriate helper for this. The matching queue side memory barrier is provided by set_current_state(). Tested-by: Anchal Agarwal Signed-off-by: Jens Axboe (backported from commit b78820937b4762b7d30b807d7156bec1d89e4dd3) [mfo: backport: - hunk 3: s/rq_wait_inc_below(rqw/atomic_inc_below(&rqw->inflight/] Signed-off-by: Mauricio Faria de Oliveira --- block/blk-wbt.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/block/blk-wbt.c b/block/blk-wbt.c index 84e5cefbb3bb..08472c1a7858 100644 --- a/block/blk-wbt.c +++ b/block/blk-wbt.c @@ -139,7 +139,7 @@ static void rwb_wake_all(struct rq_wb *rwb) for (i = 0; i < WBT_NUM_RWQ; i++) { struct rq_wait *rqw = &rwb->rq_wait[i]; - if (waitqueue_active(&rqw->wait)) + if (wq_has_sleeper(&rqw->wait)) wake_up_all(&rqw->wait); } } @@ -182,7 +182,7 @@ void __wbt_done(struct rq_wb *rwb, enum wbt_flags wb_acct) if (inflight && inflight >= limit) return; - if (waitqueue_active(&rqw->wait)) { + if (wq_has_sleeper(&rqw->wait)) { int diff = limit - inflight; if (!inflight || diff >= rwb->wb_background / 2) @@ -552,8 +552,8 @@ static void __wbt_wait(struct rq_wb *rwb, enum wbt_flags wb_acct, struct rq_wait *rqw = get_rq_wait(rwb, wb_acct); DECLARE_WAITQUEUE(wait, current); - if (!waitqueue_active(&rqw->wait) - && atomic_inc_below(&rqw->inflight, get_limit(rwb, rw))) + if (!wq_has_sleeper(&rqw->wait) && + atomic_inc_below(&rqw->inflight, get_limit(rwb, rw))) return; add_wait_queue_exclusive(&rqw->wait, &wait); From patchwork Fri Jan 11 11:08:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauricio Faria de Oliveira X-Patchwork-Id: 1023521 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 43bg9K5YHdz9sLw; Fri, 11 Jan 2019 22:09:09 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1ghugR-0007Rl-LS; Fri, 11 Jan 2019 11:09:03 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1ghugP-0007Or-EU for kernel-team@lists.ubuntu.com; Fri, 11 Jan 2019 11:09:01 +0000 Received: from mail-qt1-f198.google.com ([209.85.160.198]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1ghugP-0004x0-24 for kernel-team@lists.ubuntu.com; Fri, 11 Jan 2019 11:09:01 +0000 Received: by mail-qt1-f198.google.com with SMTP id b16so15973125qtc.22 for ; Fri, 11 Jan 2019 03:09:01 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=9aOOAlXaACW4BJVJqGPpDpObXbH/+cvlKnIXJgqJsEY=; b=OTVY1aAH/dYH9hZ5g5tYYeqqOARICzG4o03PgbFK+dBKbZu5Wi3tMn5NM23TO/i0Q+ Z0Ba4M3tHr0WXx6Ue07npSFZH6JTUVBH62mt0DWUdHs2kwXxQqGpCTNc8UvmHUlIlkS7 gi3GkPc+K5RTABvX9OWDnY76faUrL2Urvt8j7VAFjo7OE8HQt3ebf9OPBLAbHAWQ182p aXaUKFwNXpG25ErImc+oA5Mfaw3yOPjdX54MvWcJjxV6aCagkJuV2QLjDXDjl+fE1+ac ZnMShbMKeP9frG0ZpI8AHCeXZnkxo33gGxnFEMYCeQiWLLh9r3sGB3KJNi3arvOx3FQj tKzg== X-Gm-Message-State: AJcUukfwa5LigVJs1ys4Vw1eBKv7k3OUcVdyXhL+DcHVcDAJI0XEIraY ZrdB3PYFwr7ubAdrQeGRh0VJEtoS5NqtECsxmCRGvBmayElFCfbmuORwQ+cJe2bVd9NUbohC30Q NLMZP0hQNod08UgE1wrnx8vwEEzwrmJSXx2NTNuvhnQ== X-Received: by 2002:a0c:d4f9:: with SMTP id y54mr13287805qvh.98.1547204939988; Fri, 11 Jan 2019 03:08:59 -0800 (PST) X-Google-Smtp-Source: ALg8bN4iltzLjyo7hxzLr4shV6l3KDOz1vpCbwqrvx7pe6XK9mih9mXg0ZwBswfru3GfSBZoG4vYrQ== X-Received: by 2002:a0c:d4f9:: with SMTP id y54mr13287795qvh.98.1547204939843; Fri, 11 Jan 2019 03:08:59 -0800 (PST) Received: from localhost.localdomain ([177.181.227.3]) by smtp.gmail.com with ESMTPSA id x202sm35844345qka.67.2019.01.11.03.08.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 11 Jan 2019 03:08:59 -0800 (PST) From: Mauricio Faria de Oliveira To: kernel-team@lists.ubuntu.com Subject: [SRU C][PATCH v2 4/6] blk-wbt: fix has-sleeper queueing check Date: Fri, 11 Jan 2019 09:08:41 -0200 Message-Id: <20190111110843.18042-5-mfo@canonical.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190111110843.18042-1-mfo@canonical.com> References: <20190111110843.18042-1-mfo@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Jens Axboe BugLink: https://bugs.launchpad.net/bugs/1810998 We need to do this inside the loop as well, or we can allow new IO to supersede previous IO. Tested-by: Anchal Agarwal Signed-off-by: Jens Axboe (backported from commit c45e6a037a536530bd25781ac7c989e52deb2a63) [mfo: backport: - hunk 1: s/rq_wait_inc_below(rqw/atomic_inc_below(&rqw->inflight/] Signed-off-by: Mauricio Faria de Oliveira --- block/blk-wbt.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/block/blk-wbt.c b/block/blk-wbt.c index 08472c1a7858..d4f7a1bc1056 100644 --- a/block/blk-wbt.c +++ b/block/blk-wbt.c @@ -551,16 +551,17 @@ static void __wbt_wait(struct rq_wb *rwb, enum wbt_flags wb_acct, { struct rq_wait *rqw = get_rq_wait(rwb, wb_acct); DECLARE_WAITQUEUE(wait, current); + bool has_sleeper; - if (!wq_has_sleeper(&rqw->wait) && - atomic_inc_below(&rqw->inflight, get_limit(rwb, rw))) + has_sleeper = wq_has_sleeper(&rqw->wait); + if (!has_sleeper && atomic_inc_below(&rqw->inflight, get_limit(rwb, rw))) return; add_wait_queue_exclusive(&rqw->wait, &wait); do { set_current_state(TASK_UNINTERRUPTIBLE); - if (atomic_inc_below(&rqw->inflight, get_limit(rwb, rw))) + if (!has_sleeper && atomic_inc_below(&rqw->inflight, get_limit(rwb, rw))) break; if (lock) { @@ -569,6 +570,7 @@ static void __wbt_wait(struct rq_wb *rwb, enum wbt_flags wb_acct, spin_lock_irq(lock); } else io_schedule(); + has_sleeper = false; } while (1); __set_current_state(TASK_RUNNING); From patchwork Fri Jan 11 11:08:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauricio Faria de Oliveira X-Patchwork-Id: 1023522 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 43bg9Q3wMvz9sBQ; Fri, 11 Jan 2019 22:09:14 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1ghugV-0007VW-Ie; Fri, 11 Jan 2019 11:09:07 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1ghugT-0007SW-AY for kernel-team@lists.ubuntu.com; Fri, 11 Jan 2019 11:09:05 +0000 Received: from mail-qt1-f200.google.com ([209.85.160.200]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1ghugS-0004xX-1x for kernel-team@lists.ubuntu.com; Fri, 11 Jan 2019 11:09:04 +0000 Received: by mail-qt1-f200.google.com with SMTP id t18so16292361qtj.3 for ; Fri, 11 Jan 2019 03:09:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=2t178K93bRPzZb7WfVVpZQ7CX1WlvhtbX+daiHT6CKY=; b=QhcsjAZ2KJR2stscNd2zJ8RRpoQ2+Hjrm5akFVzSQIcUK+nhlFqIxRgRIcqQvmAviS JazH4QQh5yIxgpfLdMywZblp3Tp+z4fjyT5eGoFP/TEHArg/ilgE/kPaI/WFdpYXkFYk 9T09u1D3Kjlw1BwBbLlTdR8oqzY4mR8AuibWldmYXYCHxB9xc9xSa2BcQSFBd/nxgFXu LXram8lT7G95kx7xQtji3RAdH0KkcpqDq2hx0IPjazD69rktcLex8H+n0sSamVpAvTEA /mR9OpNO1EOnrPJC6uERg1N77vL3cvUQA/uV8WmEI4bVjQaG768c8VropptJTt0T+Z+5 wgpA== X-Gm-Message-State: AJcUukcBW+xKOKbSboDVD0Iw+/VPVm4GK/N3pUwvYq7Xo6ke9fOJvtFw h9ss7WU96KzTLuacs7inGM2UYMOKFI5b79RrmDog7nkn9i9eJmkp6Laz+2CAl2RYuqWWhR2yU7u Ul4V3qcrROY3+MZ+zauy20cLy+OF1AET7LDcETrQZDw== X-Received: by 2002:a37:b703:: with SMTP id h3mr12168082qkf.33.1547204942993; Fri, 11 Jan 2019 03:09:02 -0800 (PST) X-Google-Smtp-Source: ALg8bN69jahVXSZuEd8cyPzCpknCV8P3HtgziT745H5ln8X3a0hzZfkTnP3IV8Bmyg5kZzD7ee4MaA== X-Received: by 2002:a37:b703:: with SMTP id h3mr12168075qkf.33.1547204942829; Fri, 11 Jan 2019 03:09:02 -0800 (PST) Received: from localhost.localdomain ([177.181.227.3]) by smtp.gmail.com with ESMTPSA id x202sm35844345qka.67.2019.01.11.03.09.00 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 11 Jan 2019 03:09:01 -0800 (PST) From: Mauricio Faria de Oliveira To: kernel-team@lists.ubuntu.com Subject: [SRU C][PATCH v2 5/6] blk-wbt: abstract out end IO completion handler Date: Fri, 11 Jan 2019 09:08:42 -0200 Message-Id: <20190111110843.18042-6-mfo@canonical.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190111110843.18042-1-mfo@canonical.com> References: <20190111110843.18042-1-mfo@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Jens Axboe BugLink: https://bugs.launchpad.net/bugs/1810998 Prep patch for calling the handler from a different context, no functional changes in this patch. Tested-by: Agarwal, Anchal Signed-off-by: Jens Axboe (backported from commit 061a5427530633de93ace4ef001b99961984af62) [mfo: backport: __wbt_done(): - keep signature (not static; parameters; no 'rqos') - remove the cast to 'rwb' from 'rqos' (it doesn't exist).] Signed-off-by: Mauricio Faria de Oliveira --- block/blk-wbt.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) diff --git a/block/blk-wbt.c b/block/blk-wbt.c index d4f7a1bc1056..fe20486bd9b4 100644 --- a/block/blk-wbt.c +++ b/block/blk-wbt.c @@ -144,15 +144,11 @@ static void rwb_wake_all(struct rq_wb *rwb) } } -void __wbt_done(struct rq_wb *rwb, enum wbt_flags wb_acct) +static void wbt_rqw_done(struct rq_wb *rwb, struct rq_wait *rqw, + enum wbt_flags wb_acct) { - struct rq_wait *rqw; int inflight, limit; - if (!(wb_acct & WBT_TRACKED)) - return; - - rqw = get_rq_wait(rwb, wb_acct); inflight = atomic_dec_return(&rqw->inflight); /* @@ -190,6 +186,17 @@ void __wbt_done(struct rq_wb *rwb, enum wbt_flags wb_acct) } } +void __wbt_done(struct rq_wb *rwb, enum wbt_flags wb_acct) +{ + struct rq_wait *rqw; + + if (!(wb_acct & WBT_TRACKED)) + return; + + rqw = get_rq_wait(rwb, wb_acct); + wbt_rqw_done(rwb, rqw, wb_acct); +} + /* * Called on completion of a request. Note that it's also called when * a request is merged, when the request gets freed. From patchwork Fri Jan 11 11:08:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauricio Faria de Oliveira X-Patchwork-Id: 1023523 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 43bg9T2fCjz9sBQ; Fri, 11 Jan 2019 22:09:17 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1ghugZ-0007Z3-UQ; Fri, 11 Jan 2019 11:09:11 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1ghugX-0007Vh-8y for kernel-team@lists.ubuntu.com; Fri, 11 Jan 2019 11:09:09 +0000 Received: from mail-qt1-f200.google.com ([209.85.160.200]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1ghugV-0004xn-IV for kernel-team@lists.ubuntu.com; Fri, 11 Jan 2019 11:09:07 +0000 Received: by mail-qt1-f200.google.com with SMTP id p24so16281150qtl.2 for ; Fri, 11 Jan 2019 03:09:07 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=GVOmrPwO8WL645o9xrcZTWQyxZiEejVnM4fHxG6XxkU=; b=plnO4+lCq44HgdOjgdREVKdX5hKZmGpn3FTzCq7oADCFJNkgAt8QErkrm0SuyYIcDS wKNKjC4AWVZXhq+YCvTbV7P3BPqJa0qANu1T447X9lVe4mTh5X6Kxru3/2b2lyj+5mru eKDYBObnw4ScH5rtnZNv31EHPqYxu0icbgHOixXJX3XOswaHNHsrO/IXUvthqfrBwdCg TM8pMY51Jk59iLy2lsxTAzxFM6+8bN+xFh5an1i12kNjmrLcYaXZqN4CzXuPMTKW4n/p VsNOSNHoP4wI9HfXZr324lD1kC28RAQUbQ6tjpx286bkh3VA57wj6xJFaCnA3eqoGCY7 DaUw== X-Gm-Message-State: AJcUukc5gNp/9O38q8gSJbb2PDNYa1ggZjkm9CnyINL5RZLDHCIkfafv FJ45Iz5j/garczCdTm5fLQETmYIrEyCvi0NZe+EHAViIsLlM6SWxZ/9TW3F97ddfluJ948f8mOv Aev7qepMAqVmY8Z8dlfiGmqMynjtCZLR2AoyGqCfUIQ== X-Received: by 2002:a37:553:: with SMTP id 80mr13009954qkf.200.1547204946513; Fri, 11 Jan 2019 03:09:06 -0800 (PST) X-Google-Smtp-Source: ALg8bN6IsrR69q0BEV0gz9GgP1/YwyDEByF53loSgYP7PL1Q55+iqYOjw3XACWWLKBv+gMZZQbLUYw== X-Received: by 2002:a37:553:: with SMTP id 80mr13009942qkf.200.1547204946296; Fri, 11 Jan 2019 03:09:06 -0800 (PST) Received: from localhost.localdomain ([177.181.227.3]) by smtp.gmail.com with ESMTPSA id x202sm35844345qka.67.2019.01.11.03.09.03 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 11 Jan 2019 03:09:05 -0800 (PST) From: Mauricio Faria de Oliveira To: kernel-team@lists.ubuntu.com Subject: [SRU C][PATCH v2 6/6] blk-wbt: improve waking of tasks Date: Fri, 11 Jan 2019 09:08:43 -0200 Message-Id: <20190111110843.18042-7-mfo@canonical.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190111110843.18042-1-mfo@canonical.com> References: <20190111110843.18042-1-mfo@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Jens Axboe BugLink: https://bugs.launchpad.net/bugs/1810998 We have two potential issues: 1) After commit 2887e41b910b, we only wake one process at the time when we finish an IO. We really want to wake up as many tasks as can queue IO. Before this commit, we woke up everyone, which could cause a thundering herd issue. 2) A task can potentially consume two wakeups, causing us to (in practice) miss a wakeup. Fix both by providing our own wakeup function, which stops __wake_up_common() from waking up more tasks if we fail to get a queueing token. With the strict ordering we have on the wait list, this wakes the right tasks and the right amount of tasks. Based on a patch from Jianchao Wang . Tested-by: Agarwal, Anchal Signed-off-by: Jens Axboe (backported from commit 38cfb5a45ee013bfab5d1ae4c4738815e744b440) [mfo: backport: - hunk 2: s/rq_wait_inc_below(data->rqw/atomic_inc_below(&data->rqw->inflight/ - hunk 3: s/rq_wait_inc_below(rqw/atomic_inc_below(&rqw->inflight/] Signed-off-by: Mauricio Faria de Oliveira --- block/blk-wbt.c | 63 +++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 56 insertions(+), 7 deletions(-) diff --git a/block/blk-wbt.c b/block/blk-wbt.c index fe20486bd9b4..e9efcfc3a0d5 100644 --- a/block/blk-wbt.c +++ b/block/blk-wbt.c @@ -182,7 +182,7 @@ static void wbt_rqw_done(struct rq_wb *rwb, struct rq_wait *rqw, int diff = limit - inflight; if (!inflight || diff >= rwb->wb_background / 2) - wake_up(&rqw->wait); + wake_up_all(&rqw->wait); } } @@ -547,6 +547,34 @@ static inline unsigned int get_limit(struct rq_wb *rwb, unsigned long rw) return limit; } +struct wbt_wait_data { + struct wait_queue_entry wq; + struct task_struct *task; + struct rq_wb *rwb; + struct rq_wait *rqw; + unsigned long rw; + bool got_token; +}; + +static int wbt_wake_function(struct wait_queue_entry *curr, unsigned int mode, + int wake_flags, void *key) +{ + struct wbt_wait_data *data = container_of(curr, struct wbt_wait_data, + wq); + + /* + * If we fail to get a budget, return -1 to interrupt the wake up + * loop in __wake_up_common. + */ + if (!atomic_inc_below(&data->rqw->inflight, get_limit(data->rwb, data->rw))) + return -1; + + data->got_token = true; + list_del_init(&curr->entry); + wake_up_process(data->task); + return 1; +} + /* * Block if we will exceed our limit, or if we are currently waiting for * the timer to kick off queuing again. @@ -557,19 +585,40 @@ static void __wbt_wait(struct rq_wb *rwb, enum wbt_flags wb_acct, __acquires(lock) { struct rq_wait *rqw = get_rq_wait(rwb, wb_acct); - DECLARE_WAITQUEUE(wait, current); + struct wbt_wait_data data = { + .wq = { + .func = wbt_wake_function, + .entry = LIST_HEAD_INIT(data.wq.entry), + }, + .task = current, + .rwb = rwb, + .rqw = rqw, + .rw = rw, + }; bool has_sleeper; has_sleeper = wq_has_sleeper(&rqw->wait); if (!has_sleeper && atomic_inc_below(&rqw->inflight, get_limit(rwb, rw))) return; - add_wait_queue_exclusive(&rqw->wait, &wait); + prepare_to_wait_exclusive(&rqw->wait, &data.wq, TASK_UNINTERRUPTIBLE); do { - set_current_state(TASK_UNINTERRUPTIBLE); + if (data.got_token) + break; - if (!has_sleeper && atomic_inc_below(&rqw->inflight, get_limit(rwb, rw))) + if (!has_sleeper && + atomic_inc_below(&rqw->inflight, get_limit(rwb, rw))) { + finish_wait(&rqw->wait, &data.wq); + + /* + * We raced with wbt_wake_function() getting a token, + * which means we now have two. Put our local token + * and wake anyone else potentially waiting for one. + */ + if (data.got_token) + wbt_rqw_done(rwb, rqw, wb_acct); break; + } if (lock) { spin_unlock_irq(lock); @@ -577,11 +626,11 @@ static void __wbt_wait(struct rq_wb *rwb, enum wbt_flags wb_acct, spin_lock_irq(lock); } else io_schedule(); + has_sleeper = false; } while (1); - __set_current_state(TASK_RUNNING); - remove_wait_queue(&rqw->wait, &wait); + finish_wait(&rqw->wait, &data.wq); } static inline bool wbt_should_throttle(struct rq_wb *rwb, struct bio *bio)