From patchwork Fri Nov 2 09:00:05 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Axtens X-Patchwork-Id: 992208 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42mbd92JFtzB4Tq; Fri, 2 Nov 2018 20:00:29 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1gIVJW-0005Xe-R2; Fri, 02 Nov 2018 09:00:22 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1gIVJV-0005X9-B3 for kernel-team@lists.canonical.com; Fri, 02 Nov 2018 09:00:21 +0000 Received: from mail-qk1-f197.google.com ([209.85.222.197]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1gIVJV-0003LK-17 for kernel-team@lists.canonical.com; Fri, 02 Nov 2018 09:00:21 +0000 Received: by mail-qk1-f197.google.com with SMTP id 92so2663567qkx.19 for ; Fri, 02 Nov 2018 02:00:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=FGqhxCIinGU5fmSR2qTWyAKGwfRmWTmN2mfgdCwI+aI=; b=RfuVbqZUmxFbiPARXeeRcyKFfBFtYNg4lxAwNpa1wj2goeHm6YimI0O/eOkiflWb4y urGqC2tldt4lnbmRo78dfdu2BLzU0ZiQmyE7bhZggjrs4uztGpmI/GlvoLJrivneJZsN n4yqu2soJjCY0xqsY69BAOm9vcWVkbYsVrPd401AO68c2BcovVuVRJ2plQiaUnSXMxAi dpLGlcjzE2CL76WHb0i2rJL9ZkgNrOpQ8IRffqGorD4SsXTNkLD/1pMVjcymp0PZ7Qai lf+yiHjBREpz5dt74r2QFQUWbPYl8ntcA70JaWWND5YfwThtdOLm5pWpu2G3+DHslgjy mNQA== X-Gm-Message-State: AGRZ1gKfsiAUsqbPKrzEbrWp/JszdNJIfLDAQb7Wj6aAJLKBHg5kDvIi faY9PPCXzAs8rRHy90HQc+RkEXZpIXvtK4qGtn5+6PtPOGHoLCXSr0X29zEZ7QPx/A948er1muS iQbGKTK2OaHWkDjFYogv98J7QSSzZAuLtUXtgtqaftZb+Uwxq X-Received: by 2002:a37:6b42:: with SMTP id g63mr9498766qkc.297.1541149220045; Fri, 02 Nov 2018 02:00:20 -0700 (PDT) X-Google-Smtp-Source: AJdET5e6+7cOg60ZCyx/bNrVG/YXkTBujVQcM9E1jhAryEmHJdoWINNs43n6wHfaOHtvJ1jjkOeO3A== X-Received: by 2002:a37:6b42:: with SMTP id g63mr9498760qkc.297.1541149219851; Fri, 02 Nov 2018 02:00:19 -0700 (PDT) Received: from linkitivity.iinet.net.au ([2001:67c:1562:8007::aac:4356]) by smtp.gmail.com with ESMTPSA id a4-v6sm23521783qkb.62.2018.11.02.02.00.17 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 02 Nov 2018 02:00:19 -0700 (PDT) From: Daniel Axtens To: kernel-team@lists.canonical.com Subject: [SRU bionic-aws][PATCH 1/6] xen/manage: keep track of the on-going suspend mode Date: Fri, 2 Nov 2018 20:00:05 +1100 Message-Id: <20181102090010.2643-2-daniel.axtens@canonical.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181102090010.2643-1-daniel.axtens@canonical.com> References: <20181102090010.2643-1-daniel.axtens@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Munehisa Kamata BugLink: https://bugs.launchpad.net/bugs/1801305 To differentiate between Xen suspend, PM suspend and PM hibernation, keep track of the on-going suspend mode by mainly using a new PM notifier. Since Xen suspend doesn't have corresponding PM event, its main logic is modfied to acquire pm_mutex and set the current mode. Note that we may see deadlock if PM suspend/hibernation is interrupted by Xen suspend. PM suspend/hibernation depends on xenwatch thread to process xenbus state transactions, but the thread will sleep to wait pm_mutex which is already held by PM suspend/hibernation context in the scenario. Though, acquirng pm_mutex is still right thing to do, and we would need to modify Xen shutdown code to avoid the issue. This will be fixed by a separate patch. Signed-off-by: Munehisa Kamata Signed-off-by: Anchal Agarwal Reviewed-by: Sebastian Biemueller Reviewed-by: Munehisa Kamata Reviewed-by: Eduardo Valentin CR: https://cr.amazon.com/r/8273194/ (cherry-picked from 0013-xen-manage-keep-track-of-the-on-going-suspend-mode.patch in AWS 4.14 kernel SRPM) Signed-off-by: Daniel Axtens --- drivers/xen/manage.c | 58 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 58 insertions(+) diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c index 8835065029d3..8f9ea87ba93e 100644 --- a/drivers/xen/manage.c +++ b/drivers/xen/manage.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include @@ -39,6 +40,16 @@ enum shutdown_state { /* Ignore multiple shutdown requests. */ static enum shutdown_state shutting_down = SHUTDOWN_INVALID; +enum suspend_modes { + NO_SUSPEND = 0, + XEN_SUSPEND, + PM_SUSPEND, + PM_HIBERNATION, +}; + +/* Protected by pm_mutex */ +static enum suspend_modes suspend_mode = NO_SUSPEND; + struct suspend_info { int cancelled; }; @@ -98,6 +109,10 @@ static void do_suspend(void) int err; struct suspend_info si; + lock_system_sleep(); + + suspend_mode = XEN_SUSPEND; + shutting_down = SHUTDOWN_SUSPEND; err = freeze_processes(); @@ -161,6 +176,10 @@ static void do_suspend(void) thaw_processes(); out: shutting_down = SHUTDOWN_INVALID; + + suspend_mode = NO_SUSPEND; + + unlock_system_sleep(); } #endif /* CONFIG_HIBERNATE_CALLBACKS */ @@ -372,3 +391,42 @@ int xen_setup_shutdown_event(void) EXPORT_SYMBOL_GPL(xen_setup_shutdown_event); subsys_initcall(xen_setup_shutdown_event); + +static int xen_pm_notifier(struct notifier_block *notifier, + unsigned long pm_event, void *unused) +{ + switch (pm_event) { + case PM_SUSPEND_PREPARE: + suspend_mode = PM_SUSPEND; + break; + case PM_HIBERNATION_PREPARE: + case PM_RESTORE_PREPARE: + suspend_mode = PM_HIBERNATION; + break; + case PM_POST_SUSPEND: + case PM_POST_RESTORE: + case PM_POST_HIBERNATION: + /* Set back to the default */ + suspend_mode = NO_SUSPEND; + break; + default: + pr_warn("Receive unknown PM event 0x%lx\n", pm_event); + return -EINVAL; + } + + return 0; +}; + +static struct notifier_block xen_pm_notifier_block = { + .notifier_call = xen_pm_notifier +}; + +static int xen_setup_pm_notifier(void) +{ + if (!xen_hvm_domain()) + return -ENODEV; + + return register_pm_notifier(&xen_pm_notifier_block); +} + +subsys_initcall(xen_setup_pm_notifier); From patchwork Fri Nov 2 09:00:06 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Axtens X-Patchwork-Id: 992209 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42mbdC3WJHzB4Ts; Fri, 2 Nov 2018 20:00:31 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1gIVJZ-0005Yr-08; Fri, 02 Nov 2018 09:00:25 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1gIVJX-0005YK-Tp for kernel-team@lists.canonical.com; Fri, 02 Nov 2018 09:00:23 +0000 Received: from mail-qk1-f197.google.com ([209.85.222.197]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1gIVJX-0003Lq-HQ for kernel-team@lists.canonical.com; Fri, 02 Nov 2018 09:00:23 +0000 Received: by mail-qk1-f197.google.com with SMTP id d196so2740650qkb.6 for ; Fri, 02 Nov 2018 02:00:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=iM8viHy45lT2anrrhzC9QxUoLLgb03jsDjAp4bi49QI=; b=OcJ3ynUckSyt8DfRwaEajfsgaa6WkoXn/ufow9V2pvYwzRZDDUp3911SP0BMQKtv5/ 4upYHlJjUDeLp4JXps6/rVpBNJTAuAX4M90eTIi6okpMIeyNg2NFT0bHZVKfr4NgtC1V n+hdbNXCpAzcB6tSH8HTp6FDCEpffCeNr0dMdfkS3/GJnSRb6TN1N+UUIrO+RImCR5zK XTUS4kqgslaDsND5zJxozbQCgYNrMfZtslpaaqKvjK6t1vYfGeIt0rEPIMqRt2Bi+jQc rN94O1n2e6SE+WJ5pVPOCzr6fC1h1XYBBUF+s4QrWfjuowe42t6vV/0Gj3ruU16mL81O 6nmg== X-Gm-Message-State: AGRZ1gLwjrKHseoVbgDKkRW/Qfh2BUHWCzAUwbV52Mz44z1dXi4vjHau VOSRzEPInFmCLW9YIV74TqHHFMPP2DSGx6d70XyO2RaIkMWfdROmQYx4rR7H7jXTvY6QrwodcNg dczVYUM5b3aFHVrJTm2Tv7jn/89d0qg8ZC0oshUNpjg5+/QGs X-Received: by 2002:a37:ef07:: with SMTP id j7mr4756433qkk.35.1541149222499; Fri, 02 Nov 2018 02:00:22 -0700 (PDT) X-Google-Smtp-Source: AJdET5eyyTgzOMOvOiagbKn6CoNL13AMp9x5b8XfgliHujc5jh00OXMSg9c9gHPcmcgdXhYz2ai80A== X-Received: by 2002:a37:ef07:: with SMTP id j7mr4756426qkk.35.1541149222335; Fri, 02 Nov 2018 02:00:22 -0700 (PDT) Received: from linkitivity.iinet.net.au ([2001:67c:1562:8007::aac:4356]) by smtp.gmail.com with ESMTPSA id a4-v6sm23521783qkb.62.2018.11.02.02.00.20 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 02 Nov 2018 02:00:21 -0700 (PDT) From: Daniel Axtens To: kernel-team@lists.canonical.com Subject: [SRU bionic-aws][PATCH 2/6] xen/manage: introduce helper function to know the on-going suspend mode Date: Fri, 2 Nov 2018 20:00:06 +1100 Message-Id: <20181102090010.2643-3-daniel.axtens@canonical.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181102090010.2643-1-daniel.axtens@canonical.com> References: <20181102090010.2643-1-daniel.axtens@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Munehisa Kamata BugLink: https://bugs.launchpad.net/bugs/1801305 Introduce simple functions which help to know the on-going suspend mode so that other Xen-related code can behave differently according to the current suspend mode. Signed-off-by: Munehisa Kamata Signed-off-by: Anchal Agarwal Reviewed-by: Alakesh Haloi Reviewed-by: Sebastian Biemueller Reviewed-by: Munehisa Kamata Reviewed-by: Eduardo Valentin CR: https://cr.amazon.com/r/8273190/ (cherry-picked from 0014-xen-manage-introduce-helper-function-to-know-the-on-.patch in AWS 4.14 kernel SRPM) Signed-off-by: Daniel Axtens --- drivers/xen/manage.c | 15 +++++++++++++++ include/xen/xen-ops.h | 4 ++++ 2 files changed, 19 insertions(+) diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c index 8f9ea87ba93e..326631d9b80a 100644 --- a/drivers/xen/manage.c +++ b/drivers/xen/manage.c @@ -50,6 +50,21 @@ enum suspend_modes { /* Protected by pm_mutex */ static enum suspend_modes suspend_mode = NO_SUSPEND; +bool xen_suspend_mode_is_xen_suspend(void) +{ + return suspend_mode == XEN_SUSPEND; +} + +bool xen_suspend_mode_is_pm_suspend(void) +{ + return suspend_mode == PM_SUSPEND; +} + +bool xen_suspend_mode_is_pm_hibernation(void) +{ + return suspend_mode == PM_HIBERNATION; +} + struct suspend_info { int cancelled; }; diff --git a/include/xen/xen-ops.h b/include/xen/xen-ops.h index fd23e42c6024..be78f6f29f32 100644 --- a/include/xen/xen-ops.h +++ b/include/xen/xen-ops.h @@ -39,6 +39,10 @@ u64 xen_steal_clock(int cpu); int xen_setup_shutdown_event(void); +bool xen_suspend_mode_is_xen_suspend(void); +bool xen_suspend_mode_is_pm_suspend(void); +bool xen_suspend_mode_is_pm_hibernation(void); + extern unsigned long *xen_contiguous_bitmap; #ifdef CONFIG_XEN_PV From patchwork Fri Nov 2 09:00:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Axtens X-Patchwork-Id: 992210 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42mbdF40ddzB4Tq; Fri, 2 Nov 2018 20:00:33 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1gIVJc-0005aY-6i; Fri, 02 Nov 2018 09:00:28 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1gIVJa-0005Zc-Cf for kernel-team@lists.canonical.com; Fri, 02 Nov 2018 09:00:26 +0000 Received: from mail-qk1-f197.google.com ([209.85.222.197]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1gIVJa-0003Lt-2J for kernel-team@lists.canonical.com; Fri, 02 Nov 2018 09:00:26 +0000 Received: by mail-qk1-f197.google.com with SMTP id h68so2790151qke.3 for ; Fri, 02 Nov 2018 02:00:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=3ZfY0TJZQ9zqaqIaPAQ+PCsq5SMijwJQd1ZmOEUUVVQ=; b=GKV8kMepDE0+pZUeYoN0H64JobZCmIs5s7Dwy1+cBcDzkxFNdJj/X91LehYP8pcWE1 Y7ZYd1zLbO+GrayRNPUAp7MPDf6dFkiWiG5RbggqN+a9PVaR89oKYdbS3NNaIiQ0QHJo SUMu81zpRMub6jhR/1hbta/xB31L1jEVolUaGmRE8xHyhzzeOyfRwzoBmq1rnY86eDEl g7ajaxBaRtVz+g4LJ4HVfE/60dALTSpx3KWRh8xjFMbtE3D5KiNZpTHuSTPQ5oufYd/X dyOGgD2V702fTO4KDAVTpNClsVAoM2v8ZSXrffbcH0tvf2lKWxN2Tsc+VQVbtWKGXlyF sjSg== X-Gm-Message-State: AGRZ1gIPHEyS36l0hL0X6prpXL3Rh8K/hql2cUncnHaO81WlH6XZqq0/ iDcmgO4EtGcvdD1SIj1+03fL/PZ39MFY3PPA+4VRlV308tYQSMHxlL5BkXP3Exngd62fsRDONU4 SScV7mYtshfnVEPpCOsuvpeUtg+0dhms4r2q+wrXoYvwi8bB6 X-Received: by 2002:ac8:1b34:: with SMTP id y49mr2988555qtj.374.1541149225048; Fri, 02 Nov 2018 02:00:25 -0700 (PDT) X-Google-Smtp-Source: AJdET5diuqmm2nvrdN9KpoQ7RB9z7239/mm8fQa4wF/TbuLVO6q61QuEovgo8G9tIv+tCoCzuoth4Q== X-Received: by 2002:ac8:1b34:: with SMTP id y49mr2988543qtj.374.1541149224816; Fri, 02 Nov 2018 02:00:24 -0700 (PDT) Received: from linkitivity.iinet.net.au ([2001:67c:1562:8007::aac:4356]) by smtp.gmail.com with ESMTPSA id a4-v6sm23521783qkb.62.2018.11.02.02.00.22 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 02 Nov 2018 02:00:24 -0700 (PDT) From: Daniel Axtens To: kernel-team@lists.canonical.com Subject: [SRU bionic-aws][PATCH 3/6] xenbus: add freeze/thaw/restore callbacks support Date: Fri, 2 Nov 2018 20:00:07 +1100 Message-Id: <20181102090010.2643-4-daniel.axtens@canonical.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181102090010.2643-1-daniel.axtens@canonical.com> References: <20181102090010.2643-1-daniel.axtens@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Munehisa Kamata BugLink: https://bugs.launchpad.net/bugs/1801305 Since commit b3e96c0c7562 ("xen: use freeze/restore/thaw PM events for suspend/resume/chkpt"), xenbus uses PMSG_FREEZE, PMSG_THAW and PMSG_RESTORE events for Xen suspend. However, they're actually assigned to xenbus_dev_suspend(), xenbus_dev_cancel() and xenbus_dev_resume() respectively, and only suspend and resume callbacks are supported at driver level. To support PM suspend and PM hibernation, modify the bus level PM callbacks to invoke not only device driver's suspend/resume but also freeze/thaw/restore. Note that we'll use freeze/restore callbacks even for PM suspend whereas suspend/resume callbacks are normally used in the case, becausae the existing xenbus device drivers already have suspend/resume callbacks specifically designed for Xen suspend. So we can allow the device drivers to keep the existing callbacks wihtout modification. Signed-off-by: Munehisa Kamata Signed-off-by: Anchal Agarwal Reviewed-by: Munehisa Kamata Reviewed-by: Eduardo Valentin CR: https://cr.amazon.com/r/8273200/ (cherry-picked from 0015-xenbus-add-freeze-thaw-restore-callbacks-support.patch in AWS 4.14 kernel SRPM) Signed-off-by: Daniel Axtens --- drivers/xen/xenbus/xenbus_probe.c | 102 +++++++++++++++++++++++++----- include/xen/xenbus.h | 3 + 2 files changed, 89 insertions(+), 16 deletions(-) diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c index ec9eb4fba59c..95b0a6d0acce 100644 --- a/drivers/xen/xenbus/xenbus_probe.c +++ b/drivers/xen/xenbus/xenbus_probe.c @@ -49,6 +49,7 @@ #include #include #include +#include #include #include @@ -588,26 +589,47 @@ int xenbus_dev_suspend(struct device *dev) struct xenbus_driver *drv; struct xenbus_device *xdev = container_of(dev, struct xenbus_device, dev); + int (*cb)(struct xenbus_device *) = NULL; + bool xen_suspend = xen_suspend_mode_is_xen_suspend(); DPRINTK("%s", xdev->nodename); if (dev->driver == NULL) return 0; drv = to_xenbus_driver(dev->driver); - if (drv->suspend) - err = drv->suspend(xdev); - if (err) - pr_warn("suspend %s failed: %i\n", dev_name(dev), err); + + if (xen_suspend) + cb = drv->suspend; + else + cb = drv->freeze; + + if (cb) + err = cb(xdev); + + if (err) { + pr_warn("%s %s failed: %i\n", xen_suspend ? + "suspend" : "freeze", dev_name(dev), err); + return err; + } + + if (!xen_suspend) { + /* Forget otherend since this can become stale after restore */ + free_otherend_watch(xdev); + free_otherend_details(xdev); + } + return 0; } EXPORT_SYMBOL_GPL(xenbus_dev_suspend); int xenbus_dev_resume(struct device *dev) { - int err; + int err = 0; struct xenbus_driver *drv; struct xenbus_device *xdev = container_of(dev, struct xenbus_device, dev); + int (*cb)(struct xenbus_device *) = NULL; + bool xen_suspend = xen_suspend_mode_is_xen_suspend(); DPRINTK("%s", xdev->nodename); @@ -616,24 +638,34 @@ int xenbus_dev_resume(struct device *dev) drv = to_xenbus_driver(dev->driver); err = talk_to_otherend(xdev); if (err) { - pr_warn("resume (talk_to_otherend) %s failed: %i\n", + pr_warn("%s (talk_to_otherend) %s failed: %i\n", + xen_suspend ? "resume" : "restore", dev_name(dev), err); return err; } - xdev->state = XenbusStateInitialising; + if (xen_suspend) + xdev->state = XenbusStateInitialising; - if (drv->resume) { - err = drv->resume(xdev); - if (err) { - pr_warn("resume %s failed: %i\n", dev_name(dev), err); - return err; - } + if (xen_suspend) + cb = drv->resume; + else + cb = drv->restore; + + if (cb) + err = cb(xdev); + + if (err) { + pr_warn("%s %s failed: %i\n", + xen_suspend ? "resume" : "restore", + dev_name(dev), err); + return err; } err = watch_otherend(xdev); if (err) { - pr_warn("resume (watch_otherend) %s failed: %d.\n", + pr_warn("%s (watch_otherend) %s failed: %d.\n", + xen_suspend ? "resume" : "restore", dev_name(dev), err); return err; } @@ -644,8 +676,46 @@ EXPORT_SYMBOL_GPL(xenbus_dev_resume); int xenbus_dev_cancel(struct device *dev) { - /* Do nothing */ - DPRINTK("cancel"); + int err = 0; + struct xenbus_driver *drv; + struct xenbus_device *xdev + = container_of(dev, struct xenbus_device, dev); + bool xen_suspend = xen_suspend_mode_is_xen_suspend(); + + if (xen_suspend) { + /* Do nothing */ + DPRINTK("cancel"); + return 0; + } + + DPRINTK("%s", xdev->nodename); + + if (dev->driver == NULL) + return 0; + drv = to_xenbus_driver(dev->driver); + + err = talk_to_otherend(xdev); + if (err) { + pr_warn("thaw (talk_to_otherend) %s failed: %d.\n", + dev_name(dev), err); + return err; + } + + if (drv->thaw) { + err = drv->thaw(xdev); + if (err) { + pr_warn("thaw %s failed: %i\n", dev_name(dev), err); + return err; + } + } + + err = watch_otherend(xdev); + if (err) { + pr_warn("thaw (watch_otherend) %s failed: %d.\n", + dev_name(dev), err); + return err; + } + return 0; } EXPORT_SYMBOL_GPL(xenbus_dev_cancel); diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h index 869c816d5f8c..20261d5f4e78 100644 --- a/include/xen/xenbus.h +++ b/include/xen/xenbus.h @@ -100,6 +100,9 @@ struct xenbus_driver { int (*remove)(struct xenbus_device *dev); int (*suspend)(struct xenbus_device *dev); int (*resume)(struct xenbus_device *dev); + int (*freeze)(struct xenbus_device *dev); + int (*thaw)(struct xenbus_device *dev); + int (*restore)(struct xenbus_device *dev); int (*uevent)(struct xenbus_device *, struct kobj_uevent_env *); struct device_driver driver; int (*read_otherend_details)(struct xenbus_device *dev); From patchwork Fri Nov 2 09:00:08 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Axtens X-Patchwork-Id: 992211 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42mbdK4gHNzB4Tq; Fri, 2 Nov 2018 20:00:37 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1gIVJf-0005da-To; Fri, 02 Nov 2018 09:00:31 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1gIVJd-0005bk-FE for kernel-team@lists.canonical.com; Fri, 02 Nov 2018 09:00:29 +0000 Received: from mail-qk1-f200.google.com ([209.85.222.200]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1gIVJd-0003Lz-0h for kernel-team@lists.canonical.com; Fri, 02 Nov 2018 09:00:29 +0000 Received: by mail-qk1-f200.google.com with SMTP id s123-v6so2685012qkf.12 for ; Fri, 02 Nov 2018 02:00:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=7W9aZrACa1FjBtfaWH0aXKxUGxzelGvUfLv5/xXq/bk=; b=s+h2q7OnDSC7ngaOgg3EWENLjHdjJobCnLrPT0YcuqnUu0N+u3hYiBJL8CYKGDUHsh 6iHF5Pf9jZdBdw9ObqGNxAfhJJ2ysnfXBcpiI17VoRmdcEHHc53EqT7WtB0fKvnol9uN ILMOx1uE7WaW8dRnpnyqCXmQW6kj2LpEZRokU6xF6Y6LYVP/VYd+NqGYIBElc6oe2Xh1 u7EUveZRd+nJr0IKu+SnD8gvK5aU9vcSfaM83n8oajR0l7PYXCx8FiXh2IgjtdqHCoJ3 w9InqOc/bk1Ob7O8XPtd6Dy1B+/fn702For/hKV6vmujymaq6Mpvr7oEhyUq1ZFdPHU3 9G0Q== X-Gm-Message-State: AGRZ1gIUQ7InUMsKh0KWh3x6TBYO6RpZfMOY0cJu5eYIBqwVkTK/EHgk DRPtjubRROgx87V7e0j9xZLVPQrXnizTxE+Q7APIdnYkPbS22iYDLwTgwGx3mgDMmB9vWaKWecj KwUm7s4TsAxl/TpvMdtFh1XRPNR6cFuUpXQhMqm5Tiai9Px9E X-Received: by 2002:ac8:2c3a:: with SMTP id d55-v6mr354825qta.152.1541149227936; Fri, 02 Nov 2018 02:00:27 -0700 (PDT) X-Google-Smtp-Source: AJdET5eGSmFcvUB34o1SQxFYnaII4HwvLiL5pjpAUXDou9bhBctZy2J6Hi7IIXmQ/ReNJoa0uv4utA== X-Received: by 2002:ac8:2c3a:: with SMTP id d55-v6mr354809qta.152.1541149227593; Fri, 02 Nov 2018 02:00:27 -0700 (PDT) Received: from linkitivity.iinet.net.au ([2001:67c:1562:8007::aac:4356]) by smtp.gmail.com with ESMTPSA id a4-v6sm23521783qkb.62.2018.11.02.02.00.25 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 02 Nov 2018 02:00:26 -0700 (PDT) From: Daniel Axtens To: kernel-team@lists.canonical.com Subject: [SRU bionic-aws][PATCH 4/6] xen-blkfront: add callbacks for PM suspend and hibernation Date: Fri, 2 Nov 2018 20:00:08 +1100 Message-Id: <20181102090010.2643-5-daniel.axtens@canonical.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181102090010.2643-1-daniel.axtens@canonical.com> References: <20181102090010.2643-1-daniel.axtens@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Munehisa Kamata BugLink: https://bugs.launchpad.net/bugs/1801305 Add freeze and restore callbacks for PM suspend and hibernation support. The freeze handler stops a block-layer queue and disconnect the frontend from the backend while freeing ring_info and associated resources. The restore handler re-allocates ring_info and re-connect to the backedend, so the rest of the kernel can continue to use the block device transparently.Also, the handlers are used for both PM suspend and hibernation so that we can keep the existing suspend/resume callbacks for Xen suspend without modification. If a backend doesn't have commit 12ea729645ac ("xen/blkback: unmap all persistent grants when frontend gets disconnected"), the frontend may see massive amount of grant table warning when freeing resources. [ 36.852659] deferring g.e. 0xf9 (pfn 0xffffffffffffffff) [ 36.855089] xen:grant_table: WARNING: g.e. 0x112 still in use! In this case, persistent grants would need to be disabled. Ensure no reqs/rsps in rings before disconnecting. When disconnecting the frontend from the backend in blkfront_freeze(), there still may be unconsumed requests or responses in the rings, especially when the backend is backed by network-based device. If the frontend gets disconnected with such reqs/rsps remaining there, it can cause grant warnings and/or losing reqs/rsps by freeing pages afterward. This can lead resumed kernel into unrecoverable state like unexpected freeing of grant page and/or hung task due to the lost reqs or rsps. Therefore we have to ensure that there is no unconsumed requests or responses before disconnecting. Actually, the frontend just needs to wait for some amount of time so that the backend can process the requests, put responses and notify the frontend back. Timeout used here is based on some heuristic. If we somehow hit the timeout, it would mean something serious happens in the backend, the frontend will just return an error to PM core and PM suspend/hibernation will be aborted. This may be something should be fixed by the backend side, but a frontend side fix is probably still worth doing to work with broader backends. Backport Note: Unlike 4.9 kernel, blk-mq is default for 4.14 kernel and request-based mode cod eis not included in this frontend driver. Signed-off-by: Munehisa Kamata Signed-off-by: Anchal Agarwal Reviewed-by: Munehisa Kamata Reviewed-by: Eduardo Valentin CR: https://cr.amazon.com/r/8297625/ (cherry-picked from 0018-xen-blkfront-add-callbacks-for-PM-suspend-and-hibern.patch in AWS 4.14 kernel SRPM) Signed-off-by: Daniel Axtens --- drivers/block/xen-blkfront.c | 164 +++++++++++++++++++++++++++++++++-- 1 file changed, 156 insertions(+), 8 deletions(-) diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index 7d23225f79ed..e410535540f4 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -46,6 +46,8 @@ #include #include #include +#include +#include #include #include @@ -78,6 +80,8 @@ enum blkif_state { BLKIF_STATE_DISCONNECTED, BLKIF_STATE_CONNECTED, BLKIF_STATE_SUSPENDED, + BLKIF_STATE_FREEZING, + BLKIF_STATE_FROZEN }; struct grant { @@ -217,6 +221,7 @@ struct blkfront_info /* Save uncomplete reqs and bios for migration. */ struct list_head requests; struct bio_list bio_list; + struct completion wait_backend_disconnected; }; static unsigned int nr_minors; @@ -263,6 +268,16 @@ static DEFINE_SPINLOCK(minor_lock); static int blkfront_setup_indirect(struct blkfront_ring_info *rinfo); static void blkfront_gather_backend_features(struct blkfront_info *info); static int negotiate_mq(struct blkfront_info *info); +static void __blkif_free(struct blkfront_info *info); + +static inline bool blkfront_ring_is_busy(struct blkif_front_ring *ring) +{ + if (RING_SIZE(ring) > RING_FREE_REQUESTS(ring) || + RING_HAS_UNCONSUMED_RESPONSES(ring)) + return true; + else + return false; +} static int get_id_from_freelist(struct blkfront_ring_info *rinfo) { @@ -997,6 +1012,7 @@ static int xlvbd_init_blk_queue(struct gendisk *gd, u16 sector_size, info->sector_size = sector_size; info->physical_sector_size = physical_sector_size; blkif_set_queue_limits(info); + init_completion(&info->wait_backend_disconnected); return 0; } @@ -1220,6 +1236,8 @@ static void xlvbd_release_gendisk(struct blkfront_info *info) /* Already hold rinfo->ring_lock. */ static inline void kick_pending_request_queues_locked(struct blkfront_ring_info *rinfo) { + if (unlikely(rinfo->dev_info->connected == BLKIF_STATE_FREEZING)) + return; if (!RING_FULL(&rinfo->ring)) blk_mq_start_stopped_hw_queues(rinfo->dev_info->rq, true); } @@ -1343,8 +1361,6 @@ static void blkif_free_ring(struct blkfront_ring_info *rinfo) static void blkif_free(struct blkfront_info *info, int suspend) { - unsigned int i; - /* Prevent new requests being issued until we fix things up. */ info->connected = suspend ? BLKIF_STATE_SUSPENDED : BLKIF_STATE_DISCONNECTED; @@ -1352,6 +1368,13 @@ static void blkif_free(struct blkfront_info *info, int suspend) if (info->rq) blk_mq_stop_hw_queues(info->rq); + __blkif_free(info); +} + +static void __blkif_free(struct blkfront_info *info) +{ + unsigned int i; + for (i = 0; i < info->nr_rings; i++) blkif_free_ring(&info->rinfo[i]); @@ -1555,8 +1578,10 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id) struct blkfront_ring_info *rinfo = (struct blkfront_ring_info *)dev_id; struct blkfront_info *info = rinfo->dev_info; - if (unlikely(info->connected != BLKIF_STATE_CONNECTED)) - return IRQ_HANDLED; + if (unlikely(info->connected != BLKIF_STATE_CONNECTED)) { + if (info->connected != BLKIF_STATE_FREEZING) + return IRQ_HANDLED; + } spin_lock_irqsave(&rinfo->ring_lock, flags); again: @@ -2006,6 +2031,7 @@ static int blkif_recover(struct blkfront_info *info) struct bio *bio; unsigned int segs; + bool frozen = info->connected == BLKIF_STATE_FROZEN; blkfront_gather_backend_features(info); /* Reset limits changed by blk_mq_update_nr_hw_queues(). */ blkif_set_queue_limits(info); @@ -2032,6 +2058,9 @@ static int blkif_recover(struct blkfront_info *info) kick_pending_request_queues(rinfo); } + if (frozen) + return 0; + list_for_each_entry_safe(req, n, &info->requests, queuelist) { /* Requeue pending requests (flush or discard) */ list_del_init(&req->queuelist); @@ -2336,6 +2365,7 @@ static void blkfront_connect(struct blkfront_info *info) return; case BLKIF_STATE_SUSPENDED: + case BLKIF_STATE_FROZEN: /* * If we are recovering from suspension, we need to wait * for the backend to announce it's features before @@ -2453,13 +2483,38 @@ static void blkback_changed(struct xenbus_device *dev, break; case XenbusStateClosed: - if (dev->state == XenbusStateClosed) + if (dev->state == XenbusStateClosed) { + if (info->connected == BLKIF_STATE_FREEZING) { + __blkif_free(info); + info->connected = BLKIF_STATE_FROZEN; + complete(&info->wait_backend_disconnected); + break; + } + break; + } + + /* + * We may somehow receive backend's Closed again while thawing + * or restoring and it causes thawing or restoring to fail. + * Ignore such unexpected state anyway. + */ + if (info->connected == BLKIF_STATE_FROZEN && + dev->state == XenbusStateInitialised) { + dev_dbg(&dev->dev, + "ignore the backend's Closed state: %s", + dev->nodename); + break; + } /* fall through */ case XenbusStateClosing: - if (info) - blkfront_closing(info); - break; + if (info) { + if (info->connected == BLKIF_STATE_FREEZING) + xenbus_frontend_closed(dev); + else + blkfront_closing(info); + } + break; } } @@ -2595,6 +2650,96 @@ static void blkif_release(struct gendisk *disk, fmode_t mode) mutex_unlock(&blkfront_mutex); } +static int blkfront_freeze(struct xenbus_device *dev) +{ + unsigned int i; + struct blkfront_info *info = dev_get_drvdata(&dev->dev); + struct blkfront_ring_info *rinfo; + struct blkif_front_ring *ring; + /* This would be reasonable timeout as used in xenbus_dev_shutdown() */ + unsigned int timeout = 5 * HZ; + int err = 0; + + info->connected = BLKIF_STATE_FREEZING; + + blk_mq_stop_hw_queues(info->rq); + + for (i = 0; i < info->nr_rings; i++) { + rinfo = &info->rinfo[i]; + + gnttab_cancel_free_callback(&rinfo->callback); + flush_work(&rinfo->work); + } + + for (i = 0; i < info->nr_rings; i++) { + spinlock_t *lock; + bool busy; + unsigned long req_timeout_ms = 25; + unsigned long ring_timeout; + + rinfo = &info->rinfo[i]; + ring = &rinfo->ring; + + lock = &rinfo->ring_lock; + + ring_timeout = jiffies + + msecs_to_jiffies(req_timeout_ms * RING_SIZE(ring)); + + do { + spin_lock_irq(lock); + busy = blkfront_ring_is_busy(ring); + spin_unlock_irq(lock); + + if (busy) + msleep(req_timeout_ms); + else + break; + } while (time_is_after_jiffies(ring_timeout)); + + /* Timed out */ + if (busy) { + xenbus_dev_error(dev, err, "the ring is still busy"); + info->connected = BLKIF_STATE_CONNECTED; + return -EBUSY; + } + } + + /* Kick the backend to disconnect */ + xenbus_switch_state(dev, XenbusStateClosing); + + /* + * We don't want to move forward before the frontend is diconnected + * from the backend cleanly. + */ + timeout = wait_for_completion_timeout(&info->wait_backend_disconnected, + timeout); + if (!timeout) { + err = -EBUSY; + xenbus_dev_error(dev, err, "Freezing timed out;" + "the device may become inconsistent state"); + } + + return err; +} + +static int blkfront_restore(struct xenbus_device *dev) +{ + struct blkfront_info *info = dev_get_drvdata(&dev->dev); + int err = 0; + + err = negotiate_mq(info); + if (err) + goto out; + + err = talk_to_blkback(dev, info); + if (err) + goto out; + blk_mq_update_nr_hw_queues(&info->tag_set, info->nr_rings); + +out: + return err; +} + static const struct block_device_operations xlvbd_block_fops = { .owner = THIS_MODULE, @@ -2617,6 +2762,9 @@ static struct xenbus_driver blkfront_driver = { .resume = blkfront_resume, .otherend_changed = blkback_changed, .is_ready = blkfront_is_ready, + .freeze = blkfront_freeze, + .thaw = blkfront_restore, + .restore = blkfront_restore }; static int __init xlblk_init(void) From patchwork Fri Nov 2 09:00:09 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Axtens X-Patchwork-Id: 992212 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42mbdN3g1VzB4Tq; Fri, 2 Nov 2018 20:00:40 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1gIVJj-0005g7-Hl; Fri, 02 Nov 2018 09:00:35 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1gIVJg-0005eE-Jm for kernel-team@lists.canonical.com; Fri, 02 Nov 2018 09:00:32 +0000 Received: from mail-qk1-f200.google.com ([209.85.222.200]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1gIVJg-0003M8-5S for kernel-team@lists.canonical.com; Fri, 02 Nov 2018 09:00:32 +0000 Received: by mail-qk1-f200.google.com with SMTP id n68so2703204qkn.8 for ; Fri, 02 Nov 2018 02:00:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=tsNt3UuMEdBS9EeudcvLhDCTUS70s/VNTcHjKIXVcnw=; b=Lo79gcWOq8irP+ohaC9tgzNhghXrdgar1d71yebWYhV0ncFQnXZDFJ4FohYBV3Am3b YSroxFsa4jNl1CziLyKEfBKVjOFaiyHBxPOOmstr8J0X9fOpokcVPsuX8E8G+nQJujNd b5o7tVmRIvsYaojE/kjispi7fC2gUiVXc3q+HKkQzhcU2QL/IPuKQk/2d26ccMJxLsfJ glpg5/Amhke/Qudsq7E5evnOO0SjQ2H/sSgGfmZcNur87/xl0mWRWxxqMQMdx3Br3uH/ sbksSS0fLPvc/vriDSENXeClWyDYqLP6HiLPkUElKyxwoj2+SMvIJwmOqZpTMDFcoSs1 m38Q== X-Gm-Message-State: AGRZ1gLVHCuHdbz53MphTXLaWkYH3OKmxRILydQJoLB9wkOEWebvAlVk vdcSzGVN3oQwXLxVhbMJ/ockY/N6g39FU6GWuP7lFQ+GodhczCB88qA3B4NrI0rm8rnHyNSKEBt zAnodjmXcV5PljXcs1YQ3vCyCyZvjtsW8oBFs8Ux9ftH1b+sW X-Received: by 2002:ac8:33b6:: with SMTP id c51mr9888518qtb.190.1541149231048; Fri, 02 Nov 2018 02:00:31 -0700 (PDT) X-Google-Smtp-Source: AJdET5cyP8xxEIivWhZA48TdvA3IfdkjFusAi1M3UElEgnpCNqcow8tkdFJfW+XDVCW+pIopa/MGuQ== X-Received: by 2002:ac8:33b6:: with SMTP id c51mr9888499qtb.190.1541149230555; Fri, 02 Nov 2018 02:00:30 -0700 (PDT) Received: from linkitivity.iinet.net.au ([2001:67c:1562:8007::aac:4356]) by smtp.gmail.com with ESMTPSA id a4-v6sm23521783qkb.62.2018.11.02.02.00.28 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 02 Nov 2018 02:00:29 -0700 (PDT) From: Daniel Axtens To: kernel-team@lists.canonical.com Subject: [SRU bionic-aws][PATCH 5/6] xen-blkfront: resurrect request-based mode Date: Fri, 2 Nov 2018 20:00:09 +1100 Message-Id: <20181102090010.2643-6-daniel.axtens@canonical.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181102090010.2643-1-daniel.axtens@canonical.com> References: <20181102090010.2643-1-daniel.axtens@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Munehisa Kamata BugLink: https://bugs.launchpad.net/bugs/1801305 This change resurrect request-based mode which was completely dropped in commit 907c3eb18e0b ("xen-blkfront: convert to blk-mq APIs"). Not to make the queue lock stale, resurrect per-device (vbd) lock in blkfront_info which is never freed during Xen suspend and use it in request-based mode. This is bascially the same as what the driver was doing until commit 11659569f720 ("xen/blkfront: split per device io_lock"). If the driver is in blk-mq mode, just use the lock(s) in blkfront_ring_info. In commit b7420c1eaeac ("drivers/amazon: xen-blkfront: resurrect request-based mode"), we accidentally didn't bring piece of code which empties the request queue while saving bios. The logic was originally introduced in commit 402b27f9f2c2 ("xen-block: implement indirect descriptors"). It seems to be still required for request-based mode, so just do the same thing as before. Note that some suspend/resume logic were moved from blkif_recover() to blkfront_resume() in commit 7b427a59538a ("xen-blkfront: save uncompleted reqs in blkfront_resume()"), so add the logic to blkfront_resume(). Forward-port notes: As part of this forward port, we are no longer using out of tree xen-blkfront. Request based patch and its releated per device vbd lock has now been ported on top of intree xen-blkfront. For reference: 4.9 CR for resurrect request based mode: https://cr.amazon.com/r/6834653/ 4.9 CR for resurrect per-device (vbd) lock: https://cr.amazon.com/r/7475903/ 4.9 CR for empty the request queue while resuming: https://cr.amazon.com/r/7475918/ As part of forward-port, all the above 3 related patches, have been merged into a single commit. In 4.14.y kernel, we realized during forward-port and testing, that blk-mq stashes the error code for request right after the request structure in memory. Care was taken to not reuse this piece of memory for stashing error code in request mode as this can cause memory corruption. Hibernation: To not break git bisect and the hibernation feature, blkfront_freeze() and blkfront_resume() were modified as well to support request-based mode. Reported-by: Imre Palik Reviewed-by: Eduardo Valentin Reviewed-by: Munehisa Kamata Reviewed-by: Anchal Agarwal Signed-off-by: Munehisa Kamata Signed-off-by: Vallish Vaidyeshwara CR: https://cr.amazon.com/r/8309443 (cherry-picked from 0026-xen-blkfront-resurrect-request-based-mode.patch in AWS 4.14 kernel SRPM) Signed-off-by: Daniel Axtens --- drivers/block/xen-blkfront.c | 334 ++++++++++++++++++++++++++++------- 1 file changed, 268 insertions(+), 66 deletions(-) diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index e410535540f4..a5d0266ee1ba 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -156,6 +156,15 @@ MODULE_PARM_DESC(max_ring_page_order, "Maximum order of pages to be used for the #define BLK_MAX_RING_SIZE \ __CONST_RING_SIZE(blkif, XEN_PAGE_SIZE * XENBUS_MAX_RING_GRANTS) +static unsigned int blkfront_use_blk_mq = 0; +module_param_named(use_blk_mq, blkfront_use_blk_mq, int, S_IRUGO); +MODULE_PARM_DESC(use_blk_mq, "Enable blk-mq (default is 0)"); + +/* + * Index to the first available ring. + */ +#define FIRST_RING_ID (0) + /* * ring-ref%u i=(-1UL) would take 11 characters + 'ring-ref' is 8, so 19 * characters are enough. Define to 20 to keep consistent with backend. @@ -194,6 +203,12 @@ struct blkfront_ring_info { */ struct blkfront_info { + /* + * Per vbd lock which protects an associated blkfront_ring_info if the + * driver is in request-based mode. Use this lock always instead of per + * ring lock in that mode. + */ + spinlock_t io_lock; struct mutex mutex; struct xenbus_device *xbdev; struct gendisk *gd; @@ -265,6 +280,19 @@ static DEFINE_SPINLOCK(minor_lock); #define GREFS(_psegs) ((_psegs) * GRANTS_PER_PSEG) +/* Macro to save error status */ +#define BLKIF_REQ_PUT_ERROR_STATUS(req, error, status) \ + do { \ + if (blkfront_use_blk_mq) \ + blkif_req(req)->error = status; \ + else \ + error = status; \ + } while (0) + +/* Macro to retrieve error status */ +#define BLKIF_REQ_GET_ERROR_STATUS(req, error) \ + ((blkfront_use_blk_mq) ? blkif_req(req)->error : error) + static int blkfront_setup_indirect(struct blkfront_ring_info *rinfo); static void blkfront_gather_backend_features(struct blkfront_info *info); static int negotiate_mq(struct blkfront_info *info); @@ -895,6 +923,62 @@ static inline bool blkif_request_flush_invalid(struct request *req, !info->feature_fua)); } +static inline void blkif_complete_request(struct request *req, int error) +{ + if (blkfront_use_blk_mq) + blk_mq_complete_request(req); + else + __blk_end_request_all(req, error); +} + +/* + * do_blkif_request + * read a block; request is in a request queue + */ +static void do_blkif_request(struct request_queue *rq) +{ + struct blkfront_info *info = NULL; + struct request *req; + int queued; + + pr_debug("Entered do_blkif_request\n"); + + queued = 0; + + while ((req = blk_peek_request(rq)) != NULL) { + info = req->rq_disk->private_data; + + if (RING_FULL(&info->rinfo[FIRST_RING_ID].ring)) + goto wait; + + blk_start_request(req); + + if (blkif_request_flush_invalid(req, info)) { + __blk_end_request_all(req, BLK_STS_NOTSUPP); + continue; + } + + pr_debug("do_blk req %p: cmd_flags %u, sec %lx, " + "(%u/%u) [%s]\n", + req, req->cmd_flags, (unsigned long)blk_rq_pos(req), + blk_rq_cur_sectors(req), blk_rq_sectors(req), + rq_data_dir(req) ? "write" : "read"); + + if (blkif_queue_request(req, &info->rinfo[FIRST_RING_ID])) { + blk_requeue_request(rq, req); +wait: + /* Avoid pointless unplugs. */ + blk_stop_queue(rq); + break; + } + + queued++; + } + + if(queued != 0) + flush_requests(&info->rinfo[FIRST_RING_ID]); +} + static blk_status_t blkif_queue_rq(struct blk_mq_hw_ctx *hctx, const struct blk_mq_queue_data *qd) { @@ -980,30 +1064,37 @@ static int xlvbd_init_blk_queue(struct gendisk *gd, u16 sector_size, struct request_queue *rq; struct blkfront_info *info = gd->private_data; - memset(&info->tag_set, 0, sizeof(info->tag_set)); - info->tag_set.ops = &blkfront_mq_ops; - info->tag_set.nr_hw_queues = info->nr_rings; - if (HAS_EXTRA_REQ && info->max_indirect_segments == 0) { - /* - * When indirect descriptior is not supported, the I/O request - * will be split between multiple request in the ring. - * To avoid problems when sending the request, divide by - * 2 the depth of the queue. - */ - info->tag_set.queue_depth = BLK_RING_SIZE(info) / 2; - } else - info->tag_set.queue_depth = BLK_RING_SIZE(info); - info->tag_set.numa_node = NUMA_NO_NODE; - info->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_SG_MERGE; - info->tag_set.cmd_size = sizeof(struct blkif_req); - info->tag_set.driver_data = info; - - if (blk_mq_alloc_tag_set(&info->tag_set)) - return -EINVAL; - rq = blk_mq_init_queue(&info->tag_set); - if (IS_ERR(rq)) { - blk_mq_free_tag_set(&info->tag_set); - return PTR_ERR(rq); + if (blkfront_use_blk_mq) { + memset(&info->tag_set, 0, sizeof(info->tag_set)); + info->tag_set.ops = &blkfront_mq_ops; + info->tag_set.nr_hw_queues = info->nr_rings; + if (HAS_EXTRA_REQ && info->max_indirect_segments == 0) { + /* + * When indirect descriptior is not supported, the I/O request + * will be split between multiple request in the ring. + * To avoid problems when sending the request, divide by + * 2 the depth of the queue. + */ + info->tag_set.queue_depth = BLK_RING_SIZE(info) / 2; + } else + info->tag_set.queue_depth = BLK_RING_SIZE(info); + info->tag_set.numa_node = NUMA_NO_NODE; + info->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_SG_MERGE; + info->tag_set.cmd_size = sizeof(struct blkif_req); + info->tag_set.driver_data = info; + + if (blk_mq_alloc_tag_set(&info->tag_set)) + return -EINVAL; + rq = blk_mq_init_queue(&info->tag_set); + if (IS_ERR(rq)) { + blk_mq_free_tag_set(&info->tag_set); + return PTR_ERR(rq); + } + } else { + spin_lock_init(&info->io_lock); + rq = blk_init_queue(do_blkif_request, &info->io_lock); + if (IS_ERR(rq)) + return PTR_ERR(rq); } rq->queuedata = info; @@ -1202,21 +1293,29 @@ static int xlvbd_alloc_gendisk(blkif_sector_t capacity, static void xlvbd_release_gendisk(struct blkfront_info *info) { unsigned int minor, nr_minors, i; + unsigned long flags; if (info->rq == NULL) return; /* No more blkif_request(). */ - blk_mq_stop_hw_queues(info->rq); + if (blkfront_use_blk_mq) { + blk_mq_stop_hw_queues(info->rq); - for (i = 0; i < info->nr_rings; i++) { - struct blkfront_ring_info *rinfo = &info->rinfo[i]; + for (i = 0; i < info->nr_rings; i++) { + struct blkfront_ring_info *rinfo = &info->rinfo[i]; - /* No more gnttab callback work. */ - gnttab_cancel_free_callback(&rinfo->callback); + /* No more gnttab callback work. */ + gnttab_cancel_free_callback(&rinfo->callback); - /* Flush gnttab callback work. Must be done with no locks held. */ - flush_work(&rinfo->work); + /* Flush gnttab callback work. Must be done with no locks held. */ + flush_work(&rinfo->work); + } + } else { + spin_lock_irqsave(&info->io_lock, flags); + blk_stop_queue(info->rq); + gnttab_cancel_free_callback(&info->rinfo[FIRST_RING_ID].callback); + spin_unlock_irqrestore(&info->io_lock, flags); } del_gendisk(info->gd); @@ -1226,7 +1325,8 @@ static void xlvbd_release_gendisk(struct blkfront_info *info) xlbd_release_minors(minor, nr_minors); blk_cleanup_queue(info->rq); - blk_mq_free_tag_set(&info->tag_set); + if (blkfront_use_blk_mq) + blk_mq_free_tag_set(&info->tag_set); info->rq = NULL; put_disk(info->gd); @@ -1238,17 +1338,31 @@ static inline void kick_pending_request_queues_locked(struct blkfront_ring_info { if (unlikely(rinfo->dev_info->connected == BLKIF_STATE_FREEZING)) return; - if (!RING_FULL(&rinfo->ring)) + + if (RING_FULL(&rinfo->ring)) + return; + + if (blkfront_use_blk_mq) { blk_mq_start_stopped_hw_queues(rinfo->dev_info->rq, true); + } else { + /* Re-enable calldowns */ + blk_start_queue(rinfo->dev_info->rq); + /* Kick things off immediately */ + do_blkif_request(rinfo->dev_info->rq); + } } static void kick_pending_request_queues(struct blkfront_ring_info *rinfo) { unsigned long flags; + struct blkfront_info *info = rinfo->dev_info; + spinlock_t *lock; - spin_lock_irqsave(&rinfo->ring_lock, flags); + lock = blkfront_use_blk_mq ? &rinfo->ring_lock : &info->io_lock; + + spin_lock_irqsave(lock, flags); kick_pending_request_queues_locked(rinfo); - spin_unlock_irqrestore(&rinfo->ring_lock, flags); + spin_unlock_irqrestore(lock, flags); } static void blkif_restart_queue(struct work_struct *work) @@ -1259,6 +1373,7 @@ static void blkif_restart_queue(struct work_struct *work) kick_pending_request_queues(rinfo); } +/* Must be called with per vbd lock held if the frontend uses request-based */ static void blkif_free_ring(struct blkfront_ring_info *rinfo) { struct grant *persistent_gnt, *n; @@ -1341,6 +1456,9 @@ static void blkif_free_ring(struct blkfront_ring_info *rinfo) /* No more gnttab callback work. */ gnttab_cancel_free_callback(&rinfo->callback); + if (!blkfront_use_blk_mq) + spin_unlock_irq(&info->io_lock); + /* Flush gnttab callback work. Must be done with no locks held. */ flush_work(&rinfo->work); @@ -1362,11 +1480,18 @@ static void blkif_free_ring(struct blkfront_ring_info *rinfo) static void blkif_free(struct blkfront_info *info, int suspend) { /* Prevent new requests being issued until we fix things up. */ + if (!blkfront_use_blk_mq) + spin_lock_irq(&info->io_lock); + info->connected = suspend ? BLKIF_STATE_SUSPENDED : BLKIF_STATE_DISCONNECTED; /* No more blkif_request(). */ - if (info->rq) - blk_mq_stop_hw_queues(info->rq); + if (info->rq) { + if (blkfront_use_blk_mq) + blk_mq_stop_hw_queues(info->rq); + else + blk_stop_queue(info->rq); + } __blkif_free(info); } @@ -1577,13 +1702,17 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id) unsigned long flags; struct blkfront_ring_info *rinfo = (struct blkfront_ring_info *)dev_id; struct blkfront_info *info = rinfo->dev_info; + spinlock_t *lock; + int error = BLK_STS_OK; if (unlikely(info->connected != BLKIF_STATE_CONNECTED)) { if (info->connected != BLKIF_STATE_FREEZING) return IRQ_HANDLED; } - spin_lock_irqsave(&rinfo->ring_lock, flags); + lock = blkfront_use_blk_mq ? &rinfo->ring_lock : &info->io_lock; + + spin_lock_irqsave(lock, flags); again: rp = rinfo->ring.sring->rsp_prod; rmb(); /* Ensure we see queued responses up to 'rp'. */ @@ -1623,9 +1752,9 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id) } if (bret->status == BLKIF_RSP_OKAY) - blkif_req(req)->error = BLK_STS_OK; + BLKIF_REQ_PUT_ERROR_STATUS(req, error, BLK_STS_OK); else - blkif_req(req)->error = BLK_STS_IOERR; + BLKIF_REQ_PUT_ERROR_STATUS(req, error, BLK_STS_IOERR); switch (bret->operation) { case BLKIF_OP_DISCARD: @@ -1633,7 +1762,7 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id) struct request_queue *rq = info->rq; printk(KERN_WARNING "blkfront: %s: %s op failed\n", info->gd->disk_name, op_name(bret->operation)); - blkif_req(req)->error = BLK_STS_NOTSUPP; + BLKIF_REQ_PUT_ERROR_STATUS(req, error, BLK_STS_NOTSUPP); info->feature_discard = 0; info->feature_secdiscard = 0; queue_flag_clear(QUEUE_FLAG_DISCARD, rq); @@ -1645,17 +1774,19 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id) if (unlikely(bret->status == BLKIF_RSP_EOPNOTSUPP)) { printk(KERN_WARNING "blkfront: %s: %s op failed\n", info->gd->disk_name, op_name(bret->operation)); - blkif_req(req)->error = BLK_STS_NOTSUPP; + BLKIF_REQ_PUT_ERROR_STATUS(req, error, BLK_STS_NOTSUPP); } if (unlikely(bret->status == BLKIF_RSP_ERROR && rinfo->shadow[id].req.u.rw.nr_segments == 0)) { printk(KERN_WARNING "blkfront: %s: empty %s op failed\n", info->gd->disk_name, op_name(bret->operation)); - blkif_req(req)->error = BLK_STS_NOTSUPP; + BLKIF_REQ_PUT_ERROR_STATUS(req, error, BLK_STS_NOTSUPP); } - if (unlikely(blkif_req(req)->error)) { - if (blkif_req(req)->error == BLK_STS_NOTSUPP) - blkif_req(req)->error = BLK_STS_OK; + if (unlikely(BLKIF_REQ_GET_ERROR_STATUS(req, error))) { + if (BLKIF_REQ_GET_ERROR_STATUS(req, error) + == BLK_STS_NOTSUPP) + BLKIF_REQ_PUT_ERROR_STATUS(req, error, + BLK_STS_OK); info->feature_fua = 0; info->feature_flush = 0; xlvbd_flush(info); @@ -1672,7 +1803,7 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id) BUG(); } - blk_mq_complete_request(req); + blkif_complete_request(req, BLKIF_REQ_GET_ERROR_STATUS(req, error)); } rinfo->ring.rsp_cons = i; @@ -1687,7 +1818,7 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id) kick_pending_request_queues_locked(rinfo); - spin_unlock_irqrestore(&rinfo->ring_lock, flags); + spin_unlock_irqrestore(lock, flags); return IRQ_HANDLED; } @@ -1928,8 +2059,11 @@ static int negotiate_mq(struct blkfront_info *info) backend_max_queues = xenbus_read_unsigned(info->xbdev->otherend, "multi-queue-max-queues", 1); info->nr_rings = min(backend_max_queues, xen_blkif_max_queues); - /* We need at least one ring. */ - if (!info->nr_rings) + /* + * We need at least one ring. Also, do not allow to have multiple rings if blk-mq is + * not used. + */ + if (!info->nr_rings || !blkfront_use_blk_mq) info->nr_rings = 1; info->rinfo = kzalloc(sizeof(struct blkfront_ring_info) * info->nr_rings, GFP_KERNEL); @@ -1946,7 +2080,8 @@ static int negotiate_mq(struct blkfront_info *info) INIT_LIST_HEAD(&rinfo->grants); rinfo->dev_info = info; INIT_WORK(&rinfo->work, blkif_restart_queue); - spin_lock_init(&rinfo->ring_lock); + if (blkfront_use_blk_mq) + spin_lock_init(&rinfo->ring_lock); } return 0; } @@ -2047,6 +2182,10 @@ static int blkif_recover(struct blkfront_info *info) } xenbus_switch_state(info->xbdev, XenbusStateConnected); + /* blk_requeue_request below must be called with queue lock held */ + if (!blkfront_use_blk_mq) + spin_lock_irq(&info->io_lock); + /* Now safe for us to use the shared ring */ info->connected = BLKIF_STATE_CONNECTED; @@ -2055,20 +2194,34 @@ static int blkif_recover(struct blkfront_info *info) rinfo = &info->rinfo[r_index]; /* Kick any other new requests queued since we resumed */ - kick_pending_request_queues(rinfo); + if (blkfront_use_blk_mq) + kick_pending_request_queues(rinfo); + else + kick_pending_request_queues_locked(rinfo); } - if (frozen) + if (frozen) { + if (!blkfront_use_blk_mq) + spin_unlock_irq(&info->io_lock); return 0; + } list_for_each_entry_safe(req, n, &info->requests, queuelist) { /* Requeue pending requests (flush or discard) */ list_del_init(&req->queuelist); BUG_ON(req->nr_phys_segments > segs); - blk_mq_requeue_request(req, false); + if (blkfront_use_blk_mq) + blk_mq_requeue_request(req, false); + else + blk_requeue_request(info->rq, req); + } + + if (blkfront_use_blk_mq) { + blk_mq_start_stopped_hw_queues(info->rq, true); + blk_mq_kick_requeue_list(info->rq); + } else { + spin_unlock_irq(&info->io_lock); } - blk_mq_start_stopped_hw_queues(info->rq, true); - blk_mq_kick_requeue_list(info->rq); while ((bio = bio_list_pop(&info->bio_list)) != NULL) { /* Traverse the list of pending bios and re-queue them */ @@ -2125,14 +2278,47 @@ static int blkfront_resume(struct xenbus_device *dev) merge_bio.tail = shadow[j].request->biotail; bio_list_merge(&info->bio_list, &merge_bio); shadow[j].request->bio = NULL; - blk_mq_end_request(shadow[j].request, BLK_STS_OK); + if (blkfront_use_blk_mq) + blk_mq_end_request(shadow[j].request, BLK_STS_OK); + else + blk_end_request_all(shadow[j].request, BLK_STS_OK); } } + if (!blkfront_use_blk_mq) { + struct request *req; + struct bio_list merge_bio; + + /* + * Empty the queue, this is important because we might have + * requests in the queue with more segments than what we + * can handle now. + */ + spin_lock_irq(&info->io_lock); + while ((req = blk_fetch_request(info->rq)) != NULL) { + if (req_op(req) == REQ_OP_FLUSH || + req_op(req) == REQ_OP_DISCARD || + req_op(req) == REQ_OP_SECURE_ERASE || + req->cmd_flags & REQ_FUA) { + list_add(&req->queuelist, &info->requests); + continue; + } + merge_bio.head = req->bio; + merge_bio.tail = req->biotail; + bio_list_merge(&info->bio_list, &merge_bio); + req->bio = NULL; + if (req_op(req) == REQ_OP_FLUSH || + req->cmd_flags & REQ_FUA) + pr_alert("diskcache flush request found!\n"); + __blk_end_request_all(req, BLK_STS_OK); + } + spin_unlock_irq(&info->io_lock); + } + blkif_free(info, info->connected == BLKIF_STATE_CONNECTED); err = talk_to_blkback(dev, info); - if (!err) + if (!err && blkfront_use_blk_mq) blk_mq_update_nr_hw_queues(&info->tag_set, info->nr_rings); /* @@ -2485,6 +2671,8 @@ static void blkback_changed(struct xenbus_device *dev, case XenbusStateClosed: if (dev->state == XenbusStateClosed) { if (info->connected == BLKIF_STATE_FREEZING) { + if (!blkfront_use_blk_mq) + spin_lock_irq(&info->io_lock); __blkif_free(info); info->connected = BLKIF_STATE_FROZEN; complete(&info->wait_backend_disconnected); @@ -2661,14 +2849,25 @@ static int blkfront_freeze(struct xenbus_device *dev) int err = 0; info->connected = BLKIF_STATE_FREEZING; + + if (blkfront_use_blk_mq) { + blk_mq_stop_hw_queues(info->rq); - blk_mq_stop_hw_queues(info->rq); - - for (i = 0; i < info->nr_rings; i++) { - rinfo = &info->rinfo[i]; + for (i = 0; i < info->nr_rings; i++) { + rinfo = &info->rinfo[i]; + + gnttab_cancel_free_callback(&rinfo->callback); + flush_work(&rinfo->work); + } + } else { + spin_lock_irq(&info->io_lock); + blk_stop_queue(info->rq); + gnttab_cancel_free_callback( + &info->rinfo[FIRST_RING_ID].callback); + spin_unlock_irq(&info->io_lock); - gnttab_cancel_free_callback(&rinfo->callback); - flush_work(&rinfo->work); + blk_sync_queue(info->rq); + flush_work(&info->rinfo[FIRST_RING_ID].work); } for (i = 0; i < info->nr_rings; i++) { @@ -2680,7 +2879,8 @@ static int blkfront_freeze(struct xenbus_device *dev) rinfo = &info->rinfo[i]; ring = &rinfo->ring; - lock = &rinfo->ring_lock; + lock = blkfront_use_blk_mq ? + &rinfo->ring_lock : &info->io_lock; ring_timeout = jiffies + msecs_to_jiffies(req_timeout_ms * RING_SIZE(ring)); @@ -2734,7 +2934,9 @@ static int blkfront_restore(struct xenbus_device *dev) err = talk_to_blkback(dev, info); if (err) goto out; - blk_mq_update_nr_hw_queues(&info->tag_set, info->nr_rings); + + if (blkfront_use_blk_mq) + blk_mq_update_nr_hw_queues(&info->tag_set, info->nr_rings); out: return err; From patchwork Fri Nov 2 09:00:10 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Axtens X-Patchwork-Id: 992213 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42mbdR5MYpzB4Tq; Fri, 2 Nov 2018 20:00:43 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1gIVJm-0005i4-4x; Fri, 02 Nov 2018 09:00:38 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.86_2) (envelope-from ) id 1gIVJi-0005fU-Tm for kernel-team@lists.canonical.com; Fri, 02 Nov 2018 09:00:34 +0000 Received: from mail-qk1-f198.google.com ([209.85.222.198]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1gIVJi-0003MH-9i for kernel-team@lists.canonical.com; Fri, 02 Nov 2018 09:00:34 +0000 Received: by mail-qk1-f198.google.com with SMTP id u20-v6so2637522qka.21 for ; Fri, 02 Nov 2018 02:00:34 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=NfcFgSgTsJo107ul6JRgnI+VBmj/frZJY24daLQP/UI=; b=OBoxNWlaPpa2Hwi9ADHa9Gfu+84TuXR4Co4GtoACwRWCmSXYd4r/jqJLdDVXpLUVRr 0wumE7Aju5rgpb7X89Cis63IJlRE9PsqtUuZ+uffPpInTn6fbV8PsEdI6L+ETsF5xMNr P59p2KCNSvIW5g/9W7mJAugbAp5zW/vHE7qfxJpSfJplX5qDIoMjqtCgxbjxyOkVIgIi HUjLq2K4V4acNvdANSGe0y45KBuxu3mm7dfcdE9MjqHpxXubbcGmWElIu3cq0wJ9/d2X 8/RiAXo+OwDkNZ1XzLdzA4f64x2sk03uczdVkXg8plgF7cGmjWuvIQ7zpwvqoUOec4PP PO8g== X-Gm-Message-State: AGRZ1gJk4BLtuiJlePOwWyo/0mU6mDZS4x/zYac9/eCirYJe0VmXvOCh /yzhFQNJqPAB15P5k9C/+8xiiSpGB2jCxMPYi5bzoOochI0n31iefdH/TSUOpgF1nf0+WDQWsma pXLRYBXxIzD2EuxksUjdQIEEqEixkviufHs0jHpApukYpAD7c X-Received: by 2002:a0c:c3c8:: with SMTP id p8mr10128489qvi.90.1541149233298; Fri, 02 Nov 2018 02:00:33 -0700 (PDT) X-Google-Smtp-Source: AJdET5erkEeQfaKszluwa0Zd9o3WxpMWayX0nltdxICPOE7ykRarxMC4A3lZdmKLUvn2tD3yhXif7Q== X-Received: by 2002:a0c:c3c8:: with SMTP id p8mr10128473qvi.90.1541149233062; Fri, 02 Nov 2018 02:00:33 -0700 (PDT) Received: from linkitivity.iinet.net.au ([2001:67c:1562:8007::aac:4356]) by smtp.gmail.com with ESMTPSA id a4-v6sm23521783qkb.62.2018.11.02.02.00.30 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 02 Nov 2018 02:00:32 -0700 (PDT) From: Daniel Axtens To: kernel-team@lists.canonical.com Subject: [SRU bionic-aws][PATCH 6/6] xen-blkfront: Fixed blkfront_restore to remove a call to negotiate_mq Date: Fri, 2 Nov 2018 20:00:10 +1100 Message-Id: <20181102090010.2643-7-daniel.axtens@canonical.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20181102090010.2643-1-daniel.axtens@canonical.com> References: <20181102090010.2643-1-daniel.axtens@canonical.com> X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Anchal Agarwal BugLink: https://bugs.launchpad.net/bugs/1801305 The code for talk_to_blkback API changed in kernel-4.14.45 to include a call to negotiate_mq. Subsequent calls causes kernel panic [ 84.440105] Call Trace: [ 84.443707] talk_to_blkback+0x6d/0x8b0 [xen_blkfront] [ 84.449147] blkfront_restore+0x33/0x60 [xen_blkfront] [ 84.453336] ? xenbus_read_otherend_details+0x50/0xb0 [ 84.457804] xenbus_dev_cancel+0x5f/0x160 [ 84.463286] ? xenbus_dev_resume+0x170/0x170 [ 84.466891] dpm_run_callback+0x3b/0x100 [ 84.470516] device_resume+0x10d/0x420 [ 84.473844] dpm_resume+0xfd/0x2f0 [ 84.476984] hibernation_snapshot+0x218/0x410 [ 84.480794] hibernate+0x14b/0x270 [ 84.484030] state_store+0x50/0x60 [ 84.487443] kernfs_fop_write+0x105/0x180 [ 84.492695] __vfs_write+0x36/0x160 [ 84.496672] ? __audit_syscall_entry+0xbc/0x110 [ 84.502123] vfs_write+0xad/0x1a0 [ 84.506857] SyS_write+0x52/0xc0 [ 84.511420] do_syscall_64+0x67/0x100 [ 84.516365] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 84.522571] RIP: 0033:0x7f44a03407e4 [ 84.526210] RSP: 002b:00007ffd5e0ec3c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 84.534041] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f44a03407e4 [ 84.542571] RDX: 0000000000000004 RSI: 0000000001e94990 RDI: 0000000000000001 [ 84.549142] RBP: 0000000001e94990 R08: 00007f44a060c8c0 R09: 00007f44a0c57740 [ 84.554658] R10: 00007f44a03cd320 R11: 0000000000000246 R12: 0000000000000004 [ 84.560411] R13: 0000000000000001 R14: 00007f44a060b760 R15: 0000000000000004 [ 84.565744] Code: 39 ab e8 00 00 00 77 8a 31 c0 5b 5d c3 44 8b 05 50 57 00 00 45 85 c0 0f 84 2f ff ff ff 89 c0 48 69 f8 e0 40 01 00 e9 30 ff ff ff <0f> 0b 48 8b 7b 28 48 c7 c2 78 58 16 a0 be f4 ff ff ff e8 7e 37 [ 84.580594] RIP: negotiate_mq+0x12b/0x150 [xen_blkfront] RSP: ffffc90000ebbc70 Signed-off-by: Anchal Agarwal Reviewed-by: Frank van der Linden Reviewed-by: Vallish Vaidyeshwara (cherry-picked from 0035-xen-blkfront-Fixed-blkfront_restore-to-remove-a-call.patch in AWS 4.14 kernel SRPM) Signed-off-by: Daniel Axtens --- drivers/block/xen-blkfront.c | 5 ----- 1 file changed, 5 deletions(-) diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index a5d0266ee1ba..e11f12e046b2 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -2926,11 +2926,6 @@ static int blkfront_restore(struct xenbus_device *dev) { struct blkfront_info *info = dev_get_drvdata(&dev->dev); int err = 0; - - err = negotiate_mq(info); - if (err) - goto out; - err = talk_to_blkback(dev, info); if (err) goto out;