From patchwork Mon Feb 22 19:45:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Henrique Barboza X-Patchwork-Id: 1443281 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=kH0LZIOG; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Dkt4d5t3qz9sVS for ; Tue, 23 Feb 2021 06:47:29 +1100 (AEDT) Received: from localhost ([::1]:53748 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lEHB1-0005QQ-DG for incoming@patchwork.ozlabs.org; Mon, 22 Feb 2021 14:47:27 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:59470) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lEH9R-0005OT-DG; Mon, 22 Feb 2021 14:45:51 -0500 Received: from mail-qv1-xf2d.google.com ([2607:f8b0:4864:20::f2d]:33825) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lEH9P-0004Dt-Fu; Mon, 22 Feb 2021 14:45:49 -0500 Received: by mail-qv1-xf2d.google.com with SMTP id dr7so6667241qvb.1; Mon, 22 Feb 2021 11:45:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=1Z4+eauGFF4oAqBj2z7fvfODM6J17DWz3NmCJRQnBYo=; b=kH0LZIOGNMSKUJY7IB8H/6MO/B4tFbJv2jDRxxXRa7z2DBmh9Btum2EAGc2yUl1YYZ gcIe42w5043o+d2+RlWG6MLzzQqZ1Wg1tBp+myIclM1fUm7FJW7dzRdUPf+jT4Ewvnxo KbyfLu/9IFbRWcbR2o92K7Y154AOzUOFN9OiAjCzJTvKgtHmLEtW3tWUArm1TneQcmC2 3otObPnoo4aDCsJRRPhVcSfXE+/bWH/p2zgaYbjPtOWTpo2dpxBg3Tfi8tfv1wvEKNN2 gJdOZf9QxeprwH9Q/7F9Ru5q7AFjDnaZSUSNnp/HRXA2/UUHtY9aaHBmCWFqPuf6YrLd G+Nw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=1Z4+eauGFF4oAqBj2z7fvfODM6J17DWz3NmCJRQnBYo=; b=Wr0hgk+OU8z/snbwsrnfdtql7k3lQJjOIv6Ei88BmArksmhNgh66EkaebOC9uCbCEa x8+plFNGlkDdl8FSIdtQqXh/Z9zMKeazl+6mjA3SY+xmZ5TMyPMitf0+INlqHyKNAlAL 4e7XHvaaVUtEFmi5NrNOdDnVnXHhr7T1exp/kaGOwA3CgO6tCPXbeAgRF133StKEuFRl bbtiFKizi7bX7LIRYq1j6ax9JdNm5S4IC/GxDw4UZaSo7zXLRx8+ZGDJpZW1niLVB7AQ t7H7yZxPvAH0cF89safnr6ePAXKRwouwBbD9ngCxOhUoW5FHR7GyU4QyBr5F5VsEajsH ZKSA== X-Gm-Message-State: AOAM53003HTrsC53FZG6HBkykteeQwzJokOlZNcN7E5OyBFzdkSTwMEL 70lYSWycwIttZ+uDYLQmSbnUyWrq+9RM+Q== X-Google-Smtp-Source: ABdhPJyPBPOVFNcFpnc+uX55/FNwWWvXJLCi09xJkxd+Rha+TTCQIYrNWVFXjqLzdUOOPgKwG2r5wA== X-Received: by 2002:a0c:bf12:: with SMTP id m18mr21984238qvi.40.1614023146008; Mon, 22 Feb 2021 11:45:46 -0800 (PST) Received: from rekt.ibmuc.com ([2804:431:c7c6:cd1c:d722:e26f:4e76:c5c1]) by smtp.gmail.com with ESMTPSA id 82sm13483178qkd.48.2021.02.22.11.45.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Feb 2021 11:45:45 -0800 (PST) From: Daniel Henrique Barboza To: qemu-devel@nongnu.org Subject: [PATCH v4 1/5] spapr_drc.c: use spapr_drc_release() in isolate_physical/set_unusable Date: Mon, 22 Feb 2021 16:45:27 -0300 Message-Id: <20210222194531.62717-2-danielhb413@gmail.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210222194531.62717-1-danielhb413@gmail.com> References: <20210222194531.62717-1-danielhb413@gmail.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::f2d; envelope-from=danielhb413@gmail.com; helo=mail-qv1-xf2d.google.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Daniel Henrique Barboza , qemu-ppc@nongnu.org, groug@kaod.org, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" When moving a physical DRC to "Available", drc_isolate_physical() will move the DRC state to STATE_PHYSICAL_POWERON and, if the DRC is marked for unplug, call spapr_drc_detach(). For physical DRCs, drck->empty_state is STATE_PHYSICAL_POWERON, meaning that we're sure that spapr_drc_detach() will end up calling spapr_drc_release() in the end. Likewise, for logical DRCs, drc_set_unusable will move the DRC to "Unusable" state, setting drc->state to STATE_LOGICAL_UNUSABLE, which is the drck->empty_state for logical DRCs. spapr_drc_detach() will call spapr_drc_release() in this case as well. In both scenarios, spapr_drc_detach() is being used as a spapr_drc_release(), wrapper, where we also set unplug_requested (which is already true, otherwise spapr_drc_detach() wouldn't be called in the first place) and check if drc->state == drck->empty_state, which we also know it's guaranteed to be true because we just set it. Just use spapr_drc_release() in these functions to be clear of our intentions in both these functions. Reviewed-by: Greg Kurz Reviewed-by: David Gibson Signed-off-by: Daniel Henrique Barboza --- hw/ppc/spapr_drc.c | 32 ++++++++++++++++---------------- 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c index 84bd3c881f..555a25517d 100644 --- a/hw/ppc/spapr_drc.c +++ b/hw/ppc/spapr_drc.c @@ -50,6 +50,20 @@ uint32_t spapr_drc_index(SpaprDrc *drc) | (drc->id & DRC_INDEX_ID_MASK); } +static void spapr_drc_release(SpaprDrc *drc) +{ + SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc); + + drck->release(drc->dev); + + drc->unplug_requested = false; + g_free(drc->fdt); + drc->fdt = NULL; + drc->fdt_start_offset = 0; + object_property_del(OBJECT(drc), "device"); + drc->dev = NULL; +} + static uint32_t drc_isolate_physical(SpaprDrc *drc) { switch (drc->state) { @@ -68,7 +82,7 @@ static uint32_t drc_isolate_physical(SpaprDrc *drc) if (drc->unplug_requested) { uint32_t drc_index = spapr_drc_index(drc); trace_spapr_drc_set_isolation_state_finalizing(drc_index); - spapr_drc_detach(drc); + spapr_drc_release(drc); } return RTAS_OUT_SUCCESS; @@ -209,7 +223,7 @@ static uint32_t drc_set_unusable(SpaprDrc *drc) if (drc->unplug_requested) { uint32_t drc_index = spapr_drc_index(drc); trace_spapr_drc_set_allocation_state_finalizing(drc_index); - spapr_drc_detach(drc); + spapr_drc_release(drc); } return RTAS_OUT_SUCCESS; @@ -372,20 +386,6 @@ void spapr_drc_attach(SpaprDrc *drc, DeviceState *d) NULL, 0); } -static void spapr_drc_release(SpaprDrc *drc) -{ - SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc); - - drck->release(drc->dev); - - drc->unplug_requested = false; - g_free(drc->fdt); - drc->fdt = NULL; - drc->fdt_start_offset = 0; - object_property_del(OBJECT(drc), "device"); - drc->dev = NULL; -} - void spapr_drc_detach(SpaprDrc *drc) { SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc); From patchwork Mon Feb 22 19:45:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Henrique Barboza X-Patchwork-Id: 1443283 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=CC7ZfdDW; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Dkt4g3HRpz9sVr for ; Tue, 23 Feb 2021 06:47:31 +1100 (AEDT) Received: from localhost ([::1]:53952 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lEHB3-0005Zk-BD for incoming@patchwork.ozlabs.org; Mon, 22 Feb 2021 14:47:29 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:59530) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lEH9V-0005PI-3P; Mon, 22 Feb 2021 14:45:53 -0500 Received: from mail-qv1-xf2e.google.com ([2607:f8b0:4864:20::f2e]:42618) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lEH9S-0004FE-EP; Mon, 22 Feb 2021 14:45:52 -0500 Received: by mail-qv1-xf2e.google.com with SMTP id s10so6023778qvl.9; Mon, 22 Feb 2021 11:45:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=LhJZuhNIKQBkPWV2v1rNqNCJqWM7L82y26Rps50tczY=; b=CC7ZfdDW9A+B3ojK2aP/6eHn6hYZxjlOlRjGUYdAEobyO+7saXbNmY9VTK/Ye+eJex nt8CQYwz/HAt4f+KN8o0PaxD3D2wThQ93EF570G/NXmSS5M+xx3OPl0IZ19TYQfTPr20 ZzTB9tTvTq7hc2VoOGUl9OyvaBIPVHFyN9wAcFoIerxchL2OnsFuNwspxSNSLtgxP9u+ tOhaQLkPCmktahuTrlSxPwcTA5lwp3w+8po0S2o0S0C723p95zYN9bcFStBhBBvQwLjt njKSmrgX4lZQCjUz4cG9cspmRZKpCHx7XhwGo+J6Ep0BP7UPyIUrj4Oj019awVq4P2sF BRtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=LhJZuhNIKQBkPWV2v1rNqNCJqWM7L82y26Rps50tczY=; b=BXTg8FEsuRYKGQ4WUabXf1jYq970wWaIFlXcz7QVAOA0PwcT/Kc7uBSeSt/+qup/ei 5mk6Mfy2FWEeVdR+nyB0DZFzh0pWKD1dq1ZAIvh/mGNDV8nh6UTjF75Ufyyh1ZpOCavb G2a5+ET2Oe0x52R/rMz4+IDkKFFdhuV1vA4CMCmFm/ZEsooODgO0JcfY4NDRiBzoXYBp TNUyP6POlVDF0qWu/MGh8L8/unPihg3zZ1/dDxjWdRCH3HCV1qmwAVMJchrZv7zzzs83 0yq3vGxoV2W44d+whAYgA7pjdRx8vQAB3VMqK35Y3LudoUN6cK0va/3+WQZQusa0TUYL RkBw== X-Gm-Message-State: AOAM533bJWEWnCcfUFGEUvK4r5g17hOQRxW8sXR1ZfrPOTKNa9hqqNxi Cvw8JudnOUkR4ONjqyOF/ONEVRIM42aa9g== X-Google-Smtp-Source: ABdhPJwhG+nbOJds50O36Q/aeulAP96n41/2x7Omw/S6b2C1XQZtG3tmzAma/5phIyPIPTwpYJJ10Q== X-Received: by 2002:ad4:4721:: with SMTP id l1mr5246785qvz.7.1614023148106; Mon, 22 Feb 2021 11:45:48 -0800 (PST) Received: from rekt.ibmuc.com ([2804:431:c7c6:cd1c:d722:e26f:4e76:c5c1]) by smtp.gmail.com with ESMTPSA id 82sm13483178qkd.48.2021.02.22.11.45.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Feb 2021 11:45:47 -0800 (PST) From: Daniel Henrique Barboza To: qemu-devel@nongnu.org Subject: [PATCH v4 2/5] spapr: rename spapr_drc_detach() to spapr_drc_unplug_request() Date: Mon, 22 Feb 2021 16:45:28 -0300 Message-Id: <20210222194531.62717-3-danielhb413@gmail.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210222194531.62717-1-danielhb413@gmail.com> References: <20210222194531.62717-1-danielhb413@gmail.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::f2e; envelope-from=danielhb413@gmail.com; helo=mail-qv1-xf2e.google.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Daniel Henrique Barboza , qemu-ppc@nongnu.org, groug@kaod.org, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" spapr_drc_detach() is not the best name for what the function does. The function does not detach the DRC, it makes an uncommited attempt to do it. It'll mark the DRC as pending unplug, via the 'unplug_request' flag, and only if the DRC state is drck->empty_state it will detach the DRC, via spapr_drc_release(). This is a contrast with its pair spapr_drc_attach(), where the function is indeed creating the DRC QOM object. If you know what spapr_drc_attach() does, you can be misled into thinking that spapr_drc_detach() is removing the DRC from QEMU internal state, which isn't true. The current role of this function is better described as a request for detach, since there's no guarantee that we're going to detach the DRC in the end. Rename the function to spapr_drc_unplug_request to reflect what is is doing. The initial idea was to change the name to spapr_drc_detach_request(), and later on change the unplug_request flag to detach_request. However, unplug_request is a migratable boolean for a long time now and renaming it is not worth the trouble. spapr_drc_unplug_request() setting drc->unplug_request is more natural than spapr_drc_detach_request setting drc->unplug_request. Reviewed-by: Greg Kurz Reviewed-by: David Gibson Signed-off-by: Daniel Henrique Barboza --- hw/ppc/spapr.c | 6 +++--- hw/ppc/spapr_drc.c | 4 ++-- hw/ppc/spapr_pci.c | 4 ++-- hw/ppc/trace-events | 2 +- include/hw/ppc/spapr_drc.h | 2 +- 5 files changed, 9 insertions(+), 9 deletions(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 85fe65f894..b066df68cb 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -3654,7 +3654,7 @@ static void spapr_memory_unplug_request(HotplugHandler *hotplug_dev, addr / SPAPR_MEMORY_BLOCK_SIZE); g_assert(drc); - spapr_drc_detach(drc); + spapr_drc_unplug_request(drc); addr += SPAPR_MEMORY_BLOCK_SIZE; } @@ -3722,7 +3722,7 @@ void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev, g_assert(drc); if (!spapr_drc_unplug_requested(drc)) { - spapr_drc_detach(drc); + spapr_drc_unplug_request(drc); spapr_hotplug_req_remove_by_index(drc); } } @@ -3985,7 +3985,7 @@ static void spapr_phb_unplug_request(HotplugHandler *hotplug_dev, assert(drc); if (!spapr_drc_unplug_requested(drc)) { - spapr_drc_detach(drc); + spapr_drc_unplug_request(drc); spapr_hotplug_req_remove_by_index(drc); } } diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c index 555a25517d..67041fb212 100644 --- a/hw/ppc/spapr_drc.c +++ b/hw/ppc/spapr_drc.c @@ -386,11 +386,11 @@ void spapr_drc_attach(SpaprDrc *drc, DeviceState *d) NULL, 0); } -void spapr_drc_detach(SpaprDrc *drc) +void spapr_drc_unplug_request(SpaprDrc *drc) { SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc); - trace_spapr_drc_detach(spapr_drc_index(drc)); + trace_spapr_drc_unplug_request(spapr_drc_index(drc)); g_assert(drc->dev); diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index f1c7479816..b00e9609ae 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -1723,12 +1723,12 @@ static void spapr_pci_unplug_request(HotplugHandler *plug_handler, * functions, even if their unplug weren't requested * beforehand. */ - spapr_drc_detach(func_drc); + spapr_drc_unplug_request(func_drc); } } } - spapr_drc_detach(drc); + spapr_drc_unplug_request(drc); /* if this isn't func 0, defer unplug event. otherwise signal removal * for all present functions diff --git a/hw/ppc/trace-events b/hw/ppc/trace-events index 1e91984526..b4bbfbb013 100644 --- a/hw/ppc/trace-events +++ b/hw/ppc/trace-events @@ -50,7 +50,7 @@ spapr_drc_set_allocation_state(uint32_t index, int state) "drc: 0x%"PRIx32", sta spapr_drc_set_allocation_state_finalizing(uint32_t index) "drc: 0x%"PRIx32 spapr_drc_set_configured(uint32_t index) "drc: 0x%"PRIx32 spapr_drc_attach(uint32_t index) "drc: 0x%"PRIx32 -spapr_drc_detach(uint32_t index) "drc: 0x%"PRIx32 +spapr_drc_unplug_request(uint32_t index) "drc: 0x%"PRIx32 spapr_drc_awaiting_quiesce(uint32_t index) "drc: 0x%"PRIx32 spapr_drc_reset(uint32_t index) "drc: 0x%"PRIx32 spapr_drc_realize(uint32_t index) "drc: 0x%"PRIx32 diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h index 8982927d5c..02a63b3666 100644 --- a/include/hw/ppc/spapr_drc.h +++ b/include/hw/ppc/spapr_drc.h @@ -243,7 +243,7 @@ int spapr_dt_drc(void *fdt, int offset, Object *owner, uint32_t drc_type_mask); * beforehand (eg. check drc->dev at pre-plug). */ void spapr_drc_attach(SpaprDrc *drc, DeviceState *d); -void spapr_drc_detach(SpaprDrc *drc); +void spapr_drc_unplug_request(SpaprDrc *drc); /* * Reset all DRCs, causing pending hot-plug/unplug requests to complete. From patchwork Mon Feb 22 19:45:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Henrique Barboza X-Patchwork-Id: 1443286 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=Q/hRiJIT; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Dkt7P4yc1z9sVF for ; Tue, 23 Feb 2021 06:49:52 +1100 (AEDT) Received: from localhost ([::1]:34328 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lEHDJ-0000g7-Hd for incoming@patchwork.ozlabs.org; Mon, 22 Feb 2021 14:49:49 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:59570) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lEH9X-0005QJ-93; Mon, 22 Feb 2021 14:45:56 -0500 Received: from mail-qv1-xf31.google.com ([2607:f8b0:4864:20::f31]:40361) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lEH9U-0004Gh-DS; Mon, 22 Feb 2021 14:45:54 -0500 Received: by mail-qv1-xf31.google.com with SMTP id s3so4191550qvn.7; Mon, 22 Feb 2021 11:45:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=XF0OUqJe9T0qDCkk+/FrLa/tWDB4tdwV5moqL2LYy1E=; b=Q/hRiJIT5vTySl+ON3YbhNId77hlsXPKJCS+lZ5rS2nnNxpch7Dp52X//nZ9clbaqY ldG9Q3/LBHEnrGYxJg6KUsI5SUAhE9DVl/vB7U0IFgZyiO7IH0fMDagbNawRJPdIo+6G 6/vUBUt+69Lfeq4y2KAGxdnLvhd52m5/tEpbwfrCnkr1H+kqtNkb/9sXA5yqMyLuAruK tCLt/u6nsVoh2ksEq+ozLy+418rwkbfIcSEZPMDWg9x7MU3UUUcFYJ5a43cJzucau9Bl d7NlNJPzFn62bkvdeDyurE1fXKk2ns/sRWamVcGzUcDPF1nV2TN3zJCO1UYscVYJ2IAg oILw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XF0OUqJe9T0qDCkk+/FrLa/tWDB4tdwV5moqL2LYy1E=; b=kExwuKwF16IzatuB9qOpNHwq0ClZB4LGc8XArYc3Sga1cSgFKb5fcv8kjIk19qpJTF UnuDJMm9NoFpmrcgDp/7HfxgbWtyoTNq3vZnVwa8kqumvJMQvGQCcVvDnzgcDLVIETt6 Y86wkW3mra+cviwBBs1MW5ei6Izg+TNw8fzJmx3HJufvtqapA7RGMjivkRLObh3KutHw 2qV8TmwCIWFnVUUWvZY688En8+IUpmEyirYHz5ej9+RaTfe2IZzb1X+dDKm4z6gHhgAn OPs7AP2t0VVcqyDEM4ZV/M6mQ1859IgTMj5DT11XiUlhglGhxBj/I5vOYCz+z6w3Lluy ++Bg== X-Gm-Message-State: AOAM531KzOudjmE6LuTh+b3gU16kDHWiAs4GELwDP/ymsQKQcSXnRWBl MThXWDO3lu7ycjePD4CngyyxB726DKSd1A== X-Google-Smtp-Source: ABdhPJyCTmBODM704kqe3vfLxoIk32v8p+Y/Rw8PHN5EAQn9JMy1Yv3wJJuE/8UO2nksJeel++mbSw== X-Received: by 2002:a0c:e58e:: with SMTP id t14mr14574806qvm.28.1614023150752; Mon, 22 Feb 2021 11:45:50 -0800 (PST) Received: from rekt.ibmuc.com ([2804:431:c7c6:cd1c:d722:e26f:4e76:c5c1]) by smtp.gmail.com with ESMTPSA id 82sm13483178qkd.48.2021.02.22.11.45.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Feb 2021 11:45:50 -0800 (PST) From: Daniel Henrique Barboza To: qemu-devel@nongnu.org Subject: [PATCH v4 3/5] spapr_drc.c: introduce unplug_timeout_timer Date: Mon, 22 Feb 2021 16:45:29 -0300 Message-Id: <20210222194531.62717-4-danielhb413@gmail.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210222194531.62717-1-danielhb413@gmail.com> References: <20210222194531.62717-1-danielhb413@gmail.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::f31; envelope-from=danielhb413@gmail.com; helo=mail-qv1-xf31.google.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Daniel Henrique Barboza , qemu-ppc@nongnu.org, groug@kaod.org, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The LoPAR spec provides no way for the guest kernel to report failure of hotplug/hotunplug events. This wouldn't be bad if those operations were granted to always succeed, but that's far for the reality. What ends up happening is that, in the case of a failed hotunplug, regardless of whether it was a QEMU error or a guest misbehavior, the pSeries machine is retaining the unplug state of the device in the running guest. This state is cleanup in machine reset, where it is assumed that this state represents a device that is pending unplug, and the device is hotunpluged from the board. Until the reset occurs, any hotunplug operation of the same device is forbid because there is a pending unplug state. This behavior has at least one undesirable side effect. A long standing pending unplug state is, more often than not, the result of a hotunplug error. The user had to dealt with it, since retrying to unplug the device is noy allowed, and then in the machine reset we're removing the device from the guest. This means that we're failing the user twice - failed to hotunplug when asked, then hotunplugged without notice. Solutions to this problem range between trying to predict when the hotunplug will fail and forbid the operation from the QEMU layer, from opening up the IRQ queue to allow for multiple hotunplug attempts, from telling the users to 'reboot the machine if something goes wrong'. The first solution is flawed because we can't fully predict guest behavior from QEMU, the second solution is a trial and error remediation that counts on a hope that the unplug will eventually succeed, and the third is ... well. This patch introduces a crude, but effective solution to hotunplug errors in the pSeries machine. For each unplug done, we'll timeout after some time. If a certain amount of time passes, we'll cleanup the hotunplug state from the machine. During the timeout period, any unplug operations in the same device will still be blocked. After that, we'll assume that the guest failed the operation, and allow the user to try again. If the timeout is too short we'll prevent legitimate hotunplug situations to occur, so we'll need to overestimate the regular time an unplug operation takes to succeed to account that. The true solution for the hotunplug errors in the pSeries machines is a PAPR change to allow for the guest to warn the platform about it. For now, the work done in this timeout design can be used for the new PAPR 'abort hcall' in the future, given that for both cases we'll need code to cleanup the existing unplug states of the DRCs. At this moment we're adding the basic wiring of the timer into the DRC. Next patch will use the timer to timeout failed CPU hotunplugs. Signed-off-by: Daniel Henrique Barboza --- hw/ppc/spapr_drc.c | 40 ++++++++++++++++++++++++++++++++++++++ include/hw/ppc/spapr_drc.h | 4 ++++ 2 files changed, 44 insertions(+) diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c index 67041fb212..27adbc5c30 100644 --- a/hw/ppc/spapr_drc.c +++ b/hw/ppc/spapr_drc.c @@ -57,6 +57,8 @@ static void spapr_drc_release(SpaprDrc *drc) drck->release(drc->dev); drc->unplug_requested = false; + timer_del(drc->unplug_timeout_timer); + g_free(drc->fdt); drc->fdt = NULL; drc->fdt_start_offset = 0; @@ -370,6 +372,17 @@ static void prop_get_fdt(Object *obj, Visitor *v, const char *name, } while (fdt_depth != 0); } +static void spapr_drc_start_unplug_timeout_timer(SpaprDrc *drc) +{ + SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc); + + if (drck->unplug_timeout_seconds != 0) { + timer_mod(drc->unplug_timeout_timer, + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + + drck->unplug_timeout_seconds * 1000); + } +} + void spapr_drc_attach(SpaprDrc *drc, DeviceState *d) { trace_spapr_drc_attach(spapr_drc_index(drc)); @@ -475,11 +488,23 @@ static bool spapr_drc_needed(void *opaque) spapr_drc_unplug_requested(drc); } +static int spapr_drc_post_load(void *opaque, int version_id) +{ + SpaprDrc *drc = opaque; + + if (drc->unplug_requested) { + spapr_drc_start_unplug_timeout_timer(drc); + } + + return 0; +} + static const VMStateDescription vmstate_spapr_drc = { .name = "spapr_drc", .version_id = 1, .minimum_version_id = 1, .needed = spapr_drc_needed, + .post_load = spapr_drc_post_load, .fields = (VMStateField []) { VMSTATE_UINT32(state, SpaprDrc), VMSTATE_END_OF_LIST() @@ -490,6 +515,15 @@ static const VMStateDescription vmstate_spapr_drc = { } }; +static void drc_unplug_timeout_cb(void *opaque) +{ + SpaprDrc *drc = opaque; + + if (drc->unplug_requested) { + drc->unplug_requested = false; + } +} + static void drc_realize(DeviceState *d, Error **errp) { SpaprDrc *drc = SPAPR_DR_CONNECTOR(d); @@ -512,6 +546,11 @@ static void drc_realize(DeviceState *d, Error **errp) object_property_add_alias(root_container, link_name, drc->owner, child_name); g_free(link_name); + + drc->unplug_timeout_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, + drc_unplug_timeout_cb, + drc); + vmstate_register(VMSTATE_IF(drc), spapr_drc_index(drc), &vmstate_spapr_drc, drc); trace_spapr_drc_realize_complete(spapr_drc_index(drc)); @@ -529,6 +568,7 @@ static void drc_unrealize(DeviceState *d) name = g_strdup_printf("%x", spapr_drc_index(drc)); object_property_del(root_container, name); g_free(name); + timer_free(drc->unplug_timeout_timer); } SpaprDrc *spapr_dr_connector_new(Object *owner, const char *type, diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h index 02a63b3666..38ec4c8091 100644 --- a/include/hw/ppc/spapr_drc.h +++ b/include/hw/ppc/spapr_drc.h @@ -187,6 +187,8 @@ typedef struct SpaprDrc { bool unplug_requested; void *fdt; int fdt_start_offset; + + QEMUTimer *unplug_timeout_timer; } SpaprDrc; struct SpaprMachineState; @@ -209,6 +211,8 @@ typedef struct SpaprDrcClass { int (*dt_populate)(SpaprDrc *drc, struct SpaprMachineState *spapr, void *fdt, int *fdt_start_offset, Error **errp); + + int unplug_timeout_seconds; } SpaprDrcClass; typedef struct SpaprDrcPhysical { From patchwork Mon Feb 22 19:45:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Henrique Barboza X-Patchwork-Id: 1443284 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=caUN5WpP; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Dkt4p2STVz9sCD for ; Tue, 23 Feb 2021 06:47:38 +1100 (AEDT) Received: from localhost ([::1]:54666 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lEHBA-0005s5-Ao for incoming@patchwork.ozlabs.org; Mon, 22 Feb 2021 14:47:36 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:59624) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lEH9b-0005Re-C7; Mon, 22 Feb 2021 14:45:59 -0500 Received: from mail-qk1-x736.google.com ([2607:f8b0:4864:20::736]:45428) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lEH9W-0004Ht-S2; Mon, 22 Feb 2021 14:45:59 -0500 Received: by mail-qk1-x736.google.com with SMTP id z128so9337276qkc.12; Mon, 22 Feb 2021 11:45:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=E3TJ00vly281Tr/2zGxsxCXK2O76iZf35kjarKc3T7c=; b=caUN5WpPELbmZmd0fKKYYcI8kB/lta537CYHOQPsAfnjL6Ju8CWV4AcZ5TR0cOKV3b cZmMI8O1UUL2HWDj2psEZ7Dyhpb8rlr/+Wh1pZU1JFsUKRU8Kns7w3yp7TyzXNAcrA9A ZkJTkQmVezekQahFrGV0iqu+95Zpu3gGizGW56IDEt2DvjCZnwbDgcLKBJ7TFwEF9bSb Y0CPGCUifDQtqiWBva33NR+oRb7pmhxy57tl/vAC2I9IOw1f92xW08FpWxKsQ8rNXEGg XfdYoK32/I+BchwgcwpPUibtlknOEe34qXTyZG1GYO70pYDtYxCAWwrueXCCJ3xd5maZ RQgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=E3TJ00vly281Tr/2zGxsxCXK2O76iZf35kjarKc3T7c=; b=GZB8ZwrlVXsEBQebhnWlsLSTM8aGnWJiOQiqFBPdATG2Xigca/36jlzGTkeTmiokmU vJdi3kNxv7HfWXLGFl3B8qo6zjU4TrVIqK9J3T+Ini1VKZJhSBriWOBiNBiq4CohP9Ou +mjZLb0xlGurO63qs7LyXSP0nAwItsGgfVjrfj3Q5QGYP0BSPEOb1SoEswBUrTNgWTlq XlG/7KlYD6yMkMl8Y2tfBfcG4CcUIDH8eGB6kzd/idLQ5pwSTqFMGvNrMf8L0DmTt0fE bhniwUnsUtNEjCiq1bbuYn1wufq8Ve80Rho3ICyyAt4CTXgdOvrIyPXjrHS1GEkDlfBY fbXA== X-Gm-Message-State: AOAM532SxWv+95MqsjHQoVJr+bk1/Qd7u95QpnNKG1dO9j/PVu5wh9xf 04KtNU0kmLrs54l432si8YcurZ1rFNGKzg== X-Google-Smtp-Source: ABdhPJx3FN6x3tvQqZ0Ne5pjKzqqRdUQ3Fvq1deDF/N0Lto+Q7Vqmcy7nbOHyif0PAPHDIc5WY7WlQ== X-Received: by 2002:ae9:e208:: with SMTP id c8mr21855597qkc.414.1614023153363; Mon, 22 Feb 2021 11:45:53 -0800 (PST) Received: from rekt.ibmuc.com ([2804:431:c7c6:cd1c:d722:e26f:4e76:c5c1]) by smtp.gmail.com with ESMTPSA id 82sm13483178qkd.48.2021.02.22.11.45.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Feb 2021 11:45:52 -0800 (PST) From: Daniel Henrique Barboza To: qemu-devel@nongnu.org Subject: [PATCH v4 4/5] spapr_drc.c: add hotunplug timeout for CPUs Date: Mon, 22 Feb 2021 16:45:30 -0300 Message-Id: <20210222194531.62717-5-danielhb413@gmail.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210222194531.62717-1-danielhb413@gmail.com> References: <20210222194531.62717-1-danielhb413@gmail.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::736; envelope-from=danielhb413@gmail.com; helo=mail-qk1-x736.google.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Xujun Ma , Daniel Henrique Barboza , qemu-ppc@nongnu.org, groug@kaod.org, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" There is a reliable way to make a CPU hotunplug fail in the pseries machine. Hotplug a CPU A, then offline all other CPUs inside the guest but A. When trying to hotunplug A the guest kernel will refuse to do it, because A is now the last online CPU of the guest. PAPR has no 'error callback' in this situation to report back to the platform, so the guest kernel will deny the unplug in silent and QEMU will never know what happened. The unplug pending state of A will remain until the guest is shutdown or rebooted. Previous attempts of fixing it (see [1] and [2]) were aimed at trying to mitigate the effects of the problem. In [1] we were trying to guess which guest CPUs were online to forbid hotunplug of the last online CPU in the QEMU layer, avoiding the scenario described above because QEMU is now failing in behalf of the guest. This is not robust because the last online CPU of the guest can change while we're in the middle of the unplug process, and our initial assumptions are now invalid. In [2] we were accepting that our unplug process is uncertain and the user should be allowed to spam the IRQ hotunplug queue of the guest in case the CPU hotunplug fails. This patch presents another alternative, using the timeout infrastructure introduced in the previous patch. CPU hotunplugs in the pSeries machine will now timeout after 15 seconds. This is a long time for a single CPU unplug to occur, regardless of guest load - although the user is *strongly* encouraged to *not* hotunplug devices from a guest under high load - and we can be sure that something went wrong if it takes longer than that for the guest to release the CPU (the same can't be said about memory hotunplug - more on that in the next patch). Timing out the unplug operation will reset the unplug state of the CPU and allow the user to try it again, regardless of the error situation that prevented the hotunplug to occur. Of all the not so pretty fixes/mitigations for CPU hotunplug errors in pSeries, timing out the operation is an admission that we have no control in the process, and must assume the worst case if the operation doesn't succeed in a sensible time frame. [1] https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg03353.html [2] https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg04400.html Reported-by: Xujun Ma Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1911414 Reviewed-by: David Gibson Signed-off-by: Daniel Henrique Barboza --- hw/ppc/spapr.c | 4 ++++ hw/ppc/spapr_drc.c | 13 +++++++++++++ include/hw/ppc/spapr_drc.h | 1 + 3 files changed, 18 insertions(+) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index b066df68cb..ecce8abf14 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -3724,6 +3724,10 @@ void spapr_core_unplug_request(HotplugHandler *hotplug_dev, DeviceState *dev, if (!spapr_drc_unplug_requested(drc)) { spapr_drc_unplug_request(drc); spapr_hotplug_req_remove_by_index(drc); + } else { + error_setg(errp, "core-id %d unplug is still pending, %d seconds " + "timeout remaining", + cc->core_id, spapr_drc_unplug_timeout_remaining_sec(drc)); } } diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c index 27adbc5c30..fd2e45640f 100644 --- a/hw/ppc/spapr_drc.c +++ b/hw/ppc/spapr_drc.c @@ -409,6 +409,8 @@ void spapr_drc_unplug_request(SpaprDrc *drc) drc->unplug_requested = true; + spapr_drc_start_unplug_timeout_timer(drc); + if (drc->state != drck->empty_state) { trace_spapr_drc_awaiting_quiesce(spapr_drc_index(drc)); return; @@ -417,6 +419,16 @@ void spapr_drc_unplug_request(SpaprDrc *drc) spapr_drc_release(drc); } +int spapr_drc_unplug_timeout_remaining_sec(SpaprDrc *drc) +{ + if (drc->unplug_requested && timer_pending(drc->unplug_timeout_timer)) { + return (qemu_timeout_ns_to_ms(drc->unplug_timeout_timer->expire_time) - + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL)) / 1000; + } + + return 0; +} + bool spapr_drc_reset(SpaprDrc *drc) { SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc); @@ -710,6 +722,7 @@ static void spapr_drc_cpu_class_init(ObjectClass *k, void *data) drck->drc_name_prefix = "CPU "; drck->release = spapr_core_release; drck->dt_populate = spapr_core_dt_populate; + drck->unplug_timeout_seconds = 15; } static void spapr_drc_pci_class_init(ObjectClass *k, void *data) diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h index 38ec4c8091..26599c385a 100644 --- a/include/hw/ppc/spapr_drc.h +++ b/include/hw/ppc/spapr_drc.h @@ -248,6 +248,7 @@ int spapr_dt_drc(void *fdt, int offset, Object *owner, uint32_t drc_type_mask); */ void spapr_drc_attach(SpaprDrc *drc, DeviceState *d); void spapr_drc_unplug_request(SpaprDrc *drc); +int spapr_drc_unplug_timeout_remaining_sec(SpaprDrc *drc); /* * Reset all DRCs, causing pending hot-plug/unplug requests to complete. From patchwork Mon Feb 22 19:45:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Henrique Barboza X-Patchwork-Id: 1443287 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=UNQKeQnc; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Dkt8G61dsz9sCD for ; Tue, 23 Feb 2021 06:50:38 +1100 (AEDT) Received: from localhost ([::1]:36334 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lEHE4-0001Va-PE for incoming@patchwork.ozlabs.org; Mon, 22 Feb 2021 14:50:36 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:59650) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lEH9j-0005TB-Qa; Mon, 22 Feb 2021 14:46:11 -0500 Received: from mail-qk1-x72b.google.com ([2607:f8b0:4864:20::72b]:35397) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1lEH9Z-0004J9-3r; Mon, 22 Feb 2021 14:46:07 -0500 Received: by mail-qk1-x72b.google.com with SMTP id x14so13913034qkm.2; Mon, 22 Feb 2021 11:45:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=KCtQ4y+iavav+nWvV+OQ6yDFQ/vwhqZ0qt2JWOmpstQ=; b=UNQKeQncY4UuykxaYx3c/nh22ApFMbtVx5Tu4z7lw3HH4Zv4I4OPLbmugp9O7g2t+T jYS86Io88RW95+9LHlu+spBEFUwYTf5eFJ9u70/kKnbdZ2aO48hf0t8ITFffeIZ80B6F FMYrjKJ/p1QjPY4fojgBzMn//Ash1FegZhG38Fc1ygNf1p+4/pDJinB/9pb/dIKIimHs MdidIqM6EaJULWyTk6DWSFAfXRw9gc4fc78Op0yvI2POZpLTb+rFECwOmYds7cRlhnk0 7RiuPl8R4yQg+NEvDcVQlQGLAvjfoDfhCP7vKMk4e34kCulUlBhu//vEmKYe0KTURvLA rXGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KCtQ4y+iavav+nWvV+OQ6yDFQ/vwhqZ0qt2JWOmpstQ=; b=kMSppUzgkdIB0AkOxPBrU5XO7Q5ft/XLFU4/sHPKecxYaHIV1NYSZ2kvaCIfmm/iHz xZ/1zxqHv/+/7cZZlkJGtkFXjMxV/uhRAfIgCUc1oTB939vvF8eSn+2ja222l+OYxpHz Cw/rwiCl7kYM5o+D0W2dHMSFKD9aP6UtZLo+EZwZTMnXDD/7epkxZ2/UCSGDF+euCagA DxH3wipL6gmiy9UKsgDTkJJbPYygRyBr0ggx1BXwGyAb9Ds5RD7atJ9UghkV9u/p+5G3 zSWvp4xvjV6h9pxAEr8WmjFujCUjdYPp4Dy4cIRuxtmMUV9DhBj81NwngIJFn9p7iUZO 9/rQ== X-Gm-Message-State: AOAM530T9oz02Off4vwC+fltkibVIW8F2RPMLTDQTJFBAX4rNTKU5mBo S07q287/24eYCIAhqF95GyUHOabl6NVuvg== X-Google-Smtp-Source: ABdhPJyhcP3EVCsLyQOk7abyX9I3L0TQmcwt157JSckj1H6ILAqZvMwO0k77OvxaCNJR0udLuKpr8A== X-Received: by 2002:a05:620a:21d8:: with SMTP id h24mr4085216qka.464.1614023155547; Mon, 22 Feb 2021 11:45:55 -0800 (PST) Received: from rekt.ibmuc.com ([2804:431:c7c6:cd1c:d722:e26f:4e76:c5c1]) by smtp.gmail.com with ESMTPSA id 82sm13483178qkd.48.2021.02.22.11.45.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Feb 2021 11:45:55 -0800 (PST) From: Daniel Henrique Barboza To: qemu-devel@nongnu.org Subject: [PATCH v4 5/5] spapr_drc.c: use DRC reconfiguration to cleanup DIMM unplug state Date: Mon, 22 Feb 2021 16:45:31 -0300 Message-Id: <20210222194531.62717-6-danielhb413@gmail.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20210222194531.62717-1-danielhb413@gmail.com> References: <20210222194531.62717-1-danielhb413@gmail.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::72b; envelope-from=danielhb413@gmail.com; helo=mail-qk1-x72b.google.com X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Daniel Henrique Barboza , qemu-ppc@nongnu.org, groug@kaod.org, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Handling errors in memory hotunplug in the pSeries machine is more complex than any other device type, because there are all the complications that other devices has, and more. For instance, determining a timeout for a DIMM hotunplug must consider if it's a Hash-MMU or a Radix-MMU guest, because Hash guests takes longer to hotunplug DIMMs. The size of the DIMM is also a factor, given that longer DIMMs naturally takes longer to be hotunplugged from the kernel. And there's also the guest memory usage to be considered: if there's a process that is consuming memory that would be lost by the DIMM unplug, the kernel will postpone the unplug process until the process finishes, and then initiate the regular hotunplug process. The first two considerations are manageable, but the last one is a deal breaker. There is no sane way for the pSeries machine to determine the memory load in the guest when attempting a DIMM hotunplug - and even if there was a way, the guest can start using all the RAM in the middle of the unplug process and invalidate our previous assumptions - and in result we can't even begin to calculate a timeout for the operation. This means that we can't implement a viable timeout mechanism for memory unplug in pSeries. Going back to why we would consider an unplug timeout, the reason is that we can't know if the kernel is giving up the unplug. Turns out that, sometimes, we can. Consider a failed memory hotunplug attempt where the kernel will error out with the following message: 'pseries-hotplug-mem: Memory indexed-count-remove failed, adding any removed LMBs' This happens when there is a LMB that the kernel gave up in removing, and the LMBs previously marked for removal are now being added back. This happens in the pseries kernel in [1], dlpar_memory_remove_by_ic() into dlpar_add_lmb(), and after that update_lmb_associativity_index(). In this function, the kernel is configuring the LMB DRC connector again. Note that this is a valid usage in LOPAR, as stated in section "ibm,configure-connector RTAS Call": 'A subsequent sequence of calls to ibm,configure-connector with the same entry from the “ibm,drc-indexes” or “ibm,drc-info” property will restart the configuration of devices which were not completely configured.' We can use this kernel behavior in our favor. If a DRC connector reconfiguration for a LMB that we marked as unplug pending happens, this indicates that the kernel changed its mind about the unplug and is reasserting that it will keep using all the LMBs of the DIMM. In this case, it's safe to assume that the whole DIMM device unplug was cancelled. This patch hops into rtas_ibm_configure_connector() and, in the scenario described above, clear the unplug state for the DIMM device. This will not solve all the problems we still have with memory unplug, but it will cover this case where the kernel reconfigures LMBs after a failed unplug. We are a bit more resilient, without using an unreliable timeout, and we didn't make the remaining error cases any worse. [1] arch/powerpc/platforms/pseries/hotplug-memory.c Signed-off-by: Daniel Henrique Barboza --- hw/ppc/spapr.c | 43 ++++++++++++++++++++++++++++++++++++++++++ hw/ppc/spapr_drc.c | 10 ++++++++++ include/hw/ppc/spapr.h | 2 ++ 3 files changed, 55 insertions(+) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index ecce8abf14..6eaddb12cb 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -3575,6 +3575,49 @@ static SpaprDimmState *spapr_recover_pending_dimm_state(SpaprMachineState *ms, return spapr_pending_dimm_unplugs_add(ms, avail_lmbs, dimm); } +void spapr_clear_pending_dimm_unplug_state(SpaprMachineState *spapr, + DeviceState *dev) +{ + SpaprDimmState *ds; + PCDIMMDevice *dimm; + SpaprDrc *drc; + uint32_t nr_lmbs; + uint64_t size, addr_start, addr; + int i; + + if (!dev) { + return; + } + + dimm = PC_DIMM(dev); + ds = spapr_pending_dimm_unplugs_find(spapr, dimm); + + /* + * 'ds == NULL' would mean that the DIMM doesn't have a pending + * unplug state, but one of its DRC is marked as unplug_requested. + * This is bad and weird enough to g_assert() out. + */ + g_assert(ds); + + spapr_pending_dimm_unplugs_remove(spapr, ds); + + size = memory_device_get_region_size(MEMORY_DEVICE(dimm), &error_abort); + nr_lmbs = size / SPAPR_MEMORY_BLOCK_SIZE; + + addr_start = object_property_get_uint(OBJECT(dimm), PC_DIMM_ADDR_PROP, + &error_abort); + + addr = addr_start; + for (i = 0; i < nr_lmbs; i++) { + drc = spapr_drc_by_id(TYPE_SPAPR_DRC_LMB, + addr / SPAPR_MEMORY_BLOCK_SIZE); + g_assert(drc); + + drc->unplug_requested = false; + addr += SPAPR_MEMORY_BLOCK_SIZE; + } +} + /* Callback to be called during DRC release. */ void spapr_lmb_release(DeviceState *dev) { diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c index fd2e45640f..8c4997d795 100644 --- a/hw/ppc/spapr_drc.c +++ b/hw/ppc/spapr_drc.c @@ -1230,6 +1230,16 @@ static void rtas_ibm_configure_connector(PowerPCCPU *cpu, drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc); + /* + * This indicates that the kernel is reconfiguring a LMB due to + * a failed hotunplug. Clear the pending unplug state for the whole + * DIMM. + */ + if (spapr_drc_type(drc) == SPAPR_DR_CONNECTOR_TYPE_LMB && + drc->unplug_requested) { + spapr_clear_pending_dimm_unplug_state(spapr, drc->dev); + } + if (!drc->fdt) { void *fdt; int fdt_size; diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h index ccbeeca1de..d6edeaaaff 100644 --- a/include/hw/ppc/spapr.h +++ b/include/hw/ppc/spapr.h @@ -847,6 +847,8 @@ int spapr_hpt_shift_for_ramsize(uint64_t ramsize); int spapr_reallocate_hpt(SpaprMachineState *spapr, int shift, Error **errp); void spapr_clear_pending_events(SpaprMachineState *spapr); void spapr_clear_pending_hotplug_events(SpaprMachineState *spapr); +void spapr_clear_pending_dimm_unplug_state(SpaprMachineState *spapr, + DeviceState *dev); int spapr_max_server_number(SpaprMachineState *spapr); void spapr_store_hpte(PowerPCCPU *cpu, hwaddr ptex, uint64_t pte0, uint64_t pte1);