From patchwork Tue Mar 12 00:30:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stewart Smith X-Patchwork-Id: 1054998 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44JG9F2lqFz9s5c for ; Tue, 12 Mar 2019 11:30:57 +1100 (AEDT) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 44JG9F0hzwzDqFG for ; Tue, 12 Mar 2019 11:30:57 +1100 (AEDT) X-Original-To: skiboot@lists.ozlabs.org Delivered-To: skiboot@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=linux.ibm.com (client-ip=148.163.158.5; helo=mx0a-001b2d01.pphosted.com; envelope-from=stewart@linux.ibm.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 44JG933H0MzDqDX for ; Tue, 12 Mar 2019 11:30:46 +1100 (AEDT) Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x2C0T9GM018941 for ; Mon, 11 Mar 2019 20:30:42 -0400 Received: from e32.co.us.ibm.com (e32.co.us.ibm.com [32.97.110.150]) by mx0b-001b2d01.pphosted.com with ESMTP id 2r5x8w9mgg-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 11 Mar 2019 20:30:42 -0400 Received: from localhost by e32.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 12 Mar 2019 00:30:41 -0000 Received: from b03cxnp08028.gho.boulder.ibm.com (9.17.130.20) by e32.co.us.ibm.com (192.168.1.132) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 12 Mar 2019 00:30:39 -0000 Received: from b03ledav006.gho.boulder.ibm.com (b03ledav006.gho.boulder.ibm.com [9.17.130.237]) by b03cxnp08028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x2C0UcGs27590888 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 12 Mar 2019 00:30:38 GMT Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B9266C605A; Tue, 12 Mar 2019 00:30:38 +0000 (GMT) Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B6AEDC6072; Tue, 12 Mar 2019 00:30:37 +0000 (GMT) Received: from birb.localdomain (unknown [9.81.208.163]) by b03ledav006.gho.boulder.ibm.com (Postfix) with SMTP; Tue, 12 Mar 2019 00:30:37 +0000 (GMT) Received: by birb.localdomain (Postfix, from userid 1000) id 258514EC639; Tue, 12 Mar 2019 11:30:34 +1100 (AEDT) From: Stewart Smith To: skiboot@lists.ozlabs.org Date: Tue, 12 Mar 2019 11:30:31 +1100 X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 19031200-0004-0000-0000-000014ECD4DD X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00010741; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000281; SDB=6.01173045; UDB=6.00613282; IPR=6.00953699; MB=3.00025936; MTD=3.00000008; XFM=3.00000015; UTC=2019-03-12 00:30:41 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19031200-0005-0000-0000-00008ADF2E96 Message-Id: <20190312003031.21367-1-stewart@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2019-03-11_17:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1903120001 Subject: [Skiboot] [RFC PATCH] direct-controls: Use P8 sequence from workbook X-BeenThere: skiboot@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list for skiboot development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Stewart Smith Errors-To: skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Skiboot" From: Stewart Smith When attempting to do SRESET under load, our existing implementation fell down and would fail to SRESET some CPUs, causing the fast-reboot to fail. This is reproduced by running the (new) op-test test testcases.OpTestFastReboot.FastRebootHostStressTorture - and especially so if you up the `stress` invocation to be 120 seconds. See PR for op-test: https://github.com/open-power/op-test-framework/pull/437 Unfortunately, this patch itself isn't the whole story. It makes things *better* but there's still a window. Mind you, with this patch I've seen it pass 24 fast-reboots, which is around 22 more than previously. The same problem has been observed with pdbg, see discussion there https://lists.ozlabs.org/pipermail/pdbg/2019-March/001076.html Signed-off-by: Stewart Smith --- core/direct-controls.c | 44 ++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 42 insertions(+), 2 deletions(-) diff --git a/core/direct-controls.c b/core/direct-controls.c index 1d0f6818e2c8..8783a83a25ac 100644 --- a/core/direct-controls.c +++ b/core/direct-controls.c @@ -1,4 +1,4 @@ -/* Copyright 2017 IBM Corp. +/* Copyright 2017-2019 IBM Corp. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -59,6 +59,12 @@ static void mambo_stop_cpu(struct cpu_thread *cpu) #define P8_DIRECT_CTL_STOP PPC_BIT(63) #define P8_DIRECT_CTL_PRENAP PPC_BIT(47) #define P8_DIRECT_CTL_SRESET PPC_BIT(60) +#define P8_EX_TCTL_RAS_STATUS(t) (0x10013002 + (t) * 0x10) +#define P8_RAS_STATUS_SRQ_EMPTY PPC_BIT(8) +#define P8_RAS_STATUS_LSU_QUIESCED PPC_BIT(9) +#define P8_RAS_STATUS_INST_COMPLETE PPC_BIT(12) +#define P8_RAS_STATUS_THREAD_ACTIVE PPC_BIT(48) +#define P8_RAS_STATUS_TS_QUIESCE PPC_BIT(49) static int p8_core_set_special_wakeup(struct cpu_thread *cpu) { @@ -224,6 +230,9 @@ static int p8_stop_thread(struct cpu_thread *cpu) uint32_t chip_id = pir_to_chip_id(cpu->pir); uint32_t thread_id = pir_to_thread_id(cpu->pir); uint32_t xscom_addr; + uint64_t val; + int i; + int rc; xscom_addr = XSCOM_ADDR_P8_EX(core_id, P8_EX_TCTL_DIRECT_CONTROLS(thread_id)); @@ -235,7 +244,38 @@ static int p8_stop_thread(struct cpu_thread *cpu) return OPAL_HARDWARE; } - return OPAL_SUCCESS; + xscom_addr = XSCOM_ADDR_P8_EX(core_id, + P8_EX_TCTL_RAS_STATUS(thread_id)); + for (i=0; i < 1000; i++) { + rc = xscom_read(chip_id, xscom_addr, &val); + if (rc) { + prlog(PR_ERR, "Could not check state of thread " + "%u:%u:%u\n", chip_id, core_id, thread_id); + return OPAL_HARDWARE; + } + prlog(PR_INFO, "RAS_STATUS for %u:%u:%u = 0x%llx\n", + chip_id, core_id, thread_id, val); + if (val & P8_RAS_STATUS_INST_COMPLETE) + break; + time_wait_ms(10); + } + if (!(val & P8_RAS_STATUS_INST_COMPLETE)) + return OPAL_HARDWARE; + + for (i=0; i < 1000; i++) { + rc = xscom_read(chip_id, xscom_addr, &val); + if (rc) { + prlog(PR_ERR, "Could not check state of thread " + "%u:%u:%u\n", chip_id, core_id, thread_id); + return OPAL_HARDWARE; + } + prlog(PR_INFO, "RAS_STATUS for %u:%u:%u = 0x%llx\n", + chip_id, core_id, thread_id, val); + if (val & P8_RAS_STATUS_TS_QUIESCE) + return OPAL_SUCCESS; + time_wait_ms(10); + } + return OPAL_HARDWARE; } static int p8_sreset_thread(struct cpu_thread *cpu)