From patchwork Fri Mar 16 02:40:58 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vaibhav Jain X-Patchwork-Id: 886598 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 402V8Z4N36z9sVj for ; Fri, 16 Mar 2018 13:41:34 +1100 (AEDT) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 402V8Z29l0zF1Kh for ; Fri, 16 Mar 2018 13:41:34 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com X-Original-To: skiboot@lists.ozlabs.org Delivered-To: skiboot@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=none (mailfrom) smtp.mailfrom=linux.vnet.ibm.com (client-ip=148.163.156.1; helo=mx0a-001b2d01.pphosted.com; envelope-from=vaibhav@linux.vnet.ibm.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 402V8R1mRyzF15w for ; Fri, 16 Mar 2018 13:41:26 +1100 (AEDT) Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w2G2eaU5126644 for ; Thu, 15 Mar 2018 22:41:24 -0400 Received: from e06smtp11.uk.ibm.com (e06smtp11.uk.ibm.com [195.75.94.107]) by mx0a-001b2d01.pphosted.com with ESMTP id 2gr2rkcvex-1 (version=TLSv1.2 cipher=AES256-SHA256 bits=256 verify=NOT) for ; Thu, 15 Mar 2018 22:41:24 -0400 Received: from localhost by e06smtp11.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 16 Mar 2018 02:41:21 -0000 Received: from b06cxnps4076.portsmouth.uk.ibm.com (9.149.109.198) by e06smtp11.uk.ibm.com (192.168.101.141) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Fri, 16 Mar 2018 02:41:19 -0000 Received: from d06av22.portsmouth.uk.ibm.com (d06av22.portsmouth.uk.ibm.com [9.149.105.58]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w2G2fJBp53346320; Fri, 16 Mar 2018 02:41:19 GMT Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 875DC4C04E; Fri, 16 Mar 2018 02:34:32 +0000 (GMT) Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6E3704C04A; Fri, 16 Mar 2018 02:34:29 +0000 (GMT) Received: from vajain21.in.ibm.com.com (unknown [9.85.132.198]) by d06av22.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 16 Mar 2018 02:34:29 +0000 (GMT) From: Vaibhav Jain To: skiboot@lists.ozlabs.org, Christophe Lombard , Frederic Barrat Date: Fri, 16 Mar 2018 08:10:58 +0530 X-Mailer: git-send-email 2.14.3 X-TM-AS-GCONF: 00 x-cbid: 18031602-0040-0000-0000-00000441BBED X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18031602-0041-0000-0000-000020E4C877 Message-Id: <20180316024058.19016-1-vaibhav@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2018-03-16_01:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1803160031 Subject: [Skiboot] [PATCH v2] capi: Poll Err/Status register during CAPP recovery X-BeenThere: skiboot@lists.ozlabs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Mailing list for skiboot development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Philippe Bergheaud , Andrew Donnellan MIME-Version: 1.0 Errors-To: skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Skiboot" This patch updates do_capp_recovery_scoms() to poll the CAPP Err/Status control register, check for CAPP-Recovery to complete/fail based on indications of BITS-1,5,9 and then proceed with the CAPP-Recovery scoms iif recovery completed successfully. This would prevent cases where we bring-up the PCIe link while recovery sequencer on CAPP is still busy with casting out cache lines. In case CAPP-Recovery didn't complete successfully an error is returned from do_capp_recovery_scoms() asking phb4_creset() to keep the phb4 fenced and mark it as broken. The loop that implements polling of Err/Status register will also log an error on the PHB when it continues for more than 168ms which is the max time to failure for CAPP-Recovery. Signed-off-by: Vaibhav Jain --- Changelog: v2 -> Added an extra check for Bit(0) in Err/Status reg at the beginning to check if recovery mode was entered. [Christophe] --- hw/phb4.c | 84 ++++++++++++++++++++++++++++++++++++++++++++++++++------------- 1 file changed, 67 insertions(+), 17 deletions(-) diff --git a/hw/phb4.c b/hw/phb4.c index 47175df2..515b43e3 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -2857,25 +2857,74 @@ static int64_t load_capp_ucode(struct phb4 *p) return rc; } -static void do_capp_recovery_scoms(struct phb4 *p) +static int do_capp_recovery_scoms(struct phb4 *p) { - uint64_t reg; - uint32_t offset; + uint64_t rc, reg, end; + uint64_t offset = PHB4_CAPP_REG_OFFSET(p); - PHBDBG(p, "Doing CAPP recovery scoms\n"); - - offset = PHB4_CAPP_REG_OFFSET(p); - /* disable snoops */ - xscom_write(p->chip_id, SNOOP_CAPI_CONFIG + offset, 0); - load_capp_ucode(p); - /* clear err rpt reg*/ - xscom_write(p->chip_id, CAPP_ERR_RPT_CLR + offset, 0); - /* clear capp fir */ - xscom_write(p->chip_id, CAPP_FIR + offset, 0); + /* Get the status of CAPP recovery */ xscom_read(p->chip_id, CAPP_ERR_STATUS_CTRL + offset, ®); - reg &= ~(PPC_BIT(0) | PPC_BIT(1)); - xscom_write(p->chip_id, CAPP_ERR_STATUS_CTRL + offset, reg); + + /* No recovery in progress ignore */ + if ((reg & PPC_BIT(0)) == 0) { + PHBDBG(p, "CAPP: No recovery in progress\n"); + return 0; + } + + PHBDBG(p, "CAPP: Waiting for recovery to complete\n"); + /* recovery timer failure period 168ms */ + end = mftb() + msecs_to_tb(168); + while ((reg & (PPC_BIT(1) | PPC_BIT(5) | PPC_BIT(9))) == 0) { + + time_wait_ms(5); + xscom_read(p->chip_id, CAPP_ERR_STATUS_CTRL + offset, ®); + + if (end && tb_compare(mftb(), end) != TB_AAFTERB) { + PHBERR(p, "CAPP: Capp recovery Timed-out.\n"); + end = 0; + } + } + + /* Check if the recovery failed or passed */ + if (reg & PPC_BIT(1)) { + PHBDBG(p, "Doing CAPP recovery scoms\n"); + /* disable snoops */ + xscom_write(p->chip_id, SNOOP_CAPI_CONFIG + offset, 0); + load_capp_ucode(p); + + /* clear err rpt reg*/ + xscom_write(p->chip_id, CAPP_ERR_RPT_CLR + offset, 0); + + /* clear capp fir */ + xscom_write(p->chip_id, CAPP_FIR + offset, 0); + + /* Just reset Bit-0,1 and dont touch any other bit */ + xscom_read(p->chip_id, CAPP_ERR_STATUS_CTRL + offset, ®); + reg &= ~(PPC_BIT(0) | PPC_BIT(1)); + xscom_write(p->chip_id, CAPP_ERR_STATUS_CTRL + offset, reg); + + PHBDBG(p, "CAPP recovery complete\n"); + } else { + + /* We will checkstop here due to FIR ACTION for + * failed recovery. So this message would never be logged. + * But if we still enter here then return an error forcing a + * fence of the PHB. + */ + if (reg & PPC_BIT(5)) + PHBERR(p, "CAPP: Capp recovery Failed\n"); + else if (reg & PPC_BIT(9)) + PHBERR(p, "CAPP: Capp recovery hang detected\n"); + else + PHBERR(p, "CAPP: Unknown recovery failure\n"); + + PHBDBG(p, "CAPP: Err/Status-reg=0x%016llx\n", reg); + rc = OPAL_HARDWARE; + } + +out: + return rc; } static int64_t phb4_creset(struct pci_slot *slot) @@ -2934,8 +2983,9 @@ static int64_t phb4_creset(struct pci_slot *slot) PHBDBG(p, "CRESET: No pending transactions\n"); /* capp recovery */ - if (p->flags & PHB4_CAPP_RECOVERY) - do_capp_recovery_scoms(p); + if (p->flags & PHB4_CAPP_RECOVERY && + do_capp_recovery_scoms(p)) + goto error; /* Clear errors in PFIR and NFIR */ xscom_write(p->chip_id, p->pci_stk_xscom + 0x1,