From patchwork Thu May 23 12:21:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Barrat X-Patchwork-Id: 1104076 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 458pXK3lTCz9s4V for ; Thu, 23 May 2019 22:21:53 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 458pXJ1V4gzDqVW for ; Thu, 23 May 2019 22:21:52 +1000 (AEST) X-Original-To: skiboot@lists.ozlabs.org Delivered-To: skiboot@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=linux.ibm.com (client-ip=148.163.158.5; helo=mx0a-001b2d01.pphosted.com; envelope-from=fbarrat@linux.ibm.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 458pX90GwnzDqZy for ; Thu, 23 May 2019 22:21:44 +1000 (AEST) Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x4NC90uI106797 for ; Thu, 23 May 2019 08:21:41 -0400 Received: from e06smtp01.uk.ibm.com (e06smtp01.uk.ibm.com [195.75.94.97]) by mx0b-001b2d01.pphosted.com with ESMTP id 2sntw89nj8-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 23 May 2019 08:21:41 -0400 Received: from localhost by e06smtp01.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 23 May 2019 13:21:40 +0100 Received: from b06cxnps4074.portsmouth.uk.ibm.com (9.149.109.196) by e06smtp01.uk.ibm.com (192.168.101.131) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Thu, 23 May 2019 13:21:37 +0100 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x4NCLaes51577036 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 23 May 2019 12:21:36 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 03530A404D; Thu, 23 May 2019 12:21:36 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B77A9A4059; Thu, 23 May 2019 12:21:35 +0000 (GMT) Received: from bali.lab.toulouse-stg.fr.ibm.com (unknown [9.101.4.17]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Thu, 23 May 2019 12:21:35 +0000 (GMT) From: Frederic Barrat To: skiboot@lists.ozlabs.org, ajd@linux.ibm.com, arbab@linux.ibm.com, aik@ozlabs.ru Date: Thu, 23 May 2019 14:21:35 +0200 X-Mailer: git-send-email 2.21.0 MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 19052312-4275-0000-0000-00000337C9E3 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 19052312-4276-0000-0000-00003847661A Message-Id: <20190523122135.3757-1-fbarrat@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2019-05-23_10:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1905230087 Subject: [Skiboot] [PATCH] opal/hmi: Report NPU2 checkstop reason X-BeenThere: skiboot@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list for skiboot development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: clombard@linux.ibm.com Errors-To: skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Skiboot" The NPU2 is currently not passing any information to linux to explain the cause of an HMI. NPU2 has three Fault Isolation Registers and over 30 of those FIR bits are configured to raise an HMI by default. We won't be able to fit all possible state in the 32-bit xstop_reason field of the HMI event, but we can still try to encode up to 4 HMI reasons. Signed-off-by: Frederic Barrat Reviewed-by: Andrew Donnellan --- core/hmi.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/core/hmi.c b/core/hmi.c index d97f3fc0..3b2860f8 100644 --- a/core/hmi.c +++ b/core/hmi.c @@ -576,6 +576,46 @@ static bool phb_is_npu2(struct dt_node *dn) dt_node_is_compatible(dn, "ibm,power9-npu-opencapi-pciex")); } +static void add_npu2_xstop_reason(uint32_t *xstop_reason, uint8_t reason) +{ + int i, reason_count; + uint8_t *ptr; + + reason_count = sizeof(*xstop_reason) / sizeof(reason); + ptr = (uint8_t *) xstop_reason; + for (i = 0; i < reason_count; i++) { + if (*ptr == 0) { + *ptr = reason; + break; + } + ptr++; + } +} + +static void encode_npu2_xstop_reason(uint32_t *xstop_reason, + uint64_t fir, int fir_number) +{ + int bit; + uint8_t reason; + + /* + * There are three 64-bit FIRs but the xstop reason field of + * the hmi event is only 32-bit. Encode which FIR bit is set as: + * - 2 bits for the FIR number + * - 6 bits for the bit number (0 -> 63) + * + * So we could even encode up to 4 reasons for the HMI, if + * that can ever happen + */ + while (fir) { + bit = ilog2(fir); + reason = fir_number << 6; + reason |= (63 - bit); // IBM numbering + add_npu2_xstop_reason(xstop_reason, reason); + fir ^= 1ULL << bit; + } +} + static void find_npu2_checkstop_reason(int flat_chip_id, struct OpalHMIEvent *hmi_evt, uint64_t *out_flags) @@ -592,6 +632,7 @@ static void find_npu2_checkstop_reason(int flat_chip_id, uint64_t npu2_fir_action0_addr; uint64_t npu2_fir_action1_addr; uint64_t fatal_errors; + uint32_t xstop_reason = 0; int total_errors = 0; const char *loc; @@ -635,6 +676,8 @@ static void find_npu2_checkstop_reason(int flat_chip_id, prlog(PR_ERR, "NPU: [Loc: %s] P:%d ACTION0 0x%016llx, ACTION1 0x%016llx\n", loc, flat_chip_id, npu2_fir_action0, npu2_fir_action1); total_errors++; + + encode_npu2_xstop_reason(&xstop_reason, fatal_errors, i); } /* Can't do a fence yet, we are just logging fir information for now */ @@ -667,6 +710,7 @@ static void find_npu2_checkstop_reason(int flat_chip_id, hmi_evt->severity = OpalHMI_SEV_WARNING; hmi_evt->type = OpalHMI_ERROR_MALFUNC_ALERT; hmi_evt->u.xstop_error.xstop_type = CHECKSTOP_TYPE_NPU; + hmi_evt->u.xstop_error.xstop_reason = xstop_reason; hmi_evt->u.xstop_error.u.chip_id = flat_chip_id; /* Marking the event as recoverable so that we don't crash */