From patchwork Wed Oct 7 12:09:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mahesh J Salgaonkar X-Patchwork-Id: 1377983 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4C5tXS3Ntyz9sTD for ; Wed, 7 Oct 2020 23:13:28 +1100 (AEDT) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=linux.ibm.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=OZfL0uDY; dkim-atps=neutral Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 4C5tXS1wQLzDqBm for ; Wed, 7 Oct 2020 23:13:28 +1100 (AEDT) X-Original-To: skiboot@lists.ozlabs.org Delivered-To: skiboot@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=linux.ibm.com (client-ip=148.163.158.5; helo=mx0a-001b2d01.pphosted.com; envelope-from=mahesh@linux.ibm.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=OZfL0uDY; dkim-atps=neutral Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4C5tSJ1b9CzDqNt for ; Wed, 7 Oct 2020 23:09:51 +1100 (AEDT) Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 097C3aQk002439 for ; Wed, 7 Oct 2020 08:09:47 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=subject : from : to : cc : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=pp1; bh=LlUu7YXMQ78kUsFtSntWBIUw2xKAAiILKQlhhMgpbKU=; b=OZfL0uDY27gRi1ROHj6FBTjIVem2K92qW3XPrZiyIQTQD7ARcf/qlsVi09X6px+QERCT hBHAGfEhkk744Qf/B/UYq3p8+7sttXjqvMgTQXtmg7Rl+0c2p1y5PTgMR6l6DZr2nEUp lqgob+y2Xep972Zi6/MPLp94fo5HnngrTpeF37zVeiG0Q4vbmnGZTt5CAO+sWelxH1vI FBwcw1TaOslWlUC2D6K6bEyD39ZctoWtpOkHCbNuiDTtZu8N3r0VOzFpqIFsWlIE+3pp XLjpfDoJbr3OKLy5aZxEMo2uDJFyCrOq1xxJBxATDXieI4CLn6JSarmBNoRGINK5C2OE Sg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 341d9y0cnm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 07 Oct 2020 08:09:47 -0400 Received: from m0098414.ppops.net (m0098414.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 097C3esX002912 for ; Wed, 7 Oct 2020 08:09:47 -0400 Received: from ppma05fra.de.ibm.com (6c.4a.5195.ip4.static.sl-reverse.com [149.81.74.108]) by mx0b-001b2d01.pphosted.com with ESMTP id 341d9y0cms-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 07 Oct 2020 08:09:47 -0400 Received: from pps.filterd (ppma05fra.de.ibm.com [127.0.0.1]) by ppma05fra.de.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 097C8pqo025859; Wed, 7 Oct 2020 12:09:45 GMT Received: from b06avi18626390.portsmouth.uk.ibm.com (b06avi18626390.portsmouth.uk.ibm.com [9.149.26.192]) by ppma05fra.de.ibm.com with ESMTP id 33xgx827ke-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 07 Oct 2020 12:09:45 +0000 Received: from d06av22.portsmouth.uk.ibm.com (d06av22.portsmouth.uk.ibm.com [9.149.105.58]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 097C9g7K30409026 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 7 Oct 2020 12:09:42 GMT Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B51F64C05E; Wed, 7 Oct 2020 12:09:42 +0000 (GMT) Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1CF544C040; Wed, 7 Oct 2020 12:09:42 +0000 (GMT) Received: from [192.168.0.63] (unknown [9.199.48.78]) by d06av22.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 7 Oct 2020 12:09:41 +0000 (GMT) From: Mahesh Salgaonkar To: skiboot list Date: Wed, 07 Oct 2020 17:39:41 +0530 Message-ID: <160207258130.2097386.7392819145419047545.stgit@jupiter> In-Reply-To: <160207247879.2097386.9393389763183654717.stgit@jupiter> References: <160207247879.2097386.9393389763183654717.stgit@jupiter> User-Agent: StGit/0.21 MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235, 18.0.687 definitions=2020-10-07_08:2020-10-06, 2020-10-07 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 lowpriorityscore=0 suspectscore=2 bulkscore=0 phishscore=0 spamscore=0 impostorscore=0 adultscore=0 mlxscore=0 mlxlogscore=999 clxscore=1015 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2010070083 Subject: [Skiboot] [PATCH v2 08/10] opal/eeh: Add PHB diag data in error log for PE errors. X-BeenThere: skiboot@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list for skiboot development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Vasant Hegde Errors-To: skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Skiboot" Gather PHB status (diag data) for corresponding frozen PE and add it to errorlog. Since we qeury the PHB status during freeze_status/next_error opal call, we can't use get_diag_data2() which clears the errors. Instead introduce a new phb_ops which will collect only PHB status required to add it in errorlog. This will make sure linux also gets proper diag data when it calls get_diag_data2(). Signed-off-by: Mahesh Salgaonkar --- core/pci-opal.c | 21 +++++++++++++++++++-- hw/phb3.c | 25 +++++++++++++++++++++---- hw/phb4.c | 24 ++++++++++++++++++++---- include/pci.h | 2 ++ 4 files changed, 62 insertions(+), 10 deletions(-) diff --git a/core/pci-opal.c b/core/pci-opal.c index 333682a0b..34de05ddf 100644 --- a/core/pci-opal.c +++ b/core/pci-opal.c @@ -127,6 +127,9 @@ static void send_phb_freeze_event(struct phb *phb, void *diag_buffer) static void send_phb_pe_freeze_event(struct phb *phb, uint64_t pe_number) { struct errorlog *buf; + void *diag_buffer; + uint32_t len; + int rc; buf = opal_elog_create(&e_info(OPAL_RC_PCI_PHB_FREEZE), 0); if (!buf) { @@ -137,11 +140,25 @@ static void send_phb_pe_freeze_event(struct phb *phb, uint64_t pe_number) log_append_msg(buf, "PHB#%x PE#%lld Freeze/Fence detected!\n", phb->opal_id, pe_number); - /* TODO: Get PHB status data */ + /* Get PHB status data */ + len = dt_prop_get_u32(phb->dt_node, "ibm,phb-diag-data-size"); + diag_buffer = zalloc(len); + if (diag_buffer) { + rc = phb->ops->get_phb_status(phb, diag_buffer, len); + if (rc != OPAL_SUCCESS) { + prerror("Failed to gather phb diag data\n"); + free(diag_buffer); + diag_buffer = NULL; + } + } else + prerror("Failed to allocate size for phb diag data\n"); + /* TODO: Add location info of slot with forzen PEs */ - send_eeh_serviceable_event(phb, buf, NULL); + send_eeh_serviceable_event(phb, buf, diag_buffer); bitmap_set_bit(*phb->pe_freeze_reported, pe_number); + if (diag_buffer) + free(diag_buffer); } static int64_t opal_pci_config_read_half_word_be(uint64_t phb_id, diff --git a/hw/phb3.c b/hw/phb3.c index 5465b62ae..1bd753bae 100644 --- a/hw/phb3.c +++ b/hw/phb3.c @@ -3457,13 +3457,13 @@ static int64_t phb3_err_inject(struct phb *phb, uint64_t pe_number, return handler(p, pe_number, addr, mask, is_write); } -static int64_t phb3_get_diag_data(struct phb *phb, +/* Get phb diag data only. Do not clear the phb pending errors. */ +static int64_t phb3_get_phb_status(struct phb *phb, void *diag_buffer, uint64_t diag_buffer_len) { struct phb3 *p = phb_to_phb3(phb); struct OpalIoPhb3ErrorData *data = diag_buffer; - bool fenced; if (diag_buffer_len < sizeof(struct OpalIoPhb3ErrorData)) return OPAL_PARAMETER; @@ -3474,10 +3474,26 @@ static int64_t phb3_get_diag_data(struct phb *phb, * Dummy check for fence so that phb3_read_phb_status knows * whether to use ASB or AIB */ - fenced = phb3_fenced(p); + phb3_fenced(p); phb3_read_phb_status(p, data); - if (!fenced) + return OPAL_SUCCESS; +} + +/* Get phb diag data and clear the phb pending errors. */ +static int64_t phb3_get_diag_data(struct phb *phb, + void *diag_buffer, + uint64_t diag_buffer_len) +{ + int64_t rc; + struct phb3 *p = phb_to_phb3(phb); + struct OpalIoPhb3ErrorData *data = diag_buffer; + + rc = phb3_get_phb_status(phb, diag_buffer, diag_buffer_len); + if (!rc) + return rc; + + if (!(p->flags & PHB3_AIB_FENCED)) phb3_eeh_dump_regs(p, data); /* @@ -3873,6 +3889,7 @@ static const struct phb_ops phb3_ops = { .next_error = phb3_eeh_next_error, .err_inject = phb3_err_inject, .get_diag_data2 = phb3_get_diag_data, + .get_phb_status = phb3_get_phb_status, .set_capi_mode = phb3_set_capi_mode, .set_capp_recovery = phb3_set_capp_recovery, }; diff --git a/hw/phb4.c b/hw/phb4.c index cd50361fc..a088a640e 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -4091,11 +4091,11 @@ static int64_t phb4_err_inject(struct phb *phb, uint64_t pe_number, return handler(p, pe_number, addr, mask, is_write); } -static int64_t phb4_get_diag_data(struct phb *phb, +/* Get phb diag data only. Do not clear the phb pending errors. */ +static int64_t phb4_get_phb_status(struct phb *phb, void *diag_buffer, uint64_t diag_buffer_len) { - bool fenced; struct phb4 *p = phb_to_phb4(phb); struct OpalIoPhb4ErrorData *data = diag_buffer; @@ -4108,10 +4108,25 @@ static int64_t phb4_get_diag_data(struct phb *phb, * Dummy check for fence so that phb4_read_phb_status knows * whether to use ASB or AIB */ - fenced = phb4_fenced(p); + phb4_fenced(p); phb4_read_phb_status(p, data); - if (!fenced) + return OPAL_SUCCESS; +} + +/* Get phb diag data and clear the phb pending errors. */ +static int64_t phb4_get_diag_data(struct phb *phb, + void *diag_buffer, + uint64_t diag_buffer_len) +{ + int64_t rc; + struct phb4 *p = phb_to_phb4(phb); + + rc = phb4_get_phb_status(phb, diag_buffer, diag_buffer_len); + if (!rc) + return rc; + + if (!(p->flags & PHB4_AIB_FENCED)) phb4_eeh_dump_regs(p); /* @@ -4938,6 +4953,7 @@ static const struct phb_ops phb4_ops = { .next_error = phb4_eeh_next_error, .err_inject = phb4_err_inject, .get_diag_data2 = phb4_get_diag_data, + .get_phb_status = phb4_get_phb_status, .tce_kill = phb4_tce_kill, .set_capi_mode = phb4_set_capi_mode, .set_p2p = phb4_set_p2p, diff --git a/include/pci.h b/include/pci.h index d8f712e72..8da57a954 100644 --- a/include/pci.h +++ b/include/pci.h @@ -254,6 +254,8 @@ struct phb_ops { uint64_t mask); int64_t (*get_diag_data2)(struct phb *phb, void *diag_buffer, uint64_t diag_buffer_len); + int64_t (*get_phb_status)(struct phb *phb, void *diag_buffer, + uint64_t diag_buffer_len); int64_t (*next_error)(struct phb *phb, uint64_t *first_frozen_pe, uint16_t *pci_error_type, uint16_t *severity);