From patchwork Thu Jun 7 17:28:55 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mahesh J Salgaonkar X-Patchwork-Id: 926480 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 411t735smZz9s1b for ; Fri, 8 Jun 2018 03:38:23 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 411t734jN9zF34t for ; Fri, 8 Jun 2018 03:38:23 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 411swM2F0GzF34W for ; Fri, 8 Jun 2018 03:29:07 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from ozlabs.org (ozlabs.org [IPv6:2401:3900:2:1::2]) by bilbo.ozlabs.org (Postfix) with ESMTP id 411swL6pPXz8tQK for ; Fri, 8 Jun 2018 03:29:06 +1000 (AEST) Received: by ozlabs.org (Postfix) id 411swL6bY2z9s2t; Fri, 8 Jun 2018 03:29:06 +1000 (AEST) Delivered-To: linuxppc-dev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=linux.vnet.ibm.com (client-ip=148.163.156.1; helo=mx0a-001b2d01.pphosted.com; envelope-from=mahesh@linux.vnet.ibm.com; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 411swL3gMqz9s1b for ; Fri, 8 Jun 2018 03:29:06 +1000 (AEST) Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w57HJJWp126955 for ; Thu, 7 Jun 2018 13:29:04 -0400 Received: from e06smtp02.uk.ibm.com (e06smtp02.uk.ibm.com [195.75.94.98]) by mx0a-001b2d01.pphosted.com with ESMTP id 2jf89pjr9a-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 07 Jun 2018 13:29:03 -0400 Received: from localhost by e06smtp02.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 7 Jun 2018 18:29:00 +0100 Received: from b06cxnps3074.portsmouth.uk.ibm.com (9.149.109.194) by e06smtp02.uk.ibm.com (192.168.101.132) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Thu, 7 Jun 2018 18:28:58 +0100 Received: from d06av22.portsmouth.uk.ibm.com (d06av22.portsmouth.uk.ibm.com [9.149.105.58]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w57HSvvJ34799830 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Thu, 7 Jun 2018 17:28:57 GMT Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 226094C040; Thu, 7 Jun 2018 18:20:16 +0100 (BST) Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0A1FE4C04A; Thu, 7 Jun 2018 18:20:15 +0100 (BST) Received: from jupiter.in.ibm.com (unknown [9.85.93.2]) by d06av22.portsmouth.uk.ibm.com (Postfix) with ESMTP; Thu, 7 Jun 2018 18:20:14 +0100 (BST) Subject: [v3 PATCH 4/5] powerpc/pseries: Dump and flush SLB contents on SLB MCE errors. From: Mahesh J Salgaonkar To: linuxppc-dev Date: Thu, 07 Jun 2018 22:58:55 +0530 In-Reply-To: <152839244928.25118.15100234720683911223.stgit@jupiter.in.ibm.com> References: <152839244928.25118.15100234720683911223.stgit@jupiter.in.ibm.com> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 18060717-0008-0000-0000-000002454286 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18060717-0009-0000-0000-000021AB550C Message-Id: <152839253238.25118.3114450844744290470.stgit@jupiter.in.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-06-07_06:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=743 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1805220000 definitions=main-1806070189 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Laurent Dufour , "Aneesh Kumar K.V" , Nicholas Piggin Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" From: Mahesh Salgaonkar If we get a machine check exceptions due to SLB errors then dump the current SLB contents which will be very much helpful in debugging the root cause of SLB errors. On pseries, as of today system crashes on SLB errors. These are soft errors and can be fixed by flushing the SLBs so the kernel can continue to function instead of system crash. This patch fixes that also. With this patch the console will log SLB contents like below on SLB MCE errors: [ 822.711728] slb contents: [ 822.711730] 00 c000000008000000 400ea1b217000500 [ 822.711731] 1T ESID= c00000 VSID= ea1b217 LLP:100 [ 822.711732] 01 d000000008000000 400d43642f000510 [ 822.711733] 1T ESID= d00000 VSID= d43642f LLP:110 [ 822.711734] 09 f000000008000000 400a86c85f000500 [ 822.711736] 1T ESID= f00000 VSID= a86c85f LLP:100 [ 822.711737] 10 00007f0008000000 400d1f26e3000d90 [ 822.711738] 1T ESID= 7f VSID= d1f26e3 LLP:110 [ 822.711739] 11 0000000018000000 000e3615f520fd90 [ 822.711740] 256M ESID= 1 VSID= e3615f520f LLP:110 [ 822.711740] 12 d000000008000000 400d43642f000510 [ 822.711741] 1T ESID= d00000 VSID= d43642f LLP:110 [ 822.711742] 13 d000000008000000 400d43642f000510 [ 822.711743] 1T ESID= d00000 VSID= d43642f LLP:110 Suggested-by: Aneesh Kumar K.V Suggested-by: Michael Ellerman Signed-off-by: Mahesh Salgaonkar Reviewed-by: Nicholas Piggin --- arch/powerpc/include/asm/book3s/64/mmu-hash.h | 1 + arch/powerpc/mm/slb.c | 35 +++++++++++++++++++++++++ arch/powerpc/platforms/pseries/ras.c | 29 ++++++++++++++++++++- 3 files changed, 64 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h index 50ed64fba4ae..c0da68927235 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h @@ -487,6 +487,7 @@ extern void hpte_init_native(void); extern void slb_initialize(void); extern void slb_flush_and_rebolt(void); +extern void slb_dump_contents(void); extern void slb_vmalloc_update(void); extern void slb_set_size(u16 size); diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c index 66577cc66dc9..799aa117cec3 100644 --- a/arch/powerpc/mm/slb.c +++ b/arch/powerpc/mm/slb.c @@ -145,6 +145,41 @@ void slb_flush_and_rebolt(void) get_paca()->slb_cache_ptr = 0; } +void slb_dump_contents(void) +{ + int i; + unsigned long e, v; + unsigned long llp; + + pr_err("slb contents:\n"); + for (i = 0; i < mmu_slb_size; i++) { + asm volatile("slbmfee %0,%1" : "=r" (e) : "r" (i)); + asm volatile("slbmfev %0,%1" : "=r" (v) : "r" (i)); + + if (!e && !v) + continue; + + pr_err("%02d %016lx %016lx", i, e, v); + + if (!(e & SLB_ESID_V)) { + pr_err("\n"); + continue; + } + llp = v & SLB_VSID_LLP; + if (v & SLB_VSID_B_1T) { + pr_err(" 1T ESID=%9lx VSID=%13lx LLP:%3lx\n", + GET_ESID_1T(e), + (v & ~SLB_VSID_B) >> SLB_VSID_SHIFT_1T, + llp); + } else { + pr_err(" 256M ESID=%9lx VSID=%13lx LLP:%3lx\n", + GET_ESID(e), + (v & ~SLB_VSID_B) >> SLB_VSID_SHIFT, + llp); + } + } +} + void slb_vmalloc_update(void) { unsigned long vflags; diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c index 2edc673be137..e56759d92356 100644 --- a/arch/powerpc/platforms/pseries/ras.c +++ b/arch/powerpc/platforms/pseries/ras.c @@ -422,6 +422,31 @@ int pSeries_system_reset_exception(struct pt_regs *regs) return 0; /* need to perform reset */ } +static int mce_handle_error(struct rtas_error_log *errp) +{ + struct pseries_errorlog *pseries_log; + struct pseries_mc_errorlog *mce_log; + int disposition = rtas_error_disposition(errp); + uint8_t error_type; + + pseries_log = get_pseries_errorlog(errp, PSERIES_ELOG_SECT_ID_MCE); + if (pseries_log == NULL) + goto out; + + mce_log = (struct pseries_mc_errorlog *)pseries_log->data; + error_type = rtas_mc_error_type(mce_log); + + if ((disposition == RTAS_DISP_NOT_RECOVERED) && + (error_type == PSERIES_MC_ERROR_TYPE_SLB)) { + slb_dump_contents(); + slb_flush_and_rebolt(); + disposition = RTAS_DISP_FULLY_RECOVERED; + } + +out: + return disposition; +} + /* * See if we can recover from a machine check exception. * This is only called on power4 (or above) and only via @@ -434,7 +459,9 @@ int pSeries_system_reset_exception(struct pt_regs *regs) static int recover_mce(struct pt_regs *regs, struct rtas_error_log *err) { int recovered = 0; - int disposition = rtas_error_disposition(err); + int disposition; + + disposition = mce_handle_error(err); if (!(regs->msr & MSR_RI)) { /* If MSR_RI isn't set, we cannot recover */