From patchwork Tue Aug 7 14:16:11 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mahesh J Salgaonkar X-Patchwork-Id: 954539 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41lGvZ05jYz9s3x for ; Wed, 8 Aug 2018 00:23:06 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 41lGvY5rv3zDqZw for ; Wed, 8 Aug 2018 00:23:05 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 41lGln2F57zDqZm for ; Wed, 8 Aug 2018 00:16:21 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from ozlabs.org (bilbo.ozlabs.org [203.11.71.1]) by bilbo.ozlabs.org (Postfix) with ESMTP id 41lGln1Hwfz8t21 for ; Wed, 8 Aug 2018 00:16:21 +1000 (AEST) Received: by ozlabs.org (Postfix) id 41lGln107Zz9s8T; Wed, 8 Aug 2018 00:16:21 +1000 (AEST) Delivered-To: linuxppc-dev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=linux.vnet.ibm.com (client-ip=148.163.158.5; helo=mx0a-001b2d01.pphosted.com; envelope-from=mahesh@linux.vnet.ibm.com; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41lGlm3llcz9s4s for ; Wed, 8 Aug 2018 00:16:20 +1000 (AEST) Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w77EFecv030541 for ; Tue, 7 Aug 2018 10:16:18 -0400 Received: from e06smtp02.uk.ibm.com (e06smtp02.uk.ibm.com [195.75.94.98]) by mx0b-001b2d01.pphosted.com with ESMTP id 2kqb3xne9h-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 07 Aug 2018 10:16:18 -0400 Received: from localhost by e06smtp02.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 7 Aug 2018 15:16:16 +0100 Received: from b06cxnps4074.portsmouth.uk.ibm.com (9.149.109.196) by e06smtp02.uk.ibm.com (192.168.101.132) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 7 Aug 2018 15:16:14 +0100 Received: from d06av24.portsmouth.uk.ibm.com (mk.ibm.com [9.149.105.60]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w77EGDaE33030394 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 7 Aug 2018 14:16:13 GMT Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3E14D42041; Tue, 7 Aug 2018 17:16:22 +0100 (BST) Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 115D64204C; Tue, 7 Aug 2018 17:16:21 +0100 (BST) Received: from jupiter.in.ibm.com (unknown [9.85.75.157]) by d06av24.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 7 Aug 2018 17:16:20 +0100 (BST) Subject: [PATCH v7 1/9] powerpc/pseries: Avoid using the size greater than RTAS_ERROR_LOG_MAX. From: Mahesh J Salgaonkar To: linuxppc-dev Date: Tue, 07 Aug 2018 19:46:11 +0530 In-Reply-To: <153365127532.14256.1965469477086140841.stgit@jupiter.in.ibm.com> References: <153365127532.14256.1965469477086140841.stgit@jupiter.in.ibm.com> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 18080714-0008-0000-0000-0000025E7126 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18080714-0009-0000-0000-000021C67739 Message-Id: <153365136196.14256.9927143064547021494.stgit@jupiter.in.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-08-07_05:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1808070145 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Michal Suchanek , Ananth Narayan , Nicholas Piggin , "Aneesh Kumar K.V" , Laurent Dufour Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" From: Mahesh Salgaonkar The global mce data buffer that used to copy rtas error log is of 2048 (RTAS_ERROR_LOG_MAX) bytes in size. Before the copy we read extended_log_length from rtas error log header, then use max of extended_log_length and RTAS_ERROR_LOG_MAX as a size of data to be copied. Ideally the platform (phyp) will never send extended error log with size > 2048. But if that happens, then we have a risk of buffer overrun and corruption. Fix this by using min_t instead. Fixes: d368514c3097 ("powerpc: Fix corruption when grabbing FWNMI data") Reported-by: Michal Suchanek Signed-off-by: Mahesh Salgaonkar --- arch/powerpc/platforms/pseries/ras.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c index 5e1ef9150182..ef104144d4bc 100644 --- a/arch/powerpc/platforms/pseries/ras.c +++ b/arch/powerpc/platforms/pseries/ras.c @@ -371,7 +371,7 @@ static struct rtas_error_log *fwnmi_get_errinfo(struct pt_regs *regs) int len, error_log_length; error_log_length = 8 + rtas_error_extended_log_length(h); - len = max_t(int, error_log_length, RTAS_ERROR_LOG_MAX); + len = min_t(int, error_log_length, RTAS_ERROR_LOG_MAX); memset(global_mce_data_buf, 0, RTAS_ERROR_LOG_MAX); memcpy(global_mce_data_buf, h, len); errhdr = (struct rtas_error_log *)global_mce_data_buf; From patchwork Tue Aug 7 14:16:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mahesh J Salgaonkar X-Patchwork-Id: 954541 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41lGyr5zfGz9s4s for ; Wed, 8 Aug 2018 00:25:56 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 41lGyr2DSwzDqMl for ; Wed, 8 Aug 2018 00:25:56 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from ozlabs.org (bilbo.ozlabs.org [203.11.71.1]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 41lGm75cmWzDqCK for ; Wed, 8 Aug 2018 00:16:39 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from ozlabs.org (ozlabs.org [IPv6:2401:3900:2:1::2]) by bilbo.ozlabs.org (Postfix) with ESMTP id 41lGm72ZKhz8t21 for ; Wed, 8 Aug 2018 00:16:39 +1000 (AEST) Received: by ozlabs.org (Postfix) id 41lGm71kVkz9s5K; Wed, 8 Aug 2018 00:16:39 +1000 (AEST) Delivered-To: linuxppc-dev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=linux.vnet.ibm.com (client-ip=148.163.156.1; helo=mx0a-001b2d01.pphosted.com; envelope-from=mahesh@linux.vnet.ibm.com; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41lGm65bLzz9s3x for ; Wed, 8 Aug 2018 00:16:38 +1000 (AEST) Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w77EFZ0o114612 for ; Tue, 7 Aug 2018 10:16:36 -0400 Received: from e06smtp05.uk.ibm.com (e06smtp05.uk.ibm.com [195.75.94.101]) by mx0a-001b2d01.pphosted.com with ESMTP id 2kqbqe44qc-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 07 Aug 2018 10:16:36 -0400 Received: from localhost by e06smtp05.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 7 Aug 2018 15:16:33 +0100 Received: from b06cxnps4076.portsmouth.uk.ibm.com (9.149.109.198) by e06smtp05.uk.ibm.com (192.168.101.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 7 Aug 2018 15:16:29 +0100 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w77EGSKv30933200 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 7 Aug 2018 14:16:28 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3D3B411C04A; Tue, 7 Aug 2018 17:16:36 +0100 (BST) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id CA3FE11C052; Tue, 7 Aug 2018 17:16:34 +0100 (BST) Received: from jupiter.in.ibm.com (unknown [9.85.75.157]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 7 Aug 2018 17:16:34 +0100 (BST) Subject: [PATCH v7 2/9] powerpc/pseries: Defer the logging of rtas error to irq work queue. From: Mahesh J Salgaonkar To: linuxppc-dev Date: Tue, 07 Aug 2018 19:46:26 +0530 In-Reply-To: <153365127532.14256.1965469477086140841.stgit@jupiter.in.ibm.com> References: <153365127532.14256.1965469477086140841.stgit@jupiter.in.ibm.com> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 18080714-0020-0000-0000-000002B3705F X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18080714-0021-0000-0000-0000210077AC Message-Id: <153365137867.14256.4323351777611805097.stgit@jupiter.in.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-08-07_05:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1808070145 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Michal Suchanek , Ananth Narayan , Nicholas Piggin , stable@vger.kernel.org, Laurent Dufour , "Aneesh Kumar K.V" Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" From: Mahesh Salgaonkar rtas_log_buf is a buffer to hold RTAS event data that are communicated to kernel by hypervisor. This buffer is then used to pass RTAS event data to user through proc fs. This buffer is allocated from vmalloc (non-linear mapping) area. On Machine check interrupt, register r3 points to RTAS extended event log passed by hypervisor that contains the MCE event. The pseries machine check handler then logs this error into rtas_log_buf. The rtas_log_buf is a vmalloc-ed (non-linear) buffer we end up taking up a page fault (vector 0x300) while accessing it. Since machine check interrupt handler runs in NMI context we can not afford to take any page fault. Page faults are not honored in NMI context and causes kernel panic. Apart from that, as Nick pointed out, pSeries_log_error() also takes a spin_lock while logging error which is not safe in NMI context. It may endup in deadlock if we get another MCE before releasing the lock. Fix this by deferring the logging of rtas error to irq work queue. Current implementation uses two different buffers to hold rtas error log depending on whether extended log is provided or not. This makes bit difficult to identify which buffer has valid data that needs to logged later in irq work. Simplify this using single buffer, one per paca, and copy rtas log to it irrespective of whether extended log is provided or not. Allocate this buffer below RMA region so that it can be accessed in real mode mce handler. Fixes: b96672dd840f ("powerpc: Machine check interrupt is a non-maskable interrupt") Cc: stable@vger.kernel.org Reviewed-by: Nicholas Piggin Signed-off-by: Mahesh Salgaonkar --- arch/powerpc/include/asm/paca.h | 3 ++ arch/powerpc/platforms/pseries/ras.c | 47 ++++++++++++++++++++++---------- arch/powerpc/platforms/pseries/setup.c | 16 +++++++++++ 3 files changed, 51 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h index 6d34bd71139d..7f22929ce915 100644 --- a/arch/powerpc/include/asm/paca.h +++ b/arch/powerpc/include/asm/paca.h @@ -252,6 +252,9 @@ struct paca_struct { void *rfi_flush_fallback_area; u64 l1d_flush_size; #endif +#ifdef CONFIG_PPC_PSERIES + u8 *mce_data_buf; /* buffer to hold per cpu rtas errlog */ +#endif /* CONFIG_PPC_PSERIES */ } ____cacheline_aligned; extern void copy_mm_to_paca(struct mm_struct *mm); diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c index ef104144d4bc..14a46b07ab2f 100644 --- a/arch/powerpc/platforms/pseries/ras.c +++ b/arch/powerpc/platforms/pseries/ras.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include @@ -32,11 +33,13 @@ static unsigned char ras_log_buf[RTAS_ERROR_LOG_MAX]; static DEFINE_SPINLOCK(ras_log_buf_lock); -static char global_mce_data_buf[RTAS_ERROR_LOG_MAX]; -static DEFINE_PER_CPU(__u64, mce_data_buf); - static int ras_check_exception_token; +static void mce_process_errlog_event(struct irq_work *work); +static struct irq_work mce_errlog_process_work = { + .func = mce_process_errlog_event, +}; + #define EPOW_SENSOR_TOKEN 9 #define EPOW_SENSOR_INDEX 0 @@ -330,16 +333,20 @@ static irqreturn_t ras_error_interrupt(int irq, void *dev_id) ((((A) >= 0x7000) && ((A) < 0x7ff0)) || \ (((A) >= rtas.base) && ((A) < (rtas.base + rtas.size - 16)))) +static inline struct rtas_error_log *fwnmi_get_errlog(void) +{ + return (struct rtas_error_log *)local_paca->mce_data_buf; +} + /* * Get the error information for errors coming through the * FWNMI vectors. The pt_regs' r3 will be updated to reflect * the actual r3 if possible, and a ptr to the error log entry * will be returned if found. * - * If the RTAS error is not of the extended type, then we put it in a per - * cpu 64bit buffer. If it is the extended type we use global_mce_data_buf. + * Use one buffer mce_data_buf per cpu to store RTAS error. * - * The global_mce_data_buf does not have any locks or protection around it, + * The mce_data_buf does not have any locks or protection around it, * if a second machine check comes in, or a system reset is done * before we have logged the error, then we will get corruption in the * error log. This is preferable over holding off on calling @@ -349,7 +356,7 @@ static irqreturn_t ras_error_interrupt(int irq, void *dev_id) static struct rtas_error_log *fwnmi_get_errinfo(struct pt_regs *regs) { unsigned long *savep; - struct rtas_error_log *h, *errhdr = NULL; + struct rtas_error_log *h; /* Mask top two bits */ regs->gpr[3] &= ~(0x3UL << 62); @@ -362,22 +369,20 @@ static struct rtas_error_log *fwnmi_get_errinfo(struct pt_regs *regs) savep = __va(regs->gpr[3]); regs->gpr[3] = savep[0]; /* restore original r3 */ - /* If it isn't an extended log we can use the per cpu 64bit buffer */ h = (struct rtas_error_log *)&savep[1]; + /* Use the per cpu buffer from paca to store rtas error log */ + memset(local_paca->mce_data_buf, 0, RTAS_ERROR_LOG_MAX); if (!rtas_error_extended(h)) { - memcpy(this_cpu_ptr(&mce_data_buf), h, sizeof(__u64)); - errhdr = (struct rtas_error_log *)this_cpu_ptr(&mce_data_buf); + memcpy(local_paca->mce_data_buf, h, sizeof(__u64)); } else { int len, error_log_length; error_log_length = 8 + rtas_error_extended_log_length(h); len = min_t(int, error_log_length, RTAS_ERROR_LOG_MAX); - memset(global_mce_data_buf, 0, RTAS_ERROR_LOG_MAX); - memcpy(global_mce_data_buf, h, len); - errhdr = (struct rtas_error_log *)global_mce_data_buf; + memcpy(local_paca->mce_data_buf, h, len); } - return errhdr; + return (struct rtas_error_log *)local_paca->mce_data_buf; } /* Call this when done with the data returned by FWNMI_get_errinfo. @@ -422,6 +427,17 @@ int pSeries_system_reset_exception(struct pt_regs *regs) return 0; /* need to perform reset */ } +/* + * Process MCE rtas errlog event. + */ +static void mce_process_errlog_event(struct irq_work *work) +{ + struct rtas_error_log *err; + + err = fwnmi_get_errlog(); + log_error((char *)err, ERR_TYPE_RTAS_LOG, 0); +} + /* * See if we can recover from a machine check exception. * This is only called on power4 (or above) and only via @@ -466,7 +482,8 @@ static int recover_mce(struct pt_regs *regs, struct rtas_error_log *err) recovered = 1; } - log_error((char *)err, ERR_TYPE_RTAS_LOG, 0); + /* Queue irq work to log this rtas event later. */ + irq_work_queue(&mce_errlog_process_work); return recovered; } diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index 139f0af6c3d9..b42087cd8c6b 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -41,6 +41,7 @@ #include #include #include +#include #include #include @@ -101,6 +102,9 @@ static void pSeries_show_cpuinfo(struct seq_file *m) static void __init fwnmi_init(void) { unsigned long system_reset_addr, machine_check_addr; + u8 *mce_data_buf; + unsigned int i; + int nr_cpus = num_possible_cpus(); int ibm_nmi_register = rtas_token("ibm,nmi-register"); if (ibm_nmi_register == RTAS_UNKNOWN_SERVICE) @@ -114,6 +118,18 @@ static void __init fwnmi_init(void) if (0 == rtas_call(ibm_nmi_register, 2, 1, NULL, system_reset_addr, machine_check_addr)) fwnmi_active = 1; + + /* + * Allocate a chunk for per cpu buffer to hold rtas errorlog. + * It will be used in real mode mce handler, hence it needs to be + * below RMA. + */ + mce_data_buf = __va(memblock_alloc_base(RTAS_ERROR_LOG_MAX * nr_cpus, + RTAS_ERROR_LOG_MAX, ppc64_rma_size)); + for_each_possible_cpu(i) { + paca_ptrs[i]->mce_data_buf = mce_data_buf + + (RTAS_ERROR_LOG_MAX * i); + } } static void pseries_8259_cascade(struct irq_desc *desc) From patchwork Tue Aug 7 14:16:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mahesh J Salgaonkar X-Patchwork-Id: 954543 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41lH2K4rsTz9s4s for ; Wed, 8 Aug 2018 00:28:57 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 41lH2K3SH0zDqZw for ; Wed, 8 Aug 2018 00:28:57 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 41lGmV25qSzDqGD for ; Wed, 8 Aug 2018 00:16:58 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from ozlabs.org (bilbo.ozlabs.org [203.11.71.1]) by bilbo.ozlabs.org (Postfix) with ESMTP id 41lGmV0q2Cz8t21 for ; Wed, 8 Aug 2018 00:16:58 +1000 (AEST) Received: by ozlabs.org (Postfix) id 41lGmT6dtVz9s3x; Wed, 8 Aug 2018 00:16:57 +1000 (AEST) Delivered-To: linuxppc-dev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=linux.vnet.ibm.com (client-ip=148.163.158.5; helo=mx0a-001b2d01.pphosted.com; envelope-from=mahesh@linux.vnet.ibm.com; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41lGmT1wtRz9s4s for ; Wed, 8 Aug 2018 00:16:57 +1000 (AEST) Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w77EF5FX095829 for ; Tue, 7 Aug 2018 10:16:55 -0400 Received: from e06smtp02.uk.ibm.com (e06smtp02.uk.ibm.com [195.75.94.98]) by mx0a-001b2d01.pphosted.com with ESMTP id 2kqbf1mq89-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 07 Aug 2018 10:16:54 -0400 Received: from localhost by e06smtp02.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 7 Aug 2018 15:16:52 +0100 Received: from b06cxnps3074.portsmouth.uk.ibm.com (9.149.109.194) by e06smtp02.uk.ibm.com (192.168.101.132) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 7 Aug 2018 15:16:49 +0100 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w77EGmC732833682 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 7 Aug 2018 14:16:48 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 40DB111C04A; Tue, 7 Aug 2018 17:16:56 +0100 (BST) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D299B11C070; Tue, 7 Aug 2018 17:16:54 +0100 (BST) Received: from jupiter.in.ibm.com (unknown [9.85.75.157]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 7 Aug 2018 17:16:54 +0100 (BST) Subject: [PATCH v7 3/9] powerpc/pseries: Fix endainness while restoring of r3 in MCE handler. From: Mahesh J Salgaonkar To: linuxppc-dev Date: Tue, 07 Aug 2018 19:46:46 +0530 In-Reply-To: <153365127532.14256.1965469477086140841.stgit@jupiter.in.ibm.com> References: <153365127532.14256.1965469477086140841.stgit@jupiter.in.ibm.com> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 18080714-0008-0000-0000-0000025E7144 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18080714-0009-0000-0000-000021C67758 Message-Id: <153365139399.14256.13555279013971561179.stgit@jupiter.in.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-08-07_05:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1808070145 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Michal Suchanek , Ananth Narayan , Nicholas Piggin , stable@vger.kernel.org, Laurent Dufour , "Aneesh Kumar K.V" Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" From: Mahesh Salgaonkar During Machine Check interrupt on pseries platform, register r3 points RTAS extended event log passed by hypervisor. Since hypervisor uses r3 to pass pointer to rtas log, it stores the original r3 value at the start of the memory (first 8 bytes) pointed by r3. Since hypervisor stores this info and rtas log is in BE format, linux should make sure to restore r3 value in correct endian format. Without this patch when MCE handler, after recovery, returns to code that that caused the MCE may end up with Data SLB access interrupt for invalid address followed by kernel panic or hang. [ 62.878965] Severe Machine check interrupt [Recovered] [ 62.878968] NIP [d00000000ca301b8]: init_module+0x1b8/0x338 [bork_kernel] [ 62.878969] Initiator: CPU [ 62.878970] Error type: SLB [Multihit] [ 62.878971] Effective address: d00000000ca70000 cpu 0xa: Vector: 380 (Data SLB Access) at [c0000000fc7775b0] pc: c0000000009694c0: vsnprintf+0x80/0x480 lr: c0000000009698e0: vscnprintf+0x20/0x60 sp: c0000000fc777830 msr: 8000000002009033 dar: a803a30c000000d0 current = 0xc00000000bc9ef00 paca = 0xc00000001eca5c00 softe: 3 irq_happened: 0x01 pid = 8860, comm = insmod [c0000000fc7778b0] c0000000009698e0 vscnprintf+0x20/0x60 [c0000000fc7778e0] c00000000016b6c4 vprintk_emit+0xb4/0x4b0 [c0000000fc777960] c00000000016d40c vprintk_func+0x5c/0xd0 [c0000000fc777980] c00000000016cbb4 printk+0x38/0x4c [c0000000fc7779a0] d00000000ca301c0 init_module+0x1c0/0x338 [bork_kernel] [c0000000fc777a40] c00000000000d9c4 do_one_initcall+0x54/0x230 [c0000000fc777b00] c0000000001b3b74 do_init_module+0x8c/0x248 [c0000000fc777b90] c0000000001b2478 load_module+0x12b8/0x15b0 [c0000000fc777d30] c0000000001b29e8 sys_finit_module+0xa8/0x110 [c0000000fc777e30] c00000000000b204 system_call+0x58/0x6c --- Exception: c00 (System Call) at 00007fff8bda0644 SP (7fffdfbfe980) is in userspace This patch fixes this issue. Fixes: a08a53ea4c97 ("powerpc/le: Enable RTAS events support") Cc: stable@vger.kernel.org Reviewed-by: Nicholas Piggin Signed-off-by: Mahesh Salgaonkar --- arch/powerpc/platforms/pseries/ras.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c index 14a46b07ab2f..851ce326874a 100644 --- a/arch/powerpc/platforms/pseries/ras.c +++ b/arch/powerpc/platforms/pseries/ras.c @@ -367,7 +367,7 @@ static struct rtas_error_log *fwnmi_get_errinfo(struct pt_regs *regs) } savep = __va(regs->gpr[3]); - regs->gpr[3] = savep[0]; /* restore original r3 */ + regs->gpr[3] = be64_to_cpu(savep[0]); /* restore original r3 */ h = (struct rtas_error_log *)&savep[1]; /* Use the per cpu buffer from paca to store rtas error log */ From patchwork Tue Aug 7 14:16:56 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mahesh J Salgaonkar X-Patchwork-Id: 954544 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41lH4y5G6sz9s3x for ; Wed, 8 Aug 2018 00:31:14 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 41lH4y48qRzDqZC for ; Wed, 8 Aug 2018 00:31:14 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from ozlabs.org (bilbo.ozlabs.org [203.11.71.1]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 41lGml426yzDqgw for ; Wed, 8 Aug 2018 00:17:11 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from ozlabs.org (bilbo.ozlabs.org [203.11.71.1]) by bilbo.ozlabs.org (Postfix) with ESMTP id 41lGml2T1Bz8vB4 for ; Wed, 8 Aug 2018 00:17:11 +1000 (AEST) Received: by ozlabs.org (Postfix) id 41lGml1Y7Dz9s5K; Wed, 8 Aug 2018 00:17:11 +1000 (AEST) Delivered-To: linuxppc-dev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=linux.vnet.ibm.com (client-ip=148.163.156.1; helo=mx0a-001b2d01.pphosted.com; envelope-from=mahesh@linux.vnet.ibm.com; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41lGmk5fd8z9s4s for ; Wed, 8 Aug 2018 00:17:10 +1000 (AEST) Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w77EFVXB114401 for ; Tue, 7 Aug 2018 10:17:09 -0400 Received: from e06smtp05.uk.ibm.com (e06smtp05.uk.ibm.com [195.75.94.101]) by mx0a-001b2d01.pphosted.com with ESMTP id 2kqbqe45n4-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 07 Aug 2018 10:17:09 -0400 Received: from localhost by e06smtp05.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 7 Aug 2018 15:17:01 +0100 Received: from b06cxnps4076.portsmouth.uk.ibm.com (9.149.109.198) by e06smtp05.uk.ibm.com (192.168.101.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 7 Aug 2018 15:16:59 +0100 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w77EGwiY42008740 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 7 Aug 2018 14:16:58 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4DE00AE05D; Tue, 7 Aug 2018 17:16:52 +0100 (BST) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 23ECAAE04D; Tue, 7 Aug 2018 17:16:51 +0100 (BST) Received: from jupiter.in.ibm.com (unknown [9.85.75.157]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 7 Aug 2018 17:16:50 +0100 (BST) Subject: [PATCH v7 4/9] powerpc/pseries: Define MCE error event section. From: Mahesh J Salgaonkar To: linuxppc-dev Date: Tue, 07 Aug 2018 19:46:56 +0530 In-Reply-To: <153365127532.14256.1965469477086140841.stgit@jupiter.in.ibm.com> References: <153365127532.14256.1965469477086140841.stgit@jupiter.in.ibm.com> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 18080714-0020-0000-0000-000002B37071 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18080714-0021-0000-0000-0000210077C0 Message-Id: <153365141396.14256.7147876868875454254.stgit@jupiter.in.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-08-07_05:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1808070145 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Michal Suchanek , Ananth Narayan , Nicholas Piggin , "Aneesh Kumar K.V" , Laurent Dufour Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" From: Mahesh Salgaonkar On pseries, the machine check error details are part of RTAS extended event log passed under Machine check exception section. This patch adds the definition of rtas MCE event section and related helper functions. Signed-off-by: Mahesh Salgaonkar --- arch/powerpc/include/asm/rtas.h | 111 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 111 insertions(+) diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h index 71e393c46a49..adc677c5e3a4 100644 --- a/arch/powerpc/include/asm/rtas.h +++ b/arch/powerpc/include/asm/rtas.h @@ -185,6 +185,13 @@ static inline uint8_t rtas_error_disposition(const struct rtas_error_log *elog) return (elog->byte1 & 0x18) >> 3; } +static inline +void rtas_set_disposition_recovered(struct rtas_error_log *elog) +{ + elog->byte1 &= ~0x18; + elog->byte1 |= (RTAS_DISP_FULLY_RECOVERED << 3); +} + static inline uint8_t rtas_error_extended(const struct rtas_error_log *elog) { return (elog->byte1 & 0x04) >> 2; @@ -275,6 +282,7 @@ inline uint32_t rtas_ext_event_company_id(struct rtas_ext_event_log_v6 *ext_log) #define PSERIES_ELOG_SECT_ID_CALL_HOME (('C' << 8) | 'H') #define PSERIES_ELOG_SECT_ID_USER_DEF (('U' << 8) | 'D') #define PSERIES_ELOG_SECT_ID_HOTPLUG (('H' << 8) | 'P') +#define PSERIES_ELOG_SECT_ID_MCE (('M' << 8) | 'C') /* Vendor specific Platform Event Log Format, Version 6, section header */ struct pseries_errorlog { @@ -326,6 +334,109 @@ struct pseries_hp_errorlog { #define PSERIES_HP_ELOG_ID_DRC_COUNT 3 #define PSERIES_HP_ELOG_ID_DRC_IC 4 +/* RTAS pseries MCE errorlog section */ +#pragma pack(push, 1) +struct pseries_mc_errorlog { + __be32 fru_id; + __be32 proc_id; + uint8_t error_type; + union { + struct { + uint8_t ue_err_type; + /* XXXXXXXX + * X 1: Permanent or Transient UE. + * X 1: Effective address provided. + * X 1: Logical address provided. + * XX 2: Reserved. + * XXX 3: Type of UE error. + */ + uint8_t reserved_1[6]; + __be64 effective_address; + __be64 logical_address; + } ue_error; + struct { + uint8_t soft_err_type; + /* XXXXXXXX + * X 1: Effective address provided. + * XXXXX 5: Reserved. + * XX 2: Type of SLB/ERAT/TLB error. + */ + uint8_t reserved_1[6]; + __be64 effective_address; + uint8_t reserved_2[8]; + } soft_error; + } u; +}; +#pragma pack(pop) + +/* RTAS pseries MCE error types */ +#define PSERIES_MC_ERROR_TYPE_UE 0x00 +#define PSERIES_MC_ERROR_TYPE_SLB 0x01 +#define PSERIES_MC_ERROR_TYPE_ERAT 0x02 +#define PSERIES_MC_ERROR_TYPE_TLB 0x04 +#define PSERIES_MC_ERROR_TYPE_D_CACHE 0x05 +#define PSERIES_MC_ERROR_TYPE_I_CACHE 0x07 + +/* RTAS pseries MCE error sub types */ +#define PSERIES_MC_ERROR_UE_INDETERMINATE 0 +#define PSERIES_MC_ERROR_UE_IFETCH 1 +#define PSERIES_MC_ERROR_UE_PAGE_TABLE_WALK_IFETCH 2 +#define PSERIES_MC_ERROR_UE_LOAD_STORE 3 +#define PSERIES_MC_ERROR_UE_PAGE_TABLE_WALK_LOAD_STORE 4 + +#define PSERIES_MC_ERROR_SLB_PARITY 0 +#define PSERIES_MC_ERROR_SLB_MULTIHIT 1 +#define PSERIES_MC_ERROR_SLB_INDETERMINATE 2 + +#define PSERIES_MC_ERROR_ERAT_PARITY 1 +#define PSERIES_MC_ERROR_ERAT_MULTIHIT 2 +#define PSERIES_MC_ERROR_ERAT_INDETERMINATE 3 + +#define PSERIES_MC_ERROR_TLB_PARITY 1 +#define PSERIES_MC_ERROR_TLB_MULTIHIT 2 +#define PSERIES_MC_ERROR_TLB_INDETERMINATE 3 + +static inline uint8_t rtas_mc_error_type(const struct pseries_mc_errorlog *mlog) +{ + return mlog->error_type; +} + +static inline uint8_t rtas_mc_error_sub_type( + const struct pseries_mc_errorlog *mlog) +{ + switch (mlog->error_type) { + case PSERIES_MC_ERROR_TYPE_UE: + return (mlog->u.ue_error.ue_err_type & 0x07); + case PSERIES_MC_ERROR_TYPE_SLB: + case PSERIES_MC_ERROR_TYPE_ERAT: + case PSERIES_MC_ERROR_TYPE_TLB: + return (mlog->u.soft_error.soft_err_type & 0x03); + default: + return 0; + } +} + +static inline uint64_t rtas_mc_get_effective_addr( + const struct pseries_mc_errorlog *mlog) +{ + uint64_t addr = 0; + + switch (mlog->error_type) { + case PSERIES_MC_ERROR_TYPE_UE: + if (mlog->u.ue_error.ue_err_type & 0x40) + addr = mlog->u.ue_error.effective_address; + break; + case PSERIES_MC_ERROR_TYPE_SLB: + case PSERIES_MC_ERROR_TYPE_ERAT: + case PSERIES_MC_ERROR_TYPE_TLB: + if (mlog->u.soft_error.soft_err_type & 0x80) + addr = mlog->u.soft_error.effective_address; + default: + break; + } + return be64_to_cpu(addr); +} + struct pseries_errorlog *get_pseries_errorlog(struct rtas_error_log *log, uint16_t section_id); From patchwork Tue Aug 7 14:17:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mahesh J Salgaonkar X-Patchwork-Id: 954545 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41lH7r5rDBz9ryt for ; Wed, 8 Aug 2018 00:33:44 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 41lH7r4ZhTzDqk8 for ; Wed, 8 Aug 2018 00:33:44 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 41lGn05SkbzDqGD for ; Wed, 8 Aug 2018 00:17:24 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from ozlabs.org (bilbo.ozlabs.org [203.11.71.1]) by bilbo.ozlabs.org (Postfix) with ESMTP id 41lGn04gs2z8vB4 for ; Wed, 8 Aug 2018 00:17:24 +1000 (AEST) Received: by ozlabs.org (Postfix) id 41lGn03YF8z9s5K; Wed, 8 Aug 2018 00:17:24 +1000 (AEST) Delivered-To: linuxppc-dev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=linux.vnet.ibm.com (client-ip=148.163.156.1; helo=mx0a-001b2d01.pphosted.com; envelope-from=mahesh@linux.vnet.ibm.com; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41lGn00MTcz9s3x for ; Wed, 8 Aug 2018 00:17:23 +1000 (AEST) Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w77EFC31011349 for ; Tue, 7 Aug 2018 10:17:22 -0400 Received: from e06smtp04.uk.ibm.com (e06smtp04.uk.ibm.com [195.75.94.100]) by mx0a-001b2d01.pphosted.com with ESMTP id 2kqay6ee2x-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 07 Aug 2018 10:17:22 -0400 Received: from localhost by e06smtp04.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 7 Aug 2018 15:17:19 +0100 Received: from b06cxnps3074.portsmouth.uk.ibm.com (9.149.109.194) by e06smtp04.uk.ibm.com (192.168.101.134) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 7 Aug 2018 15:17:17 +0100 Received: from d06av25.portsmouth.uk.ibm.com (d06av25.portsmouth.uk.ibm.com [9.149.105.61]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w77EHG3r35520722 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 7 Aug 2018 14:17:16 GMT Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A3EB711C076; Tue, 7 Aug 2018 17:17:23 +0100 (BST) Received: from d06av25.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5947A11C069; Tue, 7 Aug 2018 17:17:22 +0100 (BST) Received: from jupiter.in.ibm.com (unknown [9.85.75.157]) by d06av25.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 7 Aug 2018 17:17:22 +0100 (BST) Subject: [PATCH v7 5/9] powerpc/pseries: flush SLB contents on SLB MCE errors. From: Mahesh J Salgaonkar To: linuxppc-dev Date: Tue, 07 Aug 2018 19:47:14 +0530 In-Reply-To: <153365127532.14256.1965469477086140841.stgit@jupiter.in.ibm.com> References: <153365127532.14256.1965469477086140841.stgit@jupiter.in.ibm.com> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 18080714-0016-0000-0000-000001F37191 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18080714-0017-0000-0000-0000324978EA Message-Id: <153365142349.14256.9954484737438718329.stgit@jupiter.in.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-08-07_05:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1808070145 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Michal Suchanek , Ananth Narayan , Nicholas Piggin , "Aneesh Kumar K.V" , Laurent Dufour Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" From: Mahesh Salgaonkar On pseries, as of today system crashes if we get a machine check exceptions due to SLB errors. These are soft errors and can be fixed by flushing the SLBs so the kernel can continue to function instead of system crash. We do this in real mode before turning on MMU. Otherwise we would run into nested machine checks. This patch now fetches the rtas error log in real mode and flushes the SLBs on SLB errors. Signed-off-by: Mahesh Salgaonkar Signed-off-by: Michal Suchanek --- Changes in V7: - Fold Michal's patch into this patch. - Handle MSR_RI=0 and evil context case in MC handler. --- arch/powerpc/include/asm/book3s/64/mmu-hash.h | 1 arch/powerpc/include/asm/machdep.h | 1 arch/powerpc/kernel/exceptions-64s.S | 112 +++++++++++++++++++++++++ arch/powerpc/kernel/mce.c | 15 +++ arch/powerpc/mm/slb.c | 6 + arch/powerpc/platforms/powernv/setup.c | 11 ++ arch/powerpc/platforms/pseries/pseries.h | 1 arch/powerpc/platforms/pseries/ras.c | 51 +++++++++++ arch/powerpc/platforms/pseries/setup.c | 1 9 files changed, 195 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h index 50ed64fba4ae..cc00a7088cf3 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h @@ -487,6 +487,7 @@ extern void hpte_init_native(void); extern void slb_initialize(void); extern void slb_flush_and_rebolt(void); +extern void slb_flush_and_rebolt_realmode(void); extern void slb_vmalloc_update(void); extern void slb_set_size(u16 size); diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h index a47de82fb8e2..b4831f1338db 100644 --- a/arch/powerpc/include/asm/machdep.h +++ b/arch/powerpc/include/asm/machdep.h @@ -108,6 +108,7 @@ struct machdep_calls { /* Early exception handlers called in realmode */ int (*hmi_exception_early)(struct pt_regs *regs); + long (*machine_check_early)(struct pt_regs *regs); /* Called during machine check exception to retrive fixup address. */ bool (*mce_check_early_recovery)(struct pt_regs *regs); diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 285c6465324a..cb06f219570a 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -332,6 +332,9 @@ TRAMP_REAL_BEGIN(machine_check_pSeries) machine_check_fwnmi: SET_SCRATCH0(r13) /* save r13 */ EXCEPTION_PROLOG_0(PACA_EXMC) +BEGIN_FTR_SECTION + b machine_check_pSeries_early +END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE) machine_check_pSeries_0: EXCEPTION_PROLOG_1(PACA_EXMC, KVMTEST_PR, 0x200) /* @@ -343,6 +346,90 @@ machine_check_pSeries_0: TRAMP_KVM_SKIP(PACA_EXMC, 0x200) +TRAMP_REAL_BEGIN(machine_check_pSeries_early) +BEGIN_FTR_SECTION + EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200) + mr r10,r1 /* Save r1 */ + ld r1,PACAMCEMERGSP(r13) /* Use MC emergency stack */ + subi r1,r1,INT_FRAME_SIZE /* alloc stack frame */ + mfspr r11,SPRN_SRR0 /* Save SRR0 */ + mfspr r12,SPRN_SRR1 /* Save SRR1 */ + EXCEPTION_PROLOG_COMMON_1() + EXCEPTION_PROLOG_COMMON_2(PACA_EXMC) + EXCEPTION_PROLOG_COMMON_3(0x200) + addi r3,r1,STACK_FRAME_OVERHEAD + BRANCH_LINK_TO_FAR(machine_check_early) /* Function call ABI */ + ld r12,_MSR(r1) + andi. r11,r12,MSR_PR /* See if coming from user. */ + bne 2f /* continue in V mode if we are. */ + + /* + * At this point we are not sure about what context we come from. + * We may be in the middle of swithing stack. r1 may not be valid. + * Hence stay on emergency stack, call machine_check_exception and + * return from the interrupt. + * But before that, check if this is an un-recoverable exception. + * If yes, then stay on emergency stack and panic. + */ + andi. r11,r12,MSR_RI + bne 1f + + /* + * Check if we have successfully handled/recovered from error, if not + * then stay on emergency stack and panic. + */ + cmpdi r3,0 /* see if we handled MCE successfully */ + bne 1f /* if handled then return from interrupt */ + + LOAD_HANDLER(r10,unrecover_mce) + mtspr SPRN_SRR0,r10 + ld r10,PACAKMSR(r13) + /* + * We are going down. But there are chances that we might get hit by + * another MCE during panic path and we may run into unstable state + * with no way out. Hence, turn ME bit off while going down, so that + * when another MCE is hit during panic path, hypervisor will + * power cycle the lpar, instead of getting into MCE loop. + */ + li r3,MSR_ME + andc r10,r10,r3 /* Turn off MSR_ME */ + mtspr SPRN_SRR1,r10 + RFI_TO_KERNEL + b . + + /* Stay on emergency stack and return from interrupt. */ +1: LOAD_HANDLER(r10,mce_return) + mtspr SPRN_SRR0,r10 + ld r10,PACAKMSR(r13) + mtspr SPRN_SRR1,r10 + RFI_TO_KERNEL + b . + + /* Move original SRR0 and SRR1 into the respective regs */ +2: ld r9,_MSR(r1) + mtspr SPRN_SRR1,r9 + ld r3,_NIP(r1) + mtspr SPRN_SRR0,r3 + ld r9,_CTR(r1) + mtctr r9 + ld r9,_XER(r1) + mtxer r9 + ld r9,_LINK(r1) + mtlr r9 + REST_GPR(0, r1) + REST_8GPRS(2, r1) + REST_GPR(10, r1) + ld r11,_CCR(r1) + mtcr r11 + REST_GPR(11, r1) + REST_2GPRS(12, r1) + /* restore original r1. */ + ld r1,GPR1(r1) + SET_SCRATCH0(r13) /* save r13 */ + EXCEPTION_PROLOG_0(PACA_EXMC) + b machine_check_pSeries_0 +END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE) + EXC_COMMON_BEGIN(machine_check_common) /* * Machine check is different because we use a different @@ -536,6 +623,31 @@ EXC_COMMON_BEGIN(unrecover_mce) bl unrecoverable_exception b 1b +EXC_COMMON_BEGIN(mce_return) + /* Invoke machine_check_exception to print MCE event and return. */ + addi r3,r1,STACK_FRAME_OVERHEAD + bl machine_check_exception + ld r9,_MSR(r1) + mtspr SPRN_SRR1,r9 + ld r3,_NIP(r1) + mtspr SPRN_SRR0,r3 + ld r9,_CTR(r1) + mtctr r9 + ld r9,_XER(r1) + mtxer r9 + ld r9,_LINK(r1) + mtlr r9 + REST_GPR(0, r1) + REST_8GPRS(2, r1) + REST_GPR(10, r1) + ld r11,_CCR(r1) + mtcr r11 + REST_GPR(11, r1) + REST_2GPRS(12, r1) + /* restore original r1. */ + ld r1,GPR1(r1) + RFI_TO_KERNEL + b . EXC_REAL(data_access, 0x300, 0x80) EXC_VIRT(data_access, 0x4300, 0x80, 0x300) diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c index efdd16a79075..ae17d8aa60c4 100644 --- a/arch/powerpc/kernel/mce.c +++ b/arch/powerpc/kernel/mce.c @@ -488,10 +488,19 @@ long machine_check_early(struct pt_regs *regs) { long handled = 0; - __this_cpu_inc(irq_stat.mce_exceptions); + /* + * For pSeries we count mce when we go into virtual mode machine + * check handler. Hence skip it. Also, We can't access per cpu + * variables in real mode for LPAR. + */ + if (early_cpu_has_feature(CPU_FTR_HVMODE)) + __this_cpu_inc(irq_stat.mce_exceptions); - if (cur_cpu_spec && cur_cpu_spec->machine_check_early) - handled = cur_cpu_spec->machine_check_early(regs); + /* + * See if platform is capable of handling machine check. + */ + if (ppc_md.machine_check_early) + handled = ppc_md.machine_check_early(regs); return handled; } diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c index cb796724a6fc..e89f675f1b5e 100644 --- a/arch/powerpc/mm/slb.c +++ b/arch/powerpc/mm/slb.c @@ -145,6 +145,12 @@ void slb_flush_and_rebolt(void) get_paca()->slb_cache_ptr = 0; } +void slb_flush_and_rebolt_realmode(void) +{ + __slb_flush_and_rebolt(); + get_paca()->slb_cache_ptr = 0; +} + void slb_vmalloc_update(void) { unsigned long vflags; diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c index f96df0a25d05..b74c93bc2e55 100644 --- a/arch/powerpc/platforms/powernv/setup.c +++ b/arch/powerpc/platforms/powernv/setup.c @@ -431,6 +431,16 @@ static unsigned long pnv_get_proc_freq(unsigned int cpu) return ret_freq; } +static long pnv_machine_check_early(struct pt_regs *regs) +{ + long handled = 0; + + if (cur_cpu_spec && cur_cpu_spec->machine_check_early) + handled = cur_cpu_spec->machine_check_early(regs); + + return handled; +} + define_machine(powernv) { .name = "PowerNV", .probe = pnv_probe, @@ -442,6 +452,7 @@ define_machine(powernv) { .machine_shutdown = pnv_shutdown, .power_save = NULL, .calibrate_decr = generic_calibrate_decr, + .machine_check_early = pnv_machine_check_early, #ifdef CONFIG_KEXEC_CORE .kexec_cpu_down = pnv_kexec_cpu_down, #endif diff --git a/arch/powerpc/platforms/pseries/pseries.h b/arch/powerpc/platforms/pseries/pseries.h index 60db2ee511fb..ec2a5f61d4a4 100644 --- a/arch/powerpc/platforms/pseries/pseries.h +++ b/arch/powerpc/platforms/pseries/pseries.h @@ -24,6 +24,7 @@ struct pt_regs; extern int pSeries_system_reset_exception(struct pt_regs *regs); extern int pSeries_machine_check_exception(struct pt_regs *regs); +extern long pSeries_machine_check_realmode(struct pt_regs *regs); #ifdef CONFIG_SMP extern void smp_init_pseries(void); diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c index 851ce326874a..e4420f7c8fda 100644 --- a/arch/powerpc/platforms/pseries/ras.c +++ b/arch/powerpc/platforms/pseries/ras.c @@ -427,6 +427,35 @@ int pSeries_system_reset_exception(struct pt_regs *regs) return 0; /* need to perform reset */ } +static int mce_handle_error(struct rtas_error_log *errp) +{ + struct pseries_errorlog *pseries_log; + struct pseries_mc_errorlog *mce_log; + int disposition = rtas_error_disposition(errp); + uint8_t error_type; + + if (!rtas_error_extended(errp)) + goto out; + + pseries_log = get_pseries_errorlog(errp, PSERIES_ELOG_SECT_ID_MCE); + if (pseries_log == NULL) + goto out; + + mce_log = (struct pseries_mc_errorlog *)pseries_log->data; + error_type = rtas_mc_error_type(mce_log); + + if ((disposition == RTAS_DISP_NOT_RECOVERED) && + (error_type == PSERIES_MC_ERROR_TYPE_SLB)) { + /* Store the old slb content someplace. */ + slb_flush_and_rebolt_realmode(); + disposition = RTAS_DISP_FULLY_RECOVERED; + rtas_set_disposition_recovered(errp); + } + +out: + return disposition; +} + /* * Process MCE rtas errlog event. */ @@ -503,11 +532,31 @@ int pSeries_machine_check_exception(struct pt_regs *regs) struct rtas_error_log *errp; if (fwnmi_active) { - errp = fwnmi_get_errinfo(regs); fwnmi_release_errinfo(); + errp = fwnmi_get_errlog(); if (errp && recover_mce(regs, errp)) return 1; } return 0; } + +long pSeries_machine_check_realmode(struct pt_regs *regs) +{ + struct rtas_error_log *errp; + int disposition; + + if (fwnmi_active) { + errp = fwnmi_get_errinfo(regs); + /* + * Call to fwnmi_release_errinfo() in real mode causes kernel + * to panic. Hence we will call it as soon as we go into + * virtual mode. + */ + disposition = mce_handle_error(errp); + if (disposition == RTAS_DISP_FULLY_RECOVERED) + return 1; + } + + return 0; +} diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index b42087cd8c6b..7a9421d089d8 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -1000,6 +1000,7 @@ define_machine(pseries) { .calibrate_decr = generic_calibrate_decr, .progress = rtas_progress, .system_reset_exception = pSeries_system_reset_exception, + .machine_check_early = pSeries_machine_check_realmode, .machine_check_exception = pSeries_machine_check_exception, #ifdef CONFIG_KEXEC_CORE .machine_kexec = pSeries_machine_kexec, From patchwork Tue Aug 7 14:17:27 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mahesh J Salgaonkar X-Patchwork-Id: 954548 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41lHBq1GxNz9ryt for ; Wed, 8 Aug 2018 00:36:19 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 41lHBq03nszDqZw for ; Wed, 8 Aug 2018 00:36:19 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from ozlabs.org (bilbo.ozlabs.org [203.11.71.1]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 41lGnG6JchzDqYL for ; Wed, 8 Aug 2018 00:17:38 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from ozlabs.org (bilbo.ozlabs.org [203.11.71.1]) by bilbo.ozlabs.org (Postfix) with ESMTP id 41lGnG5dvCz8t21 for ; Wed, 8 Aug 2018 00:17:38 +1000 (AEST) Received: by ozlabs.org (Postfix) id 41lGnG5Pfyz9s5K; Wed, 8 Aug 2018 00:17:38 +1000 (AEST) Delivered-To: linuxppc-dev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=linux.vnet.ibm.com (client-ip=148.163.158.5; helo=mx0a-001b2d01.pphosted.com; envelope-from=mahesh@linux.vnet.ibm.com; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41lGnG0yh2z9s4s for ; Wed, 8 Aug 2018 00:17:37 +1000 (AEST) Received: from pps.filterd (m0098420.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w77EFLhM016123 for ; Tue, 7 Aug 2018 10:17:35 -0400 Received: from e06smtp01.uk.ibm.com (e06smtp01.uk.ibm.com [195.75.94.97]) by mx0b-001b2d01.pphosted.com with ESMTP id 2kqd0jg5cb-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 07 Aug 2018 10:17:34 -0400 Received: from localhost by e06smtp01.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 7 Aug 2018 15:17:33 +0100 Received: from b06cxnps3074.portsmouth.uk.ibm.com (9.149.109.194) by e06smtp01.uk.ibm.com (192.168.101.131) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 7 Aug 2018 15:17:30 +0100 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w77EHTOS24772752 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 7 Aug 2018 14:17:29 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 101C85204E; Tue, 7 Aug 2018 17:17:37 +0100 (BST) Received: from jupiter.in.ibm.com (unknown [9.85.75.157]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id C0F4A5205F; Tue, 7 Aug 2018 17:17:35 +0100 (BST) Subject: [PATCH v7 6/9] powerpc/pseries: Display machine check error details. From: Mahesh J Salgaonkar To: linuxppc-dev Date: Tue, 07 Aug 2018 19:47:27 +0530 In-Reply-To: <153365127532.14256.1965469477086140841.stgit@jupiter.in.ibm.com> References: <153365127532.14256.1965469477086140841.stgit@jupiter.in.ibm.com> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 18080714-4275-0000-0000-000002A6719C X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18080714-4276-0000-0000-000037AF775C Message-Id: <153365144136.14256.10463423296756798652.stgit@jupiter.in.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-08-07_05:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1808070145 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Michal Suchanek , Ananth Narayan , Nicholas Piggin , "Aneesh Kumar K.V" , Laurent Dufour Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" From: Mahesh Salgaonkar Extract the MCE error details from RTAS extended log and display it to console. With this patch you should now see mce logs like below: [ 142.371818] Severe Machine check interrupt [Recovered] [ 142.371822] NIP [d00000000ca301b8]: init_module+0x1b8/0x338 [bork_kernel] [ 142.371822] Initiator: CPU [ 142.371823] Error type: SLB [Multihit] [ 142.371824] Effective address: d00000000ca70000 Signed-off-by: Mahesh Salgaonkar --- arch/powerpc/include/asm/rtas.h | 5 + arch/powerpc/platforms/pseries/ras.c | 132 ++++++++++++++++++++++++++++++++++ 2 files changed, 137 insertions(+) diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h index adc677c5e3a4..9b3c6e06dad1 100644 --- a/arch/powerpc/include/asm/rtas.h +++ b/arch/powerpc/include/asm/rtas.h @@ -197,6 +197,11 @@ static inline uint8_t rtas_error_extended(const struct rtas_error_log *elog) return (elog->byte1 & 0x04) >> 2; } +static inline uint8_t rtas_error_initiator(const struct rtas_error_log *elog) +{ + return (elog->byte2 & 0xf0) >> 4; +} + #define rtas_error_type(x) ((x)->byte3) static inline diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c index e4420f7c8fda..656b35a42d93 100644 --- a/arch/powerpc/platforms/pseries/ras.c +++ b/arch/powerpc/platforms/pseries/ras.c @@ -427,6 +427,135 @@ int pSeries_system_reset_exception(struct pt_regs *regs) return 0; /* need to perform reset */ } +#define VAL_TO_STRING(ar, val) ((val < ARRAY_SIZE(ar)) ? ar[val] : "Unknown") + +static void pseries_print_mce_info(struct pt_regs *regs, + struct rtas_error_log *errp) +{ + const char *level, *sevstr; + struct pseries_errorlog *pseries_log; + struct pseries_mc_errorlog *mce_log; + uint8_t error_type, err_sub_type; + uint64_t addr; + uint8_t initiator = rtas_error_initiator(errp); + int disposition = rtas_error_disposition(errp); + + static const char * const initiators[] = { + "Unknown", + "CPU", + "PCI", + "ISA", + "Memory", + "Power Mgmt", + }; + static const char * const mc_err_types[] = { + "UE", + "SLB", + "ERAT", + "TLB", + "D-Cache", + "Unknown", + "I-Cache", + }; + static const char * const mc_ue_types[] = { + "Indeterminate", + "Instruction fetch", + "Page table walk ifetch", + "Load/Store", + "Page table walk Load/Store", + }; + + /* SLB sub errors valid values are 0x0, 0x1, 0x2 */ + static const char * const mc_slb_types[] = { + "Parity", + "Multihit", + "Indeterminate", + }; + + /* TLB and ERAT sub errors valid values are 0x1, 0x2, 0x3 */ + static const char * const mc_soft_types[] = { + "Unknown", + "Parity", + "Multihit", + "Indeterminate", + }; + + if (!rtas_error_extended(errp)) { + pr_err("Machine check interrupt: Missing extended error log\n"); + return; + } + + pseries_log = get_pseries_errorlog(errp, PSERIES_ELOG_SECT_ID_MCE); + if (pseries_log == NULL) + return; + + mce_log = (struct pseries_mc_errorlog *)pseries_log->data; + + error_type = rtas_mc_error_type(mce_log); + err_sub_type = rtas_mc_error_sub_type(mce_log); + + switch (rtas_error_severity(errp)) { + case RTAS_SEVERITY_NO_ERROR: + level = KERN_INFO; + sevstr = "Harmless"; + break; + case RTAS_SEVERITY_WARNING: + level = KERN_WARNING; + sevstr = ""; + break; + case RTAS_SEVERITY_ERROR: + case RTAS_SEVERITY_ERROR_SYNC: + level = KERN_ERR; + sevstr = "Severe"; + break; + case RTAS_SEVERITY_FATAL: + default: + level = KERN_ERR; + sevstr = "Fatal"; + break; + } + + printk("%s%s Machine check interrupt [%s]\n", level, sevstr, + disposition == RTAS_DISP_FULLY_RECOVERED ? + "Recovered" : "Not recovered"); + if (user_mode(regs)) { + printk("%s NIP: [%016lx] PID: %d Comm: %s\n", level, + regs->nip, current->pid, current->comm); + } else { + printk("%s NIP [%016lx]: %pS\n", level, regs->nip, + (void *)regs->nip); + } + printk("%s Initiator: %s\n", level, + VAL_TO_STRING(initiators, initiator)); + + switch (error_type) { + case PSERIES_MC_ERROR_TYPE_UE: + printk("%s Error type: %s [%s]\n", level, + VAL_TO_STRING(mc_err_types, error_type), + VAL_TO_STRING(mc_ue_types, err_sub_type)); + break; + case PSERIES_MC_ERROR_TYPE_SLB: + printk("%s Error type: %s [%s]\n", level, + VAL_TO_STRING(mc_err_types, error_type), + VAL_TO_STRING(mc_slb_types, err_sub_type)); + break; + case PSERIES_MC_ERROR_TYPE_ERAT: + case PSERIES_MC_ERROR_TYPE_TLB: + printk("%s Error type: %s [%s]\n", level, + VAL_TO_STRING(mc_err_types, error_type), + VAL_TO_STRING(mc_soft_types, err_sub_type)); + break; + default: + printk("%s Error type: %s\n", level, + VAL_TO_STRING(mc_err_types, error_type)); + break; + } + + addr = rtas_mc_get_effective_addr(mce_log); + if (addr) + printk("%s Effective address: %016llx\n", level, addr); +} + static int mce_handle_error(struct rtas_error_log *errp) { struct pseries_errorlog *pseries_log; @@ -481,8 +610,11 @@ static int recover_mce(struct pt_regs *regs, struct rtas_error_log *err) int recovered = 0; int disposition = rtas_error_disposition(err); + pseries_print_mce_info(regs, err); + if (!(regs->msr & MSR_RI)) { /* If MSR_RI isn't set, we cannot recover */ + pr_err("Machine check interrupt unrecoverable: MSR(RI=0)\n"); recovered = 0; } else if (disposition == RTAS_DISP_FULLY_RECOVERED) { From patchwork Tue Aug 7 14:17:39 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mahesh J Salgaonkar X-Patchwork-Id: 954549 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41lHGR2RpHz9ryt for ; Wed, 8 Aug 2018 00:39:27 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 41lHGQ4s5YzDqBx for ; Wed, 8 Aug 2018 00:39:26 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from ozlabs.org (bilbo.ozlabs.org [203.11.71.1]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 41lGnW3cBvzDqfy for ; Wed, 8 Aug 2018 00:17:51 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from ozlabs.org (bilbo.ozlabs.org [203.11.71.1]) by bilbo.ozlabs.org (Postfix) with ESMTP id 41lGnW2WDKz8vB4 for ; Wed, 8 Aug 2018 00:17:51 +1000 (AEST) Received: by ozlabs.org (Postfix) id 41lGnW28Fdz9s4s; Wed, 8 Aug 2018 00:17:51 +1000 (AEST) Delivered-To: linuxppc-dev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=linux.vnet.ibm.com (client-ip=148.163.156.1; helo=mx0a-001b2d01.pphosted.com; envelope-from=mahesh@linux.vnet.ibm.com; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41lGnV5g0mz9s3x for ; Wed, 8 Aug 2018 00:17:50 +1000 (AEST) Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w77EFVeN114386 for ; Tue, 7 Aug 2018 10:17:49 -0400 Received: from e06smtp04.uk.ibm.com (e06smtp04.uk.ibm.com [195.75.94.100]) by mx0a-001b2d01.pphosted.com with ESMTP id 2kqbqe4702-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 07 Aug 2018 10:17:48 -0400 Received: from localhost by e06smtp04.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 7 Aug 2018 15:17:46 +0100 Received: from b06cxnps3074.portsmouth.uk.ibm.com (9.149.109.194) by e06smtp04.uk.ibm.com (192.168.101.134) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 7 Aug 2018 15:17:42 +0100 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w77EHfF744040304 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 7 Aug 2018 14:17:42 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F07F8AE04D; Tue, 7 Aug 2018 17:17:35 +0100 (BST) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B4584AE05D; Tue, 7 Aug 2018 17:17:34 +0100 (BST) Received: from jupiter.in.ibm.com (unknown [9.85.75.157]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 7 Aug 2018 17:17:34 +0100 (BST) Subject: [PATCH v7 7/9] powerpc/pseries: Dump the SLB contents on SLB MCE errors. From: Mahesh J Salgaonkar To: linuxppc-dev Date: Tue, 07 Aug 2018 19:47:39 +0530 In-Reply-To: <153365127532.14256.1965469477086140841.stgit@jupiter.in.ibm.com> References: <153365127532.14256.1965469477086140841.stgit@jupiter.in.ibm.com> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 18080714-0016-0000-0000-000001F371A6 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18080714-0017-0000-0000-0000324978FE Message-Id: <153365145460.14256.11932687379471923123.stgit@jupiter.in.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-08-07_05:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=818 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1808070145 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Michal Suchanek , Ananth Narayan , Nicholas Piggin , "Aneesh Kumar K.V" , Laurent Dufour Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" From: Mahesh Salgaonkar If we get a machine check exceptions due to SLB errors then dump the current SLB contents which will be very much helpful in debugging the root cause of SLB errors. Introduce an exclusive buffer per cpu to hold faulty SLB entries. In real mode mce handler saves the old SLB contents into this buffer accessible through paca and print it out later in virtual mode. With this patch the console will log SLB contents like below on SLB MCE errors: [ 507.297236] SLB contents of cpu 0x1 [ 507.297237] Last SLB entry inserted at slot 16 [ 507.297238] 00 c000000008000000 400ea1b217000500 [ 507.297239] 1T ESID= c00000 VSID= ea1b217 LLP:100 [ 507.297240] 01 d000000008000000 400d43642f000510 [ 507.297242] 1T ESID= d00000 VSID= d43642f LLP:110 [ 507.297243] 11 f000000008000000 400a86c85f000500 [ 507.297244] 1T ESID= f00000 VSID= a86c85f LLP:100 [ 507.297245] 12 00007f0008000000 4008119624000d90 [ 507.297246] 1T ESID= 7f VSID= 8119624 LLP:110 [ 507.297247] 13 0000000018000000 00092885f5150d90 [ 507.297247] 256M ESID= 1 VSID= 92885f5150 LLP:110 [ 507.297248] 14 0000010008000000 4009e7cb50000d90 [ 507.297249] 1T ESID= 1 VSID= 9e7cb50 LLP:110 [ 507.297250] 15 d000000008000000 400d43642f000510 [ 507.297251] 1T ESID= d00000 VSID= d43642f LLP:110 [ 507.297252] 16 d000000008000000 400d43642f000510 [ 507.297253] 1T ESID= d00000 VSID= d43642f LLP:110 [ 507.297253] ---------------------------------- [ 507.297254] SLB cache ptr value = 3 [ 507.297254] Valid SLB cache entries: [ 507.297255] 00 EA[0-35]= 7f000 [ 507.297256] 01 EA[0-35]= 1 [ 507.297257] 02 EA[0-35]= 1000 [ 507.297257] Rest of SLB cache entries: [ 507.297258] 03 EA[0-35]= 7f000 [ 507.297258] 04 EA[0-35]= 1 [ 507.297259] 05 EA[0-35]= 1000 [ 507.297260] 06 EA[0-35]= 12 [ 507.297260] 07 EA[0-35]= 7f000 Suggested-by: Aneesh Kumar K.V Suggested-by: Michael Ellerman Signed-off-by: Mahesh Salgaonkar --- Changes in V7: - Print slb cache ptr value and slb cache data --- arch/powerpc/include/asm/book3s/64/mmu-hash.h | 7 ++ arch/powerpc/include/asm/paca.h | 4 + arch/powerpc/mm/slb.c | 73 +++++++++++++++++++++++++ arch/powerpc/platforms/pseries/ras.c | 10 +++ arch/powerpc/platforms/pseries/setup.c | 10 +++ 5 files changed, 103 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h index cc00a7088cf3..5a3fe282076d 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h @@ -485,9 +485,16 @@ static inline void hpte_init_pseries(void) { } extern void hpte_init_native(void); +struct slb_entry { + u64 esid; + u64 vsid; +}; + extern void slb_initialize(void); extern void slb_flush_and_rebolt(void); extern void slb_flush_and_rebolt_realmode(void); +extern void slb_save_contents(struct slb_entry *slb_ptr); +extern void slb_dump_contents(struct slb_entry *slb_ptr); extern void slb_vmalloc_update(void); extern void slb_set_size(u16 size); diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h index 7f22929ce915..233d25ff6f64 100644 --- a/arch/powerpc/include/asm/paca.h +++ b/arch/powerpc/include/asm/paca.h @@ -254,6 +254,10 @@ struct paca_struct { #endif #ifdef CONFIG_PPC_PSERIES u8 *mce_data_buf; /* buffer to hold per cpu rtas errlog */ + + /* Capture SLB related old contents in MCE handler. */ + struct slb_entry *mce_faulty_slbs; + u16 slb_save_cache_ptr; #endif /* CONFIG_PPC_PSERIES */ } ____cacheline_aligned; diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c index e89f675f1b5e..16a53689ffd4 100644 --- a/arch/powerpc/mm/slb.c +++ b/arch/powerpc/mm/slb.c @@ -151,6 +151,79 @@ void slb_flush_and_rebolt_realmode(void) get_paca()->slb_cache_ptr = 0; } +void slb_save_contents(struct slb_entry *slb_ptr) +{ + int i; + unsigned long e, v; + + /* Save slb_cache_ptr value. */ + get_paca()->slb_save_cache_ptr = get_paca()->slb_cache_ptr; + + if (!slb_ptr) + return; + + for (i = 0; i < mmu_slb_size; i++) { + asm volatile("slbmfee %0,%1" : "=r" (e) : "r" (i)); + asm volatile("slbmfev %0,%1" : "=r" (v) : "r" (i)); + slb_ptr->esid = e; + slb_ptr->vsid = v; + slb_ptr++; + } +} + +void slb_dump_contents(struct slb_entry *slb_ptr) +{ + int i, n; + unsigned long e, v; + unsigned long llp; + + if (!slb_ptr) + return; + + pr_err("SLB contents of cpu 0x%x\n", smp_processor_id()); + pr_err("Last SLB entry inserted at slot %lld\n", get_paca()->stab_rr); + + for (i = 0; i < mmu_slb_size; i++) { + e = slb_ptr->esid; + v = slb_ptr->vsid; + slb_ptr++; + + if (!e && !v) + continue; + + pr_err("%02d %016lx %016lx\n", i, e, v); + + if (!(e & SLB_ESID_V)) { + pr_err("\n"); + continue; + } + llp = v & SLB_VSID_LLP; + if (v & SLB_VSID_B_1T) { + pr_err(" 1T ESID=%9lx VSID=%13lx LLP:%3lx\n", + GET_ESID_1T(e), + (v & ~SLB_VSID_B) >> SLB_VSID_SHIFT_1T, + llp); + } else { + pr_err(" 256M ESID=%9lx VSID=%13lx LLP:%3lx\n", + GET_ESID(e), + (v & ~SLB_VSID_B) >> SLB_VSID_SHIFT, + llp); + } + } + pr_err("----------------------------------\n"); + + /* Dump slb cache entires as well. */ + pr_err("SLB cache ptr value = %d\n", get_paca()->slb_save_cache_ptr); + pr_err("Valid SLB cache entries:\n"); + n = min_t(int, get_paca()->slb_save_cache_ptr, SLB_CACHE_ENTRIES); + for (i = 0; i < n; i++) + pr_err("%02d EA[0-35]=%9x\n", i, get_paca()->slb_cache[i]); + pr_err("Rest of SLB cache entries:\n"); + for (i = n; i < SLB_CACHE_ENTRIES; i++) + pr_err("%02d EA[0-35]=%9x\n", i, get_paca()->slb_cache[i]); + +} + void slb_vmalloc_update(void) { unsigned long vflags; diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c index 656b35a42d93..117ca2ff5456 100644 --- a/arch/powerpc/platforms/pseries/ras.c +++ b/arch/powerpc/platforms/pseries/ras.c @@ -515,6 +515,10 @@ static void pseries_print_mce_info(struct pt_regs *regs, break; } + /* Display faulty slb contents for SLB errors. */ + if (error_type == PSERIES_MC_ERROR_TYPE_SLB) + slb_dump_contents(local_paca->mce_faulty_slbs); + printk("%s%s Machine check interrupt [%s]\n", level, sevstr, disposition == RTAS_DISP_FULLY_RECOVERED ? "Recovered" : "Not recovered"); @@ -575,7 +579,11 @@ static int mce_handle_error(struct rtas_error_log *errp) if ((disposition == RTAS_DISP_NOT_RECOVERED) && (error_type == PSERIES_MC_ERROR_TYPE_SLB)) { - /* Store the old slb content someplace. */ + /* + * Store the old slb content in paca before flushing. Print + * this when we go to virtual mode. + */ + slb_save_contents(local_paca->mce_faulty_slbs); slb_flush_and_rebolt_realmode(); disposition = RTAS_DISP_FULLY_RECOVERED; rtas_set_disposition_recovered(errp); diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index 7a9421d089d8..53aee58a928b 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -105,6 +105,9 @@ static void __init fwnmi_init(void) u8 *mce_data_buf; unsigned int i; int nr_cpus = num_possible_cpus(); + struct slb_entry *slb_ptr; + size_t size; + int ibm_nmi_register = rtas_token("ibm,nmi-register"); if (ibm_nmi_register == RTAS_UNKNOWN_SERVICE) @@ -130,6 +133,13 @@ static void __init fwnmi_init(void) paca_ptrs[i]->mce_data_buf = mce_data_buf + (RTAS_ERROR_LOG_MAX * i); } + + /* Allocate per cpu slb area to save old slb contents during MCE */ + size = sizeof(struct slb_entry) * mmu_slb_size * nr_cpus; + slb_ptr = __va(memblock_alloc_base(size, sizeof(struct slb_entry), + ppc64_rma_size)); + for_each_possible_cpu(i) + paca_ptrs[i]->mce_faulty_slbs = slb_ptr + (mmu_slb_size * i); } static void pseries_8259_cascade(struct irq_desc *desc) From patchwork Tue Aug 7 14:17:49 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mahesh J Salgaonkar X-Patchwork-Id: 954550 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41lHNJ4SK1z9rvt for ; Wed, 8 Aug 2018 00:44:32 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 41lHNJ3GvBzDqGD for ; Wed, 8 Aug 2018 00:44:32 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 41lGng2w9rzDqMh for ; Wed, 8 Aug 2018 00:17:59 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from ozlabs.org (ozlabs.org [IPv6:2401:3900:2:1::2]) by bilbo.ozlabs.org (Postfix) with ESMTP id 41lGng24Bxz8t21 for ; Wed, 8 Aug 2018 00:17:59 +1000 (AEST) Received: by ozlabs.org (Postfix) id 41lGng172kz9s4s; Wed, 8 Aug 2018 00:17:59 +1000 (AEST) Delivered-To: linuxppc-dev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=linux.vnet.ibm.com (client-ip=148.163.158.5; helo=mx0a-001b2d01.pphosted.com; envelope-from=mahesh@linux.vnet.ibm.com; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41lGnf433Sz9s3x for ; Wed, 8 Aug 2018 00:17:58 +1000 (AEST) Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w77EGsT7025876 for ; Tue, 7 Aug 2018 10:17:56 -0400 Received: from e06smtp04.uk.ibm.com (e06smtp04.uk.ibm.com [195.75.94.100]) by mx0b-001b2d01.pphosted.com with ESMTP id 2kqcpb94e2-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 07 Aug 2018 10:17:56 -0400 Received: from localhost by e06smtp04.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 7 Aug 2018 15:17:54 +0100 Received: from b06cxnps3074.portsmouth.uk.ibm.com (9.149.109.194) by e06smtp04.uk.ibm.com (192.168.101.134) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 7 Aug 2018 15:17:52 +0100 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w77EHp9f19267730 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 7 Aug 2018 14:17:51 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3CB125206B; Tue, 7 Aug 2018 17:17:59 +0100 (BST) Received: from jupiter.in.ibm.com (unknown [9.85.75.157]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id 1647E52059; Tue, 7 Aug 2018 17:17:57 +0100 (BST) Subject: [PATCH v7 8/9] powerpc/mce: Add sysctl control for recovery action on MCE. From: Mahesh J Salgaonkar To: linuxppc-dev Date: Tue, 07 Aug 2018 19:47:49 +0530 In-Reply-To: <153365127532.14256.1965469477086140841.stgit@jupiter.in.ibm.com> References: <153365127532.14256.1965469477086140841.stgit@jupiter.in.ibm.com> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 18080714-0016-0000-0000-000001F371AC X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18080714-0017-0000-0000-000032497905 Message-Id: <153365146712.14256.11869543914717297278.stgit@jupiter.in.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-08-07_05:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=568 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1808070145 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Michal Suchanek , Ananth Narayan , Nicholas Piggin , "Aneesh Kumar K.V" , Laurent Dufour Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" From: Mahesh Salgaonkar Introduce recovery action for recovered memory errors (MCEs). There are soft memory errors like SLB Multihit, which can be a result of a bad hardware OR software BUG. Kernel can easily recover from these soft errors by flushing SLB contents. After the recovery kernel can still continue to function without any issue. But in some scenario's we may keep getting these soft errors until the root cause is fixed. To be able to analyze and find the root cause, best way is to gather enough data and system state at the time of MCE. Hence this patch introduces a sysctl knob where user can decide either to continue after recovery or panic the kernel to capture the dump. This will allow one to configure a kernel to capture a dump on MCE and then toggle back to recovery while dump is being analyzed. Signed-off-by: Mahesh Salgaonkar --- arch/powerpc/include/asm/mce.h | 2 + arch/powerpc/kernel/mce.c | 58 ++++++++++++++++++++++++++++++++ arch/powerpc/kernel/traps.c | 3 +- arch/powerpc/platforms/powernv/setup.c | 4 ++ 4 files changed, 66 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h index 3a1226e9b465..d46e1903878d 100644 --- a/arch/powerpc/include/asm/mce.h +++ b/arch/powerpc/include/asm/mce.h @@ -202,6 +202,8 @@ struct mce_error_info { #define MCE_EVENT_RELEASE true #define MCE_EVENT_DONTRELEASE false +extern int recover_on_mce; + extern void save_mce_event(struct pt_regs *regs, long handled, struct mce_error_info *mce_err, uint64_t nip, uint64_t addr, uint64_t phys_addr); diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c index ae17d8aa60c4..5e2ab5cade81 100644 --- a/arch/powerpc/kernel/mce.c +++ b/arch/powerpc/kernel/mce.c @@ -28,6 +28,7 @@ #include #include #include +#include #include #include @@ -631,3 +632,60 @@ long hmi_exception_realmode(struct pt_regs *regs) return 1; } + +/* + * Recovery action for recovered memory errors. + * + * There are soft memory errors like SLB Multihit, which can be a result of + * a bad hardware OR software BUG. Kernel can easily recover from these + * soft errors by flushing SLB contents. After the recovery kernel can + * still continue to function without any issue. But in some scenario's we + * may keep getting these soft errors until the root cause is fixed. To be + * able to analyze and find the root cause, best way is to gather enough + * data and system state at the time of MCE. Introduce a sysctl knob where + * user can decide either to continue after recovery or panic the kernel + * to capture the dump. This will allow one to configure a kernel to capture + * dump on MCE and then toggle back to recovery while dump is being analyzed. + * + * recover_on_mce == 0 + * panic/crash the kernel to trigger dump capture. + * + * recover_on_mce == 1 + * continue after MCE recovery. (no panic) + */ +int recover_on_mce; + +#ifdef CONFIG_SYSCTL +/* + * Register the sysctl to define memory error recovery action. + */ +static struct ctl_table machine_check_ctl_table[] = { + { + .procname = "recover_on_mce", + .data = &recover_on_mce, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec, + }, + {} +}; + +static struct ctl_table machine_check_sysctl_root[] = { + { + .procname = "kernel", + .mode = 0555, + .child = machine_check_ctl_table, + }, + {} +}; + +static int __init register_machine_check_sysctl(void) +{ + register_sysctl_table(machine_check_sysctl_root); + + return 0; +} +__initcall(register_machine_check_sysctl); +#endif /* CONFIG_SYSCTL */ + +core_param(recover_on_mce, recover_on_mce, int, 0644); diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c index 0e17dcb48720..246477c790e8 100644 --- a/arch/powerpc/kernel/traps.c +++ b/arch/powerpc/kernel/traps.c @@ -70,6 +70,7 @@ #include #include #include +#include #if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC_CORE) int (*__debugger)(struct pt_regs *regs) __read_mostly; @@ -727,7 +728,7 @@ void machine_check_exception(struct pt_regs *regs) else if (cur_cpu_spec->machine_check) recover = cur_cpu_spec->machine_check(regs); - if (recover > 0) + if ((recover > 0) && recover_on_mce) goto bail; if (debugger_fault_handler(regs)) diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c index b74c93bc2e55..d13278029a94 100644 --- a/arch/powerpc/platforms/powernv/setup.c +++ b/arch/powerpc/platforms/powernv/setup.c @@ -39,6 +39,7 @@ #include #include #include +#include #include "powernv.h" @@ -147,6 +148,9 @@ static void __init pnv_setup_arch(void) /* Enable NAP mode */ powersave_nap = 1; + /* Recovery action on recovered MCE. By default enable it on PowerNV */ + recover_on_mce = 1; + /* XXX PMCS */ } From patchwork Tue Aug 7 14:18:04 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mahesh J Salgaonkar X-Patchwork-Id: 954551 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41lHRN6lRjz9rvt for ; Wed, 8 Aug 2018 00:47:12 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 41lHRN4vzhzDqfq for ; Wed, 8 Aug 2018 00:47:12 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from ozlabs.org (bilbo.ozlabs.org [203.11.71.1]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 41lGny24wFzDqZB for ; Wed, 8 Aug 2018 00:18:14 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from ozlabs.org (bilbo.ozlabs.org [203.11.71.1]) by bilbo.ozlabs.org (Postfix) with ESMTP id 41lGny0frZz8t21 for ; Wed, 8 Aug 2018 00:18:14 +1000 (AEST) Received: by ozlabs.org (Postfix) id 41lGny03X9z9s4s; Wed, 8 Aug 2018 00:18:14 +1000 (AEST) Delivered-To: linuxppc-dev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=linux.vnet.ibm.com (client-ip=148.163.158.5; helo=mx0a-001b2d01.pphosted.com; envelope-from=mahesh@linux.vnet.ibm.com; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41lGnx3P8rz9s3x for ; Wed, 8 Aug 2018 00:18:13 +1000 (AEST) Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w77EF4eZ110966 for ; Tue, 7 Aug 2018 10:18:11 -0400 Received: from e06smtp03.uk.ibm.com (e06smtp03.uk.ibm.com [195.75.94.99]) by mx0b-001b2d01.pphosted.com with ESMTP id 2kqanny0gq-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 07 Aug 2018 10:18:10 -0400 Received: from localhost by e06smtp03.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 7 Aug 2018 15:18:09 +0100 Received: from b06cxnps3075.portsmouth.uk.ibm.com (9.149.109.195) by e06smtp03.uk.ibm.com (192.168.101.133) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 7 Aug 2018 15:18:07 +0100 Received: from d06av22.portsmouth.uk.ibm.com (d06av22.portsmouth.uk.ibm.com [9.149.105.58]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w77EI6Fv29950116 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 7 Aug 2018 14:18:06 GMT Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 45D2C4C05C; Tue, 7 Aug 2018 17:18:15 +0100 (BST) Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 0B1744C059; Tue, 7 Aug 2018 17:18:14 +0100 (BST) Received: from jupiter.in.ibm.com (unknown [9.85.75.157]) by d06av22.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 7 Aug 2018 17:18:13 +0100 (BST) Subject: [PATCH v7 9/9] powernv/pseries: consolidate code for mce early handling. From: Mahesh J Salgaonkar To: linuxppc-dev Date: Tue, 07 Aug 2018 19:48:04 +0530 In-Reply-To: <153365127532.14256.1965469477086140841.stgit@jupiter.in.ibm.com> References: <153365127532.14256.1965469477086140841.stgit@jupiter.in.ibm.com> User-Agent: StGit/unknown-version MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 18080714-0012-0000-0000-0000029670C6 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18080714-0013-0000-0000-000020C976E9 Message-Id: <153365147682.14256.7710319162347916694.stgit@jupiter.in.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:, , definitions=2018-08-07_05:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1808070145 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Michal Suchanek , Ananth Narayan , Nicholas Piggin , "Aneesh Kumar K.V" , Laurent Dufour Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" From: Mahesh Salgaonkar Now that other platforms also implements real mode mce handler, lets consolidate the code by sharing existing powernv machine check early code. Rename machine_check_powernv_early to machine_check_common_early and reuse the code. Signed-off-by: Mahesh Salgaonkar --- arch/powerpc/kernel/exceptions-64s.S | 138 +++++++--------------------------- 1 file changed, 28 insertions(+), 110 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index cb06f219570a..2f85a7baf026 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -243,14 +243,13 @@ EXC_REAL_BEGIN(machine_check, 0x200, 0x100) SET_SCRATCH0(r13) /* save r13 */ EXCEPTION_PROLOG_0(PACA_EXMC) BEGIN_FTR_SECTION - b machine_check_powernv_early + b machine_check_common_early FTR_SECTION_ELSE b machine_check_pSeries_0 ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE) EXC_REAL_END(machine_check, 0x200, 0x100) EXC_VIRT_NONE(0x4200, 0x100) -TRAMP_REAL_BEGIN(machine_check_powernv_early) -BEGIN_FTR_SECTION +TRAMP_REAL_BEGIN(machine_check_common_early) EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200) /* * Register contents: @@ -306,7 +305,9 @@ BEGIN_FTR_SECTION /* Save r9 through r13 from EXMC save area to stack frame. */ EXCEPTION_PROLOG_COMMON_2(PACA_EXMC) mfmsr r11 /* get MSR value */ +BEGIN_FTR_SECTION ori r11,r11,MSR_ME /* turn on ME bit */ +END_FTR_SECTION_IFSET(CPU_FTR_HVMODE) ori r11,r11,MSR_RI /* turn on RI bit */ LOAD_HANDLER(r12, machine_check_handle_early) 1: mtspr SPRN_SRR0,r12 @@ -325,7 +326,6 @@ BEGIN_FTR_SECTION andc r11,r11,r10 /* Turn off MSR_ME */ b 1b b . /* prevent speculative execution */ -END_FTR_SECTION_IFSET(CPU_FTR_HVMODE) TRAMP_REAL_BEGIN(machine_check_pSeries) .globl machine_check_fwnmi @@ -333,7 +333,7 @@ machine_check_fwnmi: SET_SCRATCH0(r13) /* save r13 */ EXCEPTION_PROLOG_0(PACA_EXMC) BEGIN_FTR_SECTION - b machine_check_pSeries_early + b machine_check_common_early END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE) machine_check_pSeries_0: EXCEPTION_PROLOG_1(PACA_EXMC, KVMTEST_PR, 0x200) @@ -346,90 +346,6 @@ machine_check_pSeries_0: TRAMP_KVM_SKIP(PACA_EXMC, 0x200) -TRAMP_REAL_BEGIN(machine_check_pSeries_early) -BEGIN_FTR_SECTION - EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200) - mr r10,r1 /* Save r1 */ - ld r1,PACAMCEMERGSP(r13) /* Use MC emergency stack */ - subi r1,r1,INT_FRAME_SIZE /* alloc stack frame */ - mfspr r11,SPRN_SRR0 /* Save SRR0 */ - mfspr r12,SPRN_SRR1 /* Save SRR1 */ - EXCEPTION_PROLOG_COMMON_1() - EXCEPTION_PROLOG_COMMON_2(PACA_EXMC) - EXCEPTION_PROLOG_COMMON_3(0x200) - addi r3,r1,STACK_FRAME_OVERHEAD - BRANCH_LINK_TO_FAR(machine_check_early) /* Function call ABI */ - ld r12,_MSR(r1) - andi. r11,r12,MSR_PR /* See if coming from user. */ - bne 2f /* continue in V mode if we are. */ - - /* - * At this point we are not sure about what context we come from. - * We may be in the middle of swithing stack. r1 may not be valid. - * Hence stay on emergency stack, call machine_check_exception and - * return from the interrupt. - * But before that, check if this is an un-recoverable exception. - * If yes, then stay on emergency stack and panic. - */ - andi. r11,r12,MSR_RI - bne 1f - - /* - * Check if we have successfully handled/recovered from error, if not - * then stay on emergency stack and panic. - */ - cmpdi r3,0 /* see if we handled MCE successfully */ - bne 1f /* if handled then return from interrupt */ - - LOAD_HANDLER(r10,unrecover_mce) - mtspr SPRN_SRR0,r10 - ld r10,PACAKMSR(r13) - /* - * We are going down. But there are chances that we might get hit by - * another MCE during panic path and we may run into unstable state - * with no way out. Hence, turn ME bit off while going down, so that - * when another MCE is hit during panic path, hypervisor will - * power cycle the lpar, instead of getting into MCE loop. - */ - li r3,MSR_ME - andc r10,r10,r3 /* Turn off MSR_ME */ - mtspr SPRN_SRR1,r10 - RFI_TO_KERNEL - b . - - /* Stay on emergency stack and return from interrupt. */ -1: LOAD_HANDLER(r10,mce_return) - mtspr SPRN_SRR0,r10 - ld r10,PACAKMSR(r13) - mtspr SPRN_SRR1,r10 - RFI_TO_KERNEL - b . - - /* Move original SRR0 and SRR1 into the respective regs */ -2: ld r9,_MSR(r1) - mtspr SPRN_SRR1,r9 - ld r3,_NIP(r1) - mtspr SPRN_SRR0,r3 - ld r9,_CTR(r1) - mtctr r9 - ld r9,_XER(r1) - mtxer r9 - ld r9,_LINK(r1) - mtlr r9 - REST_GPR(0, r1) - REST_8GPRS(2, r1) - REST_GPR(10, r1) - ld r11,_CCR(r1) - mtcr r11 - REST_GPR(11, r1) - REST_2GPRS(12, r1) - /* restore original r1. */ - ld r1,GPR1(r1) - SET_SCRATCH0(r13) /* save r13 */ - EXCEPTION_PROLOG_0(PACA_EXMC) - b machine_check_pSeries_0 -END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE) - EXC_COMMON_BEGIN(machine_check_common) /* * Machine check is different because we use a different @@ -528,6 +444,9 @@ EXC_COMMON_BEGIN(machine_check_handle_early) bl machine_check_early std r3,RESULT(r1) /* Save result */ ld r12,_MSR(r1) +BEGIN_FTR_SECTION + b 4f +END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE) #ifdef CONFIG_PPC_P7_NAP /* @@ -551,10 +470,11 @@ EXC_COMMON_BEGIN(machine_check_handle_early) */ rldicl. r11,r12,4,63 /* See if MC hit while in HV mode. */ beq 5f - andi. r11,r12,MSR_PR /* See if coming from user. */ +4: andi. r11,r12,MSR_PR /* See if coming from user. */ bne 9f /* continue in V mode if we are. */ 5: +BEGIN_FTR_SECTION #ifdef CONFIG_KVM_BOOK3S_64_HANDLER /* * We are coming from kernel context. Check if we are coming from @@ -565,6 +485,7 @@ EXC_COMMON_BEGIN(machine_check_handle_early) cmpwi r11,0 /* Check if coming from guest */ bne 9f /* continue if we are. */ #endif +END_FTR_SECTION_IFSET(CPU_FTR_HVMODE) /* * At this point we are not sure about what context we come from. * Queue up the MCE event and return from the interrupt. @@ -598,6 +519,7 @@ EXC_COMMON_BEGIN(machine_check_handle_early) cmpdi r3,0 /* see if we handled MCE successfully */ beq 1b /* if !handled then panic */ +BEGIN_FTR_SECTION /* * Return from MC interrupt. * Queue up the MCE event so that we can log it later, while @@ -606,10 +528,24 @@ EXC_COMMON_BEGIN(machine_check_handle_early) bl machine_check_queue_event MACHINE_CHECK_HANDLER_WINDUP RFI_TO_USER_OR_KERNEL +FTR_SECTION_ELSE + /* + * pSeries: Return from MC interrupt. Before that stay on emergency + * stack and call machine_check_exception to log the MCE event. + */ + LOAD_HANDLER(r10,mce_return) + mtspr SPRN_SRR0,r10 + ld r10,PACAKMSR(r13) + mtspr SPRN_SRR1,r10 + RFI_TO_KERNEL + b . +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE) 9: /* Deliver the machine check to host kernel in V mode. */ MACHINE_CHECK_HANDLER_WINDUP - b machine_check_pSeries + SET_SCRATCH0(r13) /* save r13 */ + EXCEPTION_PROLOG_0(PACA_EXMC) + b machine_check_pSeries_0 EXC_COMMON_BEGIN(unrecover_mce) /* Invoke machine_check_exception to print MCE event and panic. */ @@ -627,25 +563,7 @@ EXC_COMMON_BEGIN(mce_return) /* Invoke machine_check_exception to print MCE event and return. */ addi r3,r1,STACK_FRAME_OVERHEAD bl machine_check_exception - ld r9,_MSR(r1) - mtspr SPRN_SRR1,r9 - ld r3,_NIP(r1) - mtspr SPRN_SRR0,r3 - ld r9,_CTR(r1) - mtctr r9 - ld r9,_XER(r1) - mtxer r9 - ld r9,_LINK(r1) - mtlr r9 - REST_GPR(0, r1) - REST_8GPRS(2, r1) - REST_GPR(10, r1) - ld r11,_CCR(r1) - mtcr r11 - REST_GPR(11, r1) - REST_2GPRS(12, r1) - /* restore original r1. */ - ld r1,GPR1(r1) + MACHINE_CHECK_HANDLER_WINDUP RFI_TO_KERNEL b .