From patchwork Wed Sep 13 06:10:47 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Balbir Singh X-Patchwork-Id: 813172 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xsWj13z9Qz9sPs for ; Wed, 13 Sep 2017 16:19:33 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="WlV1ZSkR"; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3xsWj12rQhzDrJV for ; Wed, 13 Sep 2017 16:19:33 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="WlV1ZSkR"; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:400e:c00::243; helo=mail-pf0-x243.google.com; envelope-from=bsingharora@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="WlV1ZSkR"; dkim-atps=neutral Received: from mail-pf0-x243.google.com (mail-pf0-x243.google.com [IPv6:2607:f8b0:400e:c00::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3xsWWM74hhzDrJZ for ; Wed, 13 Sep 2017 16:11:11 +1000 (AEST) Received: by mail-pf0-x243.google.com with SMTP id f84so7146903pfj.3 for ; Tue, 12 Sep 2017 23:11:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=crXOrPYW8lc5m0Aj2nuDiruudstRXfIORA9jQ/v/EA0=; b=WlV1ZSkR8Lwg0ech5R3YZYdRXtNSlfLEy2r2vcdTiKfPQ6P/vebHnMNqohARUF55Mr ReUVjytEjmgp8DK+dL51PXOmAtI6iyWgGngcHBlaeDrmQDFEB1w+4Zf7LY/aHU5KXW3k i6lQN1yaZ4saCRYtibXzJment0ezXilKGJ4lXH2lmnKk1PMe7ACQy81rfSuxS2CaqNH/ UIOIuyJqEwva8/afbr5QsF9RMDzW2fviQ7ZTjfo/4BUmSrmypdDe8arH89mKqJad8p9D TuQ3Y/KcM8BmyeiI8xE2LWnkijox4RBCgXcPlZXfF6914ny91K9DbhzwUPgA52zB7Pu+ YFwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=crXOrPYW8lc5m0Aj2nuDiruudstRXfIORA9jQ/v/EA0=; b=ivxyBma+3szIa4XBXg35pddK21ASPO2aC8xAy2wfK2IeU+1ukbFNs2OEY1Vwt/KMsA QPmAeGO3Ec5pNGNTff3QpDK1+tf8sm4fFryotVUsZQYIc0mAv3vNpMuX3RC7fDk0lr2t j2C5n4gE0dgblZ03BfBIBF3FYdcvqnoYxNm2wuCIksmV+ePATjlIBm5/UxtKFlFLG6Qp jtyuaYNRyBctao1ttIIKsha0um1Pk5vhAPCnvuxhOege8f5KOEAy0Iwplukc9Ay3Zg4/ omWcz4sQrwhcyntxeGhQqR5xoRsAir6RykcbvNTpzFHzHMno6IM3/wmHQ279I+4r6ECW k+3g== X-Gm-Message-State: AHPjjUhUP3+nUTzomjJmoZ6A6pSJ0/tBhS8BoTxlG83t4xll6g1v84KR iOnWk+fEmqKjsA== X-Google-Smtp-Source: ADKCNb44dzeMASeMZyadG6Jv7Yal1EB8Gq7s/fNLBhKEdkxYt2ldehp1dNJiGnGnlk2b/ndfycwIqw== X-Received: by 10.98.80.85 with SMTP id e82mr17549168pfb.265.1505283069983; Tue, 12 Sep 2017 23:11:09 -0700 (PDT) Received: from firefly.ozlabs.ibm.com.ozlabs.ibm.com ([122.99.82.10]) by smtp.gmail.com with ESMTPSA id t65sm22581016pfk.59.2017.09.12.23.11.07 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 12 Sep 2017 23:11:09 -0700 (PDT) From: Balbir Singh To: mpe@ellerman.id.au, npiggin@gmail.com, mahesh@linux.vnet.ibm.com Subject: [PATCH v2 3/5] powerpc/mce: Hookup derror (load/store) UE errors Date: Wed, 13 Sep 2017 16:10:47 +1000 Message-Id: <20170913061049.13256-4-bsingharora@gmail.com> X-Mailer: git-send-email 2.9.5 In-Reply-To: <20170913061049.13256-1-bsingharora@gmail.com> References: <20170913061049.13256-1-bsingharora@gmail.com> X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.24 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linuxppc-dev@lists.ozlabs.org Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Extract physical_address for UE errors by walking the page tables for the mm and address at the NIP, to extract the instruction. Then use the instruction to find the effective address via analyse_instr(). We might have page table walking races, but we expect them to be rare, the physical address extraction is best effort. The idea is to then hook up this infrastructure to memory failure eventually. Signed-off-by: Balbir Singh --- arch/powerpc/include/asm/mce.h | 2 +- arch/powerpc/kernel/mce.c | 6 +++- arch/powerpc/kernel/mce_power.c | 80 ++++++++++++++++++++++++++++++++++++++--- 3 files changed, 81 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h index 75292c7..3a1226e 100644 --- a/arch/powerpc/include/asm/mce.h +++ b/arch/powerpc/include/asm/mce.h @@ -204,7 +204,7 @@ struct mce_error_info { extern void save_mce_event(struct pt_regs *regs, long handled, struct mce_error_info *mce_err, uint64_t nip, - uint64_t addr); + uint64_t addr, uint64_t phys_addr); extern int get_mce_event(struct machine_check_event *mce, bool release); extern void release_mce_event(void); extern void machine_check_queue_event(void); diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c index f351adf..c7acc33 100644 --- a/arch/powerpc/kernel/mce.c +++ b/arch/powerpc/kernel/mce.c @@ -82,7 +82,7 @@ static void mce_set_error_info(struct machine_check_event *mce, */ void save_mce_event(struct pt_regs *regs, long handled, struct mce_error_info *mce_err, - uint64_t nip, uint64_t addr) + uint64_t nip, uint64_t addr, uint64_t phys_addr) { int index = __this_cpu_inc_return(mce_nest_count) - 1; struct machine_check_event *mce = this_cpu_ptr(&mce_event[index]); @@ -140,6 +140,10 @@ void save_mce_event(struct pt_regs *regs, long handled, } else if (mce->error_type == MCE_ERROR_TYPE_UE) { mce->u.ue_error.effective_address_provided = true; mce->u.ue_error.effective_address = addr; + if (phys_addr != ULONG_MAX) { + mce->u.ue_error.physical_address_provided = true; + mce->u.ue_error.physical_address = phys_addr; + } } return; } diff --git a/arch/powerpc/kernel/mce_power.c b/arch/powerpc/kernel/mce_power.c index b76ca19..2dfbbe0 100644 --- a/arch/powerpc/kernel/mce_power.c +++ b/arch/powerpc/kernel/mce_power.c @@ -27,6 +27,29 @@ #include #include #include +#include +#include +#include + +/* + * Convert an address related to an mm to a PFN. NOTE: we are in real + * mode, we could potentially race with page table updates. + */ +static unsigned long addr_to_pfn(struct mm_struct *mm, unsigned long addr) +{ + pte_t *ptep; + unsigned long flags; + + local_irq_save(flags); + if (mm == current->mm) + ptep = find_current_mm_pte(mm->pgd, addr, NULL, NULL); + else + ptep = find_init_mm_pte(addr, NULL); + local_irq_restore(flags); + if (!ptep || pte_special(*ptep)) + return ULONG_MAX; + return pte_pfn(*ptep); +} static void flush_tlb_206(unsigned int num_sets, unsigned int action) { @@ -421,6 +444,48 @@ static const struct mce_derror_table mce_p9_derror_table[] = { MCE_INITIATOR_CPU, MCE_SEV_ERROR_SYNC, }, { 0, false, 0, 0, 0, 0 } }; +static int mce_find_instr_ea_and_pfn(struct pt_regs *regs, uint64_t *addr, + uint64_t *phys_addr) +{ + /* + * Carefully look at the NIP to determine + * the instruction to analyse. Reading the NIP + * in real-mode is tricky and can lead to recursive + * faults + */ + int instr; + struct mm_struct *mm; + unsigned long nip = regs->nip; + unsigned long pfn, instr_addr; + struct instruction_op op; + struct pt_regs tmp = *regs; + + if (user_mode(regs)) + mm = current->mm; + else + mm = &init_mm; + + pfn = addr_to_pfn(mm, nip); + if (pfn != ULONG_MAX) { + instr_addr = (pfn << PAGE_SHIFT) + (nip & ~PAGE_MASK); + instr = *(unsigned int *)(instr_addr); + if (!analyse_instr(&op, &tmp, instr)) { + pfn = addr_to_pfn(mm, op.ea); + *addr = op.ea; + *phys_addr = pfn; + return 0; + } + /* + * analyse_instr() might fail if the instruction + * is not a load/store, although this is unexpected + * for load/store errors or if we got the NIP + * wrong + */ + } + *addr = 0; + return -1; +} + static int mce_handle_ierror(struct pt_regs *regs, const struct mce_ierror_table table[], struct mce_error_info *mce_err, uint64_t *addr) @@ -489,7 +554,8 @@ static int mce_handle_ierror(struct pt_regs *regs, static int mce_handle_derror(struct pt_regs *regs, const struct mce_derror_table table[], - struct mce_error_info *mce_err, uint64_t *addr) + struct mce_error_info *mce_err, uint64_t *addr, + uint64_t *phys_addr) { uint64_t dsisr = regs->dsisr; int handled = 0; @@ -555,7 +621,10 @@ static int mce_handle_derror(struct pt_regs *regs, mce_err->initiator = table[i].initiator; if (table[i].dar_valid) *addr = regs->dar; - + else if (mce_err->severity == MCE_SEV_ERROR_SYNC && + table[i].error_type == MCE_ERROR_TYPE_UE) { + mce_find_instr_ea_and_pfn(regs, addr, phys_addr); + } found = 1; } @@ -592,19 +661,20 @@ static long mce_handle_error(struct pt_regs *regs, const struct mce_ierror_table itable[]) { struct mce_error_info mce_err = { 0 }; - uint64_t addr; + uint64_t addr, phys_addr; uint64_t srr1 = regs->msr; long handled; if (SRR1_MC_LOADSTORE(srr1)) - handled = mce_handle_derror(regs, dtable, &mce_err, &addr); + handled = mce_handle_derror(regs, dtable, &mce_err, &addr, + &phys_addr); else handled = mce_handle_ierror(regs, itable, &mce_err, &addr); if (!handled && mce_err.error_type == MCE_ERROR_TYPE_UE) handled = mce_handle_ue_error(regs); - save_mce_event(regs, handled, &mce_err, regs->nip, addr); + save_mce_event(regs, handled, &mce_err, regs->nip, addr, phys_addr); return handled; }