[{"id":1701644,"web_url":"http://patchwork.ozlabs.org/comment/1701644/","msgid":"<0ca45776-6b27-edcc-6675-267886ec3aa0@linux.vnet.ibm.com>","list_archive_url":null,"date":"2017-06-27T05:16:34","subject":"Re: [Skiboot] [RFC][PATCH] hmi: clear xscom and unknown bits from\n\tHMER","submitter":{"id":1436,"url":"http://patchwork.ozlabs.org/api/people/1436/","name":"Mahesh J Salgaonkar","email":"mahesh@linux.vnet.ibm.com"},"content":"On 06/23/2017 05:41 PM, Nicholas Piggin wrote:\n> It has been observed the xscom bit in HMER gets stuck (as-yet\n\nWe see that stuck because opal never clears it after scom read/write.\nThe bit is cleared just before the next scom read/write. I am not sure\nwhat it was left uncleared until next scom read/write kicks in.\n\n> unkonwn root cause -- HMEER should disable those exceptions).\n> This causes HMIs to be continually taken.\n> \n> HMI: Received HMI interrupt: HMER = 0x0040000000000000\n> \n> Add some attempt to handle this by clearing the HMER and HMEER.\n> \n> Try to clear HMER for other unknown HMIs (alternative is to not\n> recover).\n\nI think we should be just ok with clearing out and masking them again.\n\n> \n> There seems to be no point in continually taking an HMI that will\n> never be handled. By not handling it we already implicitly are\n> trying to \"continue\" without solving anything aren't we?\n\nWe do handle the ones that could cause harm to system functioning. Rest\nwe mask it. Other than xscom related bits we also mask bit 6, 16 and 17\nwhich does not look harmful. I think we should just mask them again in\nHMEER if we get HMIs for the bits that we already masked.\n\n> \n> ---\n>  core/hmi.c          | 26 ++++++++++++++++++++++++++\n>  hw/xscom.c          |  5 +----\n>  include/processor.h |  7 +++++++\n>  3 files changed, 34 insertions(+), 4 deletions(-)\n> \n> diff --git a/core/hmi.c b/core/hmi.c\n> index 84f2c2d6..7ab5810d 100644\n> --- a/core/hmi.c\n> +++ b/core/hmi.c\n> @@ -823,6 +823,32 @@ int handle_hmi_exception(uint64_t hmer, struct OpalHMIEvent *hmi_evt)\n>  \t\t}\n>  \t}\n> \n> +\tif (hmer & SPR_HMER_XSCOM_MASK) {\n> +\t\thmer &= ~SPR_HMER_XSCOM_MASK;\n> +\t\tif (hmi_evt) {\n> +\t\t\thmi_evt->severity = OpalHMI_SEV_NO_ERROR;\n> +\t\t\thmi_evt->type = OpalHMI_ERROR_XSCOM_DONE;\n> +\t\t\tqueue_hmi_event(hmi_evt, recover);\n> +\t\t}\n> +\t\tsync();\n> +\t\tmtspr(SPR_HMEER, mfspr(SPR_HMEER) & ~(SPR_HMER_XSCOM_FAIL |\n> +\t\t\t\t\t\t\tSPR_HMER_XSCOM_DONE))\n> +\t\tisync();\n> +\n> +\t\tprlog(PR_DEBUG, \"HMI: Unexpected XSCOM (clearing).\\n\");\n> +\t}\n> +\n> +\tif (hmer) {\n> +\t\thmer = 0;\n> +\t\tif (hmi_evt) {\n> +\t\t\thmi_evt->severity = OpalHMI_SEV_WARNING;\n> +\t\t\thmi_evt->type = 0; /* Anything sane we can put here? */\n> +\t\t\tqueue_hmi_event(hmi_evt, recover);\n> +\t\t}\n\nThis one is also unexpected, should we clear and mask this as well ?\nOtherwise we would keep getting this HMI and warnings would flood host\nkernel.\n\nThanks,\n-Mahesh.","headers":{"Return-Path":"<skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org>","X-Original-To":["incoming@patchwork.ozlabs.org","skiboot@lists.ozlabs.org"],"Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","skiboot@lists.ozlabs.org"],"Received":["from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3wxZ0q4G3Fz9s71\n\tfor <incoming@patchwork.ozlabs.org>;\n\tTue, 27 Jun 2017 15:16:59 +1000 (AEST)","from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 3wxZ0q3XcZzDr38\n\tfor <incoming@patchwork.ozlabs.org>;\n\tTue, 27 Jun 2017 15:16:59 +1000 (AEST)","from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com\n\t[148.163.158.5])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby lists.ozlabs.org (Postfix) with ESMTPS id 3wxZ0h2FJ9zDqjC\n\tfor <skiboot@lists.ozlabs.org>; Tue, 27 Jun 2017 15:16:51 +1000 (AEST)","from pps.filterd (m0098421.ppops.net [127.0.0.1])\n\tby mx0a-001b2d01.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id\n\tv5R5FUdw105774\n\tfor <skiboot@lists.ozlabs.org>; Tue, 27 Jun 2017 01:16:48 -0400","from e23smtp05.au.ibm.com (e23smtp05.au.ibm.com [202.81.31.147])\n\tby mx0a-001b2d01.pphosted.com with ESMTP id 2bb0c3yhs0-1\n\t(version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT)\n\tfor <skiboot@lists.ozlabs.org>; Tue, 27 Jun 2017 01:16:48 -0400","from localhost\n\tby e23smtp05.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use\n\tOnly! Violators will be prosecuted\n\tfor <skiboot@lists.ozlabs.org> from <mahesh@linux.vnet.ibm.com>;\n\tTue, 27 Jun 2017 15:16:46 +1000","from d23relay06.au.ibm.com (202.81.31.225)\n\tby e23smtp05.au.ibm.com (202.81.31.211) with IBM ESMTP SMTP Gateway:\n\tAuthorized Use Only! Violators will be prosecuted; \n\tTue, 27 Jun 2017 15:16:43 +1000","from d23av04.au.ibm.com (d23av04.au.ibm.com [9.190.235.139])\n\tby d23relay06.au.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id\n\tv5R5Ghvu2818374\n\tfor <skiboot@lists.ozlabs.org>; Tue, 27 Jun 2017 15:16:43 +1000","from d23av04.au.ibm.com (localhost [127.0.0.1])\n\tby d23av04.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id\n\tv5R5GfYN015978\n\tfor <skiboot@lists.ozlabs.org>; Tue, 27 Jun 2017 15:16:41 +1000","from [9.109.222.37] ([9.109.222.37])\n\tby d23av04.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id\n\tv5R5Gccp015873; Tue, 27 Jun 2017 15:16:39 +1000"],"To":"Nicholas Piggin <npiggin@gmail.com>, skiboot@lists.ozlabs.org","References":"<20170623121101.30781-1-npiggin@gmail.com>","From":"Mahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com>","Date":"Tue, 27 Jun 2017 10:46:34 +0530","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tThunderbird/52.2.0","MIME-Version":"1.0","In-Reply-To":"<20170623121101.30781-1-npiggin@gmail.com>","Content-Language":"en-MW","X-TM-AS-MML":"disable","x-cbid":"17062705-0016-0000-0000-000002547294","X-IBM-AV-DETECTION":"SAVI=unused REMOTE=unused XFE=unused","x-cbparentid":"17062705-0017-0000-0000-000006D43045","Message-Id":"<0ca45776-6b27-edcc-6675-267886ec3aa0@linux.vnet.ibm.com>","X-Proofpoint-Virus-Version":"vendor=fsecure engine=2.50.10432:, ,\n\tdefinitions=2017-06-27_02:, , signatures=0","X-Proofpoint-Spam-Details":"rule=outbound_notspam policy=outbound score=0\n\tspamscore=0 suspectscore=0\n\tmalwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam\n\tadjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000\n\tdefinitions=main-1706270084","Subject":"Re: [Skiboot] [RFC][PATCH] hmi: clear xscom and unknown bits from\n\tHMER","X-BeenThere":"skiboot@lists.ozlabs.org","X-Mailman-Version":"2.1.23","Precedence":"list","List-Id":"Mailing list for skiboot development <skiboot.lists.ozlabs.org>","List-Unsubscribe":"<https://lists.ozlabs.org/options/skiboot>,\n\t<mailto:skiboot-request@lists.ozlabs.org?subject=unsubscribe>","List-Archive":"<http://lists.ozlabs.org/pipermail/skiboot/>","List-Post":"<mailto:skiboot@lists.ozlabs.org>","List-Help":"<mailto:skiboot-request@lists.ozlabs.org?subject=help>","List-Subscribe":"<https://lists.ozlabs.org/listinfo/skiboot>,\n\t<mailto:skiboot-request@lists.ozlabs.org?subject=subscribe>","Cc":"Ryan Grimm <grimm@linux.vnet.ibm.com>, aksadiga@in.ibm.com","Content-Type":"text/plain; charset=\"utf-8\"","Content-Transfer-Encoding":"base64","Errors-To":"skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org","Sender":"\"Skiboot\"\n\t<skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org>"}},{"id":1701654,"web_url":"http://patchwork.ozlabs.org/comment/1701654/","msgid":"<20170627153912.69b128cf@roar.ozlabs.ibm.com>","list_archive_url":null,"date":"2017-06-27T05:39:12","subject":"Re: [Skiboot] [RFC][PATCH] hmi: clear xscom and unknown bits from\n\tHMER","submitter":{"id":69518,"url":"http://patchwork.ozlabs.org/api/people/69518/","name":"Nicholas Piggin","email":"npiggin@gmail.com"},"content":"On Tue, 27 Jun 2017 10:46:34 +0530\nMahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:\n\n> On 06/23/2017 05:41 PM, Nicholas Piggin wrote:\n> > It has been observed the xscom bit in HMER gets stuck (as-yet  \n> \n> We see that stuck because opal never clears it after scom read/write.\n> The bit is cleared just before the next scom read/write. I am not sure\n> what it was left uncleared until next scom read/write kicks in.\n\nRight, but did we work out why it's taking a HMI or getting enabled\nin the HMEER?\n\n> > unkonwn root cause -- HMEER should disable those exceptions).\n> > This causes HMIs to be continually taken.\n> > \n> > HMI: Received HMI interrupt: HMER = 0x0040000000000000\n> > \n> > Add some attempt to handle this by clearing the HMER and HMEER.\n> > \n> > Try to clear HMER for other unknown HMIs (alternative is to not\n> > recover).  \n> \n> I think we should be just ok with clearing out and masking them again.\n> \n> > \n> > There seems to be no point in continually taking an HMI that will\n> > never be handled. By not handling it we already implicitly are\n> > trying to \"continue\" without solving anything aren't we?  \n> \n> We do handle the ones that could cause harm to system functioning. Rest\n> we mask it. Other than xscom related bits we also mask bit 6, 16 and 17\n> which does not look harmful. I think we should just mask them again in\n> HMEER if we get HMIs for the bits that we already masked.\n\nOkay. My thinking is that if there was a hw or sw error that causes\nsome unknown HME, would it be better to stop and crash ASAP?\n\nEither way seems better than our current approach of trying to\ncontinue without masking. So it's your call.\n\n> \n> > \n> > ---\n> >  core/hmi.c          | 26 ++++++++++++++++++++++++++\n> >  hw/xscom.c          |  5 +----\n> >  include/processor.h |  7 +++++++\n> >  3 files changed, 34 insertions(+), 4 deletions(-)\n> > \n> > diff --git a/core/hmi.c b/core/hmi.c\n> > index 84f2c2d6..7ab5810d 100644\n> > --- a/core/hmi.c\n> > +++ b/core/hmi.c\n> > @@ -823,6 +823,32 @@ int handle_hmi_exception(uint64_t hmer, struct OpalHMIEvent *hmi_evt)\n> >  \t\t}\n> >  \t}\n> > \n> > +\tif (hmer & SPR_HMER_XSCOM_MASK) {\n> > +\t\thmer &= ~SPR_HMER_XSCOM_MASK;\n> > +\t\tif (hmi_evt) {\n> > +\t\t\thmi_evt->severity = OpalHMI_SEV_NO_ERROR;\n> > +\t\t\thmi_evt->type = OpalHMI_ERROR_XSCOM_DONE;\n> > +\t\t\tqueue_hmi_event(hmi_evt, recover);\n> > +\t\t}\n> > +\t\tsync();\n> > +\t\tmtspr(SPR_HMEER, mfspr(SPR_HMEER) & ~(SPR_HMER_XSCOM_FAIL |\n> > +\t\t\t\t\t\t\tSPR_HMER_XSCOM_DONE))\n> > +\t\tisync();\n> > +\n> > +\t\tprlog(PR_DEBUG, \"HMI: Unexpected XSCOM (clearing).\\n\");\n> > +\t}\n> > +\n> > +\tif (hmer) {\n> > +\t\thmer = 0;\n> > +\t\tif (hmi_evt) {\n> > +\t\t\thmi_evt->severity = OpalHMI_SEV_WARNING;\n> > +\t\t\thmi_evt->type = 0; /* Anything sane we can put here? */\n> > +\t\t\tqueue_hmi_event(hmi_evt, recover);\n> > +\t\t}  \n> \n> This one is also unexpected, should we clear and mask this as well ?\n\nProbably yes. Either that or set recover = 0. What do you think?\n\nThanks,\nNick","headers":{"Return-Path":"<skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org>","X-Original-To":["incoming@patchwork.ozlabs.org","skiboot@lists.ozlabs.org"],"Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","skiboot@lists.ozlabs.org"],"Received":["from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3wxZVw5DhBz9s71\n\tfor <incoming@patchwork.ozlabs.org>;\n\tTue, 27 Jun 2017 15:39:36 +1000 (AEST)","from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 3wxZVw48gszDr1f\n\tfor <incoming@patchwork.ozlabs.org>;\n\tTue, 27 Jun 2017 15:39:36 +1000 (AEST)","from mail-pg0-x241.google.com (mail-pg0-x241.google.com\n\t[IPv6:2607:f8b0:400e:c05::241])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128\n\tbits)) (No client certificate requested)\n\tby lists.ozlabs.org (Postfix) with ESMTPS id 3wxZVq1ChVzDr0g\n\tfor <skiboot@lists.ozlabs.org>; Tue, 27 Jun 2017 15:39:30 +1000 (AEST)","by mail-pg0-x241.google.com with SMTP id u36so2951500pgn.3\n\tfor <skiboot@lists.ozlabs.org>; Mon, 26 Jun 2017 22:39:30 -0700 (PDT)","from roar.ozlabs.ibm.com (59-102-83-48.tpgi.com.au. [59.102.83.48])\n\tby smtp.gmail.com with ESMTPSA id\n\tg79sm3306619pfg.121.2017.06.26.22.39.24\n\t(version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256);\n\tMon, 26 Jun 2017 22:39:26 -0700 (PDT)"],"Authentication-Results":["ozlabs.org;\n\tdkim=fail reason=\"signature verification failed\" (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"dXjYw7Nd\"; dkim-atps=neutral","lists.ozlabs.org;\n\tdkim=fail reason=\"signature verification failed\" (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"dXjYw7Nd\"; dkim-atps=neutral","lists.ozlabs.org; dkim=pass (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"dXjYw7Nd\"; dkim-atps=neutral"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;\n\th=date:from:to:cc:subject:message-id:in-reply-to:references\n\t:organization:mime-version:content-transfer-encoding;\n\tbh=mYiK6XphF3rSjuhEX5quwJOGIx8JNy/GZuyYFaIFpPw=;\n\tb=dXjYw7Nd3Q4/rP5UJaVkl0UQs65n4ZJRLYsFelS6SqDq1VfDzwsZO5y3PpoC+r2N3Z\n\tSlCoZ4oQGTc/LXysfU66F3AJn2llGV9ZpudQC0il5t3Jj+56/jygmmCLCESQFsUacNRg\n\tDtbVe5u2liaLZSdYYEXYJ/S9Lndg9tBzT89COuylgGc7DX6nE7at1jpdzyTfJYHorLjn\n\tkKiFZyRfPrZqCMEZMYhjTrbFn/c7zOyHuyv9p+Sn8aOiRRS2StVBD4oWK2/d+mt9N800\n\tVuL4lh+BZ6O5Qo7mITh3Rbr3ssYHrSjOM9/iCf/q1ay7yXM2wqxYCH3HmG4CIm0d4huB\n\t3bHw==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20161025;\n\th=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to\n\t:references:organization:mime-version:content-transfer-encoding;\n\tbh=mYiK6XphF3rSjuhEX5quwJOGIx8JNy/GZuyYFaIFpPw=;\n\tb=ADG13S8EyapmlNy//0Us3wJHGok6Ij/VAdQPcvSFIMNvhEK8wtxxmdjgtOEFhir8o4\n\t6fx9NLvRQHDqSJoHHghxKE8zyiz37oWQv6LV1s5MulVY0F2snQEwP7n0z2PYDi/Jrm0s\n\tDUGRP/MHnt8GTdvRsCJ9XiJ0ldiLqSNvMEsCOYZRmUkpMnhhXDgyhdM+0zfRKbYnip6+\n\tkXq+sfn1HdNseWPOvdbMrLLb/Muu1sMhTG4h/H1eJEXlX2y1ZqgL6UsICPdapRxm9Aot\n\t90d86Otd1L1D1eKI+yzpo+K8a5VYqZTVZ4h0VxAkSLHr6cKvM67e15Do7o2453UU5eKw\n\t6PYg==","X-Gm-Message-State":"AKS2vOwa+26e/RGjLx2hIwRgg3LXxtDwxU4PBAsgJQ0gXO1441Xcr+ay\n\tHIfQNbuhubm6jg==","X-Received":"by 10.101.77.74 with SMTP id j10mr3569341pgt.43.1498541968008;\n\tMon, 26 Jun 2017 22:39:28 -0700 (PDT)","Date":"Tue, 27 Jun 2017 15:39:12 +1000","From":"Nicholas Piggin <npiggin@gmail.com>","To":"Mahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com>","Message-ID":"<20170627153912.69b128cf@roar.ozlabs.ibm.com>","In-Reply-To":"<0ca45776-6b27-edcc-6675-267886ec3aa0@linux.vnet.ibm.com>","References":"<20170623121101.30781-1-npiggin@gmail.com>\n\t<0ca45776-6b27-edcc-6675-267886ec3aa0@linux.vnet.ibm.com>","Organization":"IBM","X-Mailer":"Claws Mail 3.14.1 (GTK+ 2.24.31; x86_64-pc-linux-gnu)","MIME-Version":"1.0","Subject":"Re: [Skiboot] [RFC][PATCH] hmi: clear xscom and unknown bits from\n\tHMER","X-BeenThere":"skiboot@lists.ozlabs.org","X-Mailman-Version":"2.1.23","Precedence":"list","List-Id":"Mailing list for skiboot development <skiboot.lists.ozlabs.org>","List-Unsubscribe":"<https://lists.ozlabs.org/options/skiboot>,\n\t<mailto:skiboot-request@lists.ozlabs.org?subject=unsubscribe>","List-Archive":"<http://lists.ozlabs.org/pipermail/skiboot/>","List-Post":"<mailto:skiboot@lists.ozlabs.org>","List-Help":"<mailto:skiboot-request@lists.ozlabs.org?subject=help>","List-Subscribe":"<https://lists.ozlabs.org/listinfo/skiboot>,\n\t<mailto:skiboot-request@lists.ozlabs.org?subject=subscribe>","Cc":"Ryan Grimm <grimm@linux.vnet.ibm.com>, skiboot@lists.ozlabs.org,\n\taksadiga@in.ibm.com","Content-Type":"text/plain; charset=\"utf-8\"","Content-Transfer-Encoding":"base64","Errors-To":"skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org","Sender":"\"Skiboot\"\n\t<skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org>"}},{"id":1702074,"web_url":"http://patchwork.ozlabs.org/comment/1702074/","msgid":"<1498566765.3651.31.camel@kernel.crashing.org>","list_archive_url":null,"date":"2017-06-27T12:32:45","subject":"Re: [Skiboot] [RFC][PATCH] hmi: clear xscom and unknown bits from\n\tHMER","submitter":{"id":38,"url":"http://patchwork.ozlabs.org/api/people/38/","name":"Benjamin Herrenschmidt","email":"benh@kernel.crashing.org"},"content":"On Tue, 2017-06-27 at 10:46 +0530, Mahesh Jagannath Salgaonkar wrote:\n> On 06/23/2017 05:41 PM, Nicholas Piggin wrote:\n> > It has been observed the xscom bit in HMER gets stuck (as-yet\n> \n> We see that stuck because opal never clears it after scom read/write.\n> The bit is cleared just before the next scom read/write. I am not sure\n> what it was left uncleared until next scom read/write kicks in.\n\nBecause we don't care ? It should not be enabled in HMEER...\n> \n> > unkonwn root cause -- HMEER should disable those exceptions).\n> > This causes HMIs to be continually taken.\n> > \n> > HMI: Received HMI interrupt: HMER = 0x0040000000000000\n> > \n> > Add some attempt to handle this by clearing the HMER and HMEER.\n> > \n> > Try to clear HMER for other unknown HMIs (alternative is to not\n> > recover).\n> \n> I think we should be just ok with clearing out and masking them again.\n\nRight but we need to understand why we are taking the HMI in the first\nplace since it's not enabled in HMEER unless something's wrong there.\nIs that reproduceable ?\n\n> > \n> > There seems to be no point in continually taking an HMI that will\n> > never be handled. By not handling it we already implicitly are\n> > trying to \"continue\" without solving anything aren't we?\n> \n> We do handle the ones that could cause harm to system functioning. Rest\n> we mask it. Other than xscom related bits we also mask bit 6, 16 and 17\n> which does not look harmful. I think we should just mask them again in\n> HMEER if we get HMIs for the bits that we already masked.\n\nBen.\n\n> > \n> > ---\n> >  core/hmi.c          | 26 ++++++++++++++++++++++++++\n> >  hw/xscom.c          |  5 +----\n> >  include/processor.h |  7 +++++++\n> >  3 files changed, 34 insertions(+), 4 deletions(-)\n> > \n> > diff --git a/core/hmi.c b/core/hmi.c\n> > index 84f2c2d6..7ab5810d 100644\n> > --- a/core/hmi.c\n> > +++ b/core/hmi.c\n> > @@ -823,6 +823,32 @@ int handle_hmi_exception(uint64_t hmer, struct OpalHMIEvent *hmi_evt)\n> >  \t\t}\n> >  \t}\n> > \n> > +\tif (hmer & SPR_HMER_XSCOM_MASK) {\n> > +\t\thmer &= ~SPR_HMER_XSCOM_MASK;\n> > +\t\tif (hmi_evt) {\n> > +\t\t\thmi_evt->severity = OpalHMI_SEV_NO_ERROR;\n> > +\t\t\thmi_evt->type = OpalHMI_ERROR_XSCOM_DONE;\n> > +\t\t\tqueue_hmi_event(hmi_evt, recover);\n> > +\t\t}\n> > +\t\tsync();\n> > +\t\tmtspr(SPR_HMEER, mfspr(SPR_HMEER) & ~(SPR_HMER_XSCOM_FAIL |\n> > +\t\t\t\t\t\t\tSPR_HMER_XSCOM_DONE))\n> > +\t\tisync();\n> > +\n> > +\t\tprlog(PR_DEBUG, \"HMI: Unexpected XSCOM (clearing).\\n\");\n> > +\t}\n> > +\n> > +\tif (hmer) {\n> > +\t\thmer = 0;\n> > +\t\tif (hmi_evt) {\n> > +\t\t\thmi_evt->severity = OpalHMI_SEV_WARNING;\n> > +\t\t\thmi_evt->type = 0; /* Anything sane we can put here? */\n> > +\t\t\tqueue_hmi_event(hmi_evt, recover);\n> > +\t\t}\n> \n> This one is also unexpected, should we clear and mask this as well ?\n> Otherwise we would keep getting this HMI and warnings would flood host\n> kernel.\n> \n> Thanks,\n> -Mahesh.","headers":{"Return-Path":"<skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org>","X-Original-To":["incoming@patchwork.ozlabs.org","skiboot@lists.ozlabs.org"],"Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","skiboot@lists.ozlabs.org"],"Received":["from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3wxlhl2qBzz9s2s\n\tfor <incoming@patchwork.ozlabs.org>;\n\tTue, 27 Jun 2017 22:33:43 +1000 (AEST)","from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 3wxlhl1tp9zDr5n\n\tfor <incoming@patchwork.ozlabs.org>;\n\tTue, 27 Jun 2017 22:33:43 +1000 (AEST)","from gate.crashing.org (gate.crashing.org [63.228.1.57])\n\t(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))\n\t(No client certificate requested)\n\tby lists.ozlabs.org (Postfix) with ESMTPS id 3wxlgq2Y6DzDr4B\n\tfor <skiboot@lists.ozlabs.org>; Tue, 27 Jun 2017 22:32:55 +1000 (AEST)","from localhost.localdomain (localhost.localdomain [127.0.0.1])\n\tby gate.crashing.org (8.14.1/8.13.8) with ESMTP id v5RCWi5X018579;\n\tTue, 27 Jun 2017 07:32:44 -0500"],"Message-ID":"<1498566765.3651.31.camel@kernel.crashing.org>","From":"Benjamin Herrenschmidt <benh@kernel.crashing.org>","To":"Mahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com>, Nicholas Piggin\n\t<npiggin@gmail.com>, skiboot@lists.ozlabs.org","Date":"Tue, 27 Jun 2017 07:32:45 -0500","In-Reply-To":"<0ca45776-6b27-edcc-6675-267886ec3aa0@linux.vnet.ibm.com>","References":"<20170623121101.30781-1-npiggin@gmail.com>\n\t<0ca45776-6b27-edcc-6675-267886ec3aa0@linux.vnet.ibm.com>","X-Mailer":"Evolution 3.22.6 (3.22.6-2.fc25) ","Mime-Version":"1.0","Subject":"Re: [Skiboot] [RFC][PATCH] hmi: clear xscom and unknown bits from\n\tHMER","X-BeenThere":"skiboot@lists.ozlabs.org","X-Mailman-Version":"2.1.23","Precedence":"list","List-Id":"Mailing list for skiboot development <skiboot.lists.ozlabs.org>","List-Unsubscribe":"<https://lists.ozlabs.org/options/skiboot>,\n\t<mailto:skiboot-request@lists.ozlabs.org?subject=unsubscribe>","List-Archive":"<http://lists.ozlabs.org/pipermail/skiboot/>","List-Post":"<mailto:skiboot@lists.ozlabs.org>","List-Help":"<mailto:skiboot-request@lists.ozlabs.org?subject=help>","List-Subscribe":"<https://lists.ozlabs.org/listinfo/skiboot>,\n\t<mailto:skiboot-request@lists.ozlabs.org?subject=subscribe>","Cc":"Ryan Grimm <grimm@linux.vnet.ibm.com>, aksadiga@in.ibm.com","Content-Type":"text/plain; charset=\"utf-8\"","Content-Transfer-Encoding":"base64","Errors-To":"skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org","Sender":"\"Skiboot\"\n\t<skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org>"}},{"id":1702842,"web_url":"http://patchwork.ozlabs.org/comment/1702842/","msgid":"<f7d7d224-cda2-1754-4506-2a77293fdc67@linux.vnet.ibm.com>","list_archive_url":null,"date":"2017-06-28T03:30:05","subject":"Re: [Skiboot] [RFC][PATCH] hmi: clear xscom and unknown bits from\n\tHMER","submitter":{"id":1436,"url":"http://patchwork.ozlabs.org/api/people/1436/","name":"Mahesh J Salgaonkar","email":"mahesh@linux.vnet.ibm.com"},"content":"On 06/27/2017 06:02 PM, Benjamin Herrenschmidt wrote:\n> On Tue, 2017-06-27 at 10:46 +0530, Mahesh Jagannath Salgaonkar wrote:\n>> On 06/23/2017 05:41 PM, Nicholas Piggin wrote:\n>>> It has been observed the xscom bit in HMER gets stuck (as-yet\n>>\n>> We see that stuck because opal never clears it after scom read/write.\n>> The bit is cleared just before the next scom read/write. I am not sure\n>> what it was left uncleared until next scom read/write kicks in.\n> \n> Because we don't care ? \n\nlooking at the code it looks like we didn't care. I sent out a patch\nthat clears them once scom operation is complete.\n\n> It should not be enabled in HMEER...\n\nYes, we don't enable them in HMEER.\n\n>>\n>>> unkonwn root cause -- HMEER should disable those exceptions).\n>>> This causes HMIs to be continually taken.\n>>>\n>>> HMI: Received HMI interrupt: HMER = 0x0040000000000000\n>>>\n>>> Add some attempt to handle this by clearing the HMER and HMEER.\n>>>\n>>> Try to clear HMER for other unknown HMIs (alternative is to not\n>>> recover).\n>>\n>> I think we should be just ok with clearing out and masking them again.\n> \n> Right but we need to understand why we are taking the HMI in the first\n> place since it's not enabled in HMEER unless something's wrong there.\n> Is that reproduceable ?\n\nWe did debug it yesterday and found the reason. Akshay sent out a patch\nthat fixes the issue. http://patchwork.ozlabs.org/patch/781434/\n\nThanks,\n-Mahesh.\n\n> \n>>>\n>>> There seems to be no point in continually taking an HMI that will\n>>> never be handled. By not handling it we already implicitly are\n>>> trying to \"continue\" without solving anything aren't we?\n>>\n>> We do handle the ones that could cause harm to system functioning. Rest\n>> we mask it. Other than xscom related bits we also mask bit 6, 16 and 17\n>> which does not look harmful. I think we should just mask them again in\n>> HMEER if we get HMIs for the bits that we already masked.\n> \n> Ben.\n> \n>>>\n>>> ---\n>>>  core/hmi.c          | 26 ++++++++++++++++++++++++++\n>>>  hw/xscom.c          |  5 +----\n>>>  include/processor.h |  7 +++++++\n>>>  3 files changed, 34 insertions(+), 4 deletions(-)\n>>>\n>>> diff --git a/core/hmi.c b/core/hmi.c\n>>> index 84f2c2d6..7ab5810d 100644\n>>> --- a/core/hmi.c\n>>> +++ b/core/hmi.c\n>>> @@ -823,6 +823,32 @@ int handle_hmi_exception(uint64_t hmer, struct OpalHMIEvent *hmi_evt)\n>>>  \t\t}\n>>>  \t}\n>>>\n>>> +\tif (hmer & SPR_HMER_XSCOM_MASK) {\n>>> +\t\thmer &= ~SPR_HMER_XSCOM_MASK;\n>>> +\t\tif (hmi_evt) {\n>>> +\t\t\thmi_evt->severity = OpalHMI_SEV_NO_ERROR;\n>>> +\t\t\thmi_evt->type = OpalHMI_ERROR_XSCOM_DONE;\n>>> +\t\t\tqueue_hmi_event(hmi_evt, recover);\n>>> +\t\t}\n>>> +\t\tsync();\n>>> +\t\tmtspr(SPR_HMEER, mfspr(SPR_HMEER) & ~(SPR_HMER_XSCOM_FAIL |\n>>> +\t\t\t\t\t\t\tSPR_HMER_XSCOM_DONE))\n>>> +\t\tisync();\n>>> +\n>>> +\t\tprlog(PR_DEBUG, \"HMI: Unexpected XSCOM (clearing).\\n\");\n>>> +\t}\n>>> +\n>>> +\tif (hmer) {\n>>> +\t\thmer = 0;\n>>> +\t\tif (hmi_evt) {\n>>> +\t\t\thmi_evt->severity = OpalHMI_SEV_WARNING;\n>>> +\t\t\thmi_evt->type = 0; /* Anything sane we can put here? */\n>>> +\t\t\tqueue_hmi_event(hmi_evt, recover);\n>>> +\t\t}\n>>\n>> This one is also unexpected, should we clear and mask this as well ?\n>> Otherwise we would keep getting this HMI and warnings would flood host\n>> kernel.\n>>\n>> Thanks,\n>> -Mahesh.\n>","headers":{"Return-Path":"<skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org>","X-Original-To":["incoming@patchwork.ozlabs.org","skiboot@lists.ozlabs.org"],"Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","skiboot@lists.ozlabs.org"],"Received":["from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3wy7bR1HVbz9s2s\n\tfor <incoming@patchwork.ozlabs.org>;\n\tWed, 28 Jun 2017 13:30:27 +1000 (AEST)","from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 3wy7bQ6zq9zDr3c\n\tfor <incoming@patchwork.ozlabs.org>;\n\tWed, 28 Jun 2017 13:30:26 +1000 (AEST)","from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com\n\t[148.163.156.1])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby lists.ozlabs.org (Postfix) with ESMTPS id 3wy7bK5PZczDq8c\n\tfor <skiboot@lists.ozlabs.org>; Wed, 28 Jun 2017 13:30:21 +1000 (AEST)","from pps.filterd (m0098394.ppops.net [127.0.0.1])\n\tby mx0a-001b2d01.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id\n\tv5S3Srr7020740\n\tfor <skiboot@lists.ozlabs.org>; Tue, 27 Jun 2017 23:30:19 -0400","from e23smtp02.au.ibm.com (e23smtp02.au.ibm.com [202.81.31.144])\n\tby mx0a-001b2d01.pphosted.com with ESMTP id 2bbreu4vg7-1\n\t(version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT)\n\tfor <skiboot@lists.ozlabs.org>; Tue, 27 Jun 2017 23:30:19 -0400","from localhost\n\tby e23smtp02.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use\n\tOnly! Violators will be prosecuted\n\tfor <skiboot@lists.ozlabs.org> from <mahesh@linux.vnet.ibm.com>;\n\tWed, 28 Jun 2017 13:30:17 +1000","from d23relay07.au.ibm.com (202.81.31.226)\n\tby e23smtp02.au.ibm.com (202.81.31.208) with IBM ESMTP SMTP Gateway:\n\tAuthorized Use Only! Violators will be prosecuted; \n\tWed, 28 Jun 2017 13:30:14 +1000","from d23av06.au.ibm.com (d23av06.au.ibm.com [9.190.235.151])\n\tby d23relay07.au.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id\n\tv5S3UEwD7537072\n\tfor <skiboot@lists.ozlabs.org>; Wed, 28 Jun 2017 13:30:14 +1000","from d23av06.au.ibm.com (localhost [127.0.0.1])\n\tby d23av06.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id\n\tv5S3UDlq008146\n\tfor <skiboot@lists.ozlabs.org>; Wed, 28 Jun 2017 13:30:14 +1000","from [9.195.44.180] ([9.195.44.180])\n\tby d23av06.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id\n\tv5S3UBuU008095; Wed, 28 Jun 2017 13:30:12 +1000"],"To":"Benjamin Herrenschmidt <benh@kernel.crashing.org>,\n\tNicholas Piggin <npiggin@gmail.com>, skiboot@lists.ozlabs.org","References":"<20170623121101.30781-1-npiggin@gmail.com>\n\t<0ca45776-6b27-edcc-6675-267886ec3aa0@linux.vnet.ibm.com>\n\t<1498566765.3651.31.camel@kernel.crashing.org>","From":"Mahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com>","Date":"Wed, 28 Jun 2017 09:00:05 +0530","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tThunderbird/52.2.0","MIME-Version":"1.0","In-Reply-To":"<1498566765.3651.31.camel@kernel.crashing.org>","Content-Language":"en-MW","X-TM-AS-MML":"disable","x-cbid":"17062803-0004-0000-0000-0000021F2863","X-IBM-AV-DETECTION":"SAVI=unused REMOTE=unused XFE=unused","x-cbparentid":"17062803-0005-0000-0000-00005E02FE83","Message-Id":"<f7d7d224-cda2-1754-4506-2a77293fdc67@linux.vnet.ibm.com>","X-Proofpoint-Virus-Version":"vendor=fsecure engine=2.50.10432:, ,\n\tdefinitions=2017-06-28_01:, , signatures=0","X-Proofpoint-Spam-Details":"rule=outbound_notspam policy=outbound score=0\n\tspamscore=0 suspectscore=0\n\tmalwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam\n\tadjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000\n\tdefinitions=main-1706280056","Subject":"Re: [Skiboot] [RFC][PATCH] hmi: clear xscom and unknown bits from\n\tHMER","X-BeenThere":"skiboot@lists.ozlabs.org","X-Mailman-Version":"2.1.23","Precedence":"list","List-Id":"Mailing list for skiboot development <skiboot.lists.ozlabs.org>","List-Unsubscribe":"<https://lists.ozlabs.org/options/skiboot>,\n\t<mailto:skiboot-request@lists.ozlabs.org?subject=unsubscribe>","List-Archive":"<http://lists.ozlabs.org/pipermail/skiboot/>","List-Post":"<mailto:skiboot@lists.ozlabs.org>","List-Help":"<mailto:skiboot-request@lists.ozlabs.org?subject=help>","List-Subscribe":"<https://lists.ozlabs.org/listinfo/skiboot>,\n\t<mailto:skiboot-request@lists.ozlabs.org?subject=subscribe>","Cc":"Ryan Grimm <grimm@linux.vnet.ibm.com>, aksadiga@in.ibm.com","Content-Type":"text/plain; charset=\"utf-8\"","Content-Transfer-Encoding":"base64","Errors-To":"skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org","Sender":"\"Skiboot\"\n\t<skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org>"}},{"id":1702868,"web_url":"http://patchwork.ozlabs.org/comment/1702868/","msgid":"<20170628144156.6e71c220@roar.ozlabs.ibm.com>","list_archive_url":null,"date":"2017-06-28T04:41:56","subject":"Re: [Skiboot] [RFC][PATCH] hmi: clear xscom and unknown bits from\n\tHMER","submitter":{"id":69518,"url":"http://patchwork.ozlabs.org/api/people/69518/","name":"Nicholas Piggin","email":"npiggin@gmail.com"},"content":"On Wed, 28 Jun 2017 09:00:05 +0530\nMahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:\n\n> On 06/27/2017 06:02 PM, Benjamin Herrenschmidt wrote:\n> > On Tue, 2017-06-27 at 10:46 +0530, Mahesh Jagannath Salgaonkar wrote:  \n> >> On 06/23/2017 05:41 PM, Nicholas Piggin wrote:  \n> >>> It has been observed the xscom bit in HMER gets stuck (as-yet  \n> >>\n> >> We see that stuck because opal never clears it after scom read/write.\n> >> The bit is cleared just before the next scom read/write. I am not sure\n> >> what it was left uncleared until next scom read/write kicks in.  \n> > \n> > Because we don't care ?   \n> \n> looking at the code it looks like we didn't care. I sent out a patch\n> that clears them once scom operation is complete.\n> \n> > It should not be enabled in HMEER...  \n> \n> Yes, we don't enable them in HMEER.\n> \n> >>  \n> >>> unkonwn root cause -- HMEER should disable those exceptions).\n> >>> This causes HMIs to be continually taken.\n> >>>\n> >>> HMI: Received HMI interrupt: HMER = 0x0040000000000000\n> >>>\n> >>> Add some attempt to handle this by clearing the HMER and HMEER.\n> >>>\n> >>> Try to clear HMER for other unknown HMIs (alternative is to not\n> >>> recover).  \n> >>\n> >> I think we should be just ok with clearing out and masking them again.  \n> > \n> > Right but we need to understand why we are taking the HMI in the first\n> > place since it's not enabled in HMEER unless something's wrong there.\n> > Is that reproduceable ?  \n> \n> We did debug it yesterday and found the reason. Akshay sent out a patch\n> that fixes the issue. http://patchwork.ozlabs.org/patch/781434/\n\nGiven that this bug was caused by Linux, and not due to an actual\nHMI (and therefore would not be fixed by clearing the HMER/HMEER\nbits), I wonder if this patch is still warranted. HMEER could be\nmessed up somehow, so maybe a simplified version that just notes\nthe unexpected HMI and masks out HMEER.\n\nAny opinions?\n\nThanks,\nNick","headers":{"Return-Path":"<skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org>","X-Original-To":["incoming@patchwork.ozlabs.org","skiboot@lists.ozlabs.org"],"Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","skiboot@lists.ozlabs.org"],"Received":["from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3wy9BL3zN0z9s5L\n\tfor <incoming@patchwork.ozlabs.org>;\n\tWed, 28 Jun 2017 14:42:18 +1000 (AEST)","from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 3wy9BL316kzDr3h\n\tfor <incoming@patchwork.ozlabs.org>;\n\tWed, 28 Jun 2017 14:42:18 +1000 (AEST)","from mail-pf0-x244.google.com (mail-pf0-x244.google.com\n\t[IPv6:2607:f8b0:400e:c00::244])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128\n\tbits)) (No client certificate requested)\n\tby lists.ozlabs.org (Postfix) with ESMTPS id 3wy9B948zHzDr3R\n\tfor <skiboot@lists.ozlabs.org>; Wed, 28 Jun 2017 14:42:09 +1000 (AEST)","by mail-pf0-x244.google.com with SMTP id s66so7387169pfs.2\n\tfor <skiboot@lists.ozlabs.org>; Tue, 27 Jun 2017 21:42:09 -0700 (PDT)","from roar.ozlabs.ibm.com (59-102-83-48.tpgi.com.au. [59.102.83.48])\n\tby smtp.gmail.com with ESMTPSA id\n\tt67sm1514444pfj.98.2017.06.27.21.42.04\n\t(version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256);\n\tTue, 27 Jun 2017 21:42:06 -0700 (PDT)"],"Authentication-Results":["ozlabs.org;\n\tdkim=fail reason=\"signature verification failed\" (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"SjExRSJH\"; dkim-atps=neutral","lists.ozlabs.org;\n\tdkim=fail reason=\"signature verification failed\" (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"SjExRSJH\"; dkim-atps=neutral","lists.ozlabs.org; dkim=pass (2048-bit key;\n\tunprotected) header.d=gmail.com header.i=@gmail.com\n\theader.b=\"SjExRSJH\"; dkim-atps=neutral"],"DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;\n\th=date:from:to:cc:subject:message-id:in-reply-to:references\n\t:organization:mime-version:content-transfer-encoding;\n\tbh=8JF27ATN4uzCk7H2nOhQIGvw5sHEh7MSyvZVaUIqRZM=;\n\tb=SjExRSJHYW1W16eau13FpFVWErP7pBbf0GwWknsSnGeRySXzHyQe3NI8LI+6CmOGG1\n\tHDw1O1W9/m1UII6AaHckQDCaRosWYq86gwTQ24Y9J6/e2lNyRjgp2ABZ0Hy7Kltnt1a+\n\tQPxlLdbVTBKH4lm4efRSE+I7tumThN43hczKY2h0VEFmK4A2o4qHRTWezsFx5TyWxH7q\n\tInA+dce+lPSA4UF9vS58STL/v+y1bgN4na2/gUL0/AEC32CxHUH0iWPyYhJCGJIto4aa\n\tqzqesMmAKteVze3K0q+JzadYDcE22mfU+mC0U4k22KaI5HEGjExCP/PN1C16zaIuiPn3\n\tlGGg==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20161025;\n\th=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to\n\t:references:organization:mime-version:content-transfer-encoding;\n\tbh=8JF27ATN4uzCk7H2nOhQIGvw5sHEh7MSyvZVaUIqRZM=;\n\tb=hox4dFuOmDld+eCIr9/8LrdMuvJY2iCkBkY8X6CpuQ4gQjTI0LNpbr8OlHPkF46FBd\n\tjYfElQC/SrNf8e+GrfLaKyi5FGii5Rc3FPWkhP3BUaF0oaGN//R8WzX6fqsdBWTJ+3PT\n\tvwsf5AZGB98dZFD6PWuYNe3ianBJAPaQT53IAoT0k825QEl1j5KuLuJiGY1J+axa19k2\n\t/ZPugHxXQZfqM+ymgWpWDudGMxlfZBsUhuvBOFjBUp/CNzdPb6kl1ifXEktVTS8nKkYn\n\tIA/n6Rx4aVMp+IdUb9QI9q8O1cpkQ1CkhDWGiKcWJzdv3c+eAyc1VtHrGFVgEDV4NgLC\n\taJOA==","X-Gm-Message-State":"AKS2vOz6Ae0Yxh92tSOfMuNc1O1NdzT+FgZY29jImsmZyrhvHLJCgkU3\n\tK3oN+jdo1UBxHA==","X-Received":"by 10.99.143.21 with SMTP id n21mr4436230pgd.145.1498624927657; \n\tTue, 27 Jun 2017 21:42:07 -0700 (PDT)","Date":"Wed, 28 Jun 2017 14:41:56 +1000","From":"Nicholas Piggin <npiggin@gmail.com>","To":"Mahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com>","Message-ID":"<20170628144156.6e71c220@roar.ozlabs.ibm.com>","In-Reply-To":"<f7d7d224-cda2-1754-4506-2a77293fdc67@linux.vnet.ibm.com>","References":"<20170623121101.30781-1-npiggin@gmail.com>\n\t<0ca45776-6b27-edcc-6675-267886ec3aa0@linux.vnet.ibm.com>\n\t<1498566765.3651.31.camel@kernel.crashing.org>\n\t<f7d7d224-cda2-1754-4506-2a77293fdc67@linux.vnet.ibm.com>","Organization":"IBM","X-Mailer":"Claws Mail 3.14.1 (GTK+ 2.24.31; x86_64-pc-linux-gnu)","MIME-Version":"1.0","Subject":"Re: [Skiboot] [RFC][PATCH] hmi: clear xscom and unknown bits from\n\tHMER","X-BeenThere":"skiboot@lists.ozlabs.org","X-Mailman-Version":"2.1.23","Precedence":"list","List-Id":"Mailing list for skiboot development <skiboot.lists.ozlabs.org>","List-Unsubscribe":"<https://lists.ozlabs.org/options/skiboot>,\n\t<mailto:skiboot-request@lists.ozlabs.org?subject=unsubscribe>","List-Archive":"<http://lists.ozlabs.org/pipermail/skiboot/>","List-Post":"<mailto:skiboot@lists.ozlabs.org>","List-Help":"<mailto:skiboot-request@lists.ozlabs.org?subject=help>","List-Subscribe":"<https://lists.ozlabs.org/listinfo/skiboot>,\n\t<mailto:skiboot-request@lists.ozlabs.org?subject=subscribe>","Cc":"Ryan Grimm <grimm@linux.vnet.ibm.com>, aksadiga@in.ibm.com,\n\tskiboot@lists.ozlabs.org","Content-Type":"text/plain; charset=\"utf-8\"","Content-Transfer-Encoding":"base64","Errors-To":"skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org","Sender":"\"Skiboot\"\n\t<skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org>"}},{"id":1702904,"web_url":"http://patchwork.ozlabs.org/comment/1702904/","msgid":"<1005db51-d5ba-abbf-2f2d-68604ecb83ff@linux.vnet.ibm.com>","list_archive_url":null,"date":"2017-06-28T06:32:55","subject":"Re: [Skiboot] [RFC][PATCH] hmi: clear xscom and unknown bits from\n\tHMER","submitter":{"id":1436,"url":"http://patchwork.ozlabs.org/api/people/1436/","name":"Mahesh J Salgaonkar","email":"mahesh@linux.vnet.ibm.com"},"content":"On 06/28/2017 10:11 AM, Nicholas Piggin wrote:\n> On Wed, 28 Jun 2017 09:00:05 +0530\n> Mahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:\n> \n>> On 06/27/2017 06:02 PM, Benjamin Herrenschmidt wrote:\n>>> On Tue, 2017-06-27 at 10:46 +0530, Mahesh Jagannath Salgaonkar wrote:  \n>>>> On 06/23/2017 05:41 PM, Nicholas Piggin wrote:  \n>>>>> It has been observed the xscom bit in HMER gets stuck (as-yet  \n>>>>\n>>>> We see that stuck because opal never clears it after scom read/write.\n>>>> The bit is cleared just before the next scom read/write. I am not sure\n>>>> what it was left uncleared until next scom read/write kicks in.  \n>>>\n>>> Because we don't care ?   \n>>\n>> looking at the code it looks like we didn't care. I sent out a patch\n>> that clears them once scom operation is complete.\n>>\n>>> It should not be enabled in HMEER...  \n>>\n>> Yes, we don't enable them in HMEER.\n>>\n>>>>  \n>>>>> unkonwn root cause -- HMEER should disable those exceptions).\n>>>>> This causes HMIs to be continually taken.\n>>>>>\n>>>>> HMI: Received HMI interrupt: HMER = 0x0040000000000000\n>>>>>\n>>>>> Add some attempt to handle this by clearing the HMER and HMEER.\n>>>>>\n>>>>> Try to clear HMER for other unknown HMIs (alternative is to not\n>>>>> recover).  \n>>>>\n>>>> I think we should be just ok with clearing out and masking them again.  \n>>>\n>>> Right but we need to understand why we are taking the HMI in the first\n>>> place since it's not enabled in HMEER unless something's wrong there.\n>>> Is that reproduceable ?  \n>>\n>> We did debug it yesterday and found the reason. Akshay sent out a patch\n>> that fixes the issue. http://patchwork.ozlabs.org/patch/781434/\n> \n> Given that this bug was caused by Linux, and not due to an actual\n> HMI (and therefore would not be fixed by clearing the HMER/HMEER\n> bits), I wonder if this patch is still warranted. HMEER could be\n> messed up somehow, so maybe a simplified version that just notes\n> the unexpected HMI and masks out HMEER.\n> \n> Any opinions?\n\nYeah I agree with having simplified version so that it will help us to\ndetect if we at all mess up with HMEER in future.\n\nThanks,\n-Mahesh.","headers":{"Return-Path":"<skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org>","X-Original-To":["incoming@patchwork.ozlabs.org","skiboot@lists.ozlabs.org"],"Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","skiboot@lists.ozlabs.org"],"Received":["from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\t(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))\n\t(No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3wyChY4zH7z9s2s\n\tfor <incoming@patchwork.ozlabs.org>;\n\tWed, 28 Jun 2017 16:35:09 +1000 (AEST)","from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])\n\tby lists.ozlabs.org (Postfix) with ESMTP id 3wyChY45xwzDr4w\n\tfor <incoming@patchwork.ozlabs.org>;\n\tWed, 28 Jun 2017 16:35:09 +1000 (AEST)","from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com\n\t[148.163.156.1])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby lists.ozlabs.org (Postfix) with ESMTPS id 3wyCgH1vMXzDr4b\n\tfor <skiboot@lists.ozlabs.org>; Wed, 28 Jun 2017 16:34:03 +1000 (AEST)","from pps.filterd (m0098396.ppops.net [127.0.0.1])\n\tby mx0a-001b2d01.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id\n\tv5S6Xxp6125154\n\tfor <skiboot@lists.ozlabs.org>; Wed, 28 Jun 2017 02:34:01 -0400","from e23smtp01.au.ibm.com (e23smtp01.au.ibm.com [202.81.31.143])\n\tby mx0a-001b2d01.pphosted.com with ESMTP id 2bbreyjb8d-1\n\t(version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT)\n\tfor <skiboot@lists.ozlabs.org>; Wed, 28 Jun 2017 02:34:00 -0400","from localhost\n\tby e23smtp01.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use\n\tOnly! Violators will be prosecuted\n\tfor <skiboot@lists.ozlabs.org> from <mahesh@linux.vnet.ibm.com>;\n\tWed, 28 Jun 2017 16:33:55 +1000","from d23relay06.au.ibm.com (202.81.31.225)\n\tby e23smtp01.au.ibm.com (202.81.31.207) with IBM ESMTP SMTP Gateway:\n\tAuthorized Use Only! Violators will be prosecuted; \n\tWed, 28 Jun 2017 16:33:54 +1000","from d23av05.au.ibm.com (d23av05.au.ibm.com [9.190.234.119])\n\tby d23relay06.au.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id\n\tv5S6Xi5m4260106\n\tfor <skiboot@lists.ozlabs.org>; Wed, 28 Jun 2017 16:33:53 +1000","from d23av05.au.ibm.com (localhost [127.0.0.1])\n\tby d23av05.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id\n\tv5S6XJ9Y021831\n\tfor <skiboot@lists.ozlabs.org>; Wed, 28 Jun 2017 16:33:20 +1000","from [9.202.5.86] ([9.202.5.86])\n\tby d23av05.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id\n\tv5S6XHjo021267; Wed, 28 Jun 2017 16:33:18 +1000"],"To":"Nicholas Piggin <npiggin@gmail.com>","References":"<20170623121101.30781-1-npiggin@gmail.com>\n\t<0ca45776-6b27-edcc-6675-267886ec3aa0@linux.vnet.ibm.com>\n\t<1498566765.3651.31.camel@kernel.crashing.org>\n\t<f7d7d224-cda2-1754-4506-2a77293fdc67@linux.vnet.ibm.com>\n\t<20170628144156.6e71c220@roar.ozlabs.ibm.com>","From":"Mahesh Jagannath Salgaonkar <mahesh@linux.vnet.ibm.com>","Date":"Wed, 28 Jun 2017 12:02:55 +0530","User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101\n\tThunderbird/52.2.0","MIME-Version":"1.0","In-Reply-To":"<20170628144156.6e71c220@roar.ozlabs.ibm.com>","Content-Language":"en-MW","X-TM-AS-MML":"disable","x-cbid":"17062806-1617-0000-0000-000001ED53BC","X-IBM-AV-DETECTION":"SAVI=unused REMOTE=unused XFE=unused","x-cbparentid":"17062806-1618-0000-0000-00004834CDAC","Message-Id":"<1005db51-d5ba-abbf-2f2d-68604ecb83ff@linux.vnet.ibm.com>","X-Proofpoint-Virus-Version":"vendor=fsecure engine=2.50.10432:, ,\n\tdefinitions=2017-06-28_03:, , signatures=0","X-Proofpoint-Spam-Details":"rule=outbound_notspam policy=outbound score=0\n\tspamscore=0 suspectscore=0\n\tmalwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam\n\tadjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000\n\tdefinitions=main-1706280104","Subject":"Re: [Skiboot] [RFC][PATCH] hmi: clear xscom and unknown bits from\n\tHMER","X-BeenThere":"skiboot@lists.ozlabs.org","X-Mailman-Version":"2.1.23","Precedence":"list","List-Id":"Mailing list for skiboot development <skiboot.lists.ozlabs.org>","List-Unsubscribe":"<https://lists.ozlabs.org/options/skiboot>,\n\t<mailto:skiboot-request@lists.ozlabs.org?subject=unsubscribe>","List-Archive":"<http://lists.ozlabs.org/pipermail/skiboot/>","List-Post":"<mailto:skiboot@lists.ozlabs.org>","List-Help":"<mailto:skiboot-request@lists.ozlabs.org?subject=help>","List-Subscribe":"<https://lists.ozlabs.org/listinfo/skiboot>,\n\t<mailto:skiboot-request@lists.ozlabs.org?subject=subscribe>","Cc":"Ryan Grimm <grimm@linux.vnet.ibm.com>, aksadiga@in.ibm.com,\n\tskiboot@lists.ozlabs.org","Content-Type":"text/plain; charset=\"utf-8\"","Content-Transfer-Encoding":"base64","Errors-To":"skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org","Sender":"\"Skiboot\"\n\t<skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org>"}}]