From patchwork Wed Dec  6 17:39:27 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Benjamin Herrenschmidt <benh@kernel.crashing.org>
X-Patchwork-Id: 845295
Return-Path: <skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])
	(using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 3ysT0l6NrVz9s4q
	for <incoming@patchwork.ozlabs.org>;
	Thu,  7 Dec 2017 06:18:15 +1100 (AEDT)
Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3])
	by lists.ozlabs.org (Postfix) with ESMTP id 3ysT0l0tD9zDsGJ
	for <incoming@patchwork.ozlabs.org>;
	Thu,  7 Dec 2017 06:18:15 +1100 (AEDT)
X-Original-To: skiboot@lists.ozlabs.org
Delivered-To: skiboot@lists.ozlabs.org
Authentication-Results: ozlabs.org; spf=permerror (mailfrom)
	smtp.mailfrom=kernel.crashing.org (client-ip=63.228.1.57;
	helo=gate.crashing.org; envelope-from=benh@kernel.crashing.org;
	receiver=<UNKNOWN>)
Received: from gate.crashing.org (gate.crashing.org [63.228.1.57])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by lists.ozlabs.org (Postfix) with ESMTPS id 3ysT0M4qKmzDsFC
	for <skiboot@lists.ozlabs.org>; Thu,  7 Dec 2017 06:17:55 +1100 (AEDT)
Received: from pasglop.austin.ibm.com (localhost.localdomain [127.0.0.1])
	by gate.crashing.org (8.14.1/8.14.1) with ESMTP id vB6HdhvX020901;
	Wed, 6 Dec 2017 11:39:46 -0600
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: skiboot@lists.ozlabs.org
Date: Wed,  6 Dec 2017 11:39:27 -0600
Message-Id: <20171206173928.25628-5-benh@kernel.crashing.org>
X-Mailer: git-send-email 2.14.3
In-Reply-To: <20171206173928.25628-1-benh@kernel.crashing.org>
References: <20171206173928.25628-1-benh@kernel.crashing.org>
Subject: [Skiboot] [PATCH 5/6] xive: Fix occasional VC checkstops in
	xive_reset
X-BeenThere: skiboot@lists.ozlabs.org
X-Mailman-Version: 2.1.24
Precedence: list
List-Id: Mailing list for skiboot development <skiboot.lists.ozlabs.org>
List-Unsubscribe: <https://lists.ozlabs.org/options/skiboot>,
	<mailto:skiboot-request@lists.ozlabs.org?subject=unsubscribe>
List-Archive: <http://lists.ozlabs.org/pipermail/skiboot/>
List-Post: <mailto:skiboot@lists.ozlabs.org>
List-Help: <mailto:skiboot-request@lists.ozlabs.org?subject=help>
List-Subscribe: <https://lists.ozlabs.org/listinfo/skiboot>,
	<mailto:skiboot-request@lists.ozlabs.org?subject=subscribe>
MIME-Version: 1.0
Errors-To: skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org
Sender: "Skiboot"
	<skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org>

The current workaround for the scrub bug described in
__xive_cache_scrub() has an issue in that it can leave
dirty invalid entries in the cache.

When cleaning up EQs or VPs during reset, if we then
remove the underlying indirect page for these entries,
the XIVE will checkstop when trying to flush them out
of the cache.

This replaces the existing workaround with a new pair of
workarounds for VPs and EQs:

 - The VP one does the dummy watch on another entry than
the one we scrubbed (which does the job of pushing old
stores out) using an entry that is known to be backed by
a permanent indirect page.

 - The EQ one switches to a more efficient workaround
which consists of doing a non-side-effect ESB load from
the EQ's ESe control bits.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Oliver O'Halloran <oohall@gmail.com>
---
 hw/xive.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 50 insertions(+), 3 deletions(-)
diff --git a/hw/xive.c b/hw/xive.c
index 104e1e85..b08c6783 100644
--- a/hw/xive.c
+++ b/hw/xive.c
@@ -1251,6 +1251,48 @@ static int64_t __xive_cache_watch(struct xive *x, enum xive_cache_type ctype,
 				  void *new_data, bool light_watch,
 				  bool synchronous);
 
+static void xive_scrub_workaround_vp(struct xive *x, uint32_t block, uint32_t idx __unused)
+{
+	/* VP variant of the workaround described in __xive_cache_scrub(),
+	 * we need to be careful to use for that workaround an NVT that
+	 * sits on the same xive but isn NOT part of a donated indirect
+	 * entry.
+	 *
+	 * The reason is that the dummy cache watch will re-create a
+	 * dirty entry in the cache, even if the entry is marked
+	 * invalid.
+	 *
+	 * Thus if we are about to dispose of the indirect entry backing
+	 * it, we'll cause a checkstop later on when trying to write it
+	 * out.
+	 *
+	 * Note: This means the workaround only works for block group
+	 * mode.
+	 */
+#ifdef USE_BLOCK_GROUP_MODE
+	__xive_cache_watch(x, xive_cache_vpc, block, INITIAL_VP_COUNT, 0,
+			   0, NULL, true, false);
+#else
+	/* WARNING: Some workarounds related to cache scrubs require us to
+	 * have at least one firmware owned (permanent) indirect entry for
+	 * each XIVE instance. This currently only happens in block group
+	 * mode
+	 */
+#warning Block group mode should not be disabled
+#endif
+}
+
+static void xive_scrub_workaround_eq(struct xive *x, uint32_t block __unused, uint32_t idx)
+{
+	void *mmio;
+
+	/* EQ variant of the workaround described in __xive_cache_scrub(),
+	 * a simple non-side effect load from ESn will do
+	 */
+	mmio = x->eq_mmio + idx * 0x20000;
+	in_complete(in_be64(mmio + 0x800));
+}
+
 static int64_t __xive_cache_scrub(struct xive *x, enum xive_cache_type ctype,
 				  uint64_t block, uint64_t idx,
 				  bool want_inval, bool want_disable)
@@ -1270,6 +1312,9 @@ static int64_t __xive_cache_scrub(struct xive *x, enum xive_cache_type ctype,
 	 * invalidate, then after the scrub, we do a dummy cache
 	 * watch which will make the HW read the data back, which
 	 * should be ordered behind all the preceding stores.
+	 *
+	 * Update: For EQs we can do a non-side effect ESB load instead
+	 * which is faster.
 	 */
 	want_inval = true;
 
@@ -1331,9 +1376,11 @@ static int64_t __xive_cache_scrub(struct xive *x, enum xive_cache_type ctype,
 	/* Workaround for HW bug described above (only applies to
 	 * EQC and VPC
 	 */
-	if (ctype == xive_cache_eqc || ctype == xive_cache_vpc)
-		__xive_cache_watch(x, ctype, block, idx, 0, 0, NULL,
-				   true, false);
+	if (ctype == xive_cache_eqc)
+		xive_scrub_workaround_eq(x, block, idx);
+	else if (ctype == xive_cache_vpc)
+		xive_scrub_workaround_vp(x, block, idx);
+
 	return 0;
 }