From patchwork Mon Nov 17 16:10:23 2008 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arnd Bergmann X-Patchwork-Id: 9171 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from ozlabs.org (localhost [127.0.0.1]) by ozlabs.org (Postfix) with ESMTP id 7F903DDF6A for ; Tue, 18 Nov 2008 03:11:28 +1100 (EST) X-Original-To: cbe-oss-dev@ozlabs.org Delivered-To: cbe-oss-dev@ozlabs.org Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.126.171]) by ozlabs.org (Postfix) with ESMTP id D6628DDDEC; Tue, 18 Nov 2008 03:10:32 +1100 (EST) Received: from dyn-9-152-222-41.boeblingen.de.ibm.com (blueice4n1.de.ibm.com [195.212.29.187]) by mrelayeu.kundenserver.de (node=mrelayeu8) with ESMTP (Nemesis) id 0ML31I-1L26gF3rHH-0003bw; Mon, 17 Nov 2008 17:10:29 +0100 From: Arnd Bergmann To: linuxppc-dev@ozlabs.org, benh@kernel.crashing.org, paulus@samba.org Date: Mon, 17 Nov 2008 17:10:23 +0100 User-Agent: KMail/1.9.9 X-Face: I@=L^?./?$U, EK.)V[4*>`zSqm0>65YtkOe>TFD'!aw?7OVv#~5xd\s, [~w]-J!)|%=]>=?utf-8?q?+=0A=09=7EohchhkRGW=3F=7C6=5FqTmkd=5Ft=3FLZC=23Q-=60=2E=60Y=2Ea=5E?= =?utf-8?q?3zb?=) =?utf-8?q?+U-JVN=5DWT=25cw=23=5BYo0=267C=26bL12wWGlZi=0A=09=7EJ=3B=5Cwg?= =?utf-8?q?=3B3zRnz?=, J"CT_)=\H'1/{?SR7GDu?WIopm.HaBG=QYj"NZD_[zrM\Gip^U MIME-Version: 1.0 Content-Disposition: inline Message-Id: <200811171710.24521.arnd@arndb.de> X-Provags-ID: V01U2FsdGVkX1+1yRt4OsD/WcKSsEjo+06K7TA156JpHme1Wt7 lNAU9aI0LLVpuZfhmCwxD6O2uhNnpoKnbq/DHvG8PaJCpqv5QH 86Eobvw+ItEb0blFo9cMw== Cc: cbe-oss-dev@ozlabs.org, GBIRAN@il.ibm.com Subject: [Cbe-oss-dev] [PATCH] powerpc/cell/axon-msi: retry on missing interrupt X-BeenThere: cbe-oss-dev@ozlabs.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: Discussion about Open Source Software for the Cell Broadband Engine List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: cbe-oss-dev-bounces+patchwork-incoming=ozlabs.org@ozlabs.org Errors-To: cbe-oss-dev-bounces+patchwork-incoming=ozlabs.org@ozlabs.org The MSI capture logic for the axon bridge can sometimes lose interrupts in case of high DMA and interrupt load, when it signals an MSI interrupt to the MPIC interrupt controller while we are already handling another MSI. Each MSI vector gets written into a FIFO buffer in main memory using DMA, and that DMA access is normally flushed by the actual interrupt packet on the IOIF. An MMIO register in the MSIC holds the position of the last entry in the FIFO buffer that was written. However, reading that position does not flush the DMA, so that we can observe stale data in the buffer. In a stress test, we have observed the DMA to arrive up to 14 microseconds after reading the register. This patch works around this problem by retrying the access to the FIFO buffer. We can reliably detect the conditioning by writing an invalid MSI vector into the FIFO buffer after reading from it, assuming that all MSIs we get are valid. After detecting an invalid MSI vector, we udelay(1) in the interrupt cascade for up to 100 times before giving up. Signed-off-by: Arnd Bergmann Signed-off-by: Arnd Bergmann --- axon_msi.c | 36 +++++++++++++++++++++++++++++++----- 1 files changed, 31 insertions(+), 5 deletions(-) Index: linux-2.6/arch/powerpc/platforms/cell/axon_msi.c =================================================================== --- linux-2.6.orig/arch/powerpc/platforms/cell/axon_msi.c 2008-11-17 10:29:05.000000000 -0500 +++ linux-2.6/arch/powerpc/platforms/cell/axon_msi.c 2008-11-17 10:29:08.000000000 -0500 @@ -95,6 +95,7 @@ struct axon_msic *msic = get_irq_data(irq); u32 write_offset, msi; int idx; + int retry = 0; write_offset = dcr_read(msic->dcr_host, MSIC_WRITE_OFFSET_REG); pr_debug("axon_msi: original write_offset 0x%x\n", write_offset); @@ -102,7 +103,7 @@ /* write_offset doesn't wrap properly, so we have to mask it */ write_offset &= MSIC_FIFO_SIZE_MASK; - while (msic->read_offset != write_offset) { + while (msic->read_offset != write_offset && retry < 100) { idx = msic->read_offset / sizeof(__le32); msi = le32_to_cpu(msic->fifo_virt[idx]); msi &= 0xFFFF; @@ -110,13 +111,37 @@ pr_debug("axon_msi: woff %x roff %x msi %x\n", write_offset, msic->read_offset, msi); + if (msi < NR_IRQS && irq_map[msi].host == msic->irq_host) { + generic_handle_irq(msi); + msic->fifo_virt[idx] = cpu_to_le32(0xffffffff); + } else { + /* + * Reading the MSIC_WRITE_OFFSET_REG does not + * reliably flush the outstanding DMA to the + * FIFO buffer. Here we were reading stale + * data, so we need to retry. + */ + udelay(1); + retry++; + pr_debug("axon_msi: invalid irq 0x%x!\n", msi); + continue; + } + + if (retry) { + pr_debug("axon_msi: late irq 0x%x, retry %d\n", + msi, retry); + retry = 0; + } + msic->read_offset += MSIC_FIFO_ENTRY_SIZE; msic->read_offset &= MSIC_FIFO_SIZE_MASK; + } - if (msi < NR_IRQS && irq_map[msi].host == msic->irq_host) - generic_handle_irq(msi); - else - pr_debug("axon_msi: invalid irq 0x%x!\n", msi); + if (retry) { + printk(KERN_WARNING "axon_msi: irq timed out\n"); + + msic->read_offset += MSIC_FIFO_ENTRY_SIZE; + msic->read_offset &= MSIC_FIFO_SIZE_MASK; } desc->chip->eoi(irq); @@ -364,6 +389,7 @@ dn->full_name); goto out_free_fifo; } + memset(msic->fifo_virt, 0xff, MSIC_FIFO_SIZE_BYTES); msic->irq_host = irq_alloc_host(dn, IRQ_HOST_MAP_NOMAP, NR_IRQS, &msic_host_ops, 0);