From patchwork Wed Apr 29 21:41:48 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Torek X-Patchwork-Id: 26653 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@bilbo.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from ozlabs.org (ozlabs.org [203.10.76.45]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mx.ozlabs.org", Issuer "CA Cert Signing Authority" (verified OK)) by bilbo.ozlabs.org (Postfix) with ESMTPS id 7A21EB7079 for ; Thu, 30 Apr 2009 07:52:32 +1000 (EST) Received: by ozlabs.org (Postfix) id 73CF4DE10B; Thu, 30 Apr 2009 07:52:15 +1000 (EST) Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by ozlabs.org (Postfix) with ESMTP id 053CDDE119 for ; Thu, 30 Apr 2009 07:52:15 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752729AbZD2Vv4 (ORCPT ); Wed, 29 Apr 2009 17:51:56 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753469AbZD2Vv4 (ORCPT ); Wed, 29 Apr 2009 17:51:56 -0400 Received: from mail.torek.net ([67.40.109.61]:55456 "EHLO elf.torek.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753070AbZD2Vvz (ORCPT ); Wed, 29 Apr 2009 17:51:55 -0400 X-Greylist: delayed 601 seconds by postgrey-1.27 at vger.kernel.org; Wed, 29 Apr 2009 17:51:53 EDT Received: from elf.torek.net (localhost [127.0.0.1]) by elf.torek.net (8.11.6/8.11.2) with ESMTP id n3TLfmn16267 for ; Wed, 29 Apr 2009 15:41:49 -0600 (MDT) Message-Id: <200904292141.n3TLfmn16267@elf.torek.net> From: Chris Torek To: sparclinux@vger.kernel.org Subject: interrupt problem with multiple vnet devices Date: Wed, 29 Apr 2009 15:41:48 -0600 Sender: sparclinux-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: sparclinux@vger.kernel.org One of our guys is working on a problem with Linux as guest domain client and multiple vnets bound to the same vswitch: ldm add-vnet vnet0 primary-vsw1 ldom1 ldm add-vnet vnet1 primary-vsw1 ldom1 When configured this way, Linux does not boot properly, as it winds up perpetually servicing vnet interrupts. The fix he sent me is below, but I am very suspicious about this as it essentially removes the linked-list of buckets that the irq code runs. I think the real problem is that, when multiple vnet devices are on the same vswitch, the interrupts get re-enabled "too soon" so that the sun4v_ivec.S code ends up making the linked list become circular (by queueing work for a vnet cookie that is already queued). It seems a little odd to me, though, that the sun4v_ivec.S code builds up a backwards list (by pushing things onto the head of the chain after setting the "next" based on the current head). I guess this is just because it is too hard to add to the end of the list. (I would probably do this with the assembly code equivalent of: new->next = NULL; *tail = new; tail = &new->next; but this requires adding a "tail" pointer to the per-cpu block, and of course updating it in the irq_64.c code. Note that we are still working with older code that has a sparc64 directory.) (This patch also includes code to handle irq redistribution on ldom guests that have a small number of CPUs, with a kernel built to run on many more CPUs. It is still not ideal yet as it does not do chip-then-strand distribution, I'm just including it here out of laziness :-) / desire-not-to-break-diffs.) Chris --- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/sparc64/kernel/sun4v_ivec.S b/arch/sparc64/kernel/sun4v_ivec.S index e2f8e1b..edcd71e 100644 --- a/arch/sparc64/kernel/sun4v_ivec.S +++ b/arch/sparc64/kernel/sun4v_ivec.S @@ -108,8 +108,7 @@ sun4v_dev_mondo: sllx %g3, 4, %g3 add %g4, %g3, %g4 -1: ldx [%g1], %g2 - stxa %g2, [%g4] ASI_PHYS_USE_EC +1: stxa %g0, [%g4] ASI_PHYS_USE_EC stx %g4, [%g1] /* Signal the interrupt by setting (1 << pil) in %softint. */ diff --git a/arch/sparc64/kernel/irq.c b/arch/sparc64/kernel/irq.c index 7872476..2584b9e 100644 --- a/arch/sparc64/kernel/irq.c +++ b/arch/sparc64/kernel/irq.c @@ -70,6 +70,7 @@ static unsigned long bucket_get_chain_pa(unsigned long bucket_pa) return ret; } +#if 0 /* This one is no longer needed. */ static void bucket_clear_chain_pa(unsigned long bucket_pa) { __asm__ __volatile__("stxa %%g0, [%0] %1" @@ -79,6 +80,7 @@ static void bucket_clear_chain_pa(unsigned long bucket_pa) __irq_chain_pa)), "i" (ASI_PHYS_USE_EC)); } +#endif static unsigned int bucket_get_virt_irq(unsigned long bucket_pa) { @@ -251,7 +253,7 @@ static int irq_choose_cpu(unsigned int virt_irq) cpumask_t mask = irq_desc[virt_irq].affinity; int cpuid; - if (cpus_equal(mask, CPU_MASK_ALL)) { + if (cpus_subset(mask, CPU_MASK_ALL)) { static int irq_rover; static DEFINE_SPINLOCK(irq_rover_lock); unsigned long flags; @@ -735,10 +737,12 @@ void handler_irq(int irq, struct pt_regs *regs) next_pa = bucket_get_chain_pa(bucket_pa); virt_irq = bucket_get_virt_irq(bucket_pa); - bucket_clear_chain_pa(bucket_pa); desc = irq_desc + virt_irq; + if (desc->chip->set_affinity) + desc->chip->set_affinity(virt_irq, cpu_online_map); + desc->handle_irq(virt_irq, desc); bucket_pa = next_pa;