From patchwork Wed Jul 15 21:43:04 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mike Mason X-Patchwork-Id: 29833 Return-Path: X-Original-To: patchwork-incoming@bilbo.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from ozlabs.org (ozlabs.org [203.10.76.45]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mx.ozlabs.org", Issuer "CA Cert Signing Authority" (verified OK)) by bilbo.ozlabs.org (Postfix) with ESMTPS id 8EEE7B7B62 for ; Thu, 16 Jul 2009 07:43:39 +1000 (EST) Received: by ozlabs.org (Postfix) id 81BC1DDDA0; Thu, 16 Jul 2009 07:43:39 +1000 (EST) Delivered-To: patchwork-incoming@ozlabs.org Received: from bilbo.ozlabs.org (bilbo.ozlabs.org [203.10.76.25]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "bilbo.ozlabs.org", Issuer "CAcert Class 3 Root" (verified OK)) by ozlabs.org (Postfix) with ESMTPS id 8025DDDD1C for ; Thu, 16 Jul 2009 07:43:39 +1000 (EST) Received: from bilbo.ozlabs.org (localhost [127.0.0.1]) by bilbo.ozlabs.org (Postfix) with ESMTP id CBF47B7BF2 for ; Thu, 16 Jul 2009 07:43:16 +1000 (EST) Received: from ozlabs.org (ozlabs.org [203.10.76.45]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mx.ozlabs.org", Issuer "CA Cert Signing Authority" (verified OK)) by bilbo.ozlabs.org (Postfix) with ESMTPS id BFF5EB70C0 for ; Thu, 16 Jul 2009 07:43:10 +1000 (EST) Received: by ozlabs.org (Postfix) id B08AEDDDA2; Thu, 16 Jul 2009 07:43:10 +1000 (EST) Delivered-To: linuxppc-dev@ozlabs.org Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.152]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "e34.co.us.ibm.com", Issuer "Equifax" (verified OK)) by ozlabs.org (Postfix) with ESMTPS id 55609DDDA0 for ; Thu, 16 Jul 2009 07:43:09 +1000 (EST) Received: from d03relay02.boulder.ibm.com (d03relay02.boulder.ibm.com [9.17.195.227]) by e34.co.us.ibm.com (8.13.1/8.13.1) with ESMTP id n6FLdaWN027687 for ; Wed, 15 Jul 2009 15:39:36 -0600 Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d03relay02.boulder.ibm.com (8.13.8/8.13.8/NCO v9.2) with ESMTP id n6FLh5XS258338 for ; Wed, 15 Jul 2009 15:43:05 -0600 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id n6FLh4qp008874 for ; Wed, 15 Jul 2009 15:43:04 -0600 Received: from [127.0.0.1] (sig-9-65-57-193.mts.ibm.com [9.65.57.193]) by d03av02.boulder.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id n6FLgx0A008472; Wed, 15 Jul 2009 15:43:04 -0600 Message-ID: <4A5E4D68.6070909@us.ibm.com> Date: Wed, 15 Jul 2009 14:43:04 -0700 From: Mike Mason User-Agent: Thunderbird 2.0.0.22 (Windows/20090605) MIME-Version: 1.0 To: linuxppc-dev@ozlabs.org, Paul Mackerras , benh@kernel.crashing.org, linasvepstas@gmail.com Subject: [PATCH] Hold reference to device_node during EEH event handling X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.11 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org This patch increments the device_node reference counter when an EEH error occurs and decrements the counter when the event has been handled. This is to prevent the device_node from being released until eeh_event_handler() has had a chance to deal with the event. We've seen cases where the device_node is released too soon when an EEH event occurs during a dlpar remove, causing the event handler to attempt to access bad memory locations. Please review and let me know of any concerns. Signed-off-by: Mike Mason --- a/arch/powerpc/platforms/pseries/eeh_event.c 2008-10-09 15:13:53.000000000 -0700 +++ b/arch/powerpc/platforms/pseries/eeh_event.c 2009-07-14 14:14:00.000000000 -0700 @@ -75,6 +75,14 @@ static int eeh_event_handler(void * dumm if (event == NULL) return 0; + /* EEH holds a reference to the device_node, so if it + * equals 1 it's no longer valid and the event should + * be ignored */ + if (atomic_read(&event->dn->kref.refcount) == 1) { + of_node_put(event->dn); + return 0; + } + /* Serialize processing of EEH events */ mutex_lock(&eeh_event_mutex); eeh_mark_slot(event->dn, EEH_MODE_RECOVERING); @@ -86,6 +94,7 @@ static int eeh_event_handler(void * dumm eeh_clear_slot(event->dn, EEH_MODE_RECOVERING); pci_dev_put(event->dev); + of_node_put(event->dn); kfree(event); mutex_unlock(&eeh_event_mutex); @@ -140,7 +149,7 @@ int eeh_send_failure_event (struct devic if (dev) pci_dev_get(dev); - event->dn = dn; + event->dn = of_node_get(dn); event->dev = dev; /* We may or may not be called in an interrupt context */