From patchwork Sat Mar 20 14:05:13 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: jing zhang X-Patchwork-Id: 48210 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id A9B59B7D0B for ; Sun, 21 Mar 2010 01:05:18 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751995Ab0CTOFQ (ORCPT ); Sat, 20 Mar 2010 10:05:16 -0400 Received: from mail-gw0-f46.google.com ([74.125.83.46]:56514 "EHLO mail-gw0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751707Ab0CTOFP (ORCPT ); Sat, 20 Mar 2010 10:05:15 -0400 Received: by gwaa18 with SMTP id a18so237398gwa.19 for ; Sat, 20 Mar 2010 07:05:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=jf9y4A45tSyB+IoRBLVxVOsAcHUi1zC4jIFPjZznhFc=; b=hYIoIACLeiVcDbclLQ1n6t/fvRTTSgfahBdIfDEKg/jDBxuLwcWk5YcfqH/lKU5IOI FILtshU0NqG3QxthYUQcAsJ4rz0IhpuqNTlu2B1ubmuscB44LQdUm6aFoNAWJgPmrnlC IPUPFo+FD6ZA/vZ9u2zzh+wNYppsIFhPRN81I= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=l7lSkfnNv+WNY89nWYxo6SdDIykmmQ41Ddf96Oc3F4EAaAXBAxKqIIybr8dU1pBY7P gPU+4qAb30naXPt23TxHwk/692yShGq8sYP1X8pNJV8Apv1c/ITeLOe5Mq/PxkHG3bLN 4FpT0N0WO5TN76V2jtQVP7lpuJvwk9Yxw2JNg= MIME-Version: 1.0 Received: by 10.101.176.33 with SMTP id d33mr2191055anp.49.1269093914142; Sat, 20 Mar 2010 07:05:14 -0700 (PDT) In-Reply-To: <67790F0F-9921-4A98-8DC6-DA1C00CE6CA9@sun.com> References: <20100318174629.GK8256@thunk.org> <67790F0F-9921-4A98-8DC6-DA1C00CE6CA9@sun.com> Date: Sat, 20 Mar 2010 22:05:13 +0800 Message-ID: Subject: Re: [PATCH] ext4: memory leakage in ext4_discard_preallocations From: jing zhang To: Andreas Dilger Cc: tytso@mit.edu, linux-ext4 , Dave Kleikamp Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org 2010/3/20, Andreas Dilger : > On 2010-03-19, at 08:17, jing zhang wrote: >>>> ext4_get_group_no_and_offset(sb, pa->pa_pstart, &group, NULL); >>>> @@ -3811,6 +3813,12 @@ repeat: >>>> list_del(&pa->u.pa_tmp_list); >>>> call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback); >>>> } >>>> + if (! list_empty(&list)) { >>>> + if (occurs++ < 2) >>>> + goto best_efforts; >>>> + else >>>> + BUG(); >>>> + } >>>> if (ac) >>>> kmem_cache_free(ext4_ac_cachep, ac); >>>> } >>> >>> Hmm, I'm not sure that BUG() is appropriate here. If there is an >>> I/O error reading the block bitmap, #1, retrying isn't going to help, >>> and #2, bringing down the entire system just because of an I/O error >>> in reading the block bitmap doesn't seem right. >> >> But disk hardware error is not rare, > > Exactly, which is the reason why it should not cause the system to > hang. The filesystem should handle such errors gracefully if this is > possible, return an error to the application, and/or marking the > filesystem in error so that it will be checked on next boot, or similar. > >>> Right now, if there is a problem, we just end up leaving the >>> preallocated list on the inode. Does that cause problems later on >>> down the line which you have observed? >>> >>> - Ted >> >> and is there still chance to call the >> call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback); >> function again later on? (I am not sure yet the chance does exist.) >> >> If no chance, how about the kmem_cache subsystem then? >> After reboot, the file system is still reliable, or just with a few >> lost blocks? >> >> Thus it is necessary, at least for me, to make sure whether the >> chance exists. >> - zj >> -- >> To unsubscribe from this list: send the line "unsubscribe linux- >> ext4" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > Cheers, Andreas > -- > Andreas Dilger > Sr. Staff Engineer, Lustre Group > Sun Microsystems of Canada, Inc. Evening, Thanks Andreas and Ted for your good explanations to deal error in gentle way, and I got it that the chance may exist since the pa is not deleted from its group_list yet. And it also seems that there is work deserved. - zj --- -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html --- linux-2.6.32/fs/ext4/mballoc.c 2009-12-03 11:51:22.000000000 +0800 +++ fs/mballoc.c 2010-03-20 21:40:04.000000000 +0800 @@ -3788,14 +3788,14 @@ repeat: err = ext4_mb_load_buddy(sb, group, &e4b); if (err) { ext4_error(sb, __func__, "Error in loading buddy " - "information for %u", group); + "information for group %u inode %lu", group, inode->i_ino); continue; } bitmap_bh = ext4_read_block_bitmap(sb, group); if (bitmap_bh == NULL) { ext4_error(sb, __func__, "Error in reading block " - "bitmap for %u", group); + "bitmap for group %u inode %lu", group, inode->i_ino); ext4_mb_release_desc(&e4b); continue; } @@ -3811,6 +3811,14 @@ repeat: list_del(&pa->u.pa_tmp_list); call_rcu(&(pa)->u.pa_rcu, ext4_mb_pa_callback); } + if (! list_empty(&list)) { + /* + * we have to do something for the check in + * the function, ext4_mb_discard_group_preallocations() + */ + list_for_each_entry(pa, &list, u.pa_tmp_list) + pa->pa_deleted = 0; + } if (ac) kmem_cache_free(ext4_ac_cachep, ac); }