diff mbox

bug in ext3 code causing OOM error on systems with small memory

Message ID 20100312135736.2f3f5b91.akpm@linux-foundation.org
State Not Applicable, archived
Headers show

Commit Message

Andrew Morton March 12, 2010, 9:57 p.m. UTC
(cc's added)

On Sat, 6 Mar 2010 10:31:07 +0100
"Frans van de Wiel" <fvdw@fvdw.eu> wrote:

> Dear sirs
> 
> Recently I compiled the linux-2.6.33 kernel for my arm9  based NAS using the orion5x mach.
> The kernel runs but when creating a sub directory outside the root in a big disk ext3 partition (in my case 5000 GB) it caused an OOM error.
> 
> journal_get_undo_access: No memory for committed data
> ext3_try_to_allocate_with_rsv: aborting transaction: Out of memory in __ext3_journal_get_undo_access
> 
> Now my NAS has a tiny system memory only 16 MB  but it worked fine on older kernels like 2.6.12.
> I am not an experienced C programmer but I investigated the problem and think I found the reason and that it might be a good idea to share this with you as it might be useful for others with the same problem and I think it will speed up sub directory creation on big partitions.
> The problem is also present in etx2 driver but it does not cause an OOM as there is no journaling, however it causes a significant delay in directory creation.
> Creating a sub directory took in my case 25 seconds on a 500 GB disk. Thats not acceptable. 
> 
>  It took me a while to figure it out why, but it appeared that when trying to create a sub directory the driver starts to look for free blocks with a block group number that was not suitable (too high). Then the routine starts to check all groups one by one to find a suitable group. As there are almost 4000 groups on a 500 GB partition that takes time and in case of using ext3 the journaling of that action caused an out of memory situation. On ext2 it just took a long time to make a sub directory (up to 20 seconds or so).  
> 
> The error was in the balloc.c file  where there is a routine to allocate new blocks. 
> 
> By adding printk lines I finally found the place where the problem was. After comparing this file with the linux-2.6.12.6 version it appeared that in the newer version they deleted a check that caused the loop to continue without trying to allocate in cause the group was not suitable, so skipping the time and memory intensive part of the loop for that group.
> I added that again and voila problem solved. Think on more powerful system with more memory you will never notice the problem but on the NAS with its limited hardware it caused an issue.
> 
> I attached a file showing the part of the balloc.c file with the problem and the correction made (the correction is in line 117-120 of the attached file in between the lines markes /* fvdw */). I am not a C expert and just copied the check from the old version (of course adapting variables names to match with the new version). But it seems to fix the problem. I checked with printk statements, the adapted routine allocates to the same block as without this correction, it only skips unnecessary work. maybe you can have a look at it if it its ok and will not cause other problems.
> The function at line 137 was causing the OOM error when called too many times after each other in ext3  and in ext causing the delay of creating the directory.
> 
> Hope this information is useful to you. I am not a n experienced C progrommar so my bug rapport may be different from your standards sorry for this 
> 

Thanks.  Here's Frans's patch:
diff mbox

Patch

--- a/fs/ext3/balloc.c~a
+++ a/fs/ext3/balloc.c
@@ -1581,6 +1581,8 @@  retry_alloc:
 		gdp = ext3_get_group_desc(sb, group_no, &gdp_bh);
 		if (!gdp)
 			goto io_error;
+		if (!gdp->bg_free_blocks_count)
+			continue;
 		free_blocks = le16_to_cpu(gdp->bg_free_blocks_count);
 		/*
 		 * skip this group if the number of