Message ID | 20090216190001.GB11788@mini-me.lan |
---|---|
State | Accepted, archived |
Headers | show |
On Mon, 16 Feb 2009 14:00:01 -0500, I waved a wand and this message magically appears in front of Theodore Tso: > On Mon, Feb 16, 2009 at 04:27:03PM +0100, Andres Freund wrote: > > > > So, yes, seems to be an inode allocation problem. > > Andres, Alex, others, > > I'm pretty sure the ENOSPC problem which you both found is an inode > allocation problem. Some of you seem to have an easier time > reproducing it than others; could you try this patch, and periodically > scan your system logs for the message "ext4: find_group_flex failed, > fallback succeeded"? If the problem goes away for you, and you find > the occasional aforemention message in your system log, that will > confirm what I suspect, which is the bug is in fs/ext4/inode.c's > find_group_flex() function. (If I'm wrong, the fallback code will > activate only when the filesystem is genuinely out of inodes, which > should be very rare.) OK, I had to go look through the archives on linux-ext4 mailing list to see what the context was. For myself, this used to happen at least once a week with 2.6.26, and less frequently with 2.6.27. I think that 2.6.28 with your patch should get rid of that problem altogether. I will of course get in touch should I see any more of these find_group_flex failures as that would mean your patch worked. Thanks for your work on tracking this one down!
On 02/16/2009 08:00 PM, Theodore Tso wrote: > On Mon, Feb 16, 2009 at 04:27:03PM +0100, Andres Freund wrote: >> So, yes, seems to be an inode allocation problem. > I'm pretty sure the ENOSPC problem which you both found is an inode > allocation problem. Some of you seem to have an easier time > reproducing it than others; could you try this patch, and periodically > scan your system logs for the message "ext4: find_group_flex failed, > fallback succeeded"? If the problem goes away for you, and you find > the occasional aforemention message in your system log, that will > confirm what I suspect, which is the bug is in fs/ext4/inode.c's > find_group_flex() function. (If I'm wrong, the fallback code will > activate only when the filesystem is genuinely out of inodes, which > should be very rare.) > > More comments are in the patch header. My current long-term plan for > dealing with this is to enhance find_group_orlov() to and > find_group_other() to understand about flex_bg's. Ok. I am now running with the patch enabled on two machines - but as the issue occured only 2 times in nearly 2 months on two machines... Andres -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Theodore Tso wrote: > On Mon, Feb 16, 2009 at 04:27:03PM +0100, Andres Freund wrote: >> So, yes, seems to be an inode allocation problem. >> > > Andres, Alex, others, > > I'm pretty sure the ENOSPC problem which you both found is an inode > allocation problem. Some of you seem to have an easier time > reproducing it than others; could you try this patch, and periodically > scan your system logs for the message "ext4: find_group_flex failed, > fallback succeeded"? If the problem goes away for you, and you find > the occasional aforemention message in your system log, that will > confirm what I suspect, which is the bug is in fs/ext4/inode.c's > find_group_flex() function. (If I'm wrong, the fallback code will > activate only when the filesystem is genuinely out of inodes, which > should be very rare.) > > More comments are in the patch header. My current long-term plan for > dealing with this is to enhance find_group_orlov() to and > find_group_other() to understand about flex_bg's. Ok, I finally got to where I can reliably hit this. Just as I was about to install an ext4 with this patch in place, and the bug was preventing the new initrd creation ;) But worked around that, and: ext4: find_group_flex failed, fallback succeeded dir 258402 ext4: find_group_flex failed, fallback succeeded dir 258402 ext4: find_group_flex failed, fallback succeeded dir 258402 ext4: find_group_flex failed, fallback succeeded dir 258402 .... I'll see if I can dig a bit more as to why the find_group_flex failed, if you think it's worth it, Ted. -Eric -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Eric Sandeen wrote: > Theodore Tso wrote: >> On Mon, Feb 16, 2009 at 04:27:03PM +0100, Andres Freund wrote: >>> So, yes, seems to be an inode allocation problem. >>> >> Andres, Alex, others, >> >> I'm pretty sure the ENOSPC problem which you both found is an inode >> allocation problem. Some of you seem to have an easier time >> reproducing it than others; could you try this patch, and periodically >> scan your system logs for the message "ext4: find_group_flex failed, >> fallback succeeded"? If the problem goes away for you, and you find >> the occasional aforemention message in your system log, that will >> confirm what I suspect, which is the bug is in fs/ext4/inode.c's >> find_group_flex() function. (If I'm wrong, the fallback code will >> activate only when the filesystem is genuinely out of inodes, which >> should be very rare.) >> >> More comments are in the patch header. My current long-term plan for >> dealing with this is to enhance find_group_orlov() to and >> find_group_other() to understand about flex_bg's. > > Ok, I finally got to where I can reliably hit this. Just as I was about > to install an ext4 with this patch in place, and the bug was preventing > the new initrd creation ;) But worked around that, and: > > ext4: find_group_flex failed, fallback succeeded dir 258402 > ext4: find_group_flex failed, fallback succeeded dir 258402 > ext4: find_group_flex failed, fallback succeeded dir 258402 > ext4: find_group_flex failed, fallback succeeded dir 258402 > .... > > I'll see if I can dig a bit more as to why the find_group_flex failed, > if you think it's worth it, Ted. FWIW my problem seems to be different than others have encountered; mine persists past reboot, while other reporters have said that a reboot (remount) makes the problem go away. I seem to be encountering some silliness in find_group_flex when 2 out of 3 groups are full (I "only" have 55k inodes left, all in the last group). -Eric -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Feb 17, 2009 at 02:08:21PM -0600, Eric Sandeen wrote: > FWIW my problem seems to be different than others have encountered; mine > persists past reboot, while other reporters have said that a reboot > (remount) makes the problem go away. It might or might not be the same problem, since the reporters were doing this on a mounted root partition, and on a filesystem quite a bit larger than your test filesystem; so it could be that the act of shutting down and rebooting created/deleted various pid files, and purturbed the filesystem to make the problem go away. The other possibility is that it is the flex_bg specific counters which were introduced specifically for find_group_flex. I'm not wild about them since they mean we have to take an extra flex_bg specific spin lock for every block and inode allocation. The Orlov algorithm only needs the information when allocating directories, and since those are rarer than file allocations, I think it should be OK to simply sum up the necessary fields at directory allocation time instead of trying to maintain separate counters (which could possibly get corrupted, although I couldn't see a way that they could be getting out of sync with reality). - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 17 Feb 2009 14:08:21 -0600, I waved a wand and this message magically appears in front of Eric Sandeen: > FWIW my problem seems to be different than others have encountered; > mine persists past reboot, while other reporters have said that a > reboot (remount) makes the problem go away. > > I seem to be encountering some silliness in find_group_flex when 2 out > of 3 groups are full (I "only" have 55k inodes left, all in the last > group). I've discovered a forced fsck clears this. HTH.
Alex Buell wrote: > On Tue, 17 Feb 2009 14:08:21 -0600, I waved a wand and this message > magically appears in front of Eric Sandeen: > >> FWIW my problem seems to be different than others have encountered; >> mine persists past reboot, while other reporters have said that a >> reboot (remount) makes the problem go away. >> >> I seem to be encountering some silliness in find_group_flex when 2 out >> of 3 groups are full (I "only" have 55k inodes left, all in the last >> group). > > I've discovered a forced fsck clears this. HTH. Do you have the output of the fsck run? -Eric -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 17 Feb 2009 16:56:23 -0600, I waved a wand and this message magically appears in front of Eric Sandeen: > Alex Buell wrote: > > On Tue, 17 Feb 2009 14:08:21 -0600, I waved a wand and this message > > magically appears in front of Eric Sandeen: > > > >> FWIW my problem seems to be different than others have encountered; > >> mine persists past reboot, while other reporters have said that a > >> reboot (remount) makes the problem go away. > >> > >> I seem to be encountering some silliness in find_group_flex when 2 > >> out of 3 groups are full (I "only" have 55k inodes left, all in > >> the last group). > > > > I've discovered a forced fsck clears this. HTH. > > Do you have the output of the fsck run? I'm afraid not, there's no way to save the output on a forced fsck reboot.
Hi All, On 02/17/2009 06:36 PM, Andres Freund wrote: > On 02/16/2009 08:00 PM, Theodore Tso wrote: >> On Mon, Feb 16, 2009 at 04:27:03PM +0100, Andres Freund wrote: >>> So, yes, seems to be an inode allocation problem. >> I'm pretty sure the ENOSPC problem which you both found is an inode >> allocation problem. Some of you seem to have an easier time >> reproducing it than others; could you try this patch, and periodically >> scan your system logs for the message "ext4: find_group_flex failed, >> fallback succeeded"? If the problem goes away for you, and you find >> the occasional aforemention message in your system log, that will >> confirm what I suspect, which is the bug is in fs/ext4/inode.c's >> find_group_flex() function. (If I'm wrong, the fallback code will >> activate only when the filesystem is genuinely out of inodes, which >> should be very rare.) >> More comments are in the patch header. My current long-term plan for >> dealing with this is to enhance find_group_orlov() to and >> find_group_other() to understand about flex_bg's. > Ok. I am now running with the patch enabled on two machines - but as the > issue occured only 2 times in nearly 2 months on two machines... Didn't take that long: On one of the machines I got several thousand of: [10379.575904] ext4: find_group_flex failed, fallback succeeded dir 416319 [10379.576002] ext4: find_group_flex failed, fallback succeeded dir 416319 [10379.579981] ext4: find_group_flex failed, fallback succeeded dir 416319 [10379.580097] ext4: find_group_flex failed, fallback succeeded dir 416319 (with different directories) No userspace visible behaviour. So it seems you were right. It seems sensible to put that patch without printk in the kernel until the issue is fully solved... Andres -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Feb 18, 2009 at 10:18:57PM +0100, Andres Freund wrote: > On one of the machines I got several thousand of: > > [10379.575904] ext4: find_group_flex failed, fallback succeeded dir 416319 > [10379.576002] ext4: find_group_flex failed, fallback succeeded dir 416319 > [10379.579981] ext4: find_group_flex failed, fallback succeeded dir 416319 > [10379.580097] ext4: find_group_flex failed, fallback succeeded dir 416319 > (with different directories) Ok, that's good. Good to know the workaround works. Can you send me a dumpe2fs of the filesystem in question? I'm curious what was going on... > No userspace visible behaviour. > > So it seems you were right. It seems sensible to put that patch without > printk in the kernel until the issue is fully solved... Thanks for the report. I'll push the workaround patch to Linus for 2.6.29 to avoid this problem for now. I recently sent to linux-ext4 for comment a patch to revamp the Orlov allocator for flex_bg and to use that instead of find_group_flex(), but no way that's going into 2.6.29 at this point.... - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 02/18/2009 10:29 PM, Theodore Tso wrote: > Ok, that's good. Good to know the workaround works. > Can you send me a dumpe2fs of the filesystem in question? I'm curious > what was going on... Will do as soon as I am at the same place as the machine. I guess thats only interesting to you privately (size and so on)? > Thanks for the report. I'll push the workaround patch to Linus for > 2.6.29 to avoid this problem for now. I recently sent to linux-ext4 > for comment a patch to revamp the Orlov allocator for flex_bg and to > use that instead of find_group_flex(), but no way that's going into > 2.6.29 at this point.... Would it be helpfull if I test that patch? Andres -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Feb 19, 2009 at 03:18:45AM +0100, Andres Freund wrote: > On 02/18/2009 10:29 PM, Theodore Tso wrote: >> Ok, that's good. Good to know the workaround works. >> Can you send me a dumpe2fs of the filesystem in question? I'm curious >> what was going on... > Will do as soon as I am at the same place as the machine. I guess thats > only interesting to you privately (size and so on)? > >> Thanks for the report. I'll push the workaround patch to Linus for >> 2.6.29 to avoid this problem for now. I recently sent to linux-ext4 >> for comment a patch to revamp the Orlov allocator for flex_bg and to >> use that instead of find_group_flex(), but no way that's going into >> 2.6.29 at this point.... > Would it be helpfull if I test that patch? > Sure, I'll take all of the testing I can get. :-) The patch is in the ext4 patch queue, and I sent them to the ext4 patch queue. The patch is also in patch work: http://patchwork.ozlabs.org/patch/23343/ The patch which I sent you earlier (available below) is a prequisite: http://patchwork.ozlabs.org/patch/23228/ - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Theodore Tso wrote: > On Thu, Feb 19, 2009 at 03:18:45AM +0100, Andres Freund wrote: >> On 02/18/2009 10:29 PM, Theodore Tso wrote: >>> Ok, that's good. Good to know the workaround works. >>> Can you send me a dumpe2fs of the filesystem in question? I'm curious >>> what was going on... >> Will do as soon as I am at the same place as the machine. I guess thats >> only interesting to you privately (size and so on)? >> >>> Thanks for the report. I'll push the workaround patch to Linus for >>> 2.6.29 to avoid this problem for now. I recently sent to linux-ext4 >>> for comment a patch to revamp the Orlov allocator for flex_bg and to >>> use that instead of find_group_flex(), but no way that's going into >>> 2.6.29 at this point.... >> Would it be helpfull if I test that patch? >> > > Sure, I'll take all of the testing I can get. :-) > > The patch is in the ext4 patch queue, and I sent them to the ext4 > patch queue. The patch is also in patch work: > > http://patchwork.ozlabs.org/patch/23343/ > > The patch which I sent you earlier (available below) is a prequisite: > > http://patchwork.ozlabs.org/patch/23228/ Ted, I hope the printk will be removed or at least ratelimited before it gets upstream? Thanks, -Eric -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Feb 19, 2009 at 09:46:51AM -0600, Eric Sandeen wrote: > > Ted, I hope the printk will be removed or at least ratelimited before it > gets upstream? > Yes, I've added a ratelimit. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 02/18/2009 10:29 PM, Theodore Tso wrote: >> [10379.575904] ext4: find_group_flex failed, fallback succeeded dir 416319 >> [10379.576002] ext4: find_group_flex failed, fallback succeeded dir 416319 >> [10379.579981] ext4: find_group_flex failed, fallback succeeded dir 416319 >> [10379.580097] ext4: find_group_flex failed, fallback succeeded dir 416319 >> (with different directories) > Can you send me a dumpe2fs of the filesystem in question? I'm curious > what was going on... Unfortunately the system was rebooted, before I had the chance to do the dump - since then the problem has not reemerged. Would a dump after reboot still be usefull? Andres -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c index a200059..21080ab 100644 --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -715,6 +715,13 @@ struct inode *ext4_new_inode(handle_t *handle, struct inode *dir, int mode) if (sbi->s_log_groups_per_flex) { ret2 = find_group_flex(sb, dir, &group); + if (ret2 == -1) { + ret2 = find_group_other(sb, dir, &group); + if (ret2 == 0) + printk(KERN_NOTICE "ext4: find_group_flex " + "failed, fallback succeeded dir %lu\n", + dir->i_ino); + } goto got_group; }