Message ID | 1323376115-23881-2-git-send-email-jack@suse.cz |
---|---|
State | Not Applicable, archived |
Headers | show |
On 12/8/11 2:28 PM, Jan Kara wrote: > When insert_inode_locked() fails in ext3_new_inode() it most likely > means inode bitmap got corrupted and we allocated again inode which > is already in use. Also doing unlock_new_inode() during error recovery > is wrong since inode does not have I_NEW set. Fix the problem by jumping > to fail: (instead of fail_drop:) which declares filesystem error and > does not call unlock_new_inode(). > > Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Eric Sandeen <sandeen@redhat.com> I think ext2 could use the same treatment. BTW, though, have you recently started seeing the issue? We have people hitting this when resuming after suspend; it seems likely that the bitmap did get corrupted though, based on some other things seen in similar bugs. -Eric > --- > fs/ext3/ialloc.c | 8 ++++++-- > 1 files changed, 6 insertions(+), 2 deletions(-) > > diff --git a/fs/ext3/ialloc.c b/fs/ext3/ialloc.c > index 5c866e0..adae962 100644 > --- a/fs/ext3/ialloc.c > +++ b/fs/ext3/ialloc.c > @@ -525,8 +525,12 @@ got: > if (IS_DIRSYNC(inode)) > handle->h_sync = 1; > if (insert_inode_locked(inode) < 0) { > - err = -EINVAL; > - goto fail_drop; > + /* > + * Likely a bitmap corruption causing inode to be allocated > + * twice. > + */ > + err = -EIO; > + goto fail; > } > spin_lock(&sbi->s_next_gen_lock); > inode->i_generation = sbi->s_next_generation++; -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu 08-12-11 14:46:09, Eric Sandeen wrote: > On 12/8/11 2:28 PM, Jan Kara wrote: > > When insert_inode_locked() fails in ext3_new_inode() it most likely > > means inode bitmap got corrupted and we allocated again inode which > > is already in use. Also doing unlock_new_inode() during error recovery > > is wrong since inode does not have I_NEW set. Fix the problem by jumping > > to fail: (instead of fail_drop:) which declares filesystem error and > > does not call unlock_new_inode(). > > > > Signed-off-by: Jan Kara <jack@suse.cz> > > Reviewed-by: Eric Sandeen <sandeen@redhat.com> > > I think ext2 could use the same treatment. > > BTW, though, have you recently started seeing the issue? We have > people hitting this when resuming after suspend; it seems likely > that the bitmap did get corrupted though, based on some other > things seen in similar bugs. Interesting. I've got a report from IBM testing ext3 on SLE11 SP2 kernel (3.0 based). Their filesystem got damaged (might be HW issue, not sure yet) and they also observed warnings from unlock_new_inode(). Honza > > --- > > fs/ext3/ialloc.c | 8 ++++++-- > > 1 files changed, 6 insertions(+), 2 deletions(-) > > > > diff --git a/fs/ext3/ialloc.c b/fs/ext3/ialloc.c > > index 5c866e0..adae962 100644 > > --- a/fs/ext3/ialloc.c > > +++ b/fs/ext3/ialloc.c > > @@ -525,8 +525,12 @@ got: > > if (IS_DIRSYNC(inode)) > > handle->h_sync = 1; > > if (insert_inode_locked(inode) < 0) { > > - err = -EINVAL; > > - goto fail_drop; > > + /* > > + * Likely a bitmap corruption causing inode to be allocated > > + * twice. > > + */ > > + err = -EIO; > > + goto fail; > > } > > spin_lock(&sbi->s_next_gen_lock); > > inode->i_generation = sbi->s_next_generation++; >
On 12/8/11 4:28 PM, Jan Kara wrote: > On Thu 08-12-11 14:46:09, Eric Sandeen wrote: >> On 12/8/11 2:28 PM, Jan Kara wrote: >>> When insert_inode_locked() fails in ext3_new_inode() it most likely >>> means inode bitmap got corrupted and we allocated again inode which >>> is already in use. Also doing unlock_new_inode() during error recovery >>> is wrong since inode does not have I_NEW set. Fix the problem by jumping >>> to fail: (instead of fail_drop:) which declares filesystem error and >>> does not call unlock_new_inode(). >>> >>> Signed-off-by: Jan Kara <jack@suse.cz> >> >> Reviewed-by: Eric Sandeen <sandeen@redhat.com> >> >> I think ext2 could use the same treatment. >> >> BTW, though, have you recently started seeing the issue? We have >> people hitting this when resuming after suspend; it seems likely >> that the bitmap did get corrupted though, based on some other >> things seen in similar bugs. > Interesting. I've got a report from IBM testing ext3 on SLE11 SP2 kernel > (3.0 based). Their filesystem got damaged (might be HW issue, not sure yet) > and they also observed warnings from unlock_new_inode(). It may be that it has been failing in other ways, but now we get the WARN_ON and the long backtrace so it's reported more frequently... I think there might be a hibernate issue that is causing the underlying corruption, trying to look into that now. -Eric > Honza >>> --- >>> fs/ext3/ialloc.c | 8 ++++++-- >>> 1 files changed, 6 insertions(+), 2 deletions(-) >>> >>> diff --git a/fs/ext3/ialloc.c b/fs/ext3/ialloc.c >>> index 5c866e0..adae962 100644 >>> --- a/fs/ext3/ialloc.c >>> +++ b/fs/ext3/ialloc.c >>> @@ -525,8 +525,12 @@ got: >>> if (IS_DIRSYNC(inode)) >>> handle->h_sync = 1; >>> if (insert_inode_locked(inode) < 0) { >>> - err = -EINVAL; >>> - goto fail_drop; >>> + /* >>> + * Likely a bitmap corruption causing inode to be allocated >>> + * twice. >>> + */ >>> + err = -EIO; >>> + goto fail; >>> } >>> spin_lock(&sbi->s_next_gen_lock); >>> inode->i_generation = sbi->s_next_generation++; >> -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/ext3/ialloc.c b/fs/ext3/ialloc.c index 5c866e0..adae962 100644 --- a/fs/ext3/ialloc.c +++ b/fs/ext3/ialloc.c @@ -525,8 +525,12 @@ got: if (IS_DIRSYNC(inode)) handle->h_sync = 1; if (insert_inode_locked(inode) < 0) { - err = -EINVAL; - goto fail_drop; + /* + * Likely a bitmap corruption causing inode to be allocated + * twice. + */ + err = -EIO; + goto fail; } spin_lock(&sbi->s_next_gen_lock); inode->i_generation = sbi->s_next_generation++;
When insert_inode_locked() fails in ext3_new_inode() it most likely means inode bitmap got corrupted and we allocated again inode which is already in use. Also doing unlock_new_inode() during error recovery is wrong since inode does not have I_NEW set. Fix the problem by jumping to fail: (instead of fail_drop:) which declares filesystem error and does not call unlock_new_inode(). Signed-off-by: Jan Kara <jack@suse.cz> --- fs/ext3/ialloc.c | 8 ++++++-- 1 files changed, 6 insertions(+), 2 deletions(-)