diff mbox

ext4: fix reference counting bug on block allocation error

Message ID 577E3B53.3070502@oracle.com
State Awaiting Upstream, archived
Headers show

Commit Message

Vegard Nossum July 7, 2016, 11:21 a.m. UTC
On 07/06/2016 03:57 PM, Vegard Nossum wrote:
> If we hit this error when mounted with errors=continue or
> errors=remount-ro:
>
>      EXT4-fs error (device loop0): ext4_mb_mark_diskspace_used:2940: comm ext4.exe: Allocating blocks 5090-6081 which overlap fs metadata
>
> then ext4_mb_new_blocks() will call ext4_mb_release_context() and try to
> continue. However, ext4_mb_release_context() is the wrong thing to call
> here since we are still actually using the allocation context.
>
> Instead, handle it the same way that we handle other errors, except that
> we retry the allocation instead of immediately returning an error (if we
> were mounted with errors=continue, then ext4_mb_mark_diskspace_used()
> should have fixed the original error and will either succeed or give a
> different error; if we were mounted with errors=remount-ro, then it will
> not be able to fix the original error and will raise a different error).

This didn't really work as I thought and I'm now getting stuck in an
infinite loop here where it tries (and fails) to allocate the same
blocks over and over.

The attached new patch just returns on error instead of trying to fix
the problem.


Vegard
diff mbox

Patch

From 09084a94d2b176e33ee52a0bb41409988e6ff744 Mon Sep 17 00:00:00 2001
From: Vegard Nossum <vegard.nossum@oracle.com>
Date: Wed, 6 Jul 2016 15:01:36 +0200
Subject: [PATCH] ext4: fix reference counting bug on block allocation error

If we hit this error when mounted with errors=continue or
errors=remount-ro:

    EXT4-fs error (device loop0): ext4_mb_mark_diskspace_used:2940: comm ext4.exe: Allocating blocks 5090-6081 which overlap fs metadata

then ext4_mb_new_blocks() will call ext4_mb_release_context() and try to
continue. However, ext4_mb_release_context() is the wrong thing to call
here since we are still actually using the allocation context.

Instead, just error out. We could retry the allocation, but there is a
possibility of getting stuck in an infinite loop instead, so this seems
safer.

Fixes: 8556e8f3b6 ("ext4: Don't allow new groups to be added during block allocation")
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
---
 fs/ext4/mballoc.c | 13 +------------
 1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index c1ab3ec..e65400a 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -4513,18 +4513,7 @@  repeat:
 	}
 	if (likely(ac->ac_status == AC_STATUS_FOUND)) {
 		*errp = ext4_mb_mark_diskspace_used(ac, handle, reserv_clstrs);
-		if (*errp == -EAGAIN) {
-			/*
-			 * drop the reference that we took
-			 * in ext4_mb_use_best_found
-			 */
-			ext4_mb_release_context(ac);
-			ac->ac_b_ex.fe_group = 0;
-			ac->ac_b_ex.fe_start = 0;
-			ac->ac_b_ex.fe_len = 0;
-			ac->ac_status = AC_STATUS_CONTINUE;
-			goto repeat;
-		} else if (*errp) {
+		if (*errp) {
 			ext4_discard_allocated_blocks(ac);
 			goto errout;
 		} else {
-- 
1.9.1