Message ID | 1282463086.16502.38.camel@brekeke |
---|---|
State | New, archived |
Headers | show |
On Sun, 2010-08-22 at 10:44 +0300, Artem Bityutskiy wrote: > On Wed, 2010-07-28 at 09:40 +0200, Matthieu CASTET wrote: > > I manage to reproduce it with the backtrace [1]. > > Matthieu, your work-around patch or something very close should > certainly be applied to the UBIFS tree, but I still would like to find > out what exactly happened in your setup. > > I see 2 possibilities: > > 1. An error happened and 'ubifs_garbage_collect()' returned while > c->gc_lnum was -1. But in this case we should have switched to R/O mode, > and the master node would not be written. But may be for some reasons we > did not switch to R/O mode, dunno. > > 2. More likely scenario: in 'ubifs_rcvry_gc_commit()' we call > 'ubifs_garbage_collect_leb()' directly, which can return while > c->gc_lnum is -1. And we do not handle this. > > Would you please be patient enough to reproduce the issue once again with > the following patch, which was created against the latest ubifs-2.6.git, but > you should be easily able to apply it to your tree. Hi, any news?
Artem Bityutskiy a écrit : > On Sun, 2010-08-22 at 10:44 +0300, Artem Bityutskiy wrote: >> On Wed, 2010-07-28 at 09:40 +0200, Matthieu CASTET wrote: >>> I manage to reproduce it with the backtrace [1]. >> Matthieu, your work-around patch or something very close should >> certainly be applied to the UBIFS tree, but I still would like to find >> out what exactly happened in your setup. >> >> I see 2 possibilities: >> >> 1. An error happened and 'ubifs_garbage_collect()' returned while >> c->gc_lnum was -1. But in this case we should have switched to R/O mode, >> and the master node would not be written. But may be for some reasons we >> did not switch to R/O mode, dunno. >> >> 2. More likely scenario: in 'ubifs_rcvry_gc_commit()' we call >> 'ubifs_garbage_collect_leb()' directly, which can return while >> c->gc_lnum is -1. And we do not handle this. >> >> Would you please be patient enough to reproduce the issue once again with >> the following patch, which was created against the latest ubifs-2.6.git, but >> you should be easily able to apply it to your tree. > > Hi, any news? > Not much, I was busy on another subject but I will try ASAP. Matthieu PS : any idea/comment on the handling of interrupted write page by UBI/UBIFS ?
> PS : any idea/comment on the handling of interrupted write page by > UBI/UBIFS ? Err, I think these are perfectly handled, I read your e-mails, they were a little messy, but I did not find anything UBIFS does not handle. I sent you a fix for your oops. Would you please re-formulate your questions nicely in a separate e-mail, if you still have them?
Artem Bityutskiy a écrit : > On Wed, 2010-07-28 at 09:40 +0200, Matthieu CASTET wrote: >> I manage to reproduce it with the backtrace [1]. > > Matthieu, your work-around patch or something very close should > certainly be applied to the UBIFS tree, but I still would like to find > out what exactly happened in your setup. > > I see 2 possibilities: > > 1. An error happened and 'ubifs_garbage_collect()' returned while > c->gc_lnum was -1. But in this case we should have switched to R/O mode, > and the master node would not be written. But may be for some reasons we > did not switch to R/O mode, dunno. > > 2. More likely scenario: in 'ubifs_rcvry_gc_commit()' we call > 'ubifs_garbage_collect_leb()' directly, which can return while > c->gc_lnum is -1. And we do not handle this. > > Would you please be patient enough to reproduce the issue once again with > the following patch, which was created against the latest ubifs-2.6.git, but > you should be easily able to apply it to your tree. None of these check happen. only the dump in ubifs_write_master. Matthieu
On Fri, 2010-09-24 at 17:31 +0200, Matthieu CASTET wrote: > Artem Bityutskiy a écrit : > > On Wed, 2010-07-28 at 09:40 +0200, Matthieu CASTET wrote: > >> I manage to reproduce it with the backtrace [1]. > > > > Matthieu, your work-around patch or something very close should > > certainly be applied to the UBIFS tree, but I still would like to find > > out what exactly happened in your setup. > > > > I see 2 possibilities: > > > > 1. An error happened and 'ubifs_garbage_collect()' returned while > > c->gc_lnum was -1. But in this case we should have switched to R/O mode, > > and the master node would not be written. But may be for some reasons we > > did not switch to R/O mode, dunno. > > > > 2. More likely scenario: in 'ubifs_rcvry_gc_commit()' we call > > 'ubifs_garbage_collect_leb()' directly, which can return while > > c->gc_lnum is -1. And we do not handle this. > > > > Would you please be patient enough to reproduce the issue once again with > > the following patch, which was created against the latest ubifs-2.6.git, but > > you should be easily able to apply it to your tree. > None of these check happen. > > only the dump in ubifs_write_master. Hmm.... This is weird... I think I need your UBIFS. Is it possible to share? You can take vanilla 2.6.27 and put all UBIFS stuff there. Or send patches against ubifs-v2.6.27.git
diff --git a/fs/ubifs/budget.c b/fs/ubifs/budget.c index c8ff0d1..aa433cd 100644 --- a/fs/ubifs/budget.c +++ b/fs/ubifs/budget.c @@ -83,6 +83,10 @@ static int run_gc(struct ubifs_info *c) down_read(&c->commit_sem); lnum = ubifs_garbage_collect(c, 1); up_read(&c->commit_sem); + if (c->gc_lnum == -1) { + ubifs_err("gc_lnum is -1! ubifs_garbage_collect() returned %d", lnum); + dump_stack(); + } if (lnum < 0) return lnum; diff --git a/fs/ubifs/gc.c b/fs/ubifs/gc.c index 396f24a..0e78832 100644 --- a/fs/ubifs/gc.c +++ b/fs/ubifs/gc.c @@ -807,12 +807,20 @@ int ubifs_garbage_collect(struct ubifs_info *c, int anyway) goto out; } out_unlock: + if (c->gc_lnum == -1) { + ubifs_err("gc_lnum is -1! ubifs_garbage_collect() is returning %d", ret); + dump_stack(); + } mutex_unlock(&wbuf->io_mutex); return ret; out: ubifs_assert(ret < 0); ubifs_assert(ret != -ENOSPC && ret != -EAGAIN); + if (c->gc_lnum == -1) { + ubifs_err("gc_lnum is -1! ubifs_garbage_collect() is returning %d", ret); + dump_stack(); + } ubifs_wbuf_sync_nolock(wbuf); ubifs_ro_mode(c, ret); mutex_unlock(&wbuf->io_mutex); diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c index d321bae..44df514 100644 --- a/fs/ubifs/journal.c +++ b/fs/ubifs/journal.c @@ -162,6 +162,10 @@ again: mutex_unlock(&wbuf->io_mutex); lnum = ubifs_garbage_collect(c, 0); + if (c->gc_lnum == -1) { + ubifs_err("gc_lnum is -1! ubifs_garbage_collect() returned %d", lnum); + dump_stack(); + } if (lnum < 0) { err = lnum; if (err != -ENOSPC) diff --git a/fs/ubifs/recovery.c b/fs/ubifs/recovery.c index daae9e1..3058256 100644 --- a/fs/ubifs/recovery.c +++ b/fs/ubifs/recovery.c @@ -1126,6 +1126,10 @@ int ubifs_rcvry_gc_commit(struct ubifs_info *c) dbg_rcvry("GC'ing LEB %d", lnum); mutex_lock_nested(&wbuf->io_mutex, wbuf->jhead); err = ubifs_garbage_collect_leb(c, &lp); + if (c->gc_lnum == -1) { + ubifs_err("gc_lnum is -1! ubifs_garbage_collect_leb() returned %d", err); + dump_stack(); + } if (err >= 0) { int err2 = ubifs_wbuf_sync_nolock(wbuf);
On Wed, 2010-07-28 at 09:40 +0200, Matthieu CASTET wrote: > I manage to reproduce it with the backtrace [1]. Matthieu, your work-around patch or something very close should certainly be applied to the UBIFS tree, but I still would like to find out what exactly happened in your setup. I see 2 possibilities: 1. An error happened and 'ubifs_garbage_collect()' returned while c->gc_lnum was -1. But in this case we should have switched to R/O mode, and the master node would not be written. But may be for some reasons we did not switch to R/O mode, dunno. 2. More likely scenario: in 'ubifs_rcvry_gc_commit()' we call 'ubifs_garbage_collect_leb()' directly, which can return while c->gc_lnum is -1. And we do not handle this. Would you please be patient enough to reproduce the issue once again with the following patch, which was created against the latest ubifs-2.6.git, but you should be easily able to apply it to your tree. Artem.