Message ID | 20090804225619.GB11097@duck.suse.cz |
---|---|
State | Not Applicable, archived |
Headers | show |
Hi, On Wed, Aug 05, 2009 at 12:56:19AM +0200, Jan Kara wrote: > > Thanks for testing. So you seem to be really stressting the path where > creation of new files / directories fails (probably due to group quota). Yes, there are 29 groups over quota on a total of 4499. Those are mainly spammed websites and therefore quite stressed due to the amount of tries to add new "data". > I have one idea what could cause your filesystem corruption, although > it's a wild guess... Please try attached oneliner. Running since yesterday. > Also your corruption reminded me that Al Viro has been fixing problems > where we could cache one inode twice when a filesystem was mounted over NFS > and that could also lead to a filesystem corruption. So I'm adding him to > CC just in case he has some idea. BTW Al, what do you think about the > problem I describe in the attached patch? I'm not sure if it can cause some > real problems but in theory it could... Should we upgrade NFS clients as well ? (now running 2.6.28.9) Sylvain
On Thu, Aug 06, 2009 at 03:15:56PM +0200, Sylvain Rochet wrote: > Hi, > > > On Wed, Aug 05, 2009 at 12:56:19AM +0200, Jan Kara wrote: > > > > Thanks for testing. So you seem to be really stressting the path where > > creation of new files / directories fails (probably due to group quota). > > Yes, there are 29 groups over quota on a total of 4499. Those are mainly > spammed websites and therefore quite stressed due to the amount of tries > to add new "data". > > > > I have one idea what could cause your filesystem corruption, although > > it's a wild guess... Please try attached oneliner. > > Running since yesterday. > > > > Also your corruption reminded me that Al Viro has been fixing problems > > where we could cache one inode twice when a filesystem was mounted over NFS > > and that could also lead to a filesystem corruption. So I'm adding him to > > CC just in case he has some idea. BTW Al, what do you think about the > > problem I describe in the attached patch? I'm not sure if it can cause some > > real problems but in theory it could... > > Should we upgrade NFS clients as well ? (now running 2.6.28.9) The client version shouldn't matter. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, On Thu 06-08-09 15:15:56, Sylvain Rochet wrote: > > I have one idea what could cause your filesystem corruption, although > > it's a wild guess... Please try attached oneliner. > > Running since yesterday. Any news after a week of running? How often did the corruption happen previously? Honza
Hi!, On Thu, Aug 13, 2009 at 12:34:53AM +0200, Jan Kara wrote: > Hello, > > On Thu 06-08-09 15:15:56, Sylvain Rochet wrote: > > > I have one idea what could cause your filesystem corruption, although > > > it's a wild guess... Please try attached oneliner. > > > > Running since yesterday. > > Any news after a week of running? How often did the corruption happen > previously? Sorry for the late answer, I was lurking at HAR ;-) So, everything is fine, but the problem happened only one time on this server, so we cannot conclude anything after a few weeks. However, I now have physical access back, so we will switch back to the former server where the problem happened quite frequently, then we will see! By the way, syslogd is happy, eating about 350 MiB of kernel logs a day ;) Sylvain
On Thu, Aug 20, 2009 at 07:19:53PM +0200, Sylvain Rochet wrote: > So, everything is fine, but the problem happened only one time on this > server, so we cannot conclude anything after a few weeks. However, > I now have physical access back, so we will switch back to the former > server where the problem happened quite frequently, then we will see! Not to derail the thread, but you were definitely seeing the same issues with stock 2.6.30.4, right? We had all sorts of corruption happening for files served via NFS with 2.6.28 and 2.6.29, but everything was magically fixed on 2.6.30 (though we needed a lot of fscking). I never did track down what change fixed it, since it took a while to reproduce. Hmm. I just noticed what seems to be a new occurrence of "deleted inode referenced" on a box with 2.6.30. We saw many when we first upgraded to 2.6.30 due to the corruption caused by 2.6.29, but those all occurred within a day or so and were fsck'd. I would have thought the backup sweeps would have tripped over that inode way before now... Just wondering if you can confirm that the errors you saw with 2.6.30.4 were not leftover from older kernels. Cheers, Simon- -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi, On Thu, Aug 20, 2009 at 05:00:35PM -0700, Simon Kirby wrote: > On Thu, Aug 20, 2009 at 07:19:53PM +0200, Sylvain Rochet wrote: > > > So, everything is fine, but the problem happened only one time on this > > server, so we cannot conclude anything after a few weeks. However, > > I now have physical access back, so we will switch back to the former > > server where the problem happened quite frequently, then we will see! > > Not to derail the thread, but you were definitely seeing the same issues > with stock 2.6.30.4, right? Nope, the last issue we had came from 2.6.28.9. We upgraded to 2.6.30.3 on the advice of Jan, then we "upgraded" to 2.6.30.3 with the first Jan's patch to add some debug output (0001-ext3-Debug-unlinking-of-inodes.patch). Finally we upgraded to 2.6.30.4 with the first and the second Jan's patch (0001-fs-Make-sure-data-stored-into-inode-is-properly-see.patch) to add a smp_mb() in the unlock_new_inode() function. > We had all sorts of corruption happening for files served via NFS with > 2.6.28 and 2.6.29, but everything was magically fixed on 2.6.30 > (though we needed a lot of fscking). I never did track down what > change fixed it, since it took a while to reproduce. Same here, everything is fine since 2.6.30. We will switch back to the quad-core server where the corruption happen(ed) in a few days. We are now using a bi-opteron server because we suspected hardware issues on the quad-core, the corruption happened only one time on the bi-opteron (which is IMHO a sufficient evidence to discard hardware issue). I guess the issue was(or is) kinda SMP related. And yep, we also had long times playing with fsck ;-) Luckily that the corruption only occurs on new files, and new files are mostly caches, sessions, logs, and such, so fsck used its chainsaw on quite not-really-important files. > Hmm. I just noticed what seems to be a new occurrence of "deleted inode > referenced" on a box with 2.6.30. We saw many when we first upgraded to > 2.6.30 due to the corruption caused by 2.6.29, but those all occurred > within a day or so and were fsck'd. I would have thought the backup > sweeps would have tripped over that inode way before now... > > Just wondering if you can confirm that the errors you saw with 2.6.30.4 > were not leftover from older kernels. The few garbaged inodes from 2.6.28.9 (and previous) were pushed to lost+found to prevent future use of them. We do a fsck when we moved to 2.6.30.4 that fixed everything. We never had corruption yet with the 2.6.30.4. Sylvain
From 78513d3a5628fda0f8d685d732b7bc73bd4c9222 Mon Sep 17 00:00:00 2001 From: Jan Kara <jack@suse.cz> Date: Wed, 5 Aug 2009 00:42:21 +0200 Subject: [PATCH] fs: Make sure data stored into inode is properly seen before unlocking new inode In theory it could happen that on one CPU we initialize a new inode but clearing of I_NEW | I_LOCK gets reordered before some of the initialization. Thus on another CPU we return not fully uptodate inode from iget_locked(). Signed-off-by: Jan Kara <jack@suse.cz> --- fs/inode.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/fs/inode.c b/fs/inode.c index 901bad1..e9a8e77 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -696,6 +696,7 @@ void unlock_new_inode(struct inode *inode) * just created it (so there can be no old holders * that haven't tested I_LOCK). */ + smp_mb(); WARN_ON((inode->i_state & (I_LOCK|I_NEW)) != (I_LOCK|I_NEW)); inode->i_state &= ~(I_LOCK|I_NEW); wake_up_inode(inode); -- 1.6.0.2