Message ID | 87r5pfw6ew.fsf@openvz.org |
---|---|
State | Not Applicable, archived |
Headers | show |
On Mon, Jan 25, 2010 at 12:15:51AM +0300, Dmitry Monakhov wrote: > > It's not a solution. You get an _attempted_ remount ro making writes > > fail, even if it's going to be unsuccessful. No go... > We have two options for new writers: > 1) Fail it via -EROFS > Yes, remount may fail, but it is really unlikely. > 2) Defer(block) new writers on until we complete or fail remount > for example like follows. Do you like second solution ? Umm... I wonder what the locking implications would be... Frankly, I suspect that what we really want is this: * per-superblock write count of some kind, bumped when we decide that writeback is inevitable and dropped when we are done with it (the same thing goes for async part of unlink(), etc.) * fs_may_remount_ro() checking that write count So basically we try to push those short-term writers to completion and if new ones had come while we'd been doing that (or some are really stuck) we fail remount with -EBUSY. As a short-term solution the second patch would do probably (-stable and .33), but in the next cycle I'd rather see something addressing the real problem. fs_may_remount_ro() in its current form is really broken by design - it should not scan any lists (which is where your race comes from, BTW) -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Jan 24, 2010 at 09:37:07PM +0000, Al Viro wrote: > On Mon, Jan 25, 2010 at 12:15:51AM +0300, Dmitry Monakhov wrote: > > > > It's not a solution. You get an _attempted_ remount ro making writes > > > fail, even if it's going to be unsuccessful. No go... > > We have two options for new writers: > > 1) Fail it via -EROFS > > Yes, remount may fail, but it is really unlikely. > > 2) Defer(block) new writers on until we complete or fail remount > > for example like follows. Do you like second solution ? > > Umm... I wonder what the locking implications would be... Frankly, > I suspect that what we really want is this: > * per-superblock write count of some kind, bumped when we decide > that writeback is inevitable and dropped when we are done with it (the > same thing goes for async part of unlink(), etc.) > * fs_may_remount_ro() checking that write count > So basically we try to push those short-term writers to completion and > if new ones had come while we'd been doing that (or some are really > stuck) we fail remount with -EBUSY. Perhaps we could utilise the filesystem freeze infrastructure - it already has hooks for intercepting new writers and modifcations, and filesystems have to flush any current modifications before the freeze completes. It sounds very similar to the requirements needed here... Cheers, Dave.
Al Viro <viro@ZenIV.linux.org.uk> writes: > On Mon, Jan 25, 2010 at 12:15:51AM +0300, Dmitry Monakhov wrote: > >> > It's not a solution. You get an _attempted_ remount ro making writes >> > fail, even if it's going to be unsuccessful. No go... >> We have two options for new writers: >> 1) Fail it via -EROFS >> Yes, remount may fail, but it is really unlikely. >> 2) Defer(block) new writers on until we complete or fail remount >> for example like follows. Do you like second solution ? > > Umm... I wonder what the locking implications would be... Frankly, > I suspect that what we really want is this: > * per-superblock write count of some kind, bumped when we decide > that writeback is inevitable and dropped when we are done with it (the > same thing goes for async part of unlink(), etc.) > * fs_may_remount_ro() checking that write count > So basically we try to push those short-term writers to completion and > if new ones had come while we'd been doing that (or some are really > stuck) we fail remount with -EBUSY. > > As a short-term solution the second patch would do probably (-stable and .33), > but in the next cycle I'd rather see something addressing the real problem. > fs_may_remount_ro() in its current form is really broken by design - it > should not scan any lists (which is where your race comes from, BTW) This is not actually true. The race happens not only because fs_may_remount_ro() is not atomic, but because we have two stages 1) fs_may_remount_ro() 2) sync_filesystem() Even when we make first stage atomic, we still have race between second stage and new writers. BTW: Your idea about per-sb counter may be useful here but it must be not reference count, but it may be used like i_version For example: mnt_want_write() { mnt->mnt_sb->s_wr_count++; } mnt_drop_write() { mnt->mnt_sb->s_wr_count++; } do_remount_sb { cur = mnt->mnt_sb->s_wr_count; if (fs_may_remount_ro()) return -EBUSY; sync_filesystem() if (cur != mnt->mnt_sb->s_wr_count) return -EBUSY; } -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On Sun, Jan 24, 2010 at 09:37:07PM +0000, Al Viro wrote: > > On Mon, Jan 25, 2010 at 12:15:51AM +0300, Dmitry Monakhov wrote: > > > > > > It's not a solution. You get an _attempted_ remount ro making writes > > > > fail, even if it's going to be unsuccessful. No go... > > > We have two options for new writers: > > > 1) Fail it via -EROFS > > > Yes, remount may fail, but it is really unlikely. > > > 2) Defer(block) new writers on until we complete or fail remount > > > for example like follows. Do you like second solution ? > > > > Umm... I wonder what the locking implications would be... Frankly, > > I suspect that what we really want is this: > > * per-superblock write count of some kind, bumped when we decide > > that writeback is inevitable and dropped when we are done with it (the > > same thing goes for async part of unlink(), etc.) > > * fs_may_remount_ro() checking that write count > > So basically we try to push those short-term writers to completion and > > if new ones had come while we'd been doing that (or some are really > > stuck) we fail remount with -EBUSY. > > Perhaps we could utilise the filesystem freeze infrastructure - it > already has hooks for intercepting new writers and modifcations, > and filesystems have to flush any current modifications before the freeze > completes. It sounds very similar to the requirements needed here... There are filesystems (e.g. ext2 or UDF) which don't support freezing so it's not an option at least short term... Honza
diff --git a/fs/namespace.c b/fs/namespace.c index c768f73..daf3c5a 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -196,6 +196,15 @@ int __mnt_is_readonly(struct vfsmount *mnt) return 1; if (mnt->mnt_sb->s_flags & MS_RDONLY) return 1; + if (mnt->mnt_sb->s_flags & MS_RO_REMOUNT) { + int ret = 0; + /* Serialize against remount */ + down_read(&mnt->mnt_sb->s_umount); + if (mnt->mnt_sb->s_flags & MS_RDONLY) + ret = 1; + up_read(&mnt->mnt_sb->s_umount); + return ret; + } return 0; } EXPORT_SYMBOL_GPL(__mnt_is_readonly);