Message ID | 310fb77f-dfed-1196-c4ee-30d5138ee5a2@huawei.com |
---|---|
State | New |
Headers | show |
Series | unix_io.c: fix deadlock problem in unix_write_blk64 | expand |
On Mon, Nov 21, 2022 at 10:00:49AM +0800, zhanchengbin wrote: > The process is deadlocked, and an I/O error occurs when logs > are replayed. Because in the I/O error handling function, I/O > is sent again and catch the mutexlock of CACHE_MTX. This is a legitimate bug, but the propsoed fix is not safe. There is a reason why we take the cache mutex, and that's because we need to prevent another thread from modifying the cache, possibly by ejecting the cache entry that we are in the middle of cleaning when raw_write_blk() is being called in reuse_cache(). Fortunately, we're safe on the read side, because we currently very carefully do not call raw_read_blk() while holding the CACHE_MUTEX. Instead, we write the data from the user-supplied buffer, and *then* take the cache mutex, and then save the data from the user-supplied buffer into the cache. So the problem is only on the write side, and what I think we need to do is to lift the call to channel->write_error() to the ultimate callers of raw_write_blk(), so that we return the error code to its callers in reuse_cache(), flush_cache_blocks(), and unix_write_blk64(), and let those upper-level functions call the write handler --- after they've had the chance to release any mutexes. - Ted
On Wed, Jan 25, 2023 at 01:18:06AM -0500, Theodore Ts'o wrote: > > Fortunately, we're safe on the read side, because we currently very > carefully do not call raw_read_blk() while holding the CACHE_MUTEX. > Instead, we write the data from the user-supplied buffer, and *then* ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > take the cache mutex, and then save the data from the user-supplied > buffer into the cache. Correction: the last sentence of this paragraph should begin: "Instead, we READ the data INTO the user-supplied buffer, ..." I have a series of patches that should address the deadlock issue, that I'll be sending out shortly. - Ted
diff --git a/lib/ext2fs/unix_io.c b/lib/ext2fs/unix_io.c index e53db333..89d82b48 100644 --- a/lib/ext2fs/unix_io.c +++ b/lib/ext2fs/unix_io.c @@ -210,7 +210,8 @@ static char *safe_getenv(const char *arg) static errcode_t raw_read_blk(io_channel channel, struct unix_private_data *data, unsigned long long block, - int count, void *bufv) + int count, void *bufv, + int cache_lock) { errcode_t retval; ssize_t size; @@ -331,16 +332,22 @@ error_unlock: mutex_unlock(data, BOUNCE_MTX); if (actual >= 0 && actual < size) memset((char *) buf+actual, 0, size-actual); - if (channel->read_error) + if (channel->read_error) { + if (cache_lock) + mutex_unlock(data, CACHE_MTX); retval = (channel->read_error)(channel, block, count, buf, size, actual, retval); + if (cache_lock) + mutex_lock(data, CACHE_MTX); + } return retval; } static errcode_t raw_write_blk(io_channel channel, struct unix_private_data *data, unsigned long long block, - int count, const void *bufv) + int count, const void *bufv, + int cache_lock) { ssize_t size; ext2_loff_t location; @@ -482,9 +489,14 @@ bounce_write: error_unlock: mutex_unlock(data, BOUNCE_MTX); error_out: - if (channel->write_error) + if (channel->write_error) { + if (cache_lock) + mutex_unlock(data, CACHE_MTX); retval = (channel->write_error)(channel, block, count, buf, size, actual, retval); + if (cache_lock) + mutex_lock(data, CACHE_MTX); + } return retval; } @@ -576,16 +588,22 @@ static struct unix_cache *find_cached_block(struct unix_private_data *data, /* * Reuse a particular cache entry for another block. */ -static void reuse_cache(io_channel channel, struct unix_private_data *data, +static errcode_t reuse_cache(io_channel channel, struct unix_private_data *data, struct unix_cache *cache, unsigned long long block) { - if (cache->dirty && cache->in_use) - raw_write_blk(channel, data, cache->block, 1, cache->buf); + errcode_t retval = 0; + if (cache->dirty && cache->in_use) { + retval = raw_write_blk(channel, data, cache->block, 1, cache->buf, 1); + if (retval) + return retval; + } cache->in_use = 1; cache->dirty = 0; cache->block = block; cache->access_time = ++data->access_time; + + return retval; }