Message ID | 20081208140138.GA17700@mit.edu |
---|---|
State | Superseded, archived |
Headers | show |
Ted-san, Thank you for your reviewing. > Toshiyuki-san, > > My apologies for not having a chance to review your patches; things > have been rather busy for me. There were a couple of shortcomings in > your patch series, and I think there is a better way of solving the > issue at hand. First of all, patches #1 and #2 use a new function > which is not actually defined until patches #3 and #4, respectively. > This can make it difficult for people who are trying to use "git > bisect" to try to localize the root cause of a problem. > > Secondly, the introduction of a large number of wrapper functions > increases stack utilization, and makes the call depth deeper (and in > the long run, each increase in call depth makes the code that much > harder to trace through and understand), and so it should be done only > as last resort. Fortunately, there is a simpler way of fixing this > problem. I include the inter-diff below, but I will fold this into > the current blkdev_releasepage() patches that are in the ext4 patch > queue. > > Best regards, > > - Ted > > P.S. Note that this patch is functionally identical to what you > proposed in your patch series, but since the gfp_wait mask already > controlls whether or not log_wait_commit() is called, instead of > introducing a new functional parameter, we just mask off __GFP_WAIT > before calling jbd2_journal_try_to_free_buffers. I'll make a similar > change to the ext3 patch, and attach the two revised patches to this > mail thread. Through the idea as follows, I agree to your change. ----------------------------------------------------------------------------- To tell the truth, at first, I imagined the same patch as yours to fix this problem. But I have made another patch because I thought that ext3(or ext4) should not know the contents of the processing of journal_try_to_free_buffers in detail. (ext3 should not know there is a possibility to call journal_wait_for_transaction_sync_data from journal_try_to_free_buffers.) So, I have made a new function, journal_try_to_free_metadata_buffers to release only metadata buffer_heads. (I wanted a function which is the almost same as journal_try_to_free_buffers except calling journal_wait_for_transaction_sync_data from it.) However, this new function needed big changes, you know. I reconsidered what was the most suitable patch to fix this problem after I read your mail (patch). And then, I thought that it was important to make the most concise patch to fix only a root cause. Big patch is not easy to understand even if it is more logical one. Therefore, there is the fact that ext3_release_metadata must not sleep because it can get a spinlock, and then, only changing ext3_release_metadata to the logic to make it not sleep is the simplest fix for this problem. ----------------------------------------------------------------------------- Best Regards, Toshiyuki Okajima -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Dec 12, 2008 at 09:54:18AM +0900, Toshiyuki Okajima wrote: > To tell the truth, at first, I imagined the same patch as yours to fix this > problem. But I have made another patch because I thought that ext3(or ext4) > should not know the contents of the processing of journal_try_to_free_buffers > in detail. (ext3 should not know there is a possibility to call > journal_wait_for_transaction_sync_data from journal_try_to_free_buffers.) I agree, but ext3 doesn't need to know that. What my change did was to mask off the _GFP_WAIT flag, which prohibits the function it calls from blocking, because it knows that its caller is holding a spinlock. And actually, come to think of it. We can do even better; the right fix is to have blkdev_releasepage() mask off the _GFP_WAIT flag; this is the function which is taking spinlock, and by masking off the __GFP_WAIT flag, this is simply requesting all of the downstream functions not to block, but to do the best job they can do without blocking. It doesn't need to know whether it's going to call log_wait_commit(), or anything else; all it needs to do is request "please don't block". That means we only make the request once, in the function which is taking spinlock, so all of the per-filesystem implementations of release_metadata() don't need to know that its caller is holding a spinlock. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Ted-san, Theodore Tso wrote: > On Fri, Dec 12, 2008 at 09:54:18AM +0900, Toshiyuki Okajima wrote: > > > To tell the truth, at first, I imagined the same patch as yours to fix this > > > problem. But I have made another patch because I thought that ext3(or ext4) > > > should not know the contents of the processing of journal_try_to_free_buffers > > > in detail. (ext3 should not know there is a possibility to call > > > journal_wait_for_transaction_sync_data from journal_try_to_free_buffers.) > > I agree, but ext3 doesn't need to know that. What my change did was > to mask off the _GFP_WAIT flag, which prohibits the function it calls > from blocking, because it knows that its caller is holding a spinlock. > > And actually, come to think of it. We can do even better; the right > fix is to have blkdev_releasepage() mask off the _GFP_WAIT flag; this > is the function which is taking spinlock, and by masking off the > __GFP_WAIT flag, this is simply requesting all of the downstream > functions not to block, but to do the best job they can do without > blocking. It doesn't need to know whether it's going to call > log_wait_commit(), or anything else; all it needs to do is request > "please don't block". > That means we only make the request once, in the function which is > taking spinlock, so all of the per-filesystem implementations of > release_metadata() don't need to know that its caller is holding a > spinlock. OK. I agree your last change. I also think blkdev_releasepage must do something so that downstream functions of it don't sleep. Masking off the _GFP_WAIT flag is the easiest achievement of it. Besides, I think it is not valid implementation that brings us a care about ei->client_releasepage's sleeping. Additional Information: I did an easy test with your last change and then I haven't experienced any errors. Regards, Toshiyuki Okajima -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index e77a059..543f3d0 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3049,6 +3049,12 @@ static int ext4_releasepage(struct page *page, gfp_t wait) * mapped via the block device. Since these pages could have journal heads * which would prevent try_to_free_buffers() from freeing them, we must use * jbd2 layer's try_to_free_buffers() function to release them. + * + * Note: we have to strip the __GFP_WAIT flag before calling + * jbd2_journal_try_to_free_buffers because blkdev_releasepage is + * called while holding a spinlock (bdev_inode.client_lock). + * Fortunately the metadata buffers we are interested are freed right + * away and do not require calling journal_wait_for_transaction_sync_data(). */ int ext4_release_metadata(void *client, struct page *page, gfp_t wait) { @@ -3061,7 +3067,8 @@ int ext4_release_metadata(void *client, struct page *page, gfp_t wait) BUG_ON(EXT4_SB(sb) == NULL); journal = EXT4_SB(sb)->s_journal; if (journal != NULL) - return jbd2_journal_try_to_free_buffers(journal, page, wait); + return jbd2_journal_try_to_free_buffers(journal, page, + wait & ~__GFP_WAIT); else return try_to_free_buffers(page); }