[1/2] fs: ext4: use BUG_ON if writepage call comes from direct reclaim

Message ID 1530591079-33813-1-git-send-email-yang.shi@linux.alibaba.com
State New
Headers show
Series
  • [1/2] fs: ext4: use BUG_ON if writepage call comes from direct reclaim
Related show

Commit Message

Yang Shi July 3, 2018, 4:11 a.m.
direct reclaim doesn't write out filesystem page, only kswapd could do
it. So, if the call comes from direct reclaim, it is definitely a bug.

And, Mel Gormane also mentioned "Ultimately, this will be a BUG_ON." In
commit 94054fa3fca1fd78db02cb3d68d5627120f0a1d4 ("xfs: warn if direct
reclaim tries to writeback pages").

Although it is for xfs, ext4 has the similar behavior, so elevate
WARN_ON to BUG_ON.

And, correct the comment accordingly.

Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
---
 fs/ext4/inode.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Theodore Y. Ts'o July 3, 2018, 10:39 a.m. | #1
On Tue, Jul 03, 2018 at 12:11:18PM +0800, Yang Shi wrote:
> direct reclaim doesn't write out filesystem page, only kswapd could do
> it. So, if the call comes from direct reclaim, it is definitely a bug.
> 
> And, Mel Gormane also mentioned "Ultimately, this will be a BUG_ON." In
> commit 94054fa3fca1fd78db02cb3d68d5627120f0a1d4 ("xfs: warn if direct
> reclaim tries to writeback pages").
> 
> Although it is for xfs, ext4 has the similar behavior, so elevate
> WARN_ON to BUG_ON.
> 
> And, correct the comment accordingly.
> 
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: "Theodore Ts'o" <tytso@mit.edu>
> Cc: Andreas Dilger <adilger.kernel@dilger.ca>
> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>

What's the upside of crashing the kernel if the file sytsem can handle it?

       	   	     	      	  	    - Ted
Yang Shi July 3, 2018, 5:05 p.m. | #2
On 7/3/18 3:39 AM, Theodore Y. Ts'o wrote:
> On Tue, Jul 03, 2018 at 12:11:18PM +0800, Yang Shi wrote:
>> direct reclaim doesn't write out filesystem page, only kswapd could do
>> it. So, if the call comes from direct reclaim, it is definitely a bug.
>>
>> And, Mel Gormane also mentioned "Ultimately, this will be a BUG_ON." In
>> commit 94054fa3fca1fd78db02cb3d68d5627120f0a1d4 ("xfs: warn if direct
>> reclaim tries to writeback pages").
>>
>> Although it is for xfs, ext4 has the similar behavior, so elevate
>> WARN_ON to BUG_ON.
>>
>> And, correct the comment accordingly.
>>
>> Cc: Mel Gorman <mgorman@techsingularity.net>
>> Cc: "Theodore Ts'o" <tytso@mit.edu>
>> Cc: Andreas Dilger <adilger.kernel@dilger.ca>
>> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
> What's the upside of crashing the kernel if the file sytsem can handle it?

I'm not sure if it is a good choice to let filesystem handle such vital 
VM regression. IMHO, writing out filesystem page from direct reclaim 
context is a vital VM bug. It means something is definitely wrong in VM. 
It should never happen.

It sounds ok to have filesystem throw out warning and handle it, but I'm 
not sure if someone will just ignore the warning, but it should *never* 
be ignored.

Yang

>
>         	   	     	      	  	    - Ted
Yang Shi July 3, 2018, 11:10 p.m. | #3
On 7/3/18 10:05 AM, Yang Shi wrote:
>
>
> On 7/3/18 3:39 AM, Theodore Y. Ts'o wrote:
>> On Tue, Jul 03, 2018 at 12:11:18PM +0800, Yang Shi wrote:
>>> direct reclaim doesn't write out filesystem page, only kswapd could do
>>> it. So, if the call comes from direct reclaim, it is definitely a bug.
>>>
>>> And, Mel Gormane also mentioned "Ultimately, this will be a BUG_ON." In
>>> commit 94054fa3fca1fd78db02cb3d68d5627120f0a1d4 ("xfs: warn if direct
>>> reclaim tries to writeback pages").
>>>
>>> Although it is for xfs, ext4 has the similar behavior, so elevate
>>> WARN_ON to BUG_ON.
>>>
>>> And, correct the comment accordingly.
>>>
>>> Cc: Mel Gorman <mgorman@techsingularity.net>
>>> Cc: "Theodore Ts'o" <tytso@mit.edu>
>>> Cc: Andreas Dilger <adilger.kernel@dilger.ca>
>>> Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
>> What's the upside of crashing the kernel if the file sytsem can 
>> handle it?

BTW, the comment does sound misleading. Direct reclaim is not a 
legitimate context to call writepage. I'd like to correct at least.

Thanks,
Yang

>
> I'm not sure if it is a good choice to let filesystem handle such 
> vital VM regression. IMHO, writing out filesystem page from direct 
> reclaim context is a vital VM bug. It means something is definitely 
> wrong in VM. It should never happen.
>
> It sounds ok to have filesystem throw out warning and handle it, but 
> I'm not sure if someone will just ignore the warning, but it should 
> *never* be ignored.
>
> Yang
>
>>
>>                                                 - Ted
>
Theodore Y. Ts'o July 3, 2018, 11:43 p.m. | #4
On Tue, Jul 03, 2018 at 10:05:04AM -0700, Yang Shi wrote:
> I'm not sure if it is a good choice to let filesystem handle such vital VM
> regression. IMHO, writing out filesystem page from direct reclaim context is
> a vital VM bug. It means something is definitely wrong in VM. It should
> never happen.

If it does happen, it should happen reliably; this isn't the sort of
thing where some linked list had gotten corrupted.  This would be a
structural problem in the VM code.

So presumably, if the WARN_ON triggered, it should be be noticed by VM
developers, and they should fix it.

In general, though, BUG_ON's should be avoided unless there really is
no way to recover.

> It sounds ok to have filesystem throw out warning and handle it, but I'm not
> sure if someone will just ignore the warning, but it should *never* be
> ignored.

If a kernel develper (a VM developer in this case) ignores a warning,
that's just simply professional malpractice.  In general WARN_ON's
should only be used as a sign of a kernel bug.  So they should never
be ignored.

						- Ted
Michal Hocko July 4, 2018, 2:03 p.m. | #5
On Tue 03-07-18 10:05:04, Yang Shi wrote:
> 
> 
> On 7/3/18 3:39 AM, Theodore Y. Ts'o wrote:
> > On Tue, Jul 03, 2018 at 12:11:18PM +0800, Yang Shi wrote:
> > > direct reclaim doesn't write out filesystem page, only kswapd could do
> > > it. So, if the call comes from direct reclaim, it is definitely a bug.
> > > 
> > > And, Mel Gormane also mentioned "Ultimately, this will be a BUG_ON." In
> > > commit 94054fa3fca1fd78db02cb3d68d5627120f0a1d4 ("xfs: warn if direct
> > > reclaim tries to writeback pages").
> > > 
> > > Although it is for xfs, ext4 has the similar behavior, so elevate
> > > WARN_ON to BUG_ON.
> > > 
> > > And, correct the comment accordingly.
> > > 
> > > Cc: Mel Gorman <mgorman@techsingularity.net>
> > > Cc: "Theodore Ts'o" <tytso@mit.edu>
> > > Cc: Andreas Dilger <adilger.kernel@dilger.ca>
> > > Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
> > What's the upside of crashing the kernel if the file sytsem can handle it?
> 
> I'm not sure if it is a good choice to let filesystem handle such vital VM
> regression. IMHO, writing out filesystem page from direct reclaim context is
> a vital VM bug. It means something is definitely wrong in VM. It should
> never happen.

Could you be more specific about the vital part please? Issuing
writeback from the direct reclaim surely can be sub-optimal. But since
we have quite a large stacks it shouldn't overflow immediately even for
more complex storage setups. So what is the _vital_ bug here?

Patch

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 2ea07ef..089e388 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2071,7 +2071,7 @@  static int __ext4_journalled_writepage(struct page *page,
  * This function can get called via...
  *   - ext4_writepages after taking page lock (have journal handle)
  *   - journal_submit_inode_data_buffers (no journal handle)
- *   - shrink_page_list via the kswapd/direct reclaim (no journal handle)
+ *   - shrink_page_list via the kswapd (no journal handle)
  *   - grab_page_cache when doing write_begin (have journal handle)
  *
  * We don't do any block allocation in this function. If we have page with
@@ -2148,10 +2148,10 @@  static int ext4_writepage(struct page *page,
 		    (inode->i_sb->s_blocksize == PAGE_SIZE)) {
 			/*
 			 * For memory cleaning there's no point in writing only
-			 * some buffers. So just bail out. Warn if we came here
-			 * from direct reclaim.
+			 * some buffers. So just bail out. It is a bug if we
+			 * came here from direct reclaim.
 			 */
-			WARN_ON_ONCE((current->flags & (PF_MEMALLOC|PF_KSWAPD))
+			BUG_ON((current->flags & (PF_MEMALLOC|PF_KSWAPD))
 							== PF_MEMALLOC);
 			unlock_page(page);
 			return 0;