mbox

[Trusty] Hang and corruption in dcache_shrink_list

Message ID 53EA1DD8.90601@canonical.com
State New
Headers show

Pull-request

git://kernel.ubuntu.com/rtg/ubuntu-trusty.git

Message

Tim Gardner Aug. 12, 2014, 1:59 p.m. UTC
On 08/08/2014 12:18 PM, Dave Chiluk wrote:
> BugLink: http://bugs.launchpad.net/bugs/1354157
> 
> The patchset described here and committed to upstream
> https://lkml.org/lkml/2014/5/4/7
> can be cleanly cherry-picked, and has been applied to
> git://kernel.ubuntu.com/chiluk/ubuntu-trusty.git branch 70318
> The end user reports that the above patch set fixes the issue.
> 
> I'd like to get some additional review on this patchset before
> submitting it for actual inclusion.
> 
> That all being said, I think we should strongly consider pulling in
> 046b961b4
> b2b80195d
> 9f12600fe
> and b2b80195d88
> as well, as these look to be more related to a possible lock starvation
> case.
> 
> Open for discussion.
> Dave.
> 

I did an identical patch set for a different bug
(http://bugs.launchpad.net/bugs/1354234). Shall we dup them ?

The following changes since commit 0a0caf5157bffb6340de804b3a8c94e045f68fb7:

  ahci_xgene: Use correct OOB tunning parameters for APM X-Gene SoC AHCI
SATA Host controller driver. (2014-08-12 06:40:52 -0600)

are available in the git repository at:

  git://kernel.ubuntu.com/rtg/ubuntu-trusty.git
lp1354234-dcache-shrink-list-corruption

for you to fetch changes up to 3b1cbde08418dbd8cc2d819b9e7412303818deba:

  dcache: don't need rcu in shrink_dentry_list() (2014-08-12 07:47:39 -0600)

----------------------------------------------------------------
Al Viro (7):
      fold d_kill() and d_free()
      fold try_prune_one_dentry()
      new helper: dentry_free()
      expand the call of dentry_lru_del() in dentry_kill()
      dentry_kill(): don't try to remove from shrink list
      don't remove from shrink list in select_collect()
      more graceful recovery in umount_collect()

Miklos Szeredi (1):
      dcache: don't need rcu in shrink_dentry_list()

 fs/dcache.c            | 315
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 include/linux/dcache.h |   2 ++
 2 files changed, 103 insertions(+), 214 deletions(-)

Comments

Dave Chiluk Aug. 14, 2014, 6:58 p.m. UTC | #1
I did some more review of the recent dcache changes that have gone in
upstream, and I think it's appropriate for us to bring those in as well.

I hope pushed my branch 70318 to
http://kernel.ubuntu.com/git?p=chiluk/ubuntu-trusty.git;a=shortlog;h=refs/heads/70318
for review.

Basically I included all upstream patches related to locking (or their
requisite code cleanup patches) that pertain to dcache_shrink_list.

Dave.

On 08/12/2014 08:59 AM, Tim Gardner wrote:
> On 08/08/2014 12:18 PM, Dave Chiluk wrote:
>> BugLink: http://bugs.launchpad.net/bugs/1354157
>>
>> The patchset described here and committed to upstream
>> https://lkml.org/lkml/2014/5/4/7
>> can be cleanly cherry-picked, and has been applied to
>> git://kernel.ubuntu.com/chiluk/ubuntu-trusty.git branch 70318
>> The end user reports that the above patch set fixes the issue.
>>
>> I'd like to get some additional review on this patchset before
>> submitting it for actual inclusion.
>>
>> That all being said, I think we should strongly consider pulling in
>> 046b961b4
>> b2b80195d
>> 9f12600fe
>> and b2b80195d88
>> as well, as these look to be more related to a possible lock starvation
>> case.
>>
>> Open for discussion.
>> Dave.
>>
> 
> I did an identical patch set for a different bug
> (http://bugs.launchpad.net/bugs/1354234). Shall we dup them ?
> 
> The following changes since commit 0a0caf5157bffb6340de804b3a8c94e045f68fb7:
> 
>   ahci_xgene: Use correct OOB tunning parameters for APM X-Gene SoC AHCI
> SATA Host controller driver. (2014-08-12 06:40:52 -0600)
> 
> are available in the git repository at:
> 
>   git://kernel.ubuntu.com/rtg/ubuntu-trusty.git
> lp1354234-dcache-shrink-list-corruption
> 
> for you to fetch changes up to 3b1cbde08418dbd8cc2d819b9e7412303818deba:
> 
>   dcache: don't need rcu in shrink_dentry_list() (2014-08-12 07:47:39 -0600)
> 
> ----------------------------------------------------------------
> Al Viro (7):
>       fold d_kill() and d_free()
>       fold try_prune_one_dentry()
>       new helper: dentry_free()
>       expand the call of dentry_lru_del() in dentry_kill()
>       dentry_kill(): don't try to remove from shrink list
>       don't remove from shrink list in select_collect()
>       more graceful recovery in umount_collect()
> 
> Miklos Szeredi (1):
>       dcache: don't need rcu in shrink_dentry_list()
> 
>  fs/dcache.c            | 315
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  include/linux/dcache.h |   2 ++
>  2 files changed, 103 insertions(+), 214 deletions(-)
> 
>
Tim Gardner Aug. 15, 2014, 1:42 p.m. UTC | #2
Dave - if you would log the type and results of your testing in the bug
report, I'd be willing to apply this patch set. It looks like these
patches will solve at least 2 reporters problems.

rtg


On 08/14/2014 12:58 PM, Dave Chiluk wrote:
> I did some more review of the recent dcache changes that have gone in
> upstream, and I think it's appropriate for us to bring those in as well.
> 
> I hope pushed my branch 70318 to
> http://kernel.ubuntu.com/git?p=chiluk/ubuntu-trusty.git;a=shortlog;h=refs/heads/70318
> for review.
> 
> Basically I included all upstream patches related to locking (or their
> requisite code cleanup patches) that pertain to dcache_shrink_list.
> 
> Dave.
> 
> On 08/12/2014 08:59 AM, Tim Gardner wrote:
>> On 08/08/2014 12:18 PM, Dave Chiluk wrote:
>>> BugLink: http://bugs.launchpad.net/bugs/1354157
>>>
>>> The patchset described here and committed to upstream
>>> https://lkml.org/lkml/2014/5/4/7
>>> can be cleanly cherry-picked, and has been applied to
>>> git://kernel.ubuntu.com/chiluk/ubuntu-trusty.git branch 70318
>>> The end user reports that the above patch set fixes the issue.
>>>
>>> I'd like to get some additional review on this patchset before
>>> submitting it for actual inclusion.
>>>
>>> That all being said, I think we should strongly consider pulling in
>>> 046b961b4
>>> b2b80195d
>>> 9f12600fe
>>> and b2b80195d88
>>> as well, as these look to be more related to a possible lock starvation
>>> case.
>>>
>>> Open for discussion.
>>> Dave.
>>>
>>
>> I did an identical patch set for a different bug
>> (http://bugs.launchpad.net/bugs/1354234). Shall we dup them ?
>>
>> The following changes since commit 0a0caf5157bffb6340de804b3a8c94e045f68fb7:
>>
>>   ahci_xgene: Use correct OOB tunning parameters for APM X-Gene SoC AHCI
>> SATA Host controller driver. (2014-08-12 06:40:52 -0600)
>>
>> are available in the git repository at:
>>
>>   git://kernel.ubuntu.com/rtg/ubuntu-trusty.git
>> lp1354234-dcache-shrink-list-corruption
>>
>> for you to fetch changes up to 3b1cbde08418dbd8cc2d819b9e7412303818deba:
>>
>>   dcache: don't need rcu in shrink_dentry_list() (2014-08-12 07:47:39 -0600)
>>
>> ----------------------------------------------------------------
>> Al Viro (7):
>>       fold d_kill() and d_free()
>>       fold try_prune_one_dentry()
>>       new helper: dentry_free()
>>       expand the call of dentry_lru_del() in dentry_kill()
>>       dentry_kill(): don't try to remove from shrink list
>>       don't remove from shrink list in select_collect()
>>       more graceful recovery in umount_collect()
>>
>> Miklos Szeredi (1):
>>       dcache: don't need rcu in shrink_dentry_list()
>>
>>  fs/dcache.c            | 315
>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>  include/linux/dcache.h |   2 ++
>>  2 files changed, 103 insertions(+), 214 deletions(-)
>>
>>
> 
>
Dave Chiluk Aug. 19, 2014, 2:19 p.m. UTC | #3
I was able to successfully run the xfstests to completion without a hang
or crash, and I was also able to run the script described in
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1354157
without issue as well.  However, I was unable to induce the crash with
that script earlier.  However, the user has reported to me as well that
the above fixes have resolved their issues.

So rtg, I think you are good to go ahead and pull those fixes.

Dave Chiluk



On 08/15/2014 08:42 AM, Tim Gardner wrote:
> Dave - if you would log the type and results of your testing in the bug
> report, I'd be willing to apply this patch set. It looks like these
> patches will solve at least 2 reporters problems.
> 
> rtg
> 
> 
> On 08/14/2014 12:58 PM, Dave Chiluk wrote:
>> I did some more review of the recent dcache changes that have gone in
>> upstream, and I think it's appropriate for us to bring those in as well.
>>
>> I hope pushed my branch 70318 to
>> http://kernel.ubuntu.com/git?p=chiluk/ubuntu-trusty.git;a=shortlog;h=refs/heads/70318
>> for review.
>>
>> Basically I included all upstream patches related to locking (or their
>> requisite code cleanup patches) that pertain to dcache_shrink_list.
>>
>> Dave.
>>
>> On 08/12/2014 08:59 AM, Tim Gardner wrote:
>>> On 08/08/2014 12:18 PM, Dave Chiluk wrote:
>>>> BugLink: http://bugs.launchpad.net/bugs/1354157
>>>>
>>>> The patchset described here and committed to upstream
>>>> https://lkml.org/lkml/2014/5/4/7
>>>> can be cleanly cherry-picked, and has been applied to
>>>> git://kernel.ubuntu.com/chiluk/ubuntu-trusty.git branch 70318
>>>> The end user reports that the above patch set fixes the issue.
>>>>
>>>> I'd like to get some additional review on this patchset before
>>>> submitting it for actual inclusion.
>>>>
>>>> That all being said, I think we should strongly consider pulling in
>>>> 046b961b4
>>>> b2b80195d
>>>> 9f12600fe
>>>> and b2b80195d88
>>>> as well, as these look to be more related to a possible lock starvation
>>>> case.
>>>>
>>>> Open for discussion.
>>>> Dave.
>>>>
>>>
>>> I did an identical patch set for a different bug
>>> (http://bugs.launchpad.net/bugs/1354234). Shall we dup them ?
>>>
>>> The following changes since commit 0a0caf5157bffb6340de804b3a8c94e045f68fb7:
>>>
>>>   ahci_xgene: Use correct OOB tunning parameters for APM X-Gene SoC AHCI
>>> SATA Host controller driver. (2014-08-12 06:40:52 -0600)
>>>
>>> are available in the git repository at:
>>>
>>>   git://kernel.ubuntu.com/rtg/ubuntu-trusty.git
>>> lp1354234-dcache-shrink-list-corruption
>>>
>>> for you to fetch changes up to 3b1cbde08418dbd8cc2d819b9e7412303818deba:
>>>
>>>   dcache: don't need rcu in shrink_dentry_list() (2014-08-12 07:47:39 -0600)
>>>
>>> ----------------------------------------------------------------
>>> Al Viro (7):
>>>       fold d_kill() and d_free()
>>>       fold try_prune_one_dentry()
>>>       new helper: dentry_free()
>>>       expand the call of dentry_lru_del() in dentry_kill()
>>>       dentry_kill(): don't try to remove from shrink list
>>>       don't remove from shrink list in select_collect()
>>>       more graceful recovery in umount_collect()
>>>
>>> Miklos Szeredi (1):
>>>       dcache: don't need rcu in shrink_dentry_list()
>>>
>>>  fs/dcache.c            | 315
>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>>  include/linux/dcache.h |   2 ++
>>>  2 files changed, 103 insertions(+), 214 deletions(-)
>>>
>>>
>>
>>
> 
>
Tim Gardner Aug. 19, 2014, 2:39 p.m. UTC | #4