scsi: scsi_transport_sas: Fix memory leak when removing devices
diff mbox series

Message ID 20200109081206.14813-1-ike.pan@canonical.com
State New
Headers show
Series
  • scsi: scsi_transport_sas: Fix memory leak when removing devices
Related show

Commit Message

Ike Panhc Jan. 9, 2020, 8:12 a.m. UTC
From: John Garry <john.garry@huawei.com>

BugLink: https://launchpad.net/bugs/1854550

Removing a non-host rphy causes a memory leak:

root@(none)$ echo 0 > /sys/devices/platform/HISI0162:01/host0/port-0:0/expander-0:0/port-0:0:10/phy-0:0:10/sas_phy/phy-0:0:10/enable
[   79.857888] hisi_sas_v2_hw HISI0162:01: dev[7:1] is gone
root@(none)$ echo scan > /sys/kernel/debug/kmemleak
[  131.656603] kmemleak: 3 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
root@(none)$ more /sys/kernel/debug/kmemleak
unreferenced object 0xffff041da5c66000 (size 256):
  comm "kworker/u128:1", pid 549, jiffies 4294898543 (age 113.728s)
  hex dump (first 32 bytes):
    00 5e c6 a5 1d 04 ff ff 01 00 00 00 00 00 00 00  .^..............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<(____ptrval____)>] kmem_cache_alloc+0x188/0x260
    [<(____ptrval____)>] bsg_setup_queue+0x48/0x1a8
    [<(____ptrval____)>] sas_rphy_add+0x108/0x2d0
    [<(____ptrval____)>] sas_probe_devices+0x168/0x208
    [<(____ptrval____)>] sas_discover_domain+0x660/0x9c8
    [<(____ptrval____)>] process_one_work+0x3f8/0x690
    [<(____ptrval____)>] worker_thread+0x70/0x6a0
    [<(____ptrval____)>] kthread+0x1b8/0x1c0
    [<(____ptrval____)>] ret_from_fork+0x10/0x18
unreferenced object 0xffff041d8c075400 (size 128):
  comm "kworker/u128:1", pid 549, jiffies 4294898543 (age 113.728s)
  hex dump (first 32 bytes):
    00 40 25 97 1d 00 ff ff 00 00 00 00 00 00 00 00  .@%.............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<(____ptrval____)>] __kmalloc_node+0x1a8/0x2c8
    [<(____ptrval____)>] blk_mq_realloc_tag_set_tags.part.70+0x48/0xd8
    [<(____ptrval____)>] blk_mq_alloc_tag_set+0x1dc/0x530
    [<(____ptrval____)>] bsg_setup_queue+0xe8/0x1a8
    [<(____ptrval____)>] sas_rphy_add+0x108/0x2d0
    [<(____ptrval____)>] sas_probe_devices+0x168/0x208
    [<(____ptrval____)>] sas_discover_domain+0x660/0x9c8
    [<(____ptrval____)>] process_one_work+0x3f8/0x690
    [<(____ptrval____)>] worker_thread+0x70/0x6a0
    [<(____ptrval____)>] kthread+0x1b8/0x1c0
    [<(____ptrval____)>] ret_from_fork+0x10/0x18
unreferenced object 0xffff041da5c65e00 (size 256):
  comm "kworker/u128:1", pid 549, jiffies 4294898543 (age 113.728s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<(____ptrval____)>] __kmalloc_node+0x1a8/0x2c8
    [<(____ptrval____)>] blk_mq_alloc_tag_set+0x254/0x530
    [<(____ptrval____)>] bsg_setup_queue+0xe8/0x1a8
    [<(____ptrval____)>] sas_rphy_add+0x108/0x2d0
    [<(____ptrval____)>] sas_probe_devices+0x168/0x208
    [<(____ptrval____)>] sas_discover_domain+0x660/0x9c8
    [<(____ptrval____)>] process_one_work+0x3f8/0x690
    [<(____ptrval____)>] worker_thread+0x70/0x6a0
    [<(____ptrval____)>] kthread+0x1b8/0x1c0
    [<(____ptrval____)>] ret_from_fork+0x10/0x18
root@(none)$

It turns out that we don't clean up the request queue fully for bsg
devices, as the blk mq tags for the request queue are not freed.

Fix by doing the queue removal in one place - in sas_rphy_remove() -
instead of unregistering the queue in sas_rphy_remove() and finally
cleaning up the queue in calling blk_cleanup_queue() from
sas_end_device_release() or sas_expander_release().

Function bsg_remove_queue() can handle a NULL pointer q, so remove the
precheck in sas_rphy_remove().

Fixes: 651a013649943 ("scsi: scsi_transport_sas: switch to bsg-lib for SMP passthrough")
Link: https://lore.kernel.org/r/1574242755-94156-1-git-send-email-john.garry@huawei.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit 82ea3e0e129e2ab913dd6684bab7a6e5e9896dee)
Signed-off-by: Ike Panhc <ike.pan@canonical.com>
---
 drivers/scsi/scsi_transport_sas.c | 9 +--------
 1 file changed, 1 insertion(+), 8 deletions(-)

Comments

Ike Panhc Jan. 9, 2020, 8:17 a.m. UTC | #1
Forget to edit the subject. Patch is for focal.
Seth Forshee Jan. 21, 2020, 11:11 p.m. UTC | #2
On Thu, Jan 09, 2020 at 04:12:06PM +0800, Ike Panhc wrote:
> From: John Garry <john.garry@huawei.com>
> 
> BugLink: https://launchpad.net/bugs/1854550
> 
> Removing a non-host rphy causes a memory leak:
> 
> root@(none)$ echo 0 > /sys/devices/platform/HISI0162:01/host0/port-0:0/expander-0:0/port-0:0:10/phy-0:0:10/sas_phy/phy-0:0:10/enable
> [   79.857888] hisi_sas_v2_hw HISI0162:01: dev[7:1] is gone
> root@(none)$ echo scan > /sys/kernel/debug/kmemleak
> [  131.656603] kmemleak: 3 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
> root@(none)$ more /sys/kernel/debug/kmemleak
> unreferenced object 0xffff041da5c66000 (size 256):
>   comm "kworker/u128:1", pid 549, jiffies 4294898543 (age 113.728s)
>   hex dump (first 32 bytes):
>     00 5e c6 a5 1d 04 ff ff 01 00 00 00 00 00 00 00  .^..............
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>   backtrace:
>     [<(____ptrval____)>] kmem_cache_alloc+0x188/0x260
>     [<(____ptrval____)>] bsg_setup_queue+0x48/0x1a8
>     [<(____ptrval____)>] sas_rphy_add+0x108/0x2d0
>     [<(____ptrval____)>] sas_probe_devices+0x168/0x208
>     [<(____ptrval____)>] sas_discover_domain+0x660/0x9c8
>     [<(____ptrval____)>] process_one_work+0x3f8/0x690
>     [<(____ptrval____)>] worker_thread+0x70/0x6a0
>     [<(____ptrval____)>] kthread+0x1b8/0x1c0
>     [<(____ptrval____)>] ret_from_fork+0x10/0x18
> unreferenced object 0xffff041d8c075400 (size 128):
>   comm "kworker/u128:1", pid 549, jiffies 4294898543 (age 113.728s)
>   hex dump (first 32 bytes):
>     00 40 25 97 1d 00 ff ff 00 00 00 00 00 00 00 00  .@%.............
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>   backtrace:
>     [<(____ptrval____)>] __kmalloc_node+0x1a8/0x2c8
>     [<(____ptrval____)>] blk_mq_realloc_tag_set_tags.part.70+0x48/0xd8
>     [<(____ptrval____)>] blk_mq_alloc_tag_set+0x1dc/0x530
>     [<(____ptrval____)>] bsg_setup_queue+0xe8/0x1a8
>     [<(____ptrval____)>] sas_rphy_add+0x108/0x2d0
>     [<(____ptrval____)>] sas_probe_devices+0x168/0x208
>     [<(____ptrval____)>] sas_discover_domain+0x660/0x9c8
>     [<(____ptrval____)>] process_one_work+0x3f8/0x690
>     [<(____ptrval____)>] worker_thread+0x70/0x6a0
>     [<(____ptrval____)>] kthread+0x1b8/0x1c0
>     [<(____ptrval____)>] ret_from_fork+0x10/0x18
> unreferenced object 0xffff041da5c65e00 (size 256):
>   comm "kworker/u128:1", pid 549, jiffies 4294898543 (age 113.728s)
>   hex dump (first 32 bytes):
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>   backtrace:
>     [<(____ptrval____)>] __kmalloc_node+0x1a8/0x2c8
>     [<(____ptrval____)>] blk_mq_alloc_tag_set+0x254/0x530
>     [<(____ptrval____)>] bsg_setup_queue+0xe8/0x1a8
>     [<(____ptrval____)>] sas_rphy_add+0x108/0x2d0
>     [<(____ptrval____)>] sas_probe_devices+0x168/0x208
>     [<(____ptrval____)>] sas_discover_domain+0x660/0x9c8
>     [<(____ptrval____)>] process_one_work+0x3f8/0x690
>     [<(____ptrval____)>] worker_thread+0x70/0x6a0
>     [<(____ptrval____)>] kthread+0x1b8/0x1c0
>     [<(____ptrval____)>] ret_from_fork+0x10/0x18
> root@(none)$
> 
> It turns out that we don't clean up the request queue fully for bsg
> devices, as the blk mq tags for the request queue are not freed.
> 
> Fix by doing the queue removal in one place - in sas_rphy_remove() -
> instead of unregistering the queue in sas_rphy_remove() and finally
> cleaning up the queue in calling blk_cleanup_queue() from
> sas_end_device_release() or sas_expander_release().
> 
> Function bsg_remove_queue() can handle a NULL pointer q, so remove the
> precheck in sas_rphy_remove().
> 
> Fixes: 651a013649943 ("scsi: scsi_transport_sas: switch to bsg-lib for SMP passthrough")
> Link: https://lore.kernel.org/r/1574242755-94156-1-git-send-email-john.garry@huawei.com
> Signed-off-by: John Garry <john.garry@huawei.com>
> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
> (cherry picked from commit 82ea3e0e129e2ab913dd6684bab7a6e5e9896dee)
> Signed-off-by: Ike Panhc <ike.pan@canonical.com>

Applied to focal/master-next, thanks!

Patch
diff mbox series

diff --git a/drivers/scsi/scsi_transport_sas.c b/drivers/scsi/scsi_transport_sas.c
index ef138c57e2a6..182fd25c7c43 100644
--- a/drivers/scsi/scsi_transport_sas.c
+++ b/drivers/scsi/scsi_transport_sas.c
@@ -1391,9 +1391,6 @@  static void sas_expander_release(struct device *dev)
 	struct sas_rphy *rphy = dev_to_rphy(dev);
 	struct sas_expander_device *edev = rphy_to_expander_device(rphy);
 
-	if (rphy->q)
-		blk_cleanup_queue(rphy->q);
-
 	put_device(dev->parent);
 	kfree(edev);
 }
@@ -1403,9 +1400,6 @@  static void sas_end_device_release(struct device *dev)
 	struct sas_rphy *rphy = dev_to_rphy(dev);
 	struct sas_end_device *edev = rphy_to_end_device(rphy);
 
-	if (rphy->q)
-		blk_cleanup_queue(rphy->q);
-
 	put_device(dev->parent);
 	kfree(edev);
 }
@@ -1634,8 +1628,7 @@  sas_rphy_remove(struct sas_rphy *rphy)
 	}
 
 	sas_rphy_unlink(rphy);
-	if (rphy->q)
-		bsg_unregister_queue(rphy->q);
+	bsg_remove_queue(rphy->q);
 	transport_remove_device(dev);
 	device_del(dev);
 }