diff mbox series

[bpf-next,V2] net/xdp: Fix suspicious RCU usage warning

Message ID 1534143879-8380-1-git-send-email-tariqt@mellanox.com
State Changes Requested, archived
Headers show
Series [bpf-next,V2] net/xdp: Fix suspicious RCU usage warning | expand

Commit Message

Tariq Toukan Aug. 13, 2018, 7:04 a.m. UTC
Fix the warning below by calling rhashtable_lookup_fast.
Also, make some code movements for better quality and human
readability.

[  342.450870] WARNING: suspicious RCU usage
[  342.455856] 4.18.0-rc2+ #17 Tainted: G           O
[  342.462210] -----------------------------
[  342.467202] ./include/linux/rhashtable.h:481 suspicious rcu_dereference_check() usage!
[  342.476568]
[  342.476568] other info that might help us debug this:
[  342.476568]
[  342.486978]
[  342.486978] rcu_scheduler_active = 2, debug_locks = 1
[  342.495211] 4 locks held by modprobe/3934:
[  342.500265]  #0: 00000000e23116b2 (mlx5_intf_mutex){+.+.}, at:
mlx5_unregister_interface+0x18/0x90 [mlx5_core]
[  342.511953]  #1: 00000000ca16db96 (rtnl_mutex){+.+.}, at: unregister_netdev+0xe/0x20
[  342.521109]  #2: 00000000a46e2c4b (&priv->state_lock){+.+.}, at: mlx5e_close+0x29/0x60
[mlx5_core]
[  342.531642]  #3: 0000000060c5bde3 (mem_id_lock){+.+.}, at: xdp_rxq_info_unreg+0x93/0x6b0
[  342.541206]
[  342.541206] stack backtrace:
[  342.547075] CPU: 12 PID: 3934 Comm: modprobe Tainted: G           O      4.18.0-rc2+ #17
[  342.556621] Hardware name: Dell Inc. PowerEdge R730/0H21J3, BIOS 1.5.4 10/002/2015
[  342.565606] Call Trace:
[  342.568861]  dump_stack+0x78/0xb3
[  342.573086]  xdp_rxq_info_unreg+0x3f5/0x6b0
[  342.578285]  ? __call_rcu+0x220/0x300
[  342.582911]  mlx5e_free_rq+0x38/0xc0 [mlx5_core]
[  342.588602]  mlx5e_close_channel+0x20/0x120 [mlx5_core]
[  342.594976]  mlx5e_close_channels+0x26/0x40 [mlx5_core]
[  342.601345]  mlx5e_close_locked+0x44/0x50 [mlx5_core]
[  342.607519]  mlx5e_close+0x42/0x60 [mlx5_core]
[  342.613005]  __dev_close_many+0xb1/0x120
[  342.617911]  dev_close_many+0xa2/0x170
[  342.622622]  rollback_registered_many+0x148/0x460
[  342.628401]  ? __lock_acquire+0x48d/0x11b0
[  342.633498]  ? unregister_netdev+0xe/0x20
[  342.638495]  rollback_registered+0x56/0x90
[  342.643588]  unregister_netdevice_queue+0x7e/0x100
[  342.649461]  unregister_netdev+0x18/0x20
[  342.654362]  mlx5e_remove+0x2a/0x50 [mlx5_core]
[  342.659944]  mlx5_remove_device+0xe5/0x110 [mlx5_core]
[  342.666208]  mlx5_unregister_interface+0x39/0x90 [mlx5_core]
[  342.673038]  cleanup+0x5/0xbfc [mlx5_core]
[  342.678094]  __x64_sys_delete_module+0x16b/0x240
[  342.683725]  ? do_syscall_64+0x1c/0x210
[  342.688476]  do_syscall_64+0x5a/0x210
[  342.693025]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: 8d5d88527587 ("xdp: rhashtable with allocator ID to pointer mapping")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
---
 net/core/xdp.c | 13 +++----------
 1 file changed, 3 insertions(+), 10 deletions(-)

V1 -> V2:
* Use rhashtable_lookup_fast and make some code movements, per Daniel's
  and Alexei's comments.

Please queue to -stable v4.18.

Comments

Jesper Dangaard Brouer Aug. 13, 2018, 9:13 a.m. UTC | #1
On Mon, 13 Aug 2018 10:04:39 +0300
Tariq Toukan <tariqt@mellanox.com> wrote:

> Fix the warning below by calling rhashtable_lookup_fast.
> Also, make some code movements for better quality and human
> readability.
> 
> [  342.450870] WARNING: suspicious RCU usage
> [  342.455856] 4.18.0-rc2+ #17 Tainted: G           O
> [  342.462210] -----------------------------
> [  342.467202] ./include/linux/rhashtable.h:481 suspicious rcu_dereference_check() usage!
> [  342.476568]
> [  342.476568] other info that might help us debug this:
> [  342.476568]
> [  342.486978]
> [  342.486978] rcu_scheduler_active = 2, debug_locks = 1
> [  342.495211] 4 locks held by modprobe/3934:
> [  342.500265]  #0: 00000000e23116b2 (mlx5_intf_mutex){+.+.}, at:
> mlx5_unregister_interface+0x18/0x90 [mlx5_core]
> [  342.511953]  #1: 00000000ca16db96 (rtnl_mutex){+.+.}, at: unregister_netdev+0xe/0x20
> [  342.521109]  #2: 00000000a46e2c4b (&priv->state_lock){+.+.}, at: mlx5e_close+0x29/0x60
> [mlx5_core]
> [  342.531642]  #3: 0000000060c5bde3 (mem_id_lock){+.+.}, at: xdp_rxq_info_unreg+0x93/0x6b0
> [  342.541206]
> [  342.541206] stack backtrace:
> [  342.547075] CPU: 12 PID: 3934 Comm: modprobe Tainted: G           O      4.18.0-rc2+ #17
> [  342.556621] Hardware name: Dell Inc. PowerEdge R730/0H21J3, BIOS 1.5.4 10/002/2015
> [  342.565606] Call Trace:
> [  342.568861]  dump_stack+0x78/0xb3
> [  342.573086]  xdp_rxq_info_unreg+0x3f5/0x6b0
> [  342.578285]  ? __call_rcu+0x220/0x300
> [  342.582911]  mlx5e_free_rq+0x38/0xc0 [mlx5_core]
> [  342.588602]  mlx5e_close_channel+0x20/0x120 [mlx5_core]
> [  342.594976]  mlx5e_close_channels+0x26/0x40 [mlx5_core]
> [  342.601345]  mlx5e_close_locked+0x44/0x50 [mlx5_core]
> [  342.607519]  mlx5e_close+0x42/0x60 [mlx5_core]
> [  342.613005]  __dev_close_many+0xb1/0x120
> [  342.617911]  dev_close_many+0xa2/0x170
> [  342.622622]  rollback_registered_many+0x148/0x460
> [  342.628401]  ? __lock_acquire+0x48d/0x11b0
> [  342.633498]  ? unregister_netdev+0xe/0x20
> [  342.638495]  rollback_registered+0x56/0x90
> [  342.643588]  unregister_netdevice_queue+0x7e/0x100
> [  342.649461]  unregister_netdev+0x18/0x20
> [  342.654362]  mlx5e_remove+0x2a/0x50 [mlx5_core]
> [  342.659944]  mlx5_remove_device+0xe5/0x110 [mlx5_core]
> [  342.666208]  mlx5_unregister_interface+0x39/0x90 [mlx5_core]
> [  342.673038]  cleanup+0x5/0xbfc [mlx5_core]
> [  342.678094]  __x64_sys_delete_module+0x16b/0x240
> [  342.683725]  ? do_syscall_64+0x1c/0x210
> [  342.688476]  do_syscall_64+0x5a/0x210
> [  342.693025]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
> Fixes: 8d5d88527587 ("xdp: rhashtable with allocator ID to pointer mapping")
> Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
> Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
> Cc: Jesper Dangaard Brouer <brouer@redhat.com>
> ---
>  net/core/xdp.c | 13 +++----------
>  1 file changed, 3 insertions(+), 10 deletions(-)
> 
> V1 -> V2:
> * Use rhashtable_lookup_fast and make some code movements, per Daniel's
>   and Alexei's comments.
> 
> Please queue to -stable v4.18.
>
> diff --git a/net/core/xdp.c b/net/core/xdp.c
> index 3dd99e1c04f5..8b1c7b699982 100644
> --- a/net/core/xdp.c
> +++ b/net/core/xdp.c
> @@ -105,16 +105,9 @@ static void __xdp_rxq_info_unreg_mem_model(struct xdp_rxq_info *xdp_rxq)
>  
>  	mutex_lock(&mem_id_lock);
>  
> -	xa = rhashtable_lookup(mem_id_ht, &id, mem_id_rht_params);
> -	if (!xa) {
> -		mutex_unlock(&mem_id_lock);
> -		return;
> -	}
> -
> -	err = rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params);
> -	WARN_ON(err);
> -
> -	call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
> +	xa = rhashtable_lookup_fast(mem_id_ht, &id, mem_id_rht_params);
> +	if (xa && rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params))
> +		call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
>  
>  	mutex_unlock(&mem_id_lock);
>  }

This is wrong.

The function rhashtable_remove_fast() returns zero on success.
Please, fix and send a V3.

Look at example in [1] section "Object removal"
[1] https://lwn.net/Articles/751374/
Tariq Toukan Aug. 13, 2018, 9:24 a.m. UTC | #2
On 13/08/2018 12:13 PM, Jesper Dangaard Brouer wrote:
> On Mon, 13 Aug 2018 10:04:39 +0300
> Tariq Toukan <tariqt@mellanox.com> wrote:
> 
>> Fix the warning below by calling rhashtable_lookup_fast.
>> Also, make some code movements for better quality and human
>> readability.
>>
>> [  342.450870] WARNING: suspicious RCU usage
>> [  342.455856] 4.18.0-rc2+ #17 Tainted: G           O
>> [  342.462210] -----------------------------
>> [  342.467202] ./include/linux/rhashtable.h:481 suspicious rcu_dereference_check() usage!
>> [  342.476568]
>> [  342.476568] other info that might help us debug this:
>> [  342.476568]
>> [  342.486978]
>> [  342.486978] rcu_scheduler_active = 2, debug_locks = 1
>> [  342.495211] 4 locks held by modprobe/3934:
>> [  342.500265]  #0: 00000000e23116b2 (mlx5_intf_mutex){+.+.}, at:
>> mlx5_unregister_interface+0x18/0x90 [mlx5_core]
>> [  342.511953]  #1: 00000000ca16db96 (rtnl_mutex){+.+.}, at: unregister_netdev+0xe/0x20
>> [  342.521109]  #2: 00000000a46e2c4b (&priv->state_lock){+.+.}, at: mlx5e_close+0x29/0x60
>> [mlx5_core]
>> [  342.531642]  #3: 0000000060c5bde3 (mem_id_lock){+.+.}, at: xdp_rxq_info_unreg+0x93/0x6b0
>> [  342.541206]
>> [  342.541206] stack backtrace:
>> [  342.547075] CPU: 12 PID: 3934 Comm: modprobe Tainted: G           O      4.18.0-rc2+ #17
>> [  342.556621] Hardware name: Dell Inc. PowerEdge R730/0H21J3, BIOS 1.5.4 10/002/2015
>> [  342.565606] Call Trace:
>> [  342.568861]  dump_stack+0x78/0xb3
>> [  342.573086]  xdp_rxq_info_unreg+0x3f5/0x6b0
>> [  342.578285]  ? __call_rcu+0x220/0x300
>> [  342.582911]  mlx5e_free_rq+0x38/0xc0 [mlx5_core]
>> [  342.588602]  mlx5e_close_channel+0x20/0x120 [mlx5_core]
>> [  342.594976]  mlx5e_close_channels+0x26/0x40 [mlx5_core]
>> [  342.601345]  mlx5e_close_locked+0x44/0x50 [mlx5_core]
>> [  342.607519]  mlx5e_close+0x42/0x60 [mlx5_core]
>> [  342.613005]  __dev_close_many+0xb1/0x120
>> [  342.617911]  dev_close_many+0xa2/0x170
>> [  342.622622]  rollback_registered_many+0x148/0x460
>> [  342.628401]  ? __lock_acquire+0x48d/0x11b0
>> [  342.633498]  ? unregister_netdev+0xe/0x20
>> [  342.638495]  rollback_registered+0x56/0x90
>> [  342.643588]  unregister_netdevice_queue+0x7e/0x100
>> [  342.649461]  unregister_netdev+0x18/0x20
>> [  342.654362]  mlx5e_remove+0x2a/0x50 [mlx5_core]
>> [  342.659944]  mlx5_remove_device+0xe5/0x110 [mlx5_core]
>> [  342.666208]  mlx5_unregister_interface+0x39/0x90 [mlx5_core]
>> [  342.673038]  cleanup+0x5/0xbfc [mlx5_core]
>> [  342.678094]  __x64_sys_delete_module+0x16b/0x240
>> [  342.683725]  ? do_syscall_64+0x1c/0x210
>> [  342.688476]  do_syscall_64+0x5a/0x210
>> [  342.693025]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>
>> Fixes: 8d5d88527587 ("xdp: rhashtable with allocator ID to pointer mapping")
>> Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
>> Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
>> Cc: Jesper Dangaard Brouer <brouer@redhat.com>
>> ---
>>   net/core/xdp.c | 13 +++----------
>>   1 file changed, 3 insertions(+), 10 deletions(-)
>>
>> V1 -> V2:
>> * Use rhashtable_lookup_fast and make some code movements, per Daniel's
>>    and Alexei's comments.
>>
>> Please queue to -stable v4.18.
>>
>> diff --git a/net/core/xdp.c b/net/core/xdp.c
>> index 3dd99e1c04f5..8b1c7b699982 100644
>> --- a/net/core/xdp.c
>> +++ b/net/core/xdp.c
>> @@ -105,16 +105,9 @@ static void __xdp_rxq_info_unreg_mem_model(struct xdp_rxq_info *xdp_rxq)
>>   
>>   	mutex_lock(&mem_id_lock);
>>   
>> -	xa = rhashtable_lookup(mem_id_ht, &id, mem_id_rht_params);
>> -	if (!xa) {
>> -		mutex_unlock(&mem_id_lock);
>> -		return;
>> -	}
>> -
>> -	err = rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params);
>> -	WARN_ON(err);
>> -
>> -	call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
>> +	xa = rhashtable_lookup_fast(mem_id_ht, &id, mem_id_rht_params);
>> +	if (xa && rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params))
>> +		call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
>>   
>>   	mutex_unlock(&mem_id_lock);
>>   }
> 
> This is wrong.
> 
> The function rhashtable_remove_fast() returns zero on success.
> Please, fix and send a V3.
> 
> Look at example in [1] section "Object removal"
> [1] https://lwn.net/Articles/751374/
> 

Right, thanks Jesper!
V3 Sent.
Daniel Borkmann Aug. 13, 2018, 9:26 a.m. UTC | #3
On 08/13/2018 11:13 AM, Jesper Dangaard Brouer wrote:
> On Mon, 13 Aug 2018 10:04:39 +0300
> Tariq Toukan <tariqt@mellanox.com> wrote:
> 
>> Fix the warning below by calling rhashtable_lookup_fast.
>> Also, make some code movements for better quality and human
>> readability.
>>
>> [  342.450870] WARNING: suspicious RCU usage
>> [  342.455856] 4.18.0-rc2+ #17 Tainted: G           O
>> [  342.462210] -----------------------------
>> [  342.467202] ./include/linux/rhashtable.h:481 suspicious rcu_dereference_check() usage!
>> [  342.476568]
>> [  342.476568] other info that might help us debug this:
>> [  342.476568]
>> [  342.486978]
>> [  342.486978] rcu_scheduler_active = 2, debug_locks = 1
>> [  342.495211] 4 locks held by modprobe/3934:
>> [  342.500265]  #0: 00000000e23116b2 (mlx5_intf_mutex){+.+.}, at:
>> mlx5_unregister_interface+0x18/0x90 [mlx5_core]
>> [  342.511953]  #1: 00000000ca16db96 (rtnl_mutex){+.+.}, at: unregister_netdev+0xe/0x20
>> [  342.521109]  #2: 00000000a46e2c4b (&priv->state_lock){+.+.}, at: mlx5e_close+0x29/0x60
>> [mlx5_core]
>> [  342.531642]  #3: 0000000060c5bde3 (mem_id_lock){+.+.}, at: xdp_rxq_info_unreg+0x93/0x6b0
>> [  342.541206]
>> [  342.541206] stack backtrace:
>> [  342.547075] CPU: 12 PID: 3934 Comm: modprobe Tainted: G           O      4.18.0-rc2+ #17
>> [  342.556621] Hardware name: Dell Inc. PowerEdge R730/0H21J3, BIOS 1.5.4 10/002/2015
>> [  342.565606] Call Trace:
>> [  342.568861]  dump_stack+0x78/0xb3
>> [  342.573086]  xdp_rxq_info_unreg+0x3f5/0x6b0
>> [  342.578285]  ? __call_rcu+0x220/0x300
>> [  342.582911]  mlx5e_free_rq+0x38/0xc0 [mlx5_core]
>> [  342.588602]  mlx5e_close_channel+0x20/0x120 [mlx5_core]
>> [  342.594976]  mlx5e_close_channels+0x26/0x40 [mlx5_core]
>> [  342.601345]  mlx5e_close_locked+0x44/0x50 [mlx5_core]
>> [  342.607519]  mlx5e_close+0x42/0x60 [mlx5_core]
>> [  342.613005]  __dev_close_many+0xb1/0x120
>> [  342.617911]  dev_close_many+0xa2/0x170
>> [  342.622622]  rollback_registered_many+0x148/0x460
>> [  342.628401]  ? __lock_acquire+0x48d/0x11b0
>> [  342.633498]  ? unregister_netdev+0xe/0x20
>> [  342.638495]  rollback_registered+0x56/0x90
>> [  342.643588]  unregister_netdevice_queue+0x7e/0x100
>> [  342.649461]  unregister_netdev+0x18/0x20
>> [  342.654362]  mlx5e_remove+0x2a/0x50 [mlx5_core]
>> [  342.659944]  mlx5_remove_device+0xe5/0x110 [mlx5_core]
>> [  342.666208]  mlx5_unregister_interface+0x39/0x90 [mlx5_core]
>> [  342.673038]  cleanup+0x5/0xbfc [mlx5_core]
>> [  342.678094]  __x64_sys_delete_module+0x16b/0x240
>> [  342.683725]  ? do_syscall_64+0x1c/0x210
>> [  342.688476]  do_syscall_64+0x5a/0x210
>> [  342.693025]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>>
>> Fixes: 8d5d88527587 ("xdp: rhashtable with allocator ID to pointer mapping")
>> Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
>> Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
>> Cc: Jesper Dangaard Brouer <brouer@redhat.com>
>> ---
>>  net/core/xdp.c | 13 +++----------
>>  1 file changed, 3 insertions(+), 10 deletions(-)
>>
>> V1 -> V2:
>> * Use rhashtable_lookup_fast and make some code movements, per Daniel's
>>   and Alexei's comments.
>>
>> Please queue to -stable v4.18.
>>
>> diff --git a/net/core/xdp.c b/net/core/xdp.c
>> index 3dd99e1c04f5..8b1c7b699982 100644
>> --- a/net/core/xdp.c
>> +++ b/net/core/xdp.c
>> @@ -105,16 +105,9 @@ static void __xdp_rxq_info_unreg_mem_model(struct xdp_rxq_info *xdp_rxq)
>>  
>>  	mutex_lock(&mem_id_lock);
>>  
>> -	xa = rhashtable_lookup(mem_id_ht, &id, mem_id_rht_params);
>> -	if (!xa) {
>> -		mutex_unlock(&mem_id_lock);
>> -		return;
>> -	}
>> -
>> -	err = rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params);
>> -	WARN_ON(err);
>> -
>> -	call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
>> +	xa = rhashtable_lookup_fast(mem_id_ht, &id, mem_id_rht_params);
>> +	if (xa && rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params))
>> +		call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
>>  
>>  	mutex_unlock(&mem_id_lock);
>>  }
> 
> This is wrong.
> 
> The function rhashtable_remove_fast() returns zero on success.
> Please, fix and send a V3.

Good catch, suggestion in https://patchwork.ozlabs.org/patch/945121/ did have the
check for success, so seems this slipped through while adapting to it.

> Look at example in [1] section "Object removal"
> [1] https://lwn.net/Articles/751374/
diff mbox series

Patch

diff --git a/net/core/xdp.c b/net/core/xdp.c
index 3dd99e1c04f5..8b1c7b699982 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -105,16 +105,9 @@  static void __xdp_rxq_info_unreg_mem_model(struct xdp_rxq_info *xdp_rxq)
 
 	mutex_lock(&mem_id_lock);
 
-	xa = rhashtable_lookup(mem_id_ht, &id, mem_id_rht_params);
-	if (!xa) {
-		mutex_unlock(&mem_id_lock);
-		return;
-	}
-
-	err = rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params);
-	WARN_ON(err);
-
-	call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
+	xa = rhashtable_lookup_fast(mem_id_ht, &id, mem_id_rht_params);
+	if (xa && rhashtable_remove_fast(mem_id_ht, &xa->node, mem_id_rht_params))
+		call_rcu(&xa->rcu, __xdp_mem_allocator_rcu_free);
 
 	mutex_unlock(&mem_id_lock);
 }