diff mbox series

[net] net/smc: cancel event worker during device removal

Message ID 20200306134518.84416-1-kgraul@linux.ibm.com
State Changes Requested
Delegated to: David Miller
Headers show
Series [net] net/smc: cancel event worker during device removal | expand

Commit Message

Karsten Graul March 6, 2020, 1:45 p.m. UTC
During IB device removal, cancel the event worker before the device
structure is freed. In the worker, check if the device is being
terminated and do not proceed with the event work in that case.

Fixes: a4cf0443c414 ("smc: introduce SMC as an IB-client")
Reported-by: syzbot+b297c6825752e7a07272@syzkaller.appspotmail.com
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
---
 net/smc/smc_ib.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Leon Romanovsky March 8, 2020, 3:01 p.m. UTC | #1
On Fri, Mar 06, 2020 at 02:45:18PM +0100, Karsten Graul wrote:
> During IB device removal, cancel the event worker before the device
> structure is freed. In the worker, check if the device is being
> terminated and do not proceed with the event work in that case.
>
> Fixes: a4cf0443c414 ("smc: introduce SMC as an IB-client")
> Reported-by: syzbot+b297c6825752e7a07272@syzkaller.appspotmail.com
> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
> ---
>  net/smc/smc_ib.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> index d6ba186f67e2..5e4e64a9aa4b 100644
> --- a/net/smc/smc_ib.c
> +++ b/net/smc/smc_ib.c
> @@ -240,6 +240,9 @@ static void smc_ib_port_event_work(struct work_struct *work)
>  		work, struct smc_ib_device, port_event_work);
>  	u8 port_idx;
>
> +	if (list_empty(&smcibdev->list))
> +		return;
> +

How can it be true if you are not holding "smc_ib_devices.lock" during
execution of smc_ib_port_event_work()?

>  	for_each_set_bit(port_idx, &smcibdev->port_event_mask, SMC_MAX_PORTS) {
>  		smc_ib_remember_port_attr(smcibdev, port_idx + 1);
>  		clear_bit(port_idx, &smcibdev->port_event_mask);
> @@ -582,6 +585,7 @@ static void smc_ib_remove_dev(struct ib_device *ibdev, void *client_data)
>  	smc_smcr_terminate_all(smcibdev);
>  	smc_ib_cleanup_per_ibdev(smcibdev);
>  	ib_unregister_event_handler(&smcibdev->event_handler);
> +	cancel_work_sync(&smcibdev->port_event_work);
>  	kfree(smcibdev);
>  }
>
> --
> 2.17.1
>
Karsten Graul March 8, 2020, 7:59 p.m. UTC | #2
On 08/03/2020 16:01, Leon Romanovsky wrote:
> On Fri, Mar 06, 2020 at 02:45:18PM +0100, Karsten Graul wrote:
>> During IB device removal, cancel the event worker before the device
>> structure is freed. In the worker, check if the device is being
>> terminated and do not proceed with the event work in that case.
>>
>> Fixes: a4cf0443c414 ("smc: introduce SMC as an IB-client")
>> Reported-by: syzbot+b297c6825752e7a07272@syzkaller.appspotmail.com
>> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
>> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
>> ---
>>  net/smc/smc_ib.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
>> index d6ba186f67e2..5e4e64a9aa4b 100644
>> --- a/net/smc/smc_ib.c
>> +++ b/net/smc/smc_ib.c
>> @@ -240,6 +240,9 @@ static void smc_ib_port_event_work(struct work_struct *work)
>>  		work, struct smc_ib_device, port_event_work);
>>  	u8 port_idx;
>>
>> +	if (list_empty(&smcibdev->list))
>> +		return;
>> +
> 
> How can it be true if you are not holding "smc_ib_devices.lock" during
> execution of smc_ib_port_event_work()?
> 

It is true when smc_ib_remove_dev() runs before the work actually started.
Other than that its only a shortcut to return earlier, when the item is 
removed from the list after the check then the processing just takes a 
little bit longer...its still save.

>>  	for_each_set_bit(port_idx, &smcibdev->port_event_mask, SMC_MAX_PORTS) {
>>  		smc_ib_remember_port_attr(smcibdev, port_idx + 1);
>>  		clear_bit(port_idx, &smcibdev->port_event_mask);
>> @@ -582,6 +585,7 @@ static void smc_ib_remove_dev(struct ib_device *ibdev, void *client_data)
>>  	smc_smcr_terminate_all(smcibdev);
>>  	smc_ib_cleanup_per_ibdev(smcibdev);
>>  	ib_unregister_event_handler(&smcibdev->event_handler);
>> +	cancel_work_sync(&smcibdev->port_event_work);
>>  	kfree(smcibdev);
>>  }
>>
>> --
>> 2.17.1
>>
Leon Romanovsky March 9, 2020, 8:04 a.m. UTC | #3
On Sun, Mar 08, 2020 at 08:59:33PM +0100, Karsten Graul wrote:
> On 08/03/2020 16:01, Leon Romanovsky wrote:
> > On Fri, Mar 06, 2020 at 02:45:18PM +0100, Karsten Graul wrote:
> >> During IB device removal, cancel the event worker before the device
> >> structure is freed. In the worker, check if the device is being
> >> terminated and do not proceed with the event work in that case.
> >>
> >> Fixes: a4cf0443c414 ("smc: introduce SMC as an IB-client")
> >> Reported-by: syzbot+b297c6825752e7a07272@syzkaller.appspotmail.com
> >> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
> >> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
> >> ---
> >>  net/smc/smc_ib.c | 4 ++++
> >>  1 file changed, 4 insertions(+)
> >>
> >> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> >> index d6ba186f67e2..5e4e64a9aa4b 100644
> >> --- a/net/smc/smc_ib.c
> >> +++ b/net/smc/smc_ib.c
> >> @@ -240,6 +240,9 @@ static void smc_ib_port_event_work(struct work_struct *work)
> >>  		work, struct smc_ib_device, port_event_work);
> >>  	u8 port_idx;
> >>
> >> +	if (list_empty(&smcibdev->list))
> >> +		return;
> >> +
> >
> > How can it be true if you are not holding "smc_ib_devices.lock" during
> > execution of smc_ib_port_event_work()?
> >
>
> It is true when smc_ib_remove_dev() runs before the work actually started.
> Other than that its only a shortcut to return earlier, when the item is
> removed from the list after the check then the processing just takes a
> little bit longer...its still save.

The check itself maybe safe, but it can't fix syzkaller bug reported above.
As you said, the smc_ib_remove_dev() can be called immediately after
your list_empty() check and we return to original behavior.

The correct design will be to ensure that smc_ib_port_event_work() is
executed only smcibdev->list is not empty.

Thanks

>
> >>  	for_each_set_bit(port_idx, &smcibdev->port_event_mask, SMC_MAX_PORTS) {
> >>  		smc_ib_remember_port_attr(smcibdev, port_idx + 1);
> >>  		clear_bit(port_idx, &smcibdev->port_event_mask);
> >> @@ -582,6 +585,7 @@ static void smc_ib_remove_dev(struct ib_device *ibdev, void *client_data)
> >>  	smc_smcr_terminate_all(smcibdev);
> >>  	smc_ib_cleanup_per_ibdev(smcibdev);
> >>  	ib_unregister_event_handler(&smcibdev->event_handler);
> >> +	cancel_work_sync(&smcibdev->port_event_work);
> >>  	kfree(smcibdev);
> >>  }
> >>
> >> --
> >> 2.17.1
> >>
>
> --
> Karsten
>
> (I'm a dude)
>
Karsten Graul March 9, 2020, 9:40 a.m. UTC | #4
On 09/03/2020 09:04, Leon Romanovsky wrote:
> On Sun, Mar 08, 2020 at 08:59:33PM +0100, Karsten Graul wrote:
>> On 08/03/2020 16:01, Leon Romanovsky wrote:
>>> On Fri, Mar 06, 2020 at 02:45:18PM +0100, Karsten Graul wrote:
>>>> During IB device removal, cancel the event worker before the device
>>>> structure is freed. In the worker, check if the device is being
>>>> terminated and do not proceed with the event work in that case.
>>>>
>>>> Fixes: a4cf0443c414 ("smc: introduce SMC as an IB-client")
>>>> Reported-by: syzbot+b297c6825752e7a07272@syzkaller.appspotmail.com
>>>> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
>>>> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
>>>> ---
>>>>  net/smc/smc_ib.c | 4 ++++
>>>>  1 file changed, 4 insertions(+)
>>>>
>>>> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
>>>> index d6ba186f67e2..5e4e64a9aa4b 100644
>>>> --- a/net/smc/smc_ib.c
>>>> +++ b/net/smc/smc_ib.c
>>>> @@ -240,6 +240,9 @@ static void smc_ib_port_event_work(struct work_struct *work)
>>>>  		work, struct smc_ib_device, port_event_work);
>>>>  	u8 port_idx;
>>>>
>>>> +	if (list_empty(&smcibdev->list))
>>>> +		return;
>>>> +
>>>
>>> How can it be true if you are not holding "smc_ib_devices.lock" during
>>> execution of smc_ib_port_event_work()?
>>>
>>
>> It is true when smc_ib_remove_dev() runs before the work actually started.
>> Other than that its only a shortcut to return earlier, when the item is
>> removed from the list after the check then the processing just takes a
>> little bit longer...its still save.
> 
> The check itself maybe safe, but it can't fix syzkaller bug reported above.
> As you said, the smc_ib_remove_dev() can be called immediately after
> your list_empty() check and we return to original behavior.
> 
> The correct design will be to ensure that smc_ib_port_event_work() is
> executed only smcibdev->list is not empty.
> 
> Thanks
> 

The fix I had in mind was the

	cancel_work_sync(&smcibdev->port_event_work);

to wait for a running port_event_work to finish before smcibdev is freed.
I can remove the list_empty() check if that is too confusing.

>>
>>>>  	for_each_set_bit(port_idx, &smcibdev->port_event_mask, SMC_MAX_PORTS) {
>>>>  		smc_ib_remember_port_attr(smcibdev, port_idx + 1);
>>>>  		clear_bit(port_idx, &smcibdev->port_event_mask);
>>>> @@ -582,6 +585,7 @@ static void smc_ib_remove_dev(struct ib_device *ibdev, void *client_data)
>>>>  	smc_smcr_terminate_all(smcibdev);
>>>>  	smc_ib_cleanup_per_ibdev(smcibdev);
>>>>  	ib_unregister_event_handler(&smcibdev->event_handler);
>>>> +	cancel_work_sync(&smcibdev->port_event_work);
>>>>  	kfree(smcibdev);
>>>>  }
>>>>
>>>> --
>>>> 2.17.1
>>>>
>>
>> --
>> Karsten
>>
>> (I'm a dude)
>>
Leon Romanovsky March 9, 2020, 1:19 p.m. UTC | #5
On Mon, Mar 09, 2020 at 10:40:16AM +0100, Karsten Graul wrote:
> On 09/03/2020 09:04, Leon Romanovsky wrote:
> > On Sun, Mar 08, 2020 at 08:59:33PM +0100, Karsten Graul wrote:
> >> On 08/03/2020 16:01, Leon Romanovsky wrote:
> >>> On Fri, Mar 06, 2020 at 02:45:18PM +0100, Karsten Graul wrote:
> >>>> During IB device removal, cancel the event worker before the device
> >>>> structure is freed. In the worker, check if the device is being
> >>>> terminated and do not proceed with the event work in that case.
> >>>>
> >>>> Fixes: a4cf0443c414 ("smc: introduce SMC as an IB-client")
> >>>> Reported-by: syzbot+b297c6825752e7a07272@syzkaller.appspotmail.com
> >>>> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
> >>>> Reviewed-by: Ursula Braun <ubraun@linux.ibm.com>
> >>>> ---
> >>>>  net/smc/smc_ib.c | 4 ++++
> >>>>  1 file changed, 4 insertions(+)
> >>>>
> >>>> diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
> >>>> index d6ba186f67e2..5e4e64a9aa4b 100644
> >>>> --- a/net/smc/smc_ib.c
> >>>> +++ b/net/smc/smc_ib.c
> >>>> @@ -240,6 +240,9 @@ static void smc_ib_port_event_work(struct work_struct *work)
> >>>>  		work, struct smc_ib_device, port_event_work);
> >>>>  	u8 port_idx;
> >>>>
> >>>> +	if (list_empty(&smcibdev->list))
> >>>> +		return;
> >>>> +
> >>>
> >>> How can it be true if you are not holding "smc_ib_devices.lock" during
> >>> execution of smc_ib_port_event_work()?
> >>>
> >>
> >> It is true when smc_ib_remove_dev() runs before the work actually started.
> >> Other than that its only a shortcut to return earlier, when the item is
> >> removed from the list after the check then the processing just takes a
> >> little bit longer...its still save.
> >
> > The check itself maybe safe, but it can't fix syzkaller bug reported above.
> > As you said, the smc_ib_remove_dev() can be called immediately after
> > your list_empty() check and we return to original behavior.
> >
> > The correct design will be to ensure that smc_ib_port_event_work() is
> > executed only smcibdev->list is not empty.
> >
> > Thanks
> >
>
> The fix I had in mind was the
>
> 	cancel_work_sync(&smcibdev->port_event_work);
>
> to wait for a running port_event_work to finish before smcibdev is freed.
> I can remove the list_empty() check if that is too confusing.

Yes, please.

Thanks

>
> >>
> >>>>  	for_each_set_bit(port_idx, &smcibdev->port_event_mask, SMC_MAX_PORTS) {
> >>>>  		smc_ib_remember_port_attr(smcibdev, port_idx + 1);
> >>>>  		clear_bit(port_idx, &smcibdev->port_event_mask);
> >>>> @@ -582,6 +585,7 @@ static void smc_ib_remove_dev(struct ib_device *ibdev, void *client_data)
> >>>>  	smc_smcr_terminate_all(smcibdev);
> >>>>  	smc_ib_cleanup_per_ibdev(smcibdev);
> >>>>  	ib_unregister_event_handler(&smcibdev->event_handler);
> >>>> +	cancel_work_sync(&smcibdev->port_event_work);
> >>>>  	kfree(smcibdev);
> >>>>  }
> >>>>
> >>>> --
> >>>> 2.17.1
> >>>>
> >>
> >> --
> >> Karsten
> >>
> >> (I'm a dude)
> >>
>
> --
> Karsten
>
> (I'm a dude)
>
diff mbox series

Patch

diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
index d6ba186f67e2..5e4e64a9aa4b 100644
--- a/net/smc/smc_ib.c
+++ b/net/smc/smc_ib.c
@@ -240,6 +240,9 @@  static void smc_ib_port_event_work(struct work_struct *work)
 		work, struct smc_ib_device, port_event_work);
 	u8 port_idx;
 
+	if (list_empty(&smcibdev->list))
+		return;
+
 	for_each_set_bit(port_idx, &smcibdev->port_event_mask, SMC_MAX_PORTS) {
 		smc_ib_remember_port_attr(smcibdev, port_idx + 1);
 		clear_bit(port_idx, &smcibdev->port_event_mask);
@@ -582,6 +585,7 @@  static void smc_ib_remove_dev(struct ib_device *ibdev, void *client_data)
 	smc_smcr_terminate_all(smcibdev);
 	smc_ib_cleanup_per_ibdev(smcibdev);
 	ib_unregister_event_handler(&smcibdev->event_handler);
+	cancel_work_sync(&smcibdev->port_event_work);
 	kfree(smcibdev);
 }