diff mbox series

[net] r8152: Re-order napi_disable in rtl8152_close

Message ID 20191120194020.8796-1-pmalani@chromium.org
State Accepted
Delegated to: David Miller
Headers show
Series [net] r8152: Re-order napi_disable in rtl8152_close | expand

Commit Message

Prashant Malani Nov. 20, 2019, 7:40 p.m. UTC
Both rtl_work_func_t() and rtl8152_close() call napi_disable().
Since the two calls aren't protected by a lock, if the close
function starts executing before the work function, we can get into a
situation where the napi_disable() function is called twice in
succession (first by rtl8152_close(), then by set_carrier()).

In such a situation, the second call would loop indefinitely, since
rtl8152_close() doesn't call napi_enable() to clear the NAPI_STATE_SCHED
bit.

The rtl8152_close() function in turn issues a
cancel_delayed_work_sync(), and so it would wait indefinitely for the
rtl_work_func_t() to complete. Since rtl8152_close() is called by a
process holding rtnl_lock() which is requested by other processes, this
eventually leads to a system deadlock and crash.

Re-order the napi_disable() call to occur after the work function
disabling and urb cancellation calls are issued.

Change-Id: I6ef0b703fc214998a037a68f722f784e1d07815e
Reported-by: http://crbug.com/1017928
Signed-off-by: Prashant Malani <pmalani@chromium.org>
---
 drivers/net/usb/r8152.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

David Miller Nov. 20, 2019, 8:49 p.m. UTC | #1
From: Prashant Malani <pmalani@chromium.org>
Date: Wed, 20 Nov 2019 11:40:21 -0800

> Both rtl_work_func_t() and rtl8152_close() call napi_disable().
> Since the two calls aren't protected by a lock, if the close
> function starts executing before the work function, we can get into a
> situation where the napi_disable() function is called twice in
> succession (first by rtl8152_close(), then by set_carrier()).
> 
> In such a situation, the second call would loop indefinitely, since
> rtl8152_close() doesn't call napi_enable() to clear the NAPI_STATE_SCHED
> bit.
> 
> The rtl8152_close() function in turn issues a
> cancel_delayed_work_sync(), and so it would wait indefinitely for the
> rtl_work_func_t() to complete. Since rtl8152_close() is called by a
> process holding rtnl_lock() which is requested by other processes, this
> eventually leads to a system deadlock and crash.
> 
> Re-order the napi_disable() call to occur after the work function
> disabling and urb cancellation calls are issued.
> 
> Change-Id: I6ef0b703fc214998a037a68f722f784e1d07815e
> Reported-by: http://crbug.com/1017928
> Signed-off-by: Prashant Malani <pmalani@chromium.org>

Applied and queued up for -stable, thanks.
Hayes Wang Nov. 21, 2019, 2:13 a.m. UTC | #2
Prashant Malani [mailto:pmalani@chromium.org]
> Sent: Thursday, November 21, 2019 3:40 AM

> Both rtl_work_func_t() and rtl8152_close() call napi_disable().
> Since the two calls aren't protected by a lock, if the close
> function starts executing before the work function, we can get into a
> situation where the napi_disable() function is called twice in
> succession (first by rtl8152_close(), then by set_carrier()).
> 
> In such a situation, the second call would loop indefinitely, since
> rtl8152_close() doesn't call napi_enable() to clear the NAPI_STATE_SCHED
> bit.
> 
> The rtl8152_close() function in turn issues a
> cancel_delayed_work_sync(), and so it would wait indefinitely for the
> rtl_work_func_t() to complete. Since rtl8152_close() is called by a
> process holding rtnl_lock() which is requested by other processes, this
> eventually leads to a system deadlock and crash.
> 
> Re-order the napi_disable() call to occur after the work function
> disabling and urb cancellation calls are issued.
> 
> Change-Id: I6ef0b703fc214998a037a68f722f784e1d07815e
> Reported-by: http://crbug.com/1017928
> Signed-off-by: Prashant Malani <pmalani@chromium.org>

Acked-by: Hayes Wang <hayeswang@realtek.com>

Thanks

Best Regards,
Hayes
Hayes Wang Nov. 21, 2019, 3 a.m. UTC | #3
Prashant Malani [mailto:pmalani@chromium.org]
> Sent: Thursday, November 21, 2019 3:40 AM
[...]
> @@ -4283,10 +4283,10 @@ static int rtl8152_close(struct net_device
> *netdev)
>  	unregister_pm_notifier(&tp->pm_notifier);
>  #endif
>  	tasklet_disable(&tp->tx_tl);

Should tasklet_disable() be moved, too?

> -	napi_disable(&tp->napi);
>  	clear_bit(WORK_ENABLE, &tp->flags);
>  	usb_kill_urb(tp->intr_urb);
>  	cancel_delayed_work_sync(&tp->schedule);
> +	napi_disable(&tp->napi);
>  	netif_stop_queue(netdev);

Best Regards,
Hayes
Prashant Malani Nov. 21, 2019, 8:50 a.m. UTC | #4
On Wed, Nov 20, 2019 at 7:00 PM Hayes Wang <hayeswang@realtek.com> wrote:
>
> Prashant Malani [mailto:pmalani@chromium.org]
> > Sent: Thursday, November 21, 2019 3:40 AM
> [...]
> > @@ -4283,10 +4283,10 @@ static int rtl8152_close(struct net_device
> > *netdev)
> >       unregister_pm_notifier(&tp->pm_notifier);
> >  #endif
> >       tasklet_disable(&tp->tx_tl);
>
> Should tasklet_disable() be moved, too?
Perhaps; I'm not too familiar with what the tasklet bottom half does
to be able to conclusively say. Probably best to leave it as is
(somewhat symmetrical to rtl8152_open()) if it is not causing any
races in its current location?
Moving if after cancel_delayed_work_sync() would effectively not do
much IIUC since WORK_ENABLE is already cleared by then and that is one
of the guard clauses inside bottom_half(see :
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/usb/r8152.c#n2423),
so effectively, it's disabled as soon as WORK_ENABLE is cleared. I
might be mistaken here though.
>
> > -     napi_disable(&tp->napi);
> >       clear_bit(WORK_ENABLE, &tp->flags);
> >       usb_kill_urb(tp->intr_urb);
> >       cancel_delayed_work_sync(&tp->schedule);
> > +     napi_disable(&tp->napi);
> >       netif_stop_queue(netdev);
>
> Best Regards,
> Hayes
>
>
diff mbox series

Patch

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index d4a95b50bda6b..4d34c01826f30 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -4283,10 +4283,10 @@  static int rtl8152_close(struct net_device *netdev)
 	unregister_pm_notifier(&tp->pm_notifier);
 #endif
 	tasklet_disable(&tp->tx_tl);
-	napi_disable(&tp->napi);
 	clear_bit(WORK_ENABLE, &tp->flags);
 	usb_kill_urb(tp->intr_urb);
 	cancel_delayed_work_sync(&tp->schedule);
+	napi_disable(&tp->napi);
 	netif_stop_queue(netdev);
 
 	res = usb_autopm_get_interface(tp->intf);