Message ID | 20191120194020.8796-1-pmalani@chromium.org |
---|---|
State | Accepted |
Delegated to: | David Miller |
Headers | show |
Series | [net] r8152: Re-order napi_disable in rtl8152_close | expand |
From: Prashant Malani <pmalani@chromium.org> Date: Wed, 20 Nov 2019 11:40:21 -0800 > Both rtl_work_func_t() and rtl8152_close() call napi_disable(). > Since the two calls aren't protected by a lock, if the close > function starts executing before the work function, we can get into a > situation where the napi_disable() function is called twice in > succession (first by rtl8152_close(), then by set_carrier()). > > In such a situation, the second call would loop indefinitely, since > rtl8152_close() doesn't call napi_enable() to clear the NAPI_STATE_SCHED > bit. > > The rtl8152_close() function in turn issues a > cancel_delayed_work_sync(), and so it would wait indefinitely for the > rtl_work_func_t() to complete. Since rtl8152_close() is called by a > process holding rtnl_lock() which is requested by other processes, this > eventually leads to a system deadlock and crash. > > Re-order the napi_disable() call to occur after the work function > disabling and urb cancellation calls are issued. > > Change-Id: I6ef0b703fc214998a037a68f722f784e1d07815e > Reported-by: http://crbug.com/1017928 > Signed-off-by: Prashant Malani <pmalani@chromium.org> Applied and queued up for -stable, thanks.
Prashant Malani [mailto:pmalani@chromium.org] > Sent: Thursday, November 21, 2019 3:40 AM > Both rtl_work_func_t() and rtl8152_close() call napi_disable(). > Since the two calls aren't protected by a lock, if the close > function starts executing before the work function, we can get into a > situation where the napi_disable() function is called twice in > succession (first by rtl8152_close(), then by set_carrier()). > > In such a situation, the second call would loop indefinitely, since > rtl8152_close() doesn't call napi_enable() to clear the NAPI_STATE_SCHED > bit. > > The rtl8152_close() function in turn issues a > cancel_delayed_work_sync(), and so it would wait indefinitely for the > rtl_work_func_t() to complete. Since rtl8152_close() is called by a > process holding rtnl_lock() which is requested by other processes, this > eventually leads to a system deadlock and crash. > > Re-order the napi_disable() call to occur after the work function > disabling and urb cancellation calls are issued. > > Change-Id: I6ef0b703fc214998a037a68f722f784e1d07815e > Reported-by: http://crbug.com/1017928 > Signed-off-by: Prashant Malani <pmalani@chromium.org> Acked-by: Hayes Wang <hayeswang@realtek.com> Thanks Best Regards, Hayes
Prashant Malani [mailto:pmalani@chromium.org] > Sent: Thursday, November 21, 2019 3:40 AM [...] > @@ -4283,10 +4283,10 @@ static int rtl8152_close(struct net_device > *netdev) > unregister_pm_notifier(&tp->pm_notifier); > #endif > tasklet_disable(&tp->tx_tl); Should tasklet_disable() be moved, too? > - napi_disable(&tp->napi); > clear_bit(WORK_ENABLE, &tp->flags); > usb_kill_urb(tp->intr_urb); > cancel_delayed_work_sync(&tp->schedule); > + napi_disable(&tp->napi); > netif_stop_queue(netdev); Best Regards, Hayes
On Wed, Nov 20, 2019 at 7:00 PM Hayes Wang <hayeswang@realtek.com> wrote: > > Prashant Malani [mailto:pmalani@chromium.org] > > Sent: Thursday, November 21, 2019 3:40 AM > [...] > > @@ -4283,10 +4283,10 @@ static int rtl8152_close(struct net_device > > *netdev) > > unregister_pm_notifier(&tp->pm_notifier); > > #endif > > tasklet_disable(&tp->tx_tl); > > Should tasklet_disable() be moved, too? Perhaps; I'm not too familiar with what the tasklet bottom half does to be able to conclusively say. Probably best to leave it as is (somewhat symmetrical to rtl8152_open()) if it is not causing any races in its current location? Moving if after cancel_delayed_work_sync() would effectively not do much IIUC since WORK_ENABLE is already cleared by then and that is one of the guard clauses inside bottom_half(see : https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/usb/r8152.c#n2423), so effectively, it's disabled as soon as WORK_ENABLE is cleared. I might be mistaken here though. > > > - napi_disable(&tp->napi); > > clear_bit(WORK_ENABLE, &tp->flags); > > usb_kill_urb(tp->intr_urb); > > cancel_delayed_work_sync(&tp->schedule); > > + napi_disable(&tp->napi); > > netif_stop_queue(netdev); > > Best Regards, > Hayes > >
diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index d4a95b50bda6b..4d34c01826f30 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -4283,10 +4283,10 @@ static int rtl8152_close(struct net_device *netdev) unregister_pm_notifier(&tp->pm_notifier); #endif tasklet_disable(&tp->tx_tl); - napi_disable(&tp->napi); clear_bit(WORK_ENABLE, &tp->flags); usb_kill_urb(tp->intr_urb); cancel_delayed_work_sync(&tp->schedule); + napi_disable(&tp->napi); netif_stop_queue(netdev); res = usb_autopm_get_interface(tp->intf);
Both rtl_work_func_t() and rtl8152_close() call napi_disable(). Since the two calls aren't protected by a lock, if the close function starts executing before the work function, we can get into a situation where the napi_disable() function is called twice in succession (first by rtl8152_close(), then by set_carrier()). In such a situation, the second call would loop indefinitely, since rtl8152_close() doesn't call napi_enable() to clear the NAPI_STATE_SCHED bit. The rtl8152_close() function in turn issues a cancel_delayed_work_sync(), and so it would wait indefinitely for the rtl_work_func_t() to complete. Since rtl8152_close() is called by a process holding rtnl_lock() which is requested by other processes, this eventually leads to a system deadlock and crash. Re-order the napi_disable() call to occur after the work function disabling and urb cancellation calls are issued. Change-Id: I6ef0b703fc214998a037a68f722f784e1d07815e Reported-by: http://crbug.com/1017928 Signed-off-by: Prashant Malani <pmalani@chromium.org> --- drivers/net/usb/r8152.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)