diff mbox

[v2] wpa_supplicant: Denitialize the driver if the last user went away

Message ID 1414657490-22551-1-git-send-email-lkundrak@v3.sk
State Changes Requested
Headers show

Commit Message

Lubomir Rintel Oct. 30, 2014, 8:24 a.m. UTC
It might be that the underlying infrastrucutre went away and the state is no
longer valid. We ought to reinitialize it once a device appears again.

This is the case when the nl80211 devices disappear and cfg8011 is remoed
afterwards. The netlink handle is no longer valid (returns ENOENT) and a new
one needs to be open if it's loaded back.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
---
Changes since v1:
    - wpa_s->global_drv_priv unset moved to correct place
 wpa_supplicant/wpa_supplicant.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

Comments

Jouni Malinen Nov. 1, 2014, 3:08 p.m. UTC | #1
On Thu, Oct 30, 2014 at 09:24:50AM +0100, Lubomir Rintel wrote:
> It might be that the underlying infrastrucutre went away and the state is no
> longer valid. We ought to reinitialize it once a device appears again.
> 
> This is the case when the nl80211 devices disappear and cfg8011 is remoed
> afterwards. The netlink handle is no longer valid (returns ENOENT) and a new
> one needs to be open if it's loaded back.

This is not the way the global_init/global_deinit was originally meant
to be used, i.e., the assumption was more like this getting initialized
once and then remain available even if no interfaces are enabled, e.g.,
for get_interfaces() call to work.

> diff --git a/wpa_supplicant/wpa_supplicant.c b/wpa_supplicant/wpa_supplicant.c
> @@ -3925,6 +3925,25 @@ static int wpa_supplicant_init_iface(struct wpa_supplicant *wpa_s,
> +/* Deinitialize the driver if we're the last user. */
> +static void wpa_drv_cleanup(struct wpa_supplicant *wpa_s)
> +{
> +	struct wpa_global *global = wpa_s->global;
> +	struct wpa_supplicant *iface;
> +	int i;
> +
> +	for (iface = global->ifaces; iface; iface = iface->next)
> +		if (iface != wpa_s && iface->driver == wpa_s->driver)
> +			return;
> +
> +	for (i = 0; wpa_drivers[i]; i++)
> +		if (global->drv_priv[i] == wpa_s->global_drv_priv)
> +			global->drv_priv[i] = NULL;
> +
> +	if (wpa_s->driver->global_deinit)

That would be a NULL pointer dereference if interface initialization
fails early enough (wpa_s->driver == NULL in that case).

In addition to this, there is something wrong in how the global deinit
gets handled, i.e., at least one more place can trigger a SIGSEGV when
running through mac80211_hwsim test cases. As such, I cannot really
apply this until those issues have been found and fixed. Then again, I'm
not sure whether this is really the best way of handling the issue which
I'm assuming is the main reason for this patch, i.e., need to
re-initialize some socket for the case of cfg80211 getting reloaded.
Lubomir Rintel Nov. 11, 2014, 1:05 p.m. UTC | #2
On Sat, 2014-11-01 at 17:08 +0200, Jouni Malinen wrote:
> On Thu, Oct 30, 2014 at 09:24:50AM +0100, Lubomir Rintel wrote:
> > It might be that the underlying infrastrucutre went away and the state is no
> > longer valid. We ought to reinitialize it once a device appears again.
> > 
> > This is the case when the nl80211 devices disappear and cfg8011 is remoed
> > afterwards. The netlink handle is no longer valid (returns ENOENT) and a new
> > one needs to be open if it's loaded back.
> 
> This is not the way the global_init/global_deinit was originally meant
> to be used, i.e., the assumption was more like this getting initialized
> once and then remain available even if no interfaces are enabled, e.g.,
> for get_interfaces() call to work.

I guess this is something specific to Windows NDIS drivers (only one
that seems to implement get_interfaces).

It seems to me that get_interfaces works whether the driver is
initialized or not; maybe it shouldn't even be passwd the global_priv
structure as it's unused anyway?

> > diff --git a/wpa_supplicant/wpa_supplicant.c b/wpa_supplicant/wpa_supplicant.c
> > @@ -3925,6 +3925,25 @@ static int wpa_supplicant_init_iface(struct wpa_supplicant *wpa_s,
> > +/* Deinitialize the driver if we're the last user. */
> > +static void wpa_drv_cleanup(struct wpa_supplicant *wpa_s)
> > +{
> > +	struct wpa_global *global = wpa_s->global;
> > +	struct wpa_supplicant *iface;
> > +	int i;
> > +
> > +	for (iface = global->ifaces; iface; iface = iface->next)
> > +		if (iface != wpa_s && iface->driver == wpa_s->driver)
> > +			return;
> > +
> > +	for (i = 0; wpa_drivers[i]; i++)
> > +		if (global->drv_priv[i] == wpa_s->global_drv_priv)
> > +			global->drv_priv[i] = NULL;
> > +
> > +	if (wpa_s->driver->global_deinit)
> 
> That would be a NULL pointer dereference if interface initialization
> fails early enough (wpa_s->driver == NULL in that case).

I see. Will fix.

> In addition to this, there is something wrong in how the global deinit
> gets handled, i.e., at least one more place can trigger a SIGSEGV when
> running through mac80211_hwsim test cases.

I've run the test suite. It runs now for hours, but I noticed no crashes
so far. However I see a lot of timeout and FAIL messages. I'm wondering
if some tests are expected to fail in master tree, or I need to fix my
setup:

[lkundrak@fedora20-2 hwsim]$ ./run-all.sh
START offchannel_tx_roc_gas 1/764
Exception: ANQP_GET command failed
FAIL offchannel_tx_roc_gas 0.283542 2014-11-11 13:28:00.406416
...
START p2p_device_group_remove 420/764
Exception: [Errno 111] Connection refused
FAIL p2p_device_group_remove 0.104766 2014-11-11 13:58:50.080920
...

> As such, I cannot really
> apply this until those issues have been found and fixed. Then again, I'm
> not sure whether this is really the best way of handling the issue which
> I'm assuming is the main reason for this patch, i.e., need to
> re-initialize some socket for the case of cfg80211 getting reloaded.

Correct. What would be the other possible fix? Address that in kernel?

Lubo
Jouni Malinen Nov. 23, 2014, 7:55 p.m. UTC | #3
On Tue, Nov 11, 2014 at 02:05:53PM +0100, Lubomir Rintel wrote:
> On Sat, 2014-11-01 at 17:08 +0200, Jouni Malinen wrote:
> > This is not the way the global_init/global_deinit was originally meant
> > to be used, i.e., the assumption was more like this getting initialized
> > once and then remain available even if no interfaces are enabled, e.g.,
> > for get_interfaces() call to work.
> 
> I guess this is something specific to Windows NDIS drivers (only one
> that seems to implement get_interfaces).
> 
> It seems to me that get_interfaces works whether the driver is
> initialized or not; maybe it shouldn't even be passwd the global_priv
> structure as it's unused anyway?

This is not really used currently, but that does not change the reasons
for the design. Global context was expected to be available and remain
available even if driver interfaces were not in use.

> I've run the test suite. It runs now for hours, but I noticed no crashes
> so far. However I see a lot of timeout and FAIL messages. I'm wondering
> if some tests are expected to fail in master tree, or I need to fix my
> setup:

All the test cases pass in my tests (or well, if running under heavy
load, couple of test cases may fail out of about 800 every now and
then).

> [lkundrak@fedora20-2 hwsim]$ ./run-all.sh
> START offchannel_tx_roc_gas 1/764
> Exception: ANQP_GET command failed
> FAIL offchannel_tx_roc_gas 0.283542 2014-11-11 13:28:00.406416
> ...
> START p2p_device_group_remove 420/764
> Exception: [Errno 111] Connection refused
> FAIL p2p_device_group_remove 0.104766 2014-11-11 13:58:50.080920
> ...

That "Connection refused" is one of the most common indication of
hostapd or wpa_supplicant having crashed during the test..

> > As such, I cannot really
> > apply this until those issues have been found and fixed. Then again, I'm
> > not sure whether this is really the best way of handling the issue which
> > I'm assuming is the main reason for this patch, i.e., need to
> > re-initialize some socket for the case of cfg80211 getting reloaded.
> 
> Correct. What would be the other possible fix? Address that in kernel?

It could be enough to just re-initialize the netlink sockets in global
context within driver_nl80211.c if they start showing errors that could
indicate that cfg80211 has been reloaded.
Jouni Malinen Nov. 23, 2014, 8:12 p.m. UTC | #4
On Sun, Nov 23, 2014 at 09:55:24PM +0200, Jouni Malinen wrote:
> On Tue, Nov 11, 2014 at 02:05:53PM +0100, Lubomir Rintel wrote:
> > Correct. What would be the other possible fix? Address that in kernel?
> 
> It could be enough to just re-initialize the netlink sockets in global
> context within driver_nl80211.c if they start showing errors that could
> indicate that cfg80211 has been reloaded.

Actually, it may be easier to simply always deinit the nl80211_global
allocated resources when the last interface gets removed from
nl80211_global::interfaces list at the end of
wpa_driver_nl80211_deinit() (and then re-init them on next interface add
in wpa_driver_nl80211_drv_init()). I'd assume this would be pretty close
to what you were trying to do, but limited to be within driver_nl80211.c
and as such, not affecting any other uses of the global driver wrapper
context.
diff mbox

Patch

diff --git a/wpa_supplicant/wpa_supplicant.c b/wpa_supplicant/wpa_supplicant.c
index 5a4d8dc..2099ef8 100644
--- a/wpa_supplicant/wpa_supplicant.c
+++ b/wpa_supplicant/wpa_supplicant.c
@@ -3925,6 +3925,25 @@  static int wpa_supplicant_init_iface(struct wpa_supplicant *wpa_s,
 	return 0;
 }
 
+/* Deinitialize the driver if we're the last user. */
+static void wpa_drv_cleanup(struct wpa_supplicant *wpa_s)
+{
+	struct wpa_global *global = wpa_s->global;
+	struct wpa_supplicant *iface;
+	int i;
+
+	for (iface = global->ifaces; iface; iface = iface->next)
+		if (iface != wpa_s && iface->driver == wpa_s->driver)
+			return;
+
+	for (i = 0; wpa_drivers[i]; i++)
+		if (global->drv_priv[i] == wpa_s->global_drv_priv)
+			global->drv_priv[i] = NULL;
+
+	if (wpa_s->driver->global_deinit)
+		wpa_s->driver->global_deinit (wpa_s->global_drv_priv);
+	wpa_s->global_drv_priv = NULL;
+}
 
 static void wpa_supplicant_deinit_iface(struct wpa_supplicant *wpa_s,
 					int notify, int terminate)
@@ -3967,6 +3986,8 @@  static void wpa_supplicant_deinit_iface(struct wpa_supplicant *wpa_s,
 	if (wpa_s->drv_priv)
 		wpa_drv_deinit(wpa_s);
 
+	wpa_drv_cleanup(wpa_s);
+
 	if (notify)
 		wpas_notify_iface_removed(wpa_s);