Patchwork [v4,24/25] VLAN: do not remove VLAN interfaces due to kernel bug

login
register
mail settings
Submitter michael-dev@fami-braun.de
Date July 27, 2013, 7:56 p.m.
Message ID <20130727195627.17627.11155.stgit@ray-controller>
Download mbox | patch
Permalink /patch/262526/
State Superseded
Headers show

Comments

michael-dev@fami-braun.de - July 27, 2013, 7:56 p.m.
Since running this hostapd version, I continously
see kernel warnings. I suspect them to be due to
interfaces removal now occuring on STA freeing,
which is happen frequently on my test node.

This patch makes interface freeing optional
and deferred by default.

Signed-hostap: Michael Braun <michael-dev@fami-braun.de>

WARNING: at /opt/openwrt/stage2/trunk/build_dir/target-powerpc_uClibc-0.9.33.2/linux-mpc85xx_p1020wlan-fem/compat-wireless-2013-04-16/net/mac80211/wpa.c:68
[  876.001783] [eef81e10] [f57eef2c] rate_control_get_rate+0xb4/0x100 [mac80211] (unreliable)
[  876.010055] [eef81e30] [f57fbd24] ieee80211_proberesp_get+0xbcc/0x1320 [mac80211]
[  876.017540] [eef81eb0] [f57fc598] ieee80211_get_buffered_bc+0x120/0x140 [mac80211]
[  876.025111] [eef81f10] [f5bf8a1c] ath9k_beacon_tasklet+0x434/0x684 [ath9k]
[  876.031988] [eef81f70] [c002c514] tasklet_action+0xe0/0x174
[  876.037558] [eef81fa0] [c002ca6c] __do_softirq+0xe0/0x18c
[  876.042952] [eef81ff0] [c000bef8] call_do_softirq+0x14/0x24
[  876.048522] [c03d5e70] [c000459c] do_softirq+0x80/0xc0
[  876.053657] [c03d5e90] [c002cc54] irq_exit+0x60/0x8c
[  876.058616] [c03d5ea0] [c0004374] do_IRQ+0x12c/0x150
[  876.063580] [c03d5ed0] [c000dee8] ret_from_except+0x0/0x18
[  876.069066] --- Exception: 501 at cpu_idle+0x84/0xd0
[  876.069066]     LR = cpu_idle+0x84/0xd0
[  876.077846] [c03d5f90] [c00078ec] cpu_idle+0x44/0xd0 (unreliable)
[  876.083938] [c03d5fb0] [c0002724] rest_init+0x68/0x78
[  876.088989] [c03d5fc0] [c037a7bc] start_kernel+0x2e0/0x2f4
[  876.094469] [c03d5ff0] [c00003f8] skpinv+0x2e4/0x320

[ 2523.961923] ------------[ cut here ]------------
[ 2523.966547] WARNING: at /opt/openwrt/stage2/trunk/build_dir/target-powerpc_uClibc-0.9.33.2/linux-mpc85xx_p1020wlan-fem/compat-wireless-2013-04-16/net/mac80211/chan.c:218
[ 2524.209752] [ef8a9990] [f5803838] ieee80211_iter_chan_contexts_atomic+0x378/0x3f4 [mac80211] (unreliable)
[ 2524.219326] [ef8a99c0] [f5803cac] ieee80211_recalc_smps_chanctx+0x304/0x358 [mac80211]
[ 2524.227247] [ef8a99e0] [f5804218] ieee80211_vif_release_channel+0x48/0x68 [mac80211]
[ 2524.234993] [ef8a99f0] [f57f2fd0] ieee80211_aes_cmac_key_free+0x2c04/0x4648 [mac80211]
[ 2524.242922] [ef8a9a10] [f56fb96c] cfg80211_stop_ap+0x74/0x3e8 [cfg80211]
[ 2524.249623] [ef8a9a30] [f56de740] cfg80211_leave+0xdc/0x9a0 [cfg80211]
[ 2524.256149] [ef8a9a50] [f56de984] cfg80211_leave+0x320/0x9a0 [cfg80211]
[ 2524.262767] [ef8a9ab0] [c004a054] notifier_call_chain+0x4c/0xa4
[ 2524.268685] [ef8a9ad0] [c0243148] call_netdevice_notifiers+0x5c/0x70
[ 2524.275035] [ef8a9ae0] [c02431b0] __dev_close_many+0x54/0xf4
[ 2524.280690] [ef8a9b00] [c0243320] dev_close_many+0x84/0xfc
[ 2524.286172] [ef8a9b30] [c0243470] rollback_registered_many+0xd8/0x2d0
[ 2524.292607] [ef8a9b60] [c0243710] rollback_registered+0x2c/0x4c
[ 2524.298524] [ef8a9b80] [c024504c] unregister_netdevice_queue+0x80/0xb8
[ 2524.305056] [ef8a9b90] [f57edcf8] ieee80211_if_remove+0x8c/0xb4 [mac80211]
[ 2524.311934] [ef8a9ba0] [f57f37fc] ieee80211_aes_cmac_key_free+0x3430/0x4648 [mac80211]
[ 2524.319853] [ef8a9bb0] [f56e6968] cfg80211_wext_giwscan+0x1154/0x11c0 [cfg80211]
[ 2524.327252] [ef8a9bc0] [c0269838] genl_rcv_msg+0x1ec/0x228
[ 2524.332734] [ef8a9c30] [c0268c40] netlink_rcv_skb+0x64/0xd4
[ 2524.338303] [ef8a9c50] [c0269638] genl_rcv+0x28/0x3c
[ 2524.343265] [ef8a9c60] [c02685f0] netlink_unicast+0x168/0x224
[ 2524.349007] [ef8a9ca0] [c0268a28] netlink_sendmsg+0x2b0/0x32c
[ 2524.354758] [ef8a9cf0] [c022fb1c] sock_sendmsg+0x80/0xa4
[ 2524.360066] [ef8a9dd0] [c022fe60] __sys_sendmsg+0x1f8/0x2b4
[ 2524.365635] [ef8a9ef0] [c0231c68] sys_sendmsg+0x40/0x74
[ 2524.370859] [ef8a9f40] [c000d824] ret_from_syscall+0x0/0x3c
---
 hostapd/config_file.c |    2 +
 hostapd/hostapd.conf  |    7 +++
 src/ap/ap_config.c    |    2 +
 src/ap/ap_config.h    |    1 
 src/ap/vlan_init.c    |  105 ++++++++++++++++++++++++++++++++++++++++++-------
 5 files changed, 102 insertions(+), 15 deletions(-)
Sergey Ryazanov - July 31, 2013, 6:53 p.m.
Hello Michael,

2013/7/27 Michael Braun <michael-dev@fami-braun.de>:
> Since running this hostapd version, I continously
> see kernel warnings. I suspect them to be due to
> interfaces removal now occuring on STA freeing,
> which is happen frequently on my test node.
>
> This patch makes interface freeing optional
> and deferred by default.
>
May be I missing something, but why you are try to avoid triggering of
the kernel bug by placing workaround inside hostapd? May be we should
fix kernel bug itself?
michael-dev@fami-braun.de - Aug. 1, 2013, 7:42 a.m.
Hi,

Am 31.07.2013 20:53, schrieb Sergey Ryazanov:
> May be I missing something, but why you are try to avoid triggering of
> the kernel bug by placing workaround inside hostapd? May be we should
> fix kernel bug itself?

of course the kernel (or whatever actually is the cause) should be fixed 
as well.
Although the main reason for this change was the kernel issue, I think 
that limiting the churn for creating and removing interfaces could also 
help to mitigate some special kinds of DOS attacks, where the kernel 
would become busy removing and creating interfaces all the time. So this 
change is useful even without any kernel bug.
Additionally, with this change I can actually use this changeset in 
production without the aps regularly crashing, so it really helps.

Regards,
  M. Braun
Sergey Ryazanov - Aug. 1, 2013, 6:06 p.m.
Hi,

2013/8/1 michael-dev <michael-dev@fami-braun.de>:
> Hi,
>
> Am 31.07.2013 20:53, schrieb Sergey Ryazanov:
>
>> May be I missing something, but why you are try to avoid triggering of
>> the kernel bug by placing workaround inside hostapd? May be we should
>> fix kernel bug itself?
>
> of course the kernel (or whatever actually is the cause) should be fixed as
> well.
>
If so, could you create a bug report and send it to the linux-wireless list?

> Although the main reason for this change was the kernel issue, I think that
> limiting the churn for creating and removing interfaces could also help to
> mitigate some special kinds of DOS attacks, where the kernel would become
> busy removing and creating interfaces all the time. So this change is useful
> even without any kernel bug.
>
In such case you need a more generic mechanism, which limit client
connection rate. Or you prefer to add such workaround for each
time-consuming operation?

> Additionally, with this change I can actually use this changeset in
> production without the aps regularly crashing, so it really helps.
>
Yes, may be this patch improves situation, but why you are try to push
them into the main hostapd code, where it could stay forever, instead
of keep them locally, until kernel fix would be ready.

Patch

diff --git a/hostapd/config_file.c b/hostapd/config_file.c
index bf865ea..684cc90 100644
--- a/hostapd/config_file.c
+++ b/hostapd/config_file.c
@@ -2575,6 +2575,8 @@  static int hostapd_config_fill(struct hostapd_config *conf,
 #ifndef CONFIG_NO_VLAN
 		} else if (os_strcmp(buf, "dynamic_vlan") == 0) {
 			bss->ssid.dynamic_vlan = atoi(pos);
+		} else if (os_strcmp(buf, "dynamic_vlan_release_timeout") == 0) {
+			bss->ssid.dynamic_vlan_release_timeout = atoi(pos);
 		} else if (os_strcmp(buf, "vlan_file") == 0) {
 			if (hostapd_config_read_vlan_file(bss, pos)) {
 				wpa_printf(MSG_ERROR, "Line %d: failed to "
diff --git a/hostapd/hostapd.conf b/hostapd/hostapd.conf
index a1dbdfe..849a800 100644
--- a/hostapd/hostapd.conf
+++ b/hostapd/hostapd.conf
@@ -852,6 +852,13 @@  own_ip_addr=127.0.0.1
 # 1 = option; use default interface if RADIUS server does not include VLAN ID
 # 2 = required; reject authentication if RADIUS server does not include VLAN ID
 #dynamic_vlan=0
+#
+# Removing dynamic interfaces seems to trigger kernel bugs, so it is disabled.
+# This gives a timeout for removing the interface after it has been disabled.
+# -1 = disabled
+#  0 = remove immediately
+#  >0  deferred removal
+#dynamic_vlan_release_timeout=30
 
 # VLAN interface list for dynamic VLAN mode is read from a separate text file.
 # This list is used to map VLAN ID from the RADIUS server to a network
diff --git a/src/ap/ap_config.c b/src/ap/ap_config.c
index 03da6a0..edc63b3 100644
--- a/src/ap/ap_config.c
+++ b/src/ap/ap_config.c
@@ -91,6 +91,8 @@  void hostapd_config_defaults_bss(struct hostapd_bss_config *bss)
 	bss->radius_das_time_window = 300;
 
 	bss->sae_anti_clogging_threshold = 5;
+
+	bss->ssid.dynamic_vlan_release_timeout = 30;
 }
 
 
diff --git a/src/ap/ap_config.h b/src/ap/ap_config.h
index d58a3d1..bd6c83d 100644
--- a/src/ap/ap_config.h
+++ b/src/ap/ap_config.h
@@ -67,6 +67,7 @@  struct hostapd_ssid {
 #define DYNAMIC_VLAN_OPTIONAL 1
 #define DYNAMIC_VLAN_REQUIRED 2
 	int dynamic_vlan;
+	int dynamic_vlan_release_timeout;
 #define DYNAMIC_VLAN_NAMING_WITHOUT_DEVICE 0
 #define DYNAMIC_VLAN_NAMING_WITH_DEVICE 1
 #define DYNAMIC_VLAN_NAMING_END 2
diff --git a/src/ap/vlan_init.c b/src/ap/vlan_init.c
index 3b7cdfb..4fc1996 100644
--- a/src/ap/vlan_init.c
+++ b/src/ap/vlan_init.c
@@ -40,6 +40,10 @@  struct full_dynamic_vlan {
 	int s; /* socket on which to listen for new/removed interfaces. */
 };
 
+struct vlan_removal_info {
+	int ifidx;
+	char ifname[IFNAMSIZ+1];
+};
 
 static int ifconfig_helper(const char *if_name, int up)
 {
@@ -1109,6 +1113,9 @@  struct hostapd_vlan * vlan_add_dynamic(struct hostapd_data *hapd,
 {
 	struct hostapd_vlan *n, *i, *prev;
 	char *ifname, *pos;
+#ifdef CONFIG_FULL_DYNAMIC_VLAN
+	int ifidx, created = 0;
+#endif /* CONFIG_FULL_DYNAMIC_VLAN */
 
 	if (vlan == NULL
 	    || vlan_untagged(&vlan_id) <= 0
@@ -1168,14 +1175,21 @@  struct hostapd_vlan * vlan_add_dynamic(struct hostapd_data *hapd,
 	            artifical_id, pos);
 	os_free(ifname);
 
-	if (hostapd_vlan_if_add(hapd, n->ifname)) {
-		os_free(n);
-		return NULL;
-	}
 #ifdef CONFIG_FULL_DYNAMIC_VLAN
-	// if id_add returned the ifidx, it could be reused here
+	ifidx = if_nametoindex(n->ifname);
+	if (!ifidx) {
+		/* interface does not yet exist */
+		if (hostapd_vlan_if_add(hapd, n->ifname)) {
+			/* interface could not be created */
+			os_free(n);
+			return NULL;
+		}
+		created = 1;
+	}
+	hapd_get_dynamic_iface(NULL, n->ifname, created, hapd);
+	// if if_add returned the ifidx, it could be reused here
 	n->ifidx = if_nametoindex(n->ifname);
-#endif
+#endif /* CONFIG_FULL_DYNAMIC_VLAN */
 
 	/* insert into list */
 	if (!prev) {
@@ -1194,12 +1208,50 @@  struct hostapd_vlan * vlan_add_dynamic(struct hostapd_data *hapd,
 }
 
 
+#ifdef CONFIG_FULL_DYNAMIC_VLAN
+/**
+ * ap_handle_timer - Per STA timer handler
+ * @eloop_ctx: struct hostapd_data *
+ * @timeout_ctx: struct sta_info *
+ *
+ * This function is called to check station activity and to remove inactive
+ * stations.
+ */
+void vlan_remove_timer(void *eloop_ctx, void *timeout_ctx)
+{
+	struct hostapd_data *hapd = eloop_ctx;
+	struct vlan_removal_info *ctx = timeout_ctx;
+
+	wpa_printf(MSG_DEBUG, "VLAN: %s(ifname=%s, ifidx=%d)", __func__, ctx->ifname, ctx->ifidx);
+
+	if (hapd_put_dynamic_iface(NULL, ctx->ifname, hapd) != 1)
+		goto out;
+	
+	/* if somebody else removed that interface and thus a new one
+	 * has been created, we don't need to remove the new one.
+	 */
+	if (ctx->ifidx != 0 && if_nametoindex(ctx->ifname) != ctx->ifidx)
+		goto out;
+
+	hostapd_vlan_if_remove(hapd, ctx->ifname);
+
+out:
+	os_free(ctx);
+}
+#endif /* CONFIG_FULL_DYNAMIC_VLAN */
+
+
 int vlan_remove_dynamic(struct hostapd_data *hapd, vlan_t vlan_id)
 {
 	struct hostapd_vlan *vlan;
-	char buf[IFNAMSIZ+1];
+	char ifname[IFNAMSIZ+1];
+#ifdef CONFIG_FULL_DYNAMIC_VLAN
+	int ifidx;
+	struct vlan_removal_info *ctx = NULL;
+#endif /* CONFIG_FULL_DYNAMIC_VLAN */
 
-	if (vlan_untagged(&vlan->vlan_id) == VLAN_ID_WILDCARD)
+	if (vlan_untagged(&vlan->vlan_id) == VLAN_ID_WILDCARD
+	    || !vlan_notempty(&vlan_id))
 		return 1;
 
 	wpa_printf(MSG_DEBUG, "VLAN: %s(vlan_id=%s)", __func__,
@@ -1218,14 +1270,37 @@  int vlan_remove_dynamic(struct hostapd_data *hapd, vlan_t vlan_id)
 	if (vlan == NULL)
 		return 1;
 
-	if (vlan->dynamic_vlan == 0) {
-		os_strncpy(buf, vlan->ifname, sizeof(buf));
-#ifdef CONFIG_FULL_DYNAMIC_VLAN
-		vlan_dellink(buf, vlan->ifidx, hapd);
-#endif
-		vlan = NULL; /* freed by vlan_dellink */
-		hostapd_vlan_if_remove(hapd, buf);
+	if (vlan->dynamic_vlan != 0)
+		return 0;
+
+	if (hapd->conf->ssid.dynamic_vlan_release_timeout < 0)
+		return 0;
+
+	os_strncpy(ifname, vlan->ifname, sizeof(ifname));
+#ifndef CONFIG_FULL_DYNAMIC_VLAN
+	hostapd_vlan_if_remove(hapd, ifname);
+#else
+	ifidx = vlan->ifidx;
+
+	vlan_dellink(ifname, ifidx, hapd);
+	vlan = NULL; /* freed by vlan_dellink */
+
+	if (hapd->conf->ssid.dynamic_vlan_release_timeout > 0)
+		ctx = os_zalloc(sizeof(*ctx));
+
+	if (hapd->conf->ssid.dynamic_vlan_release_timeout == 0 || !ctx) {
+		/* timeout = 0 */
+		if (hapd_put_dynamic_iface(NULL, ifname, hapd) == 1)
+			hostapd_vlan_if_remove(hapd, ifname);
+		return 0;
 	}
 
+	ctx->ifidx = ifidx;
+	os_strncpy(ctx->ifname, ifname, sizeof(ctx->ifname));
+
+	eloop_register_timeout(hapd->conf->ssid.dynamic_vlan_release_timeout, 0,
+	                       vlan_remove_timer, hapd, ctx);
+
+#endif /* CONFIG_FULL_DYNAMIC_VLAN */
 	return 0;
 }