diff mbox series

[v9,02/16] AP: Address PTK rekey issues

Message ID 20200104221015.90469-3-alexander@wetzel-home.de
State Changes Requested
Headers show
Series Seamless PTK rekeys | expand

Commit Message

Alexander Wetzel Jan. 4, 2020, 10:10 p.m. UTC
Rekeying a pairwise key using only keyid 0 (PTK0 rekeys) has many broken
implementations and should be avoided for both security and usability
reasons.
The effects can be triggered by either end of the connection and range
from hardly noticeable disconnects over long connection freezes up to
leaking clear text MPDUs which can be used to calculate the outgoing PTK.

To avoid the issues replace PTK0 rekeys by default with disconnects and
add the new option "wpa_deny_ptk0_rekey" to let the user control the
behavior.

Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de>
---
 hostapd/config_file.c  |  9 +++++++++
 hostapd/hostapd.conf   | 32 ++++++++++++++++++++++++++++++++
 src/ap/ap_config.c     |  1 +
 src/ap/ap_config.h     |  1 +
 src/ap/wpa_auth.c      | 23 +++++++++++++++++++++--
 src/ap/wpa_auth.h      |  1 +
 src/ap/wpa_auth_glue.c | 12 ++++++++++++
 src/common/defs.h      |  6 ++++++
 8 files changed, 83 insertions(+), 2 deletions(-)

Comments

Jouni Malinen Jan. 6, 2020, 9:03 a.m. UTC | #1
On Sat, Jan 04, 2020 at 11:10:01PM +0100, Alexander Wetzel wrote:
> Rekeying a pairwise key using only keyid 0 (PTK0 rekeys) has many broken
> implementations and should be avoided for both security and usability
> reasons.
> The effects can be triggered by either end of the connection and range
> from hardly noticeable disconnects over long connection freezes up to
> leaking clear text MPDUs which can be used to calculate the outgoing PTK.
> 
> To avoid the issues replace PTK0 rekeys by default with disconnects and
> add the new option "wpa_deny_ptk0_rekey" to let the user control the
> behavior.

I don't think it is appropriate to force disconnections by default for
all existing systems. It would seem fine to provide an option to
explicitly request such behavior, but this by-default-behavior looks
like a too drastic approach to take when there are multiple drivers that
have over years added various workarounds to avoid many of the issues
for most common cases.
Alexander Wetzel Jan. 6, 2020, 11:38 a.m. UTC | #2
Am 06.01.20 um 10:03 schrieb Jouni Malinen:
> On Sat, Jan 04, 2020 at 11:10:01PM +0100, Alexander Wetzel wrote:
>> Rekeying a pairwise key using only keyid 0 (PTK0 rekeys) has many broken
>> implementations and should be avoided for both security and usability
>> reasons.
>> The effects can be triggered by either end of the connection and range
>> from hardly noticeable disconnects over long connection freezes up to
>> leaking clear text MPDUs which can be used to calculate the outgoing PTK.
>>
>> To avoid the issues replace PTK0 rekeys by default with disconnects and
>> add the new option "wpa_deny_ptk0_rekey" to let the user control the
>> behavior.
> 
> I don't think it is appropriate to force disconnections by default for
> all existing systems. It would seem fine to provide an option to
> explicitly request such behavior, but this by-default-behavior looks
> like a too drastic approach to take when there are multiple drivers that
> have over years added various workarounds to avoid many of the issues
> for most common cases.
> 

I'm aware of the issue to be controversial. For me this looks more risky 
than the alternative, but I can't rule out I miss something...

Have you some evidence of rekeying to be really working in real-world 
scenarios? It can work with carefully selected cards/drivers but in 
mixed environments it seems to be asking for trouble.

I've done some tests with abysmal results with HW available to me.
I tried to google issues related to that and did find some - but much 
less than expected - candidates ranging from very likely to maybe. Some 
were even discussing GTK rekeys when the log evidence made clear that 
they had a PTK rekey issue.

I can only explain the missing outcry by the assumption that rekeying is 
seldom used and the few users hit by it are using the workarounds for 
"strange wifi issues" like disabling HW encryption and/or A-MPDU 
aggregation or switching to other HW. Coupled with the normal 1h "dead 
time" you need to hit the first problem and that most systems seem to 
recovering already with a reconnect that may just be possible...

Set your AP to rekey all 30s and start a download or video streaming and 
check what happens with your different devices. (I'm normally testing 
close to the AP.)
For me with Windows the "common" case was a disconnect, due to 
encrypting the eapol #4 with the new key. I was initially assuming that 
these reconnects were by design to recover from whatever went wrong and 
was wondering why linux systems did freeze instead. Fixing eapol#4 races 
may therefore make the situation worse.

Here list for devices/System which got at least some quick look and what 
I found out (this data is roughly two years old):

The only working devices I found are
  - ath10k with linux (not tested with Windows)
  - Windows Surface Pro 3 running Win 10 (I think many/most/all Marvell
    drivers could be ok)
  - iPhone (I think it was a iPhone X but may be off one generation)

Known devices/systems not able to rekey the PTK correctly:
  - any linux system using ath9k with mac80211 from kernel < 4.20
  - HTC 10 smartphone (running Linageos)
  - Nexus 5x smartphone (stock)
  - Samsung Galaxy S5 smartphone (running Linageos)
  - Samsung Galaxy Tab S3 (stock)
  - Samsung Series 6 SmartTV
  - Dell notebook using some broadcom fullmac NIC (Windows 7)
  - AWUS036ACH USB stick (Win10)

So there is at least some evidence that many developers did not consider 
the races and the potential consequences of ignoring them has.

There is basically no way that the ath9k driver is not messing up a 
rekey with ongoing traffic. It's also so far the only known driver which 
can leak the PTK. Especially with TKIP this looks extremely dangerous, 
where an attacker can trigger the rekeys at will.
I've not tried it, but generate MIC errors at a busy moment and you will 
get MPDUs fully set up for encryption with only the encryption missing. 
And that also works for retransmits, so you can get the same MPDU 
encrypted and decrypted... Which for my understanding then allows you to 
calculate the PTK and decrypt any previously captured MPDUs using the PTK.

Now this is of course nothing more than a driver issue but looking at 
the lists I would be very surprised if not at least 50% of common used 
devices are broken today and expect rates around 90%. (With most 
already reconnecting, mitigating the user impact.)

For me the issue started with these two tickets, when I got frustrated 
by the strange wlan freezes and could no longer deny a pattern after 
having ignored it for years:
  - https://bugzilla.kernel.org/show_bug.cgi?id=92451
  - https://dev.archive.openwrt.org/ticket/18966

I was able to pin the issue to PTK rekeying with clear evidence showing 
what was going on. But nothing happened and so some years later I 
decided to try to fix that myself. A important mile stone was the fix 
for mac80211 62872a9b9a10 ("mac80211: Fix PTK rekey freezes and clear 
text leak"), which is to my best knowledge the first documentation what 
must be considered when replacing a PTK key besides the eapol#4 race - 
which is pretty harmless compared to the other races a driver can have.

Keeping the old behavior as default will for sure cause less pain, after 
all the problems are as old as WPA and the users must have reached some 
kind of arrangement with the situation. But knowing that this can leak 
PTKs it looks like the only secure approach is enforce the reconnects.

The reasoning is, that since we do not have a outcry of users with PTK 
rekey issues today and most devices are already reconnecting on rekey 
there should not be many users which should notice the difference.
But quite some of them may leak PTK keys.

I would of course feel better if more would test PTK rekeying in their 
setups or at least get some data how often PTK rekeying is enabled by 
default and what are the intervals.

Not changing the default is ok when we are sure that only ath9k with 
mac80211 < 4.20 is leaking clear text MPDUs and no other of the broken 
devices fell into the same trap. Or - since my understanding of the used 
cypto is still lacking - someone can confirm that having an encrypted 
MPDU and the same MPDU decrypted (with all the same headers and PN) does 
not allow to calculate the PTK.


Alexander
diff mbox series

Patch

diff --git a/hostapd/config_file.c b/hostapd/config_file.c
index 21c9ab288..b88071a58 100644
--- a/hostapd/config_file.c
+++ b/hostapd/config_file.c
@@ -2884,6 +2884,15 @@  static int hostapd_config_fill(struct hostapd_config *conf,
 		bss->wpa_gmk_rekey = atoi(pos);
 	} else if (os_strcmp(buf, "wpa_ptk_rekey") == 0) {
 		bss->wpa_ptk_rekey = atoi(pos);
+	} else if (os_strcmp(buf, "wpa_deny_ptk0_rekey") == 0) {
+		bss->wpa_deny_ptk0_rekey = atoi(pos);
+		if (bss->wpa_deny_ptk0_rekey < 0 ||
+		    bss->wpa_deny_ptk0_rekey > 2) {
+			wpa_printf(MSG_ERROR,
+				   "Line %d: Invalid wpa_deny_ptk0_rekey=%d; allowed range 0..2",
+				   line, bss->wpa_deny_ptk0_rekey);
+			return 1;
+		}
 	} else if (os_strcmp(buf, "wpa_group_update_count") == 0) {
 		char *endp;
 		unsigned long val = strtoul(pos, &endp, 0);
diff --git a/hostapd/hostapd.conf b/hostapd/hostapd.conf
index 263a04e8f..b61176717 100644
--- a/hostapd/hostapd.conf
+++ b/hostapd/hostapd.conf
@@ -903,6 +903,8 @@  eapol_key_index_workaround=0
 
 # EAP reauthentication period in seconds (default: 3600 seconds; 0 = disable
 # reauthentication).
+# Reauthentications may enforce a disconnect, check the related setting
+# "wpa_deny_ptk0_rekey" for details.
 #eap_reauth_period=3600
 
 # Use PAE group address (01:80:c2:00:00:03) instead of individual target
@@ -1607,8 +1609,38 @@  own_ip_addr=127.0.0.1
 
 # Maximum lifetime for PTK in seconds. This can be used to enforce rekeying of
 # PTK to mitigate some attacks against TKIP deficiencies.
+# Warning: PTK rekeying is buggy with many drivers/devices and the only secure
+# method to rekey the PTK without Extended Key ID support requires a disconnect.
+# Check the related setting "wpa_deny_ptk0_rekey" for details.
 #wpa_ptk_rekey=600
 
+# Workaround for PTK rekey issues
+#
+# Rekeying the PTK without using "Extended Key ID for Individually Addressed
+# Frames" can - depending on the cards/drivers - reduce both the security and
+# the stability of connections. Both ends can freeze the connection or even leak
+# the PTK by repeating MPDU's without encryption. To avoid the issues hostapd is
+# now by default replacing all PTK rekeys using only keyid 0 (PTK0 rekeys) with
+# disconnects.
+# This is a regression for setups able to rekey the PTK and actually do so.
+#
+# Eap reauthentication depends on replacing the PTK and is therefore just
+# another way to rekey the PTK and affected by the setting, too.
+#
+# Unfortunately there is no way to detect if a remote stations can safely rekey
+# the PTK and the default is therefore to replace all those rekeys with a
+# disconnect when not two keyids (Extended Key ID for Individually Addressed
+# Frames) are available for it.
+#
+# If you are sure your cards/drivers don't need the workaround the old behavior
+# can be restored by setting wpa_deny_ptk0_rekey = 0.
+#
+# Available options:
+# 0 = always rekey when configured/instructed (old behavior)
+# 1 = only rekey when the local driver is able to do it bug free
+# 2 = never allow PTK0 rekeys (default)
+# wpa_deny_ptk0_rekey = 2
+
 # The number of times EAPOL-Key Message 1/4 and Message 3/4 in the RSN 4-Way
 # Handshake are retried per 4-Way Handshake attempt.
 # (dot11RSNAConfigPairwiseUpdateCount)
diff --git a/src/ap/ap_config.c b/src/ap/ap_config.c
index 68af3c1d1..98a7f6e26 100644
--- a/src/ap/ap_config.c
+++ b/src/ap/ap_config.c
@@ -64,6 +64,7 @@  void hostapd_config_defaults_bss(struct hostapd_bss_config *bss)
 
 	bss->wpa_group_rekey = 600;
 	bss->wpa_gmk_rekey = 86400;
+	bss->wpa_deny_ptk0_rekey = PTK0_REKEY_ALLOW_NEVER;
 	bss->wpa_group_update_count = 4;
 	bss->wpa_pairwise_update_count = 4;
 	bss->wpa_disable_eapol_key_retries =
diff --git a/src/ap/ap_config.h b/src/ap/ap_config.h
index 7e4b9262c..0b4483ea1 100644
--- a/src/ap/ap_config.h
+++ b/src/ap/ap_config.h
@@ -368,6 +368,7 @@  struct hostapd_bss_config {
 	int wpa_strict_rekey;
 	int wpa_gmk_rekey;
 	int wpa_ptk_rekey;
+	int wpa_deny_ptk0_rekey;
 	u32 wpa_group_update_count;
 	u32 wpa_pairwise_update_count;
 	int wpa_disable_eapol_key_retries;
diff --git a/src/ap/wpa_auth.c b/src/ap/wpa_auth.c
index 6611b0e53..5c52f0f76 100644
--- a/src/ap/wpa_auth.c
+++ b/src/ap/wpa_auth.c
@@ -755,8 +755,17 @@  static void wpa_request_new_ptk(struct wpa_state_machine *sm)
 	if (sm == NULL)
 		return;
 
-	sm->PTKRequest = TRUE;
-	sm->PTK_valid = 0;
+	if (sm->wpa_auth->conf.wpa_deny_ptk0_rekey) {
+		wpa_printf(MSG_WARNING,
+			   "WPA: PTK0 rekey not allowed, disconnect " MACSTR,
+			   MAC2STR(sm->addr));
+		sm->Disconnect = TRUE;
+		/* Try to encourage the STA reconnect */
+		sm->disconnect_reason = WLAN_REASON_CLASS3_FRAME_FROM_NONASSOC_STA;
+	} else {
+		sm->PTKRequest = TRUE;
+		sm->PTK_valid = 0;
+	}
 }
 
 
@@ -1775,6 +1784,16 @@  int wpa_auth_sm_event(struct wpa_state_machine *sm, enum wpa_event event)
 			sm->Init = FALSE;
 			sm->AuthenticationRequest = TRUE;
 			break;
+		} else {
+			if (sm->wpa_auth->conf.wpa_deny_ptk0_rekey) {
+				wpa_printf(MSG_WARNING,
+					   "WPA: PTK0 rekey not allowed, disconnect "
+					   MACSTR, MAC2STR(sm->addr));
+				sm->Disconnect = TRUE;
+				/* Try to encourage the STA reconnect */
+				sm->disconnect_reason = WLAN_REASON_CLASS3_FRAME_FROM_NONASSOC_STA;
+				break;
+			}
 		}
 		if (sm->GUpdateStationKeys) {
 			/*
diff --git a/src/ap/wpa_auth.h b/src/ap/wpa_auth.h
index 933a4b8ed..0e44cbe0b 100644
--- a/src/ap/wpa_auth.h
+++ b/src/ap/wpa_auth.h
@@ -176,6 +176,7 @@  struct wpa_auth_config {
 	int wpa_strict_rekey;
 	int wpa_gmk_rekey;
 	int wpa_ptk_rekey;
+	int wpa_deny_ptk0_rekey;
 	u32 wpa_group_update_count;
 	u32 wpa_pairwise_update_count;
 	int wpa_disable_eapol_key_retries;
diff --git a/src/ap/wpa_auth_glue.c b/src/ap/wpa_auth_glue.c
index b655ae57b..951c7364b 100644
--- a/src/ap/wpa_auth_glue.c
+++ b/src/ap/wpa_auth_glue.c
@@ -1348,6 +1348,18 @@  int hostapd_setup_wpa(struct hostapd_data *hapd)
 		_conf.tx_status = 1;
 	if (hapd->iface->drv_flags & WPA_DRIVER_FLAGS_AP_MLME)
 		_conf.ap_mlme = 1;
+
+	if (!(hapd->iface->drv_flags & WPA_DRIVER_FLAGS_WIRED) &&
+	    (hapd->conf->wpa_deny_ptk0_rekey == PTK0_REKEY_ALLOW_NEVER ||
+	     (hapd->conf->wpa_deny_ptk0_rekey == PTK0_REKEY_ALLOW_LOCAL_OK &&
+	      !(hapd->iface->drv_flags & WPA_DRIVER_FLAGS_SAFE_PTK0_REKEYS)))) {
+		wpa_msg(hapd->msg_ctx, MSG_INFO, "Disable PTK0 rekey support");
+		_conf.wpa_deny_ptk0_rekey = 1;
+	} else {
+		wpa_dbg(hapd->msg_ctx, MSG_DEBUG, "Support PTK0 rekey");
+		_conf.wpa_deny_ptk0_rekey = 0;
+	}
+
 	hapd->wpa_auth = wpa_init(hapd->own_addr, &_conf, &cb, hapd);
 	if (hapd->wpa_auth == NULL) {
 		wpa_printf(MSG_ERROR, "WPA initialization failed.");
diff --git a/src/common/defs.h b/src/common/defs.h
index e2fa4b291..1cf027340 100644
--- a/src/common/defs.h
+++ b/src/common/defs.h
@@ -423,4 +423,10 @@  enum chan_width {
 	CHAN_WIDTH_UNKNOWN
 };
 
+enum ptk0_rekey_handling {
+	PTK0_REKEY_ALLOW_ALWAYS,
+	PTK0_REKEY_ALLOW_LOCAL_OK,
+	PTK0_REKEY_ALLOW_NEVER
+};
+
 #endif /* DEFS_H */