Message ID | 20190912192002.6105-1-alexander@wetzel-home.de |
---|---|
State | Changes Requested |
Headers | show |
Series | Bypass QDISC for control frames on Linux | expand |
On Thu, Sep 12, 2019 at 09:20:02PM +0200, Alexander Wetzel wrote: > Let wpa_supplicant/hostapd generated control frames bypass the QDISC > when the linux kernel supports the feature, to make sure the frames > are reaching the driver and are not delayed or dropped by QDISC. What impact would this have on reliability of sending out the EAPOL frames in cases where there is significant traffic load on the interface? Documentation for PACKET_QDISC_BYPASS seems to imply that this option is targeted for traffic testing applications and as a side effect, this increases packet drops when transmit queues are busy. Would all those drops be frames that go through qdisc or would this also apply for the frames that go out through the socket that uses PACKET_QDISC_BYPASS (i.e., these EAPOL frames)? > This fixes a potential race when rekeying a connection under load: > Without that an EAPOL#4 frame send out by wpa_supplicant may not have > reached the driver when switching over to the new key, causing the AP to > disconnect the STA. It would seem to me that it would be better to send and receive EAPOL frames using nl80211 commands/events (i.e., same mechanism that is used to perform authentication/association and key configuration) to get matching behavior for all these management operations. > The intend of this patch is to make sure the eapol#4 frame has at least > reached the driver when we replace the key. Based on PACKET_QDISC_BYPASS documentation, I'm not convinced this "makes sure" of that.. Using nl80211 commands for sending EAPOL-Key msg 4/4 and setting the key on the other hand would make that pretty clear (or would provide clear error reporting if that is not the case). > Now this is is just the quick & easy fix to make sure EAPOL#4 is at > least in the driver queues when we replace the key, as discussed here: > http://lists.infradead.org/pipermail/hostap/2019-September/040514.html That other patch, i.e., EAPOL-over-nl80211 would seem to be a better direction to finally get working. Was anyone still working on that or can I assume that no one is going to address my comment on it (use it more consistently for all EAPOL frames rather than have a special case for EAPOL-Key)?
> On Thu, Sep 12, 2019 at 09:20:02PM +0200, Alexander Wetzel wrote: >> Let wpa_supplicant/hostapd generated control frames bypass the QDISC >> when the linux kernel supports the feature, to make sure the frames >> are reaching the driver and are not delayed or dropped by QDISC. > > What impact would this have on reliability of sending out the EAPOL > frames in cases where there is significant traffic load on the > interface? Documentation for PACKET_QDISC_BYPASS seems to imply that > this option is targeted for traffic testing applications and as a side > effect, this increases packet drops when transmit queues are busy. Would > all those drops be frames that go through qdisc or would this also apply > for the frames that go out through the socket that uses > PACKET_QDISC_BYPASS (i.e., these EAPOL frames)? > First, I do not think the patch should be merged as it is. PTK rekeying is at the moment mostly broken and this patch is just replacing one broken solution with a less broken one. I was hoping for some discussion here how to generally handle that. That did not happen so far. Second, I'll soon have a quite radical patch ready to address the PTK rekey issues with what seems to be the only solution: Make rekeying optional and use reconnects by default as replacement. There are just too many broken implementations in the wild which in at least one case enable plain text attacks on the outgoing PTK: Ath9k with mac80211 from a kernel < 4.20 can repeat lost MPDUs but skips the encryption... Generally all cards using HW crypto and mac80211 likely have some issues with a kernel 4.20, which more or less added proper rekey support, handling the corner cases. (ath10k is the known exception so far.) But quite many authors seem to have missed at least some of the corner cases PTK rekeying with only keyid 0 has I invite anyone believing that rekey is working in his setup to set the rekey intervall to 30s and try a download. Even with an AP able to handle that error free many devices - probably most if my samples are representative - will have have issues. Most common are download aborts, caused to to a reconnect due to an error during rekeying. Making that to the default action and allow users knowing what they do to override it seems to be the best way out of the PTK rekey mess. But pack to this patch here: The patch would disconnect a STA when the socket refuse to accept the MPDU. Now the easy fix would be to simply ignore send errors and trust that one of the retransmits will be accepted and can get through. But this should only be a fallback strategy when we can't use CONTROL PORT. That said the current solution using QDISC is probably worse then just hoping the HW is accepting the first send attempt. Short of a really loaded connection the driver should be able to accept the MPDU. And when the driver can't only the drives with special PTK rekeying hacks can hope to rekey the PTK when we are using QDISC. The price of using QDISC is, that when the send call returns one skb has probably been send out, but next to sure not our EAPOL frame we just added at the end of the queue. Assuming die QDISC queues are filled it may take quite some time the MPDU is OTA and has therefore a very good chance to get encrypted with the new key. Which will never be decrypted by the AP and also cause a disconnect. Setting PACKET_QDISC_BYPASS on the socket is just "jumping" QDISC. - We lose retransmits handling when the HW is busy and not accepting skb's. (Which is kind of dangerous to outsource in the first place) + We get the guarantee that the skb is at least in a driver buffer when the send call returns (instead of a QDISC bufer) >> This fixes a potential race when rekeying a connection under load: >> Without that an EAPOL#4 frame send out by wpa_supplicant may not have >> reached the driver when switching over to the new key, causing the AP to >> disconnect the STA. > > It would seem to me that it would be better to send and receive EAPOL > frames using nl80211 commands/events (i.e., same mechanism that is used > to perform authentication/association and key configuration) to get > matching behavior for all these management operations. > Yes, but CONTROL PORT is optionally and many drivers are not implementing it. >> The intend of this patch is to make sure the eapol#4 frame has at least >> reached the driver when we replace the key. > > Based on PACKET_QDISC_BYPASS documentation, I'm not convinced this > "makes sure" of that.. Using nl80211 commands for sending EAPOL-Key msg > 4/4 and setting the key on the other hand would make that pretty clear > (or would provide clear error reporting if that is not the case). > I checked the code, not the documentation. The ioctl can only return when the skb has been either accepted by the driver or hit an error somewhere. >> Now this is is just the quick & easy fix to make sure EAPOL#4 is at >> least in the driver queues when we replace the key, as discussed here: >> http://lists.infradead.org/pipermail/hostap/2019-September/040514.html > > That other patch, i.e., EAPOL-over-nl80211 would seem to be a better > direction to finally get working. Was anyone still working on that or > can I assume that no one is going to address my comment on it (use it > more consistently for all EAPOL frames rather than have a special case > for EAPOL-Key)? > If nobody is else is I'll sooner or more likely later pick that up. I just fear the final patch will be quite invasive when done proper and is merging the eapol paths from hostapd and wpa_supplicant. And while it should be the preferred path when available so far only mac80211 is supporting it from the in-tree drivers. Alexander
On Sun, Dec 29, 2019 at 7:10 PM Alexander Wetzel <alexander@wetzel-home.de> wrote: > > > On Thu, Sep 12, 2019 at 09:20:02PM +0200, Alexander Wetzel wrote: > >> Let wpa_supplicant/hostapd generated control frames bypass the QDISC > >> when the linux kernel supports the feature, to make sure the frames > >> are reaching the driver and are not delayed or dropped by QDISC. > > > > What impact would this have on reliability of sending out the EAPOL > > frames in cases where there is significant traffic load on the > > interface? Documentation for PACKET_QDISC_BYPASS seems to imply that > > this option is targeted for traffic testing applications and as a side > > effect, this increases packet drops when transmit queues are busy. Would > > all those drops be frames that go through qdisc or would this also apply > > for the frames that go out through the socket that uses > > PACKET_QDISC_BYPASS (i.e., these EAPOL frames)? > > > > First, I do not think the patch should be merged as it is. PTK rekeying > is at the moment mostly broken and this patch is just replacing one > broken solution with a less broken one. I was hoping for some discussion > here how to generally handle that. That did not happen so far. > > Second, I'll soon have a quite radical patch ready to address the PTK > rekey issues with what seems to be the only solution: Make rekeying > optional and use reconnects by default as replacement. There are just > too many broken implementations in the wild which in at least one case > enable plain text attacks on the outgoing PTK: Ath9k with mac80211 from > a kernel < 4.20 can repeat lost MPDUs but skips the encryption... > Generally all cards using HW crypto and mac80211 likely have some issues > with a kernel 4.20, which more or less added proper rekey support, > handling the corner cases. (ath10k is the known exception so far.) > But quite many authors seem to have missed at least some of the corner > cases PTK rekeying with only keyid 0 has I invite anyone believing that > rekey is working in his setup to set the rekey intervall to 30s and try > a download. Even with an AP able to handle that error free many devices > - probably most if my samples are representative - will have have > issues. Most common are download aborts, caused to to a reconnect due to > an error during rekeying. Making that to the default action and allow > users knowing what they do to override it seems to be the best way out > of the PTK rekey mess. > > But pack to this patch here: > The patch would disconnect a STA when the socket refuse to accept the > MPDU. Now the easy fix would be to simply ignore send errors and trust > that one of the retransmits will be accepted and can get through. > But this should only be a fallback strategy when we can't use CONTROL PORT. > > That said the current solution using QDISC is probably worse then just > hoping the HW is accepting the first send attempt. Short of a really > loaded connection the driver should be able to accept the MPDU. And when > the driver can't only the drives with special PTK rekeying hacks can > hope to rekey the PTK when we are using QDISC. > > The price of using QDISC is, that when the send call returns one skb has > probably been send out, but next to sure not our EAPOL frame we just > added at the end of the queue. Assuming die QDISC queues are filled it > may take quite some time the MPDU is OTA and has therefore a very good > chance to get encrypted with the new key. Which will never be decrypted > by the AP and also cause a disconnect. > > Setting PACKET_QDISC_BYPASS on the socket is just "jumping" QDISC. > - We lose retransmits handling when the HW is busy and not accepting > skb's. (Which is kind of dangerous to outsource in the first place) > + We get the guarantee that the skb is at least in a driver buffer when > the send call returns (instead of a QDISC bufer) > > > >> This fixes a potential race when rekeying a connection under load: > >> Without that an EAPOL#4 frame send out by wpa_supplicant may not have > >> reached the driver when switching over to the new key, causing the AP to > >> disconnect the STA. > > > > It would seem to me that it would be better to send and receive EAPOL > > frames using nl80211 commands/events (i.e., same mechanism that is used > > to perform authentication/association and key configuration) to get > > matching behavior for all these management operations. > > > > Yes, but CONTROL PORT is optionally and many drivers are not > implementing it. > > >> The intend of this patch is to make sure the eapol#4 frame has at least > >> reached the driver when we replace the key. > > > > Based on PACKET_QDISC_BYPASS documentation, I'm not convinced this > > "makes sure" of that.. Using nl80211 commands for sending EAPOL-Key msg > > 4/4 and setting the key on the other hand would make that pretty clear > > (or would provide clear error reporting if that is not the case). > > > > I checked the code, not the documentation. The ioctl can only return > when the skb has been either accepted by the driver or hit an error > somewhere. > > >> Now this is is just the quick & easy fix to make sure EAPOL#4 is at > >> least in the driver queues when we replace the key, as discussed here: > >> http://lists.infradead.org/pipermail/hostap/2019-September/040514.html > > > > That other patch, i.e., EAPOL-over-nl80211 would seem to be a better > > direction to finally get working. Was anyone still working on that or > > can I assume that no one is going to address my comment on it (use it > > more consistently for all EAPOL frames rather than have a special case > > for EAPOL-Key)? > > > > If nobody is else is I'll sooner or more likely later pick that up. > I just fear the final patch will be quite invasive when done proper and > is merging the eapol paths from hostapd and wpa_supplicant. Sorry, it went out of my radar, I can pick that up. This patch + flushing EAPoL in the driver rekeying works fine. > And while it should be the preferred path when available so far only > mac80211 is supporting it from the in-tree drivers.
diff --git a/src/drivers/driver_nl80211.c b/src/drivers/driver_nl80211.c index f699fb53f..a8372cfc0 100644 --- a/src/drivers/driver_nl80211.c +++ b/src/drivers/driver_nl80211.c @@ -1992,6 +1992,7 @@ static void * wpa_driver_nl80211_drv_init(void *ctx, const char *ifname, { struct wpa_driver_nl80211_data *drv; struct i802_bss *bss; + int enabled = 1; if (global_priv == NULL) return NULL; @@ -2038,9 +2039,12 @@ static void * wpa_driver_nl80211_drv_init(void *ctx, const char *ifname, if (drv->eapol_tx_sock < 0) goto failed; - if (drv->data_tx_status) { - int enabled = 1; + /* The qdisc could delay/drop frames. Bypass it when supported. */ + setsockopt(drv->eapol_tx_sock, SOL_PACKET, PACKET_QDISC_BYPASS, + &enabled, sizeof(enabled)); + if (drv->data_tx_status) { + enabled = 1; if (setsockopt(drv->eapol_tx_sock, SOL_SOCKET, SO_WIFI_STATUS, &enabled, sizeof(enabled)) < 0) { wpa_printf(MSG_DEBUG, diff --git a/src/l2_packet/l2_packet_linux.c b/src/l2_packet/l2_packet_linux.c index 291c9dd26..1bd7d15d1 100644 --- a/src/l2_packet/l2_packet_linux.c +++ b/src/l2_packet/l2_packet_linux.c @@ -274,6 +274,7 @@ struct l2_packet_data * l2_packet_init( struct l2_packet_data *l2; struct ifreq ifr; struct sockaddr_ll ll; + int enabled = 1; l2 = os_zalloc(sizeof(struct l2_packet_data)); if (l2 == NULL) @@ -294,6 +295,14 @@ struct l2_packet_data * l2_packet_init( os_free(l2); return NULL; } + + /* The qdisc could delay/drop the eapol #3 frame, causing the frame to + * be encrypted with the new key or not to be send out at all when + * rekeying a busy link. Bypass it when supported. + */ + setsockopt(l2->fd, SOL_PACKET, PACKET_QDISC_BYPASS, + &enabled, sizeof(enabled)); + os_memset(&ifr, 0, sizeof(ifr)); os_strlcpy(ifr.ifr_name, l2->ifname, sizeof(ifr.ifr_name)); if (ioctl(l2->fd, SIOCGIFINDEX, &ifr) < 0) {
Let wpa_supplicant/hostapd generated control frames bypass the QDISC when the linux kernel supports the feature, to make sure the frames are reaching the driver and are not delayed or dropped by QDISC. This fixes a potential race when rekeying a connection under load: Without that an EAPOL#4 frame send out by wpa_supplicant may not have reached the driver when switching over to the new key, causing the AP to disconnect the STA. Signed-off-by: Alexander Wetzel <alexander@wetzel-home.de> --- The intend of this patch is to make sure the eapol#4 frame has at least reached the driver when we replace the key. Now this is is just the quick & easy fix to make sure EAPOL#4 is at least in the driver queues when we replace the key, as discussed here: http://lists.infradead.org/pipermail/hostap/2019-September/040514.html I've cleaned up the tested hack a bit and added the same bypass for hostapd. I don't like the two independent eapol paths and think it would make sense to let nl80211 handle eapol frames also for wpa_supplicant and not just for hostapd. But the changes to get that implemented are quite far reaching and not directly linked to the problem at hand. I've also started looking how to unify the two paths, but this will be quite invasive and nothing I see in the near future. The patch here is therefore just enabling PACKET_QDISC_BYPASS on the existing sockets we use to transmit eapol frames: When we can't bypass the QDISC - probably the kernel is < 3.14 - we simply continue anyhow, thus keeping the existing behavior. (A warning here seems a bit like overkill, but that's just my gut feeling...) Technically it would be sufficient to only apply the second part of the patch (l2_packet_linux.c): Really critical is only the eapol#4 frame. But having some qdisc drop any 802.11 management frames - and then ONLY eapol frames - did not look like a good idea. src/drivers/driver_nl80211.c | 8 ++++++-- src/l2_packet/l2_packet_linux.c | 9 +++++++++ 2 files changed, 15 insertions(+), 2 deletions(-)