Message ID | d9eb10a4-31dc-b87f-2d75-90e9a9e1a361@mail.com |
---|---|
State | Not Applicable |
Headers | show |
Series | wpa_supplicant 2.10 - Scan failed | expand |
Update as I've been doing a few more checks. I should specify these tests are with the broadcom-wl 6.30.223.271-357 driver on a Broadcom BCM43142 controller. First, I checked why the new SCS/MSCS was making the scan fail. The reason is, it causes nl80211_scan_common in src/drivers/driver_nl80211_scan.c to add some extra IEs in the scan request, via > nla_put(msg, NL80211_ATTR_IE, params->extra_ies_len, params->extra_ies) (commenting out this block also makes the scans succeed again) and these extra_ies cause the cfg80211 kernel module to reject the scan request with EINVAL based on the following check in linux/net/wireless/nl80211.c in nl80211_trigger_scan : > if (ie_len > wiphy->max_scan_ie_len) > return -EINVAL; In this case, ie_len is set to 13 in the scan request and max_scan_ie_len is set (default initialized) to 0 in the wl driver, hence the reported behavior. To double check I recompiled my broadcom-wl driver with this added initialization line: > wdev->wiphy->max_scan_ie_len = 1024; (meaning the driver still completely ignores the extra IEs, but it declares that it accepts some) and indeed the scan also started to work again with the original 2.10 release. So my takeaway is that one of these two should happen: 1) wpa_supplicant should check if the driver accepts extra IEs in the scan request before actually putting any. 2) drivers should be patched to make sure they accept extra IEs in the scan request even if they then silently discard them. Then I'm not qualified to say which of these two (or something else entirely) is the proper fix. However the fact that the cfg80211 module has an explicit check for it makes me guess that drivers are allowed to limit extra IEs and that wpa_supplicant should honor this limit ? Best, Étienne Morice On 1/27/22 13:06, Étienne Morice wrote: > Hi, > > Following the update from 2.9 to 2.10 in arch, we had some scan failures > reported. > > Original report is here: > > https://bugs.archlinux.org/task/73495 > > Upon updating and restarting wpa_supplicant, scan starts to fail with a > log of this kind: > >> systemd[1]: Started WPA supplicant daemon (interface-specific version). >> Jan 26 22:33:45 ... wpa_supplicant[422]: Successfully initialized >> wpa_supplicant >> Jan 26 22:33:45 ... wpa_supplicant[422]: wlp3s0: >> CTRL-EVENT-SCAN-FAILED ret=-22 retry=1 >> Jan 26 22:33:46 ... wpa_supplicant[422]: wlp3s0: >> CTRL-EVENT-SCAN-FAILED ret=-22 retry=1 > > In one case at least bisecting the problem was possible, here are the > conclusions pasted from the arch thread: > > ===================================== > > For me the first commits that make it stop working are: > a11804724 MSCS: Add support to send MSCS Request frames > c005283c4 SCS: Sending of SCS Request frames > > These introduced in wpa_supplicant/wpa_supplicant.c, in > wpas_ext_capab_byte, two new flags: > ``` > case 6: /* Bits 48-55 */ > *pos |= 0x40; /* Bit 54 - SCS */ > ``` > and > ``` > case 10: /* Bits 80-87 */ > *pos |= 0x20; /* Bit 85 - Mirrored SCS */ > ``` > > If I comment these out, or set mscs = scs = false at the beginning of > the function: > ``` > diff --git a/wpa_supplicant/wpa_supplicant.c > b/wpa_supplicant/wpa_supplicant.c > index d37a994f9..561e5f3ed 100644 > --- a/wpa_supplicant/wpa_supplicant.c > +++ b/wpa_supplicant/wpa_supplicant.c > @@ -1883,7 +1883,7 @@ int wpa_supplicant_set_suites(struct > wpa_supplicant *wpa_s, > > static void wpas_ext_capab_byte(struct wpa_supplicant *wpa_s, u8 *pos, > int idx) > { > - bool scs = true, mscs = true; > + bool scs = false, mscs = false; > > *pos = 0x00; > ``` > I get it to work again. I'm posting from hostap_2_10 with this > modification right now. > > Also, commenting the rest of the first breaking commit made no > difference, so it seems to me that it's really these flags that cause > the problem for me and not the rest of code depending on them introduced > at the same time. > > Now the problem is, I have no idea what these actually do and how they > cause the scan to fail. > > Most crucially, I don't know at all if wpa_supplicant is doing something > wrong here, or if that code is actually correct but these flags are > somehow propagated down the stack to components (drivers ?) that cannot > handle them yet. So I'm not even sure if this should be fixed upstream > here or in other packages. > > ====================================== > > Any idea how to proceed from here ? > > Best, > > Étienne Morice > >
Hi Etienne, On 1/30/22 00:25, Étienne Morice wrote: > Update as I've been doing a few more checks. > > I should specify these tests are with the broadcom-wl 6.30.223.271-357 > driver on a Broadcom BCM43142 controller. > > First, I checked why the new SCS/MSCS was making the scan fail. > > The reason is, it causes nl80211_scan_common in > src/drivers/driver_nl80211_scan.c to add some extra IEs in the scan > request, via As a sidenote: Similar breakage was observed a while back with broadcom-wl when Debian and Arch enabled CONFIG_WNM for their wpa_supplicant distributions [0]. The kernel exposes the maximum supported bytes to userspace with which can be retrieved in wiphy_info_handler. See the attached patch. Does scanning work with thid patch applied to v2.10? [0] https://bugs.archlinux.org/task/65588 Best David > > > nla_put(msg, NL80211_ATTR_IE, params->extra_ies_len, params->extra_ies) > > (commenting out this block also makes the scans succeed again) > > and these extra_ies cause the cfg80211 kernel module to reject the scan > request with EINVAL based on the following check in > linux/net/wireless/nl80211.c in nl80211_trigger_scan : > > > if (ie_len > wiphy->max_scan_ie_len) > > return -EINVAL; > > In this case, ie_len is set to 13 in the scan request and > max_scan_ie_len is set (default initialized) to 0 in the wl driver, > hence the reported behavior. > > To double check I recompiled my broadcom-wl driver with this added > initialization line: > > > wdev->wiphy->max_scan_ie_len = 1024; > > (meaning the driver still completely ignores the extra IEs, but it > declares that it accepts some) > > and indeed the scan also started to work again with the original 2.10 > release. > > So my takeaway is that one of these two should happen: > 1) wpa_supplicant should check if the driver accepts extra IEs in the > scan request before actually putting any. > 2) drivers should be patched to make sure they accept extra IEs in the > scan request even if they then silently discard them. > > Then I'm not qualified to say which of these two (or something else > entirely) is the proper fix. However the fact that the cfg80211 module > has an explicit check for it makes me guess that drivers are allowed to > limit extra IEs and that wpa_supplicant should honor this limit ? > > Best, > Étienne Morice > > On 1/27/22 13:06, Étienne Morice wrote: >> Hi, >> >> Following the update from 2.9 to 2.10 in arch, we had some scan failures >> reported. >> >> Original report is here: >> >> https://bugs.archlinux.org/task/73495 >> >> Upon updating and restarting wpa_supplicant, scan starts to fail with a >> log of this kind: >> >>> systemd[1]: Started WPA supplicant daemon (interface-specific version). >>> Jan 26 22:33:45 ... wpa_supplicant[422]: Successfully initialized >>> wpa_supplicant >>> Jan 26 22:33:45 ... wpa_supplicant[422]: wlp3s0: >>> CTRL-EVENT-SCAN-FAILED ret=-22 retry=1 >>> Jan 26 22:33:46 ... wpa_supplicant[422]: wlp3s0: >>> CTRL-EVENT-SCAN-FAILED ret=-22 retry=1 >> >> In one case at least bisecting the problem was possible, here are the >> conclusions pasted from the arch thread: >> >> ===================================== >> >> For me the first commits that make it stop working are: >> a11804724 MSCS: Add support to send MSCS Request frames >> c005283c4 SCS: Sending of SCS Request frames >> >> These introduced in wpa_supplicant/wpa_supplicant.c, in >> wpas_ext_capab_byte, two new flags: >> ``` >> case 6: /* Bits 48-55 */ >> *pos |= 0x40; /* Bit 54 - SCS */ >> ``` >> and >> ``` >> case 10: /* Bits 80-87 */ >> *pos |= 0x20; /* Bit 85 - Mirrored SCS */ >> ``` >> >> If I comment these out, or set mscs = scs = false at the beginning of >> the function: >> ``` >> diff --git a/wpa_supplicant/wpa_supplicant.c >> b/wpa_supplicant/wpa_supplicant.c >> index d37a994f9..561e5f3ed 100644 >> --- a/wpa_supplicant/wpa_supplicant.c >> +++ b/wpa_supplicant/wpa_supplicant.c >> @@ -1883,7 +1883,7 @@ int wpa_supplicant_set_suites(struct >> wpa_supplicant *wpa_s, >> >> static void wpas_ext_capab_byte(struct wpa_supplicant *wpa_s, u8 *pos, >> int idx) >> { >> - bool scs = true, mscs = true; >> + bool scs = false, mscs = false; >> >> *pos = 0x00; >> ``` >> I get it to work again. I'm posting from hostap_2_10 with this >> modification right now. >> >> Also, commenting the rest of the first breaking commit made no >> difference, so it seems to me that it's really these flags that cause >> the problem for me and not the rest of code depending on them introduced >> at the same time. >> >> Now the problem is, I have no idea what these actually do and how they >> cause the scan to fail. >> >> Most crucially, I don't know at all if wpa_supplicant is doing something >> wrong here, or if that code is actually correct but these flags are >> somehow propagated down the stack to components (drivers ?) that cannot >> handle them yet. So I'm not even sure if this should be fixed upstream >> here or in other packages. >> >> ====================================== >> >> Any idea how to proceed from here ? >> >> Best, >> >> Étienne Morice >> >> > > _______________________________________________ > Hostap mailing list > Hostap@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/hostap
Hi David, On 1/30/22 13:49, David Bauer wrote: > > The kernel exposes the maximum supported bytes to userspace with > which can be retrieved in wiphy_info_handler. > > See the attached patch. Does scanning work with thid patch applied to > v2.10? > > [0] https://bugs.archlinux.org/task/65588 > > Best > David Thanks ! This almost works. The patch doesn't fix the problem as-is, but I suspect it's just a typo in the direction of the comparison in the check in nl80211_scan_common. If I flip the direction as attached, it then works with the orginial patch + the modification applied to 2.10. Best, Étienne
Hi Etienne, On 1/30/22 15:00, Étienne Morice wrote: > Hi David, > > On 1/30/22 13:49, David Bauer wrote: > > > > > The kernel exposes the maximum supported bytes to userspace with > > which can be retrieved in wiphy_info_handler. > > > > See the attached patch. Does scanning work with thid patch applied to > > v2.10? > > > > [0] https://bugs.archlinux.org/task/65588 > > > > Best > > David > > Thanks ! This almost works. The patch doesn't fix the problem as-is, but > I suspect it's just a typo in the direction of the comparison in the > check in nl80211_scan_common. If I flip the direction as attached, it > then works with the orginial patch + the modification applied to 2.10. Great to hear! One additional question (As i do not have the hardware): Does scan & association work with CONFIG_WNM enabled? As indicated in my last name, this broke broadcom-wl on 2.9, from my understanding the patch should also solve this. Best David > > Best, > Étienne
Hi David, On 1/30/22 20:01, David Bauer wrote: > Hi Etienne, > > On 1/30/22 15:00, Étienne Morice wrote: >> Hi David, >> >> On 1/30/22 13:49, David Bauer wrote: >> >> > >> > The kernel exposes the maximum supported bytes to userspace with >> > which can be retrieved in wiphy_info_handler. >> > >> > See the attached patch. Does scanning work with thid patch applied to >> > v2.10? >> > >> > [0] https://bugs.archlinux.org/task/65588 >> > >> > Best >> > David >> >> Thanks ! This almost works. The patch doesn't fix the problem as-is, but >> I suspect it's just a typo in the direction of the comparison in the >> check in nl80211_scan_common. If I flip the direction as attached, it >> then works with the orginial patch + the modification applied to 2.10. > > Great to hear! > > One additional question (As i do not have the hardware): Does scan & > association > work with CONFIG_WNM enabled? As indicated in my last name, this broke > broadcom-wl > on 2.9, from my understanding the patch should also solve this. > > Best > David Yes, it works. I can indeed reproduce the same problem on 2.10 if I set the [m]scs flags to false in wpas_wpas_ext_capab_byte (which by itself works) but turn on CONFIG_WNM=y, and then I get the same failure with ie_len=6 getting passed down to cfg80211. With the patch on top it scans and associates correctly with CONFIG_WNM=y (both with and without the [m]scs flags). Best, Étienne.
diff --git a/wpa_supplicant/wpa_supplicant.c b/wpa_supplicant/wpa_supplicant.c index d37a994f9..561e5f3ed 100644 --- a/wpa_supplicant/wpa_supplicant.c +++ b/wpa_supplicant/wpa_supplicant.c @@ -1883,7 +1883,7 @@ int wpa_supplicant_set_suites(struct wpa_supplicant *wpa_s, static void wpas_ext_capab_byte(struct wpa_supplicant *wpa_s, u8 *pos, int idx) { - bool scs = true, mscs = true; + bool scs = false, mscs = false; *pos = 0x00; ```