Message ID | 20140409225048.F13EA140263@ushik.mtv.corp.google.com |
---|---|
State | Superseded |
Headers | show |
On Wed, Apr 09, 2014 at 03:48:07PM -0700, Dmitry Shmidt wrote: > --- a/wpa_supplicant/scan.c > +++ b/wpa_supplicant/scan.c > @@ -261,6 +261,9 @@ int wpa_supplicant_stop_sched_scan(struct wpa_supplicant *wpa_s) > wpa_dbg(wpa_s, MSG_DEBUG, "stopping sched_scan failed!"); > /* TODO: what to do if stopping fails? */ > return -1; > + } else { > + wpa_s->sched_scanning = 0; > + wpa_supplicant_notify_scanning(wpa_s, 0); > } It looks like the current implementation assumes that there will be a driver event to indicate that sched_scan was stopped and wpa_s->sched_scanning is cleared to 0 when that event is processed. Is there a driver that does not end up indicating that event? I would assume there was a reason for postponing, but then again, I now remember that there were open questions on a related area.. I think there was a request to revert commit cf70d2981df1be7005fb90c2532e737ab39bc119 and handle some of the changes in mac80211 to make operations synchronous. I guess we should get open item resolved as well.. I'm not completely sure whether these two changes have any dependencies to each other, though.
On Fri, Apr 11, 2014 at 9:09 AM, Jouni Malinen <j@w1.fi> wrote: > On Wed, Apr 09, 2014 at 03:48:07PM -0700, Dmitry Shmidt wrote: >> --- a/wpa_supplicant/scan.c >> +++ b/wpa_supplicant/scan.c >> @@ -261,6 +261,9 @@ int wpa_supplicant_stop_sched_scan(struct wpa_supplicant *wpa_s) >> wpa_dbg(wpa_s, MSG_DEBUG, "stopping sched_scan failed!"); >> /* TODO: what to do if stopping fails? */ >> return -1; >> + } else { >> + wpa_s->sched_scanning = 0; >> + wpa_supplicant_notify_scanning(wpa_s, 0); >> } > > It looks like the current implementation assumes that there will be a > driver event to indicate that sched_scan was stopped and > wpa_s->sched_scanning is cleared to 0 when that event is processed. Is > there a driver that does not end up indicating that event? Indeed, driver is not calling cfg80211_sched_scan_stopped() to send NL80211_CMD_SCHED_SCAN_STOPPED message. It is calling cfg80211_scan_done(aborted) to send NL80211_CMD_SCAN_ABORTED that is doing EVENT_SCAN_RESULTS and finally cleans wpa_s->scanning state. However, I tried to resolve next problem that is not disappearing even if we send message above from the driver: wpas_stop_pno() { wpa_supplicant_stop_sched_scan() if (wpa_s->wpa_state == WPA_SCANNING) wpa_supplicant_req_scan(wpa_s, 0, 0); <<< usual scan } wpa_supplicant_scan() { ... if (wpa_s->scanning) { ... wpa_supplicant_req_scan(wpa_s, 1, 0); <<< postponing scan for 1 sec return; } ... } Scan is postponed for 1 sec and I just tried to eliminate it. It is slightly different from problem below where Raja tried to sync between two sched scan requests. > > I would assume there was a reason for postponing, but then again, I now > remember that there were open questions on a related area.. I think > there was a request to revert commit > cf70d2981df1be7005fb90c2532e737ab39bc119 and handle some of the changes > in mac80211 to make operations synchronous. I guess we should get open > item resolved as well.. I'm not completely sure whether these two > changes have any dependencies to each other, though. I think Raja's change is related to "coordination" between sched scan by wpa_supplicant and explicit one. > > -- > Jouni Malinen PGP id EFC895FA
On Fri, Apr 11, 2014 at 09:51:14AM -0700, Dmitry Shmidt wrote: > On Fri, Apr 11, 2014 at 9:09 AM, Jouni Malinen <j@w1.fi> wrote: > > On Wed, Apr 09, 2014 at 03:48:07PM -0700, Dmitry Shmidt wrote: > >> --- a/wpa_supplicant/scan.c > >> +++ b/wpa_supplicant/scan.c > >> @@ -261,6 +261,9 @@ int wpa_supplicant_stop_sched_scan(struct wpa_supplicant *wpa_s) > >> wpa_dbg(wpa_s, MSG_DEBUG, "stopping sched_scan failed!"); > >> /* TODO: what to do if stopping fails? */ > >> return -1; > >> + } else { > >> + wpa_s->sched_scanning = 0; > >> + wpa_supplicant_notify_scanning(wpa_s, 0); I have this in my pending queue, but I'm not completely sure what to do about this. Is there still a case where this would be needed? If so, I guess I can apply this since I have not really had a chance to do any real sched_scan testing myself and no one has come up with additional justification for delaying the notification until the driver has actually completed stopping of sched_scan. > > I would assume there was a reason for postponing, but then again, I now > > remember that there were open questions on a related area.. I think > > there was a request to revert commit > > cf70d2981df1be7005fb90c2532e737ab39bc119 and handle some of the changes > > in mac80211 to make operations synchronous. I guess we should get open > > item resolved as well.. I'm not completely sure whether these two > > changes have any dependencies to each other, though. > > I think Raja's change is related to "coordination" between sched scan > by wpa_supplicant > and explicit one. This revert request has not gone anywhere either.. Would someone still remember what the issue here is and whether there is still need to either revert that commit cf70d2981df1be7005fb90c2532e737ab39bc119 ('wpa_supplicant: Schedule PNO on completion of ongoing sched_scan') or fix something in it? It would be nice to see a debug log showing the issue since I don't have very convenient test setup for sched_scan.
On Sat, Mar 7, 2015 at 11:46 AM, Jouni Malinen <j@w1.fi> wrote: > On Fri, Apr 11, 2014 at 09:51:14AM -0700, Dmitry Shmidt wrote: >> On Fri, Apr 11, 2014 at 9:09 AM, Jouni Malinen <j@w1.fi> wrote: >> > On Wed, Apr 09, 2014 at 03:48:07PM -0700, Dmitry Shmidt wrote: >> >> --- a/wpa_supplicant/scan.c >> >> +++ b/wpa_supplicant/scan.c >> >> @@ -261,6 +261,9 @@ int wpa_supplicant_stop_sched_scan(struct wpa_supplicant *wpa_s) >> >> wpa_dbg(wpa_s, MSG_DEBUG, "stopping sched_scan failed!"); >> >> /* TODO: what to do if stopping fails? */ >> >> return -1; >> >> + } else { >> >> + wpa_s->sched_scanning = 0; >> >> + wpa_supplicant_notify_scanning(wpa_s, 0); > > I have this in my pending queue, but I'm not completely sure what to do > about this. > > Is there still a case where this would be needed? If so, I guess I can > apply this since I have not really had a chance to do any real > sched_scan testing myself and no one has come up with additional > justification for delaying the notification until the driver has > actually completed stopping of sched_scan. When wifi manager explicitly stops PNO (by set pno 0), wpa_supplicant sends stop_pno command to the driver. Theoretically it is "sync" call and PNO is stopped immediately. However, it takes time for event to propagate back: nl80211_stop_sched_scan() -> __cfg80211_stop_sched_scan() -> nl80211_send_sched_scan(rdev, dev, NL80211_CMD_SCHED_SCAN_STOPPED); so when wpa_supplciant gets NL80211_CMD_SCHED_SCAN_STOPPED that will be translated to EVENT_SCHED_SCAN_STOPPED to clear wpa_s->scanning it will be too late - code in scan function will postpone scan for 1 sec: if (wpa_s->scanning) { /* * If we are already in scanning state, we shall reschedule the * the incoming scan request. */ wpa_dbg(wpa_s, MSG_DEBUG, "Already scanning - Reschedule the incoming scan req"); wpa_supplicant_req_scan(wpa_s, 1, 0); return; } So unless you see some problem with this patch, I would recommend to take it to avoid 1 sec delay after PNO termination. > >> > I would assume there was a reason for postponing, but then again, I now >> > remember that there were open questions on a related area.. I think >> > there was a request to revert commit >> > cf70d2981df1be7005fb90c2532e737ab39bc119 and handle some of the changes >> > in mac80211 to make operations synchronous. I guess we should get open >> > item resolved as well.. I'm not completely sure whether these two >> > changes have any dependencies to each other, though. >> >> I think Raja's change is related to "coordination" between sched scan >> by wpa_supplicant >> and explicit one. > > This revert request has not gone anywhere either.. Would someone still > remember what the issue here is and whether there is still need to > either revert that commit cf70d2981df1be7005fb90c2532e737ab39bc119 > ('wpa_supplicant: Schedule PNO on completion of ongoing sched_scan') or > fix something in it? > > It would be nice to see a debug log showing the issue since I don't have > very convenient test setup for sched_scan. > > -- > Jouni Malinen PGP id EFC895FA > _______________________________________________ > HostAP mailing list > HostAP@lists.shmoo.com > http://lists.shmoo.com/mailman/listinfo/hostap
On Mon, 2015-03-09 at 11:29 -0700, Dmitry Shmidt wrote: > On Sat, Mar 7, 2015 at 11:46 AM, Jouni Malinen <j@w1.fi> wrote: > > On Fri, Apr 11, 2014 at 09:51:14AM -0700, Dmitry Shmidt wrote: > >> On Fri, Apr 11, 2014 at 9:09 AM, Jouni Malinen <j@w1.fi> wrote: > >> > On Wed, Apr 09, 2014 at 03:48:07PM -0700, Dmitry Shmidt wrote: > >> >> --- a/wpa_supplicant/scan.c > >> >> +++ b/wpa_supplicant/scan.c > >> >> @@ -261,6 +261,9 @@ int wpa_supplicant_stop_sched_scan(struct wpa_supplicant *wpa_s) > >> >> wpa_dbg(wpa_s, MSG_DEBUG, "stopping sched_scan failed!"); > >> >> /* TODO: what to do if stopping fails? */ > >> >> return -1; > >> >> + } else { > >> >> + wpa_s->sched_scanning = 0; > >> >> + wpa_supplicant_notify_scanning(wpa_s, 0); > > > > I have this in my pending queue, but I'm not completely sure what to do > > about this. > > > > Is there still a case where this would be needed? If so, I guess I can > > apply this since I have not really had a chance to do any real > > sched_scan testing myself and no one has come up with additional > > justification for delaying the notification until the driver has > > actually completed stopping of sched_scan. > > When wifi manager explicitly stops PNO (by set pno 0), wpa_supplicant sends > stop_pno command to the driver. Theoretically it is "sync" call and > PNO is stopped > immediately. However, it takes time for event to propagate back: > nl80211_stop_sched_scan() -> __cfg80211_stop_sched_scan() -> > nl80211_send_sched_scan(rdev, dev, NL80211_CMD_SCHED_SCAN_STOPPED); > so when wpa_supplciant gets NL80211_CMD_SCHED_SCAN_STOPPED > that will be translated to EVENT_SCHED_SCAN_STOPPED to clear > wpa_s->scanning > it will be too late Which driver are you using? ISTR fixing a number of issues in this area. johannes
On Mon, Mar 9, 2015 at 12:43 PM, Johannes Berg <johannes@sipsolutions.net> wrote: > On Mon, 2015-03-09 at 11:29 -0700, Dmitry Shmidt wrote: >> On Sat, Mar 7, 2015 at 11:46 AM, Jouni Malinen <j@w1.fi> wrote: >> > On Fri, Apr 11, 2014 at 09:51:14AM -0700, Dmitry Shmidt wrote: >> >> On Fri, Apr 11, 2014 at 9:09 AM, Jouni Malinen <j@w1.fi> wrote: >> >> > On Wed, Apr 09, 2014 at 03:48:07PM -0700, Dmitry Shmidt wrote: >> >> >> --- a/wpa_supplicant/scan.c >> >> >> +++ b/wpa_supplicant/scan.c >> >> >> @@ -261,6 +261,9 @@ int wpa_supplicant_stop_sched_scan(struct wpa_supplicant *wpa_s) >> >> >> wpa_dbg(wpa_s, MSG_DEBUG, "stopping sched_scan failed!"); >> >> >> /* TODO: what to do if stopping fails? */ >> >> >> return -1; >> >> >> + } else { >> >> >> + wpa_s->sched_scanning = 0; >> >> >> + wpa_supplicant_notify_scanning(wpa_s, 0); >> > >> > I have this in my pending queue, but I'm not completely sure what to do >> > about this. >> > >> > Is there still a case where this would be needed? If so, I guess I can >> > apply this since I have not really had a chance to do any real >> > sched_scan testing myself and no one has come up with additional >> > justification for delaying the notification until the driver has >> > actually completed stopping of sched_scan. >> >> When wifi manager explicitly stops PNO (by set pno 0), wpa_supplicant sends >> stop_pno command to the driver. Theoretically it is "sync" call and >> PNO is stopped >> immediately. However, it takes time for event to propagate back: >> nl80211_stop_sched_scan() -> __cfg80211_stop_sched_scan() -> >> nl80211_send_sched_scan(rdev, dev, NL80211_CMD_SCHED_SCAN_STOPPED); >> so when wpa_supplciant gets NL80211_CMD_SCHED_SCAN_STOPPED >> that will be translated to EVENT_SCHED_SCAN_STOPPED to clear >> wpa_s->scanning >> it will be too late > > Which driver are you using? ISTR fixing a number of issues in this area. We are using Broadcom and Qualcomm drivers. However, I am not sure that driver is relevant here - all the calls I described above are done by kernel wireless stack and wpa_supplicant. > > johannes >
On Mon, 2015-03-09 at 12:46 -0700, Dmitry Shmidt wrote: > >> immediately. However, it takes time for event to propagate back: > >> nl80211_stop_sched_scan() -> __cfg80211_stop_sched_scan() -> > >> nl80211_send_sched_scan(rdev, dev, NL80211_CMD_SCHED_SCAN_STOPPED); > >> so when wpa_supplciant gets NL80211_CMD_SCHED_SCAN_STOPPED > >> that will be translated to EVENT_SCHED_SCAN_STOPPED to clear > >> wpa_s->scanning > >> it will be too late > > > > Which driver are you using? ISTR fixing a number of issues in this area. > > We are using Broadcom and Qualcomm drivers. However, I am not sure that > driver is relevant here - all the calls I described above are done by kernel > wireless stack and wpa_supplicant. Right but I think we fixed the stack to report the event back before the stop_sched_scan() call can return? But my memory of this is very vague, I might very well be mistaken and would have to check the code. johannes
On Mon, Mar 9, 2015 at 12:56 PM, Johannes Berg <johannes@sipsolutions.net> wrote: > On Mon, 2015-03-09 at 12:46 -0700, Dmitry Shmidt wrote: >> >> immediately. However, it takes time for event to propagate back: >> >> nl80211_stop_sched_scan() -> __cfg80211_stop_sched_scan() -> >> >> nl80211_send_sched_scan(rdev, dev, NL80211_CMD_SCHED_SCAN_STOPPED); >> >> so when wpa_supplciant gets NL80211_CMD_SCHED_SCAN_STOPPED >> >> that will be translated to EVENT_SCHED_SCAN_STOPPED to clear >> >> wpa_s->scanning >> >> it will be too late >> > >> > Which driver are you using? ISTR fixing a number of issues in this area. >> >> We are using Broadcom and Qualcomm drivers. However, I am not sure that >> driver is relevant here - all the calls I described above are done by kernel >> wireless stack and wpa_supplicant. > > Right but I think we fixed the stack to report the event back before the > stop_sched_scan() call can return? But my memory of this is very vague, > I might very well be mistaken and would have to check the code. If you can hint me what this change was about, I'll try to find it. Meanwhile the only change I see between 3.19 and 3.4 in __cfg80211_stop_sched_scan() is to use rtnl_lock instead of sched_scan_mtx. Other than that the logic is the same - send stop command to the driver and send NL80211_CMD_SCHED_SCAN_STOPPED back to user space. And I see (I didn't try 3.19) that it will come "too late". > > johannes >
On Mon, 2015-03-09 at 13:07 -0700, Dmitry Shmidt wrote: > If you can hint me what this change was about, I'll try to find it. > Meanwhile the only change I see between 3.19 and 3.4 in > __cfg80211_stop_sched_scan() is to use rtnl_lock instead of sched_scan_mtx. > Other than that the logic is the same - send stop command to the driver and send > NL80211_CMD_SCHED_SCAN_STOPPED back to user space. > And I see (I didn't try 3.19) that it will come "too late". Looks like cfg80211 could do this correctly, but mac80211 insists on doing the scheduled work. Given that locks the RTNL it's not going to be simple to actually implement it. It therefore seems far more likely then that I'm just confused and was thinking about something else, sorry about that. johannes
On Mon, Mar 9, 2015 at 10:11 PM, Johannes Berg <johannes@sipsolutions.net> wrote: > > On Mon, 2015-03-09 at 13:07 -0700, Dmitry Shmidt wrote: > > > If you can hint me what this change was about, I'll try to find it. > > Meanwhile the only change I see between 3.19 and 3.4 in > > __cfg80211_stop_sched_scan() is to use rtnl_lock instead of sched_scan_mtx. > > Other than that the logic is the same - send stop command to the driver and send > > NL80211_CMD_SCHED_SCAN_STOPPED back to user space. > > And I see (I didn't try 3.19) that it will come "too late". > > Looks like cfg80211 could do this correctly, but mac80211 insists on > doing the scheduled work. Given that locks the RTNL it's not going to be > simple to actually implement it. > Is this really about kernel-mode though? wpa_s is single threaded (eloop), so even if the notification comes up right away, the event is still only queued on the thread.. That said, I think Dmitry's version might hurt some state. It's safer to just schedule the scan on the eloop with a very small delay (even 0), so that the pending sched-scan event has a chance to get processed. Arik
On Mon, Mar 9, 2015 at 1:17 PM, Arik Nemtsov <arik@wizery.com> wrote: > On Mon, Mar 9, 2015 at 10:11 PM, Johannes Berg > <johannes@sipsolutions.net> wrote: >> >> On Mon, 2015-03-09 at 13:07 -0700, Dmitry Shmidt wrote: >> >> > If you can hint me what this change was about, I'll try to find it. >> > Meanwhile the only change I see between 3.19 and 3.4 in >> > __cfg80211_stop_sched_scan() is to use rtnl_lock instead of sched_scan_mtx. >> > Other than that the logic is the same - send stop command to the driver and send >> > NL80211_CMD_SCHED_SCAN_STOPPED back to user space. >> > And I see (I didn't try 3.19) that it will come "too late". >> >> Looks like cfg80211 could do this correctly, but mac80211 insists on >> doing the scheduled work. Given that locks the RTNL it's not going to be >> simple to actually implement it. >> > > Is this really about kernel-mode though? wpa_s is single threaded > (eloop), so even if the notification comes up right away, the event is > still only queued on the thread.. This is what I see as well. > > That said, I think Dmitry's version might hurt some state. It's safer > to just schedule the scan on the eloop with a very small delay (even > 0), so that the pending sched-scan event has a chance to get > processed. Changing scanning function may affect others, while resetting PNO state should not. Moreover later EVENT_SCHED_SCAN_STOPPED will do nothing if sched_scan_timed_out or pno_sched_pending will not be set. if (wpa_s->sched_scan_timed_out) { wpa_supplicant_req_sched_scan(wpa_s); } else if (wpa_s->pno_sched_pending) { wpa_s->pno_sched_pending = 0; wpas_start_pno(wpa_s); } > > Arik
On Fri, Mar 13, 2015 at 8:16 PM, Dmitry Shmidt <dimitrysh@google.com> wrote: > Changing scanning function may affect others, while resetting PNO > state should not. Moreover later EVENT_SCHED_SCAN_STOPPED > will do nothing if sched_scan_timed_out or pno_sched_pending will not > be set. > if (wpa_s->sched_scan_timed_out) { > wpa_supplicant_req_sched_scan(wpa_s); > } else if (wpa_s->pno_sched_pending) { > wpa_s->pno_sched_pending = 0; > wpas_start_pno(wpa_s); > } Your change can hurt sched-scan users. For instance if wpa_s->sched_scan_timed_out is set, then your patch will introduce a time-window where a regular scan might come in before the EVENT_SCHED_SCAN_STOPPED has a chance to arrive. You introduce a similar race for wpa_s->pno_sched_pending (you even guarantee it there I believe?). I guess the easiest fix is to have another pno related flag (wpa_s->restart_scan_after_pno ?) and checking it in EVENT_SCHED_SCAN_STOPPED for restarting the regular scan. Arik
diff --git a/wpa_supplicant/scan.c b/wpa_supplicant/scan.c index 1d4e6e5..af34887 100644 --- a/wpa_supplicant/scan.c +++ b/wpa_supplicant/scan.c @@ -261,6 +261,9 @@ int wpa_supplicant_stop_sched_scan(struct wpa_supplicant *wpa_s) wpa_dbg(wpa_s, MSG_DEBUG, "stopping sched_scan failed!"); /* TODO: what to do if stopping fails? */ return -1; + } else { + wpa_s->sched_scanning = 0; + wpa_supplicant_notify_scanning(wpa_s, 0); } return ret;
Change-Id: I6deca387f7c64e4125e85ecfd585e1cff6931ab1 Signed-off-by: Dmitry Shmidt <dimitrysh@google.com> --- wpa_supplicant/scan.c | 3 +++ 1 file changed, 3 insertions(+)