diff mbox

poll: prevent missed events if _qproc is NULL

Message ID 1356960060-1263-1-git-send-email-normalperson@yhbt.net
State Superseded, archived
Delegated to: David Miller
Headers show

Commit Message

Eric Wong Dec. 31, 2012, 1:21 p.m. UTC
This patch seems to fix my issue with ppoll() being stuck on my
SMP machine: http://article.gmane.org/gmane.linux.file-systems/70414

The change to sock_poll_wait() in
commit 626cf236608505d376e4799adb4f7eb00a8594af
  (poll: add poll_requested_events() and poll_does_not_wait() functions)
seems to have allowed additional cases where the SMP memory barrier
is not issued before checking for readiness.

In my case, this affects the select()-family of functions
which register descriptors once and set _qproc to NULL before
checking events again (after poll_schedule_timeout() returns).
The set_mb() barrier in poll_schedule_timeout() appears to be
insufficient on my SMP x86-64 machine (as it's only an xchg()).

This may also be related to the epoll issue described by
Andreas Voellmy in http://thread.gmane.org/gmane.linux.kernel/1408782/

Signed-off-by: Eric Wong <normalperson@yhbt.net>
Cc: Hans Verkuil <hans.verkuil@cisco.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
Cc: David Miller <davem@davemloft.net>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andreas Voellmy <andreas.voellmy@yale.edu>
Cc: "Junchang(Jason) Wang" <junchang.wang@yale.edu>
Cc: netdev@vger.kernel.org
Cc: linux-fsdevel@vger.kernel.org
---
 If this patch is correct, I think we can just drop the
 poll_does_not_wait() function entirely since poll_wait()
 does the same check anyways...

 include/net/sock.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Eric Wong Dec. 31, 2012, 11:24 p.m. UTC | #1
Eric Wong <normalperson@yhbt.net> wrote:
> This patch seems to fix my issue with ppoll() being stuck on my
> SMP machine: http://article.gmane.org/gmane.linux.file-systems/70414

OK, it doesn't fix my issue, but it seems to make it harder-to-hit...

> The change to sock_poll_wait() in
> commit 626cf236608505d376e4799adb4f7eb00a8594af
>   (poll: add poll_requested_events() and poll_does_not_wait() functions)
> seems to have allowed additional cases where the SMP memory barrier
> is not issued before checking for readiness.
> 
> In my case, this affects the select()-family of functions
> which register descriptors once and set _qproc to NULL before
> checking events again (after poll_schedule_timeout() returns).
> The set_mb() barrier in poll_schedule_timeout() appears to be
> insufficient on my SMP x86-64 machine (as it's only an xchg()).
> 
> This may also be related to the epoll issue described by
> Andreas Voellmy in http://thread.gmane.org/gmane.linux.kernel/1408782/

However, I believe my patch will still fix Andreas' issue with epoll
due to how ep_modify() uses a NULL qproc when calling ->poll().

(I've never been able to reproduce Andreas' issue on my 4-core system,
 but he's been hitting it since 3.4 (at least))
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Junchang(Jason) Wang Jan. 1, 2013, 4:58 p.m. UTC | #2
Hi Eric and list,

Thanks a lot. The patch solves our (Andreas and my) issue in using
epoll. Here's our test program
https://github.com/AndreasVoellmy/epollbug/blob/master/epollbug.c  We
are using Linux 3.7.1 and a server with 80 cores.

Cheers!

--Jason

On Mon, Dec 31, 2012 at 6:24 PM, Eric Wong <normalperson@yhbt.net> wrote:
>
> Eric Wong <normalperson@yhbt.net> wrote:
> > This patch seems to fix my issue with ppoll() being stuck on my
> > SMP machine: http://article.gmane.org/gmane.linux.file-systems/70414
>
> OK, it doesn't fix my issue, but it seems to make it harder-to-hit...
>
> > The change to sock_poll_wait() in
> > commit 626cf236608505d376e4799adb4f7eb00a8594af
> >   (poll: add poll_requested_events() and poll_does_not_wait() functions)
> > seems to have allowed additional cases where the SMP memory barrier
> > is not issued before checking for readiness.
> >
> > In my case, this affects the select()-family of functions
> > which register descriptors once and set _qproc to NULL before
> > checking events again (after poll_schedule_timeout() returns).
> > The set_mb() barrier in poll_schedule_timeout() appears to be
> > insufficient on my SMP x86-64 machine (as it's only an xchg()).
> >
> > This may also be related to the epoll issue described by
> > Andreas Voellmy in http://thread.gmane.org/gmane.linux.kernel/1408782/
>
> However, I believe my patch will still fix Andreas' issue with epoll
> due to how ep_modify() uses a NULL qproc when calling ->poll().
>
> (I've never been able to reproduce Andreas' issue on my 4-core system,
>  but he's been hitting it since 3.4 (at least))
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Jan. 1, 2013, 6:42 p.m. UTC | #3
On Mon, 2012-12-31 at 13:21 +0000, Eric Wong wrote:
> This patch seems to fix my issue with ppoll() being stuck on my
> SMP machine: http://article.gmane.org/gmane.linux.file-systems/70414
> 
> The change to sock_poll_wait() in
> commit 626cf236608505d376e4799adb4f7eb00a8594af
>   (poll: add poll_requested_events() and poll_does_not_wait() functions)
> seems to have allowed additional cases where the SMP memory barrier
> is not issued before checking for readiness.
> 
> In my case, this affects the select()-family of functions
> which register descriptors once and set _qproc to NULL before
> checking events again (after poll_schedule_timeout() returns).
> The set_mb() barrier in poll_schedule_timeout() appears to be
> insufficient on my SMP x86-64 machine (as it's only an xchg()).
> 
> This may also be related to the epoll issue described by
> Andreas Voellmy in http://thread.gmane.org/gmane.linux.kernel/1408782/

Hmm, the change seems not very logical to me.

If it helps, I would like to understand the real issue.

commit 626cf236608505d376e4799adb4f7eb00a8594af should not have this
side effect, at least for poll()/select() functions. The epoll() changes
I am not yet very confident.

I suspect a race already existed before this commit, it would be nice to
track it properly.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/net/sock.h b/include/net/sock.h
index c945fba..1923e48 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1925,8 +1925,9 @@  static inline bool wq_has_sleeper(struct socket_wq *wq)
 static inline void sock_poll_wait(struct file *filp,
 		wait_queue_head_t *wait_address, poll_table *p)
 {
-	if (!poll_does_not_wait(p) && wait_address) {
-		poll_wait(filp, wait_address, p);
+	if (wait_address) {
+		if (!poll_does_not_wait(p))
+			poll_wait(filp, wait_address, p);
 		/* We need to be sure we are in sync with the
 		 * socket flags modification.
 		 *