diff mbox series

[ovs-dev,v2,3/4] Loop to connect unix socket until success

Message ID 20231118010703.4154866-4-ihrachys@redhat.com
State Changes Requested
Delegated to: Simon Horman
Headers show
Series Improve unixctl AF_UNIX backlog handling | expand

Checks

Context Check Description
ovsrobot/apply-robot warning apply and check: warning
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/intel-ovs-compilation success test: success

Commit Message

Ihar Hrachyshka Nov. 18, 2023, 1:07 a.m. UTC
In nonblocking mode, a POSIX socket may temporarily fail to connect(),
which is indicated by returning EINPROGRESS. This patch will loop
waiting for connect() to succeed, or until EINPROGRESS happens.

An alternative - and probably a better - path to deal with this error
would be to extend unix stream_class to support connect API, that would
allow the caller to repeat connect() attempts asynchronously.

Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
---
 lib/socket-util-unix.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

Comments

Simon Horman Nov. 20, 2023, 4:22 p.m. UTC | #1
On Sat, Nov 18, 2023 at 01:07:02AM +0000, Ihar Hrachyshka wrote:
> In nonblocking mode, a POSIX socket may temporarily fail to connect(),
> which is indicated by returning EINPROGRESS. This patch will loop
> waiting for connect() to succeed, or until EINPROGRESS happens.
> 
> An alternative - and probably a better - path to deal with this error
> would be to extend unix stream_class to support connect API, that would
> allow the caller to repeat connect() attempts asynchronously.

Hi Ihar,

Yes, that probably would be better. But I expect it is also more complex.
So perhaps such work is appropriate as a follow-up at some point. Because,
assuming the patch is correct (I have some questions), it does seem to
improve things. And I'm all for improving things even if incrementally.

> Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
> ---
>  lib/socket-util-unix.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/socket-util-unix.c b/lib/socket-util-unix.c
> index 0053a61b1..230705ba3 100644
> --- a/lib/socket-util-unix.c
> +++ b/lib/socket-util-unix.c
> @@ -361,10 +361,16 @@ make_unix_socket(int style, bool nonblock,
>          int dirfd;
>  
>          error = make_sockaddr_un(connect_path, &un, &un_len, &dirfd, linkname);
> -        if (!error
> -            && connect(fd, (struct sockaddr*) &un, un_len)
> -            && errno != EINPROGRESS) {
> -            error = errno;
> +        if (!error) {
> +            for (;;) {
> +                if (!connect(fd, (struct sockaddr *)&un, un_len)) {
> +                    break;
> +                }
> +                if (errno != EINPROGRESS) {
> +                    error = errno;
> +                    break;
> +                }
> +            }

I have a few questions about this.

1. I see that patch 4/4 also takes into account EAGAIN as an
   alternative for EINPROGRESS, which is consistent with my
   reading of connect(2) wrt Unix domain sockets on Linux.
   And, reading connect(3p) I see that EINPROGRESS is the
   correct error to check for wrt POSIX.

   But I am not entirely clear that calling connect() again
   is the correct behaviour. Is this clear to you?

2. Is it possible that this loop never exits?
   Or is otherwise abusive resource-wise.

>          }
>          free_sockaddr_un(dirfd, linkname);
>  
> -- 
> 2.41.0
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
Ihar Hrachyshka Nov. 20, 2023, 5:40 p.m. UTC | #2
On Mon, Nov 20, 2023 at 11:22 AM Simon Horman <horms@ovn.org> wrote:

> On Sat, Nov 18, 2023 at 01:07:02AM +0000, Ihar Hrachyshka wrote:
> > In nonblocking mode, a POSIX socket may temporarily fail to connect(),
> > which is indicated by returning EINPROGRESS. This patch will loop
> > waiting for connect() to succeed, or until EINPROGRESS happens.
> >
> > An alternative - and probably a better - path to deal with this error
> > would be to extend unix stream_class to support connect API, that would
> > allow the caller to repeat connect() attempts asynchronously.
>
> Hi Ihar,
>
> Yes, that probably would be better. But I expect it is also more complex.
> So perhaps such work is appropriate as a follow-up at some point. Because,
> assuming the patch is correct (I have some questions), it does seem to
> improve things. And I'm all for improving things even if incrementally.
>

This patch is clearly controversial (and I hate it), thanks for starting
this discussion. (Perhaps I should have posted this patch from the series
as RFC not to suggest I am happy about it. I am not.)


>
> > Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
> > ---
> >  lib/socket-util-unix.c | 14 ++++++++++----
> >  1 file changed, 10 insertions(+), 4 deletions(-)
> >
> > diff --git a/lib/socket-util-unix.c b/lib/socket-util-unix.c
> > index 0053a61b1..230705ba3 100644
> > --- a/lib/socket-util-unix.c
> > +++ b/lib/socket-util-unix.c
> > @@ -361,10 +361,16 @@ make_unix_socket(int style, bool nonblock,
> >          int dirfd;
> >
> >          error = make_sockaddr_un(connect_path, &un, &un_len, &dirfd,
> linkname);
> > -        if (!error
> > -            && connect(fd, (struct sockaddr*) &un, un_len)
> > -            && errno != EINPROGRESS) {
> > -            error = errno;
> > +        if (!error) {
> > +            for (;;) {
> > +                if (!connect(fd, (struct sockaddr *)&un, un_len)) {
> > +                    break;
> > +                }
> > +                if (errno != EINPROGRESS) {
> > +                    error = errno;
> > +                    break;
> > +                }
> > +            }
>
> I have a few questions about this.
>
> 1. I see that patch 4/4 also takes into account EAGAIN as an
>    alternative for EINPROGRESS, which is consistent with my
>    reading of connect(2) wrt Unix domain sockets on Linux.
>    And, reading connect(3p) I see that EINPROGRESS is the
>    correct error to check for wrt POSIX.
>
>    But I am not entirely clear that calling connect() again
>    is the correct behaviour. Is this clear to you?
>

It's not. Perhaps there are bugs in the implementation beyond non-graceful
handling of EAGAIN for Unix sockets.

For example, AFAIU, POSIX implies that if connect() returns EINPROGRESS,
then the code should select() to check if FD is ready for write.
(Alternatively, consequent calls to connect() may return EALREADY.)

This particular snippet from the Linux connect 2 man page describes the
proper way to deal with EINPROGRESS:
https://man7.org/linux/man-pages/man2/connect.2.html

```
After
              select(2) indicates writability, use getsockopt(2) to read
              the SO_ERROR option at level SOL_SOCKET to determine
              whether connect() completed successfully (SO_ERROR is
              zero) or unsuccessfully (SO_ERROR is one of the usual
              error codes listed here, explaining the reason for the
              failure).
```

I don't see any references to SO_ERROR in the openvswitch tree.

Then there's the Linux case for AF_INET sockets that returns EAGAIN instead
of EINPROGRESS. In this case, I think consequent connect() calls may
require different treatment (repeating connect attempts). What I didn't
consider in this patch, I think, is that this EAGAIN behavior is Linux
specific, and - probably - other Unixes may decide to return EINPROGRESS in
the same case.

---

Is it safe to call connect() after EAGAIN? Depends on the definition of
"safe". I think it's "safe" to do it with the current implementation of
AF_UNIX stream_connect in kernel, see:
https://github.com/torvalds/linux/blob/98b1cc82c4affc16f5598d4fa14b1858671b2263/net/unix/af_unix.c#L1559

Here, once the connect() call times out and bails out to "goto out" with
errno set to EAGAIN, it unix_release_sock(newsk, 0); the allocated
structures for the socket.

Does it mean we can safely rely on this behavior not to change in Linux AND
be consistent with any other systems that may decide to EAGAIN on AF_UNIX
sockets? I dunno. :(


>
> 2. Is it possible that this loop never exits?
>    Or is otherwise abusive resource-wise.
>

Abuse could be dealt with usleep() on each iteration (perhaps with a
backoff). The infinite loop situation would require a mechanism to timeout
after a particular number of attempts. We could definitely implement this.
But - perhaps at this point it would be better to implement the "connect"
unix class endpoint. I will take a look into it in the next revision.

---

I suggest we do the following for the next revision.

1. In unix_make_connect, call connect() once. If it returns 0, tag the
stream connected (stream->state).
2. If connect() returns EAGAIN or EINPROGRESS, bail out from
unix_make_connect but return fd. Otherwise return error.
3. Implement connect() endpoint to unconditionally call connect(). If it
returns EAGAIN (on Linux) or EINPROGRESS (POSIX), consider the stream is
still connecting. If it's a different error, tag the stream disconnected.

AFAIR (1) is already done. (2) will need to know that EAGAIN is a legit
answer from connect(), but no loop will be introduced. (3) will need to be
written from scratch. I think check_connection_completion may also need to
be adjusted to return EAGAIN when Linux AF_UNIX socket is still connecting.

Thoughts?


>
> >          }
> >          free_sockaddr_un(dirfd, linkname);
> >
> > --
> > 2.41.0
> >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
>
>
Simon Horman Nov. 21, 2023, 10:54 a.m. UTC | #3
On Mon, Nov 20, 2023 at 12:40:58PM -0500, Ihar Hrachyshka wrote:
> On Mon, Nov 20, 2023 at 11:22 AM Simon Horman <horms@ovn.org> wrote:
> 
> > On Sat, Nov 18, 2023 at 01:07:02AM +0000, Ihar Hrachyshka wrote:
> > > In nonblocking mode, a POSIX socket may temporarily fail to connect(),
> > > which is indicated by returning EINPROGRESS. This patch will loop
> > > waiting for connect() to succeed, or until EINPROGRESS happens.
> > >
> > > An alternative - and probably a better - path to deal with this error
> > > would be to extend unix stream_class to support connect API, that would
> > > allow the caller to repeat connect() attempts asynchronously.
> >
> > Hi Ihar,
> >
> > Yes, that probably would be better. But I expect it is also more complex.
> > So perhaps such work is appropriate as a follow-up at some point. Because,
> > assuming the patch is correct (I have some questions), it does seem to
> > improve things. And I'm all for improving things even if incrementally.
> >
> 
> This patch is clearly controversial (and I hate it), thanks for starting
> this discussion. (Perhaps I should have posted this patch from the series
> as RFC not to suggest I am happy about it. I am not.)
> 
> 
> >
> > > Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
> > > ---
> > >  lib/socket-util-unix.c | 14 ++++++++++----
> > >  1 file changed, 10 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/lib/socket-util-unix.c b/lib/socket-util-unix.c
> > > index 0053a61b1..230705ba3 100644
> > > --- a/lib/socket-util-unix.c
> > > +++ b/lib/socket-util-unix.c
> > > @@ -361,10 +361,16 @@ make_unix_socket(int style, bool nonblock,
> > >          int dirfd;
> > >
> > >          error = make_sockaddr_un(connect_path, &un, &un_len, &dirfd,
> > linkname);
> > > -        if (!error
> > > -            && connect(fd, (struct sockaddr*) &un, un_len)
> > > -            && errno != EINPROGRESS) {
> > > -            error = errno;
> > > +        if (!error) {
> > > +            for (;;) {
> > > +                if (!connect(fd, (struct sockaddr *)&un, un_len)) {
> > > +                    break;
> > > +                }
> > > +                if (errno != EINPROGRESS) {
> > > +                    error = errno;
> > > +                    break;
> > > +                }
> > > +            }
> >
> > I have a few questions about this.
> >
> > 1. I see that patch 4/4 also takes into account EAGAIN as an
> >    alternative for EINPROGRESS, which is consistent with my
> >    reading of connect(2) wrt Unix domain sockets on Linux.
> >    And, reading connect(3p) I see that EINPROGRESS is the
> >    correct error to check for wrt POSIX.
> >
> >    But I am not entirely clear that calling connect() again
> >    is the correct behaviour. Is this clear to you?
> >
> 
> It's not. Perhaps there are bugs in the implementation beyond non-graceful
> handling of EAGAIN for Unix sockets.
> 
> For example, AFAIU, POSIX implies that if connect() returns EINPROGRESS,
> then the code should select() to check if FD is ready for write.
> (Alternatively, consequent calls to connect() may return EALREADY.)
> 
> This particular snippet from the Linux connect 2 man page describes the
> proper way to deal with EINPROGRESS:
> https://man7.org/linux/man-pages/man2/connect.2.html
> 
> ```
> After
>               select(2) indicates writability, use getsockopt(2) to read
>               the SO_ERROR option at level SOL_SOCKET to determine
>               whether connect() completed successfully (SO_ERROR is
>               zero) or unsuccessfully (SO_ERROR is one of the usual
>               error codes listed here, explaining the reason for the
>               failure).
> ```
> 
> I don't see any references to SO_ERROR in the openvswitch tree.
> 
> Then there's the Linux case for AF_INET sockets that returns EAGAIN instead
> of EINPROGRESS. In this case, I think consequent connect() calls may
> require different treatment (repeating connect attempts). What I didn't
> consider in this patch, I think, is that this EAGAIN behavior is Linux
> specific, and - probably - other Unixes may decide to return EINPROGRESS in
> the same case.
> 
> ---
> 
> Is it safe to call connect() after EAGAIN? Depends on the definition of
> "safe". I think it's "safe" to do it with the current implementation of
> AF_UNIX stream_connect in kernel, see:
> https://github.com/torvalds/linux/blob/98b1cc82c4affc16f5598d4fa14b1858671b2263/net/unix/af_unix.c#L1559
> 
> Here, once the connect() call times out and bails out to "goto out" with
> errno set to EAGAIN, it unix_release_sock(newsk, 0); the allocated
> structures for the socket.
> 
> Does it mean we can safely rely on this behavior not to change in Linux AND
> be consistent with any other systems that may decide to EAGAIN on AF_UNIX
> sockets? I dunno. :(

I think we are on the same page here :(

> > 2. Is it possible that this loop never exits?
> >    Or is otherwise abusive resource-wise.
> >
> 
> Abuse could be dealt with usleep() on each iteration (perhaps with a
> backoff). The infinite loop situation would require a mechanism to timeout
> after a particular number of attempts. We could definitely implement this.
> But - perhaps at this point it would be better to implement the "connect"
> unix class endpoint. I will take a look into it in the next revision.

Yes, that is a good point.
Perhaps rather than bandage things up, in ways that we are unclear of
the consequences, a new approach is worth investigating.

> 
> ---
> 
> I suggest we do the following for the next revision.
> 
> 1. In unix_make_connect, call connect() once. If it returns 0, tag the
> stream connected (stream->state).
> 2. If connect() returns EAGAIN or EINPROGRESS, bail out from
> unix_make_connect but return fd. Otherwise return error.
> 3. Implement connect() endpoint to unconditionally call connect(). If it
> returns EAGAIN (on Linux) or EINPROGRESS (POSIX), consider the stream is
> still connecting. If it's a different error, tag the stream disconnected.
> 
> AFAIR (1) is already done. (2) will need to know that EAGAIN is a legit
> answer from connect(), but no loop will be introduced. (3) will need to be
> written from scratch. I think check_connection_completion may also need to
> be adjusted to return EAGAIN when Linux AF_UNIX socket is still connecting.

1. Agreed
2. That is my reading of connect(2) [*]
3. I'm a little clear how the "still connecting case" will be handled

[*] https://www.man7.org/linux/man-pages/man2/connect.2.html
diff mbox series

Patch

diff --git a/lib/socket-util-unix.c b/lib/socket-util-unix.c
index 0053a61b1..230705ba3 100644
--- a/lib/socket-util-unix.c
+++ b/lib/socket-util-unix.c
@@ -361,10 +361,16 @@  make_unix_socket(int style, bool nonblock,
         int dirfd;
 
         error = make_sockaddr_un(connect_path, &un, &un_len, &dirfd, linkname);
-        if (!error
-            && connect(fd, (struct sockaddr*) &un, un_len)
-            && errno != EINPROGRESS) {
-            error = errno;
+        if (!error) {
+            for (;;) {
+                if (!connect(fd, (struct sockaddr *)&un, un_len)) {
+                    break;
+                }
+                if (errno != EINPROGRESS) {
+                    error = errno;
+                    break;
+                }
+            }
         }
         free_sockaddr_un(dirfd, linkname);