Message ID | 20191114173225.21199-7-fw@strlen.de |
---|---|
State | Superseded, archived |
Headers | show |
Series | [RFC] mptcp: wmem accounting and nonblocking io support | expand |
On Thu, 2019-11-14 at 18:32 +0100, Florian Westphal wrote: > This disables transmit of new data until the peer has acked > enough mptcp data to get below the wspace write threshold (more than > half of wspace upperlimit is available again). > > Also have poll not report EPOLLOUT in this case, its not relevant if a > subflow is writeable. > > The latter is a temporary workaround that is needed because mptcp_poll > walks the subflows and calls __tcp_poll on each of them. > Because subflow ssk is usually writable, we will have to undo-that > if the mptcp sndbuf is exhausted. This won't be needed anymore once > __tcp_poll is removed, I am working on this. > > Signed-off-by: Florian Westphal <fw@strlen.de> > --- > net/mptcp/protocol.c | 18 ++++++++++++++++-- > 1 file changed, 16 insertions(+), 2 deletions(-) > > diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c > index 2144e80b8704..83be407e1dd6 100644 > --- a/net/mptcp/protocol.c > +++ b/net/mptcp/protocol.c > @@ -406,6 +406,18 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) > return ret; > } > > + timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT); > + > + mptcp_clean_una(sk); > + > + while (!sk_stream_memory_free(sk)) { > + ret = sk_stream_wait_memory(sk, &timeo); > + if (ret) > + goto out; > + > + mptcp_clean_una(sk); > + } > + Can we move the above loop to the non fallback case only ? e.g. after the below !mptcp_subflow_get(msk) checks? If so, we could have a single loop checking for: !sk_stream_memory_free(sk) || !mptcp_subflow_get_send() (together with the next patch) Cheers, Paolo
On Mon, 2019-11-18 at 12:29 +0100, Paolo Abeni wrote: > Can we move the above loop to the non fallback case only ? e.g. after > the below !mptcp_subflow_get(msk) checks? > > If so, we could have a single loop checking for: > > !sk_stream_memory_free(sk) || !mptcp_subflow_get_send() > > (together with the next patch) Dumb me! I meant "with patch 9/14" - where a similar loop is added. /P
Paolo Abeni <pabeni@redhat.com> wrote: > On Thu, 2019-11-14 at 18:32 +0100, Florian Westphal wrote: > > This disables transmit of new data until the peer has acked > > enough mptcp data to get below the wspace write threshold (more than > > half of wspace upperlimit is available again). > > > > Also have poll not report EPOLLOUT in this case, its not relevant if a > > subflow is writeable. > > > > The latter is a temporary workaround that is needed because mptcp_poll > > walks the subflows and calls __tcp_poll on each of them. > > Because subflow ssk is usually writable, we will have to undo-that > > if the mptcp sndbuf is exhausted. This won't be needed anymore once > > __tcp_poll is removed, I am working on this. > > > > Signed-off-by: Florian Westphal <fw@strlen.de> > > --- > > net/mptcp/protocol.c | 18 ++++++++++++++++-- > > 1 file changed, 16 insertions(+), 2 deletions(-) > > > > diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c > > index 2144e80b8704..83be407e1dd6 100644 > > --- a/net/mptcp/protocol.c > > +++ b/net/mptcp/protocol.c > > @@ -406,6 +406,18 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) > > return ret; > > } > > > > + timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT); > > + > > + mptcp_clean_una(sk); > > + > > + while (!sk_stream_memory_free(sk)) { > > + ret = sk_stream_wait_memory(sk, &timeo); > > + if (ret) > > + goto out; > > + > > + mptcp_clean_una(sk); > > + } > > + > > Can we move the above loop to the non fallback case only ? e.g. after > the below !mptcp_subflow_get(msk) checks? > > If so, we could have a single loop checking for: > > !sk_stream_memory_free(sk) || !mptcp_subflow_get_send() > > (together with the next patch) It would be easy to do if I remove if (!msg_data_left(msg)) { pr_debug("empty send"); ret = sock_sendmsg(ssk->sk_socket, msg); any idea why this is there in the first place?
On Mon, 2019-11-18 at 13:11 +0100, Florian Westphal wrote: > Paolo Abeni <pabeni@redhat.com> wrote: > > On Thu, 2019-11-14 at 18:32 +0100, Florian Westphal wrote: > > > This disables transmit of new data until the peer has acked > > > enough mptcp data to get below the wspace write threshold (more than > > > half of wspace upperlimit is available again). > > > > > > Also have poll not report EPOLLOUT in this case, its not relevant if a > > > subflow is writeable. > > > > > > The latter is a temporary workaround that is needed because mptcp_poll > > > walks the subflows and calls __tcp_poll on each of them. > > > Because subflow ssk is usually writable, we will have to undo-that > > > if the mptcp sndbuf is exhausted. This won't be needed anymore once > > > __tcp_poll is removed, I am working on this. > > > > > > Signed-off-by: Florian Westphal <fw@strlen.de> > > > --- > > > net/mptcp/protocol.c | 18 ++++++++++++++++-- > > > 1 file changed, 16 insertions(+), 2 deletions(-) > > > > > > diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c > > > index 2144e80b8704..83be407e1dd6 100644 > > > --- a/net/mptcp/protocol.c > > > +++ b/net/mptcp/protocol.c > > > @@ -406,6 +406,18 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) > > > return ret; > > > } > > > > > > + timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT); > > > + > > > + mptcp_clean_una(sk); > > > + > > > + while (!sk_stream_memory_free(sk)) { > > > + ret = sk_stream_wait_memory(sk, &timeo); > > > + if (ret) > > > + goto out; > > > + > > > + mptcp_clean_una(sk); > > > + } > > > + > > > > Can we move the above loop to the non fallback case only ? e.g. after > > the below !mptcp_subflow_get(msk) checks? > > > > If so, we could have a single loop checking for: > > > > !sk_stream_memory_free(sk) || !mptcp_subflow_get_send() > > > > (together with the next patch) > > It would be easy to do if I remove > > if (!msg_data_left(msg)) { > pr_debug("empty send"); > ret = sock_sendmsg(ssk->sk_socket, msg); > > any idea why this is there in the first place? Uhmm... that looks like a left-over from the initial implementation ?!? possibly trying to deal with fastopen?!? I think it can be dropped. @Peter / @Mat do you know better? /P
On Mon, 18 Nov 2019, Paolo Abeni wrote: > On Mon, 2019-11-18 at 13:11 +0100, Florian Westphal wrote: >> Paolo Abeni <pabeni@redhat.com> wrote: >>> On Thu, 2019-11-14 at 18:32 +0100, Florian Westphal wrote: >>>> This disables transmit of new data until the peer has acked >>>> enough mptcp data to get below the wspace write threshold (more than >>>> half of wspace upperlimit is available again). >>>> >>>> Also have poll not report EPOLLOUT in this case, its not relevant if a >>>> subflow is writeable. >>>> >>>> The latter is a temporary workaround that is needed because mptcp_poll >>>> walks the subflows and calls __tcp_poll on each of them. >>>> Because subflow ssk is usually writable, we will have to undo-that >>>> if the mptcp sndbuf is exhausted. This won't be needed anymore once >>>> __tcp_poll is removed, I am working on this. >>>> >>>> Signed-off-by: Florian Westphal <fw@strlen.de> >>>> --- >>>> net/mptcp/protocol.c | 18 ++++++++++++++++-- >>>> 1 file changed, 16 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c >>>> index 2144e80b8704..83be407e1dd6 100644 >>>> --- a/net/mptcp/protocol.c >>>> +++ b/net/mptcp/protocol.c >>>> @@ -406,6 +406,18 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) >>>> return ret; >>>> } >>>> >>>> + timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT); >>>> + >>>> + mptcp_clean_una(sk); >>>> + >>>> + while (!sk_stream_memory_free(sk)) { >>>> + ret = sk_stream_wait_memory(sk, &timeo); >>>> + if (ret) >>>> + goto out; >>>> + >>>> + mptcp_clean_una(sk); >>>> + } >>>> + >>> >>> Can we move the above loop to the non fallback case only ? e.g. after >>> the below !mptcp_subflow_get(msk) checks? >>> >>> If so, we could have a single loop checking for: >>> >>> !sk_stream_memory_free(sk) || !mptcp_subflow_get_send() >>> >>> (together with the next patch) >> >> It would be easy to do if I remove >> >> if (!msg_data_left(msg)) { >> pr_debug("empty send"); >> ret = sock_sendmsg(ssk->sk_socket, msg); >> >> any idea why this is there in the first place? > > Uhmm... that looks like a left-over from the initial implementation ?!? > possibly trying to deal with fastopen?!? > > I think it can be dropped. @Peter / @Mat do you know better? > Paolo's right, that's a leftover. You're welcome to drop it. -- Mat Martineau Intel
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 2144e80b8704..83be407e1dd6 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -406,6 +406,18 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) return ret; } + timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT); + + mptcp_clean_una(sk); + + while (!sk_stream_memory_free(sk)) { + ret = sk_stream_wait_memory(sk, &timeo); + if (ret) + goto out; + + mptcp_clean_una(sk); + } + ssk = mptcp_subflow_get(msk); if (!ssk) { release_sock(sk); @@ -421,8 +433,6 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) pr_debug("conn_list->subflow=%p", ssk); lock_sock(ssk); - mptcp_clean_una(sk); - timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT); while (msg_data_left(msg)) { ret = mptcp_sendmsg_frag(sk, ssk, msg, NULL, &timeo, &mss_now, &size_goal); @@ -1312,6 +1322,10 @@ static __poll_t mptcp_poll(struct file *file, struct socket *sock, tcp_sock = mptcp_subflow_tcp_socket(subflow); ret |= __tcp_poll(tcp_sock->sk); } + + if (!sk_stream_is_writeable(sk)) + ret &= ~(EPOLLOUT|EPOLLWRNORM); + release_sock(sk); return ret;
This disables transmit of new data until the peer has acked enough mptcp data to get below the wspace write threshold (more than half of wspace upperlimit is available again). Also have poll not report EPOLLOUT in this case, its not relevant if a subflow is writeable. The latter is a temporary workaround that is needed because mptcp_poll walks the subflows and calls __tcp_poll on each of them. Because subflow ssk is usually writable, we will have to undo-that if the mptcp sndbuf is exhausted. This won't be needed anymore once __tcp_poll is removed, I am working on this. Signed-off-by: Florian Westphal <fw@strlen.de> --- net/mptcp/protocol.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-)