diff mbox

tcp: accept socket after TCP_DEFER_ACCEPT period

Message ID Pine.LNX.4.58.0910192259280.2971@u.domain.uli
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Julian Anastasov Oct. 19, 2009, 8:01 p.m. UTC
Willy Tarreau and many other folks in recent years
were concerned what happens when the TCP_DEFER_ACCEPT period
expires for clients which sent ACK packet. They prefer clients
that actively resend ACK on our SYN-ACK retransmissions to be
converted from open requests to sockets and queued to the
listener for accepting after the deferring period is finished.
Then application server can decide to wait longer for data
or to properly terminate the connection with FIN if read()
returns EAGAIN which is an indication for accepting after
the deferring period. This change still can have side effects
for applications that expect always to see data on the accepted
socket. Others can be prepared to work in both modes (with or
without TCP_DEFER_ACCEPT period) and their data processing can
ignore the read=EAGAIN notification and to allocate resources for
clients which proved to have no data to send during the deferring
period. OTOH, servers that use TCP_DEFER_ACCEPT=1 as flag (not
as a timeout) to wait for data will notice clients that didn't
send data for 3 seconds but that still resend ACKs.
Thanks to Willy Tarreau for the initial idea and to
Eric Dumazet for the review and testing the change.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
---

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Eric Dumazet Oct. 19, 2009, 8:04 p.m. UTC | #1
Julian Anastasov a écrit :
> 	Willy Tarreau and many other folks in recent years
> were concerned what happens when the TCP_DEFER_ACCEPT period
> expires for clients which sent ACK packet. They prefer clients
> that actively resend ACK on our SYN-ACK retransmissions to be
> converted from open requests to sockets and queued to the
> listener for accepting after the deferring period is finished.
> Then application server can decide to wait longer for data
> or to properly terminate the connection with FIN if read()
> returns EAGAIN which is an indication for accepting after
> the deferring period. This change still can have side effects
> for applications that expect always to see data on the accepted
> socket. Others can be prepared to work in both modes (with or
> without TCP_DEFER_ACCEPT period) and their data processing can
> ignore the read=EAGAIN notification and to allocate resources for
> clients which proved to have no data to send during the deferring
> period. OTOH, servers that use TCP_DEFER_ACCEPT=1 as flag (not
> as a timeout) to wait for data will notice clients that didn't
> send data for 3 seconds but that still resend ACKs.
> Thanks to Willy Tarreau for the initial idea and to
> Eric Dumazet for the review and testing the change.
> 
> Signed-off-by: Julian Anastasov <ja@ssi.bg>
> ---
> 
> diff -urp v2.6.31/linux/net/ipv4/tcp_minisocks.c linux/net/ipv4/tcp_minisocks.c
> --- v2.6.31/linux/net/ipv4/tcp_minisocks.c	2009-09-11 10:27:17.000000000 +0300
> +++ linux/net/ipv4/tcp_minisocks.c	2009-10-16 10:29:19.000000000 +0300
> @@ -641,8 +641,8 @@ struct sock *tcp_check_req(struct sock *
>  	if (!(flg & TCP_FLAG_ACK))
>  		return NULL;
>  
> -	/* If TCP_DEFER_ACCEPT is set, drop bare ACK. */
> -	if (inet_csk(sk)->icsk_accept_queue.rskq_defer_accept &&
> +	/* While TCP_DEFER_ACCEPT is active, drop bare ACK. */
> +	if (req->retrans < inet_csk(sk)->icsk_accept_queue.rskq_defer_accept &&
>  	    TCP_SKB_CB(skb)->end_seq == tcp_rsk(req)->rcv_isn + 1) {
>  		inet_rsk(req)->acked = 1;
>  		return NULL;
> --

Acked-by: Eric Dumazet <eric.dumazet@gmail.com>

Thanks Julian
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff -urp v2.6.31/linux/net/ipv4/tcp_minisocks.c linux/net/ipv4/tcp_minisocks.c
--- v2.6.31/linux/net/ipv4/tcp_minisocks.c	2009-09-11 10:27:17.000000000 +0300
+++ linux/net/ipv4/tcp_minisocks.c	2009-10-16 10:29:19.000000000 +0300
@@ -641,8 +641,8 @@  struct sock *tcp_check_req(struct sock *
 	if (!(flg & TCP_FLAG_ACK))
 		return NULL;
 
-	/* If TCP_DEFER_ACCEPT is set, drop bare ACK. */
-	if (inet_csk(sk)->icsk_accept_queue.rskq_defer_accept &&
+	/* While TCP_DEFER_ACCEPT is active, drop bare ACK. */
+	if (req->retrans < inet_csk(sk)->icsk_accept_queue.rskq_defer_accept &&
 	    TCP_SKB_CB(skb)->end_seq == tcp_rsk(req)->rcv_isn + 1) {
 		inet_rsk(req)->acked = 1;
 		return NULL;