Patchwork [V3,5/8] sctp: use limited socket backlog

login
register
mail settings
Submitter Zhu Yi
Date March 5, 2010, 4:01 a.m.
Message ID <1267761707-15605-5-git-send-email-yi.zhu@intel.com>
Download mbox | patch
Permalink /patch/46994/
State Accepted
Delegated to: David Miller
Headers show

Comments

Zhu Yi - March 5, 2010, 4:01 a.m.
Make sctp adapt to the limited socket backlog change.

Cc: Vlad Yasevich <vladislav.yasevich@hp.com>
Cc: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
---
 net/sctp/input.c  |   42 +++++++++++++++++++++++++++---------------
 net/sctp/socket.c |    3 +++
 2 files changed, 30 insertions(+), 15 deletions(-)
Eric Dumazet - March 5, 2010, 6:28 a.m.
Le vendredi 05 mars 2010 à 12:01 +0800, Zhu Yi a écrit :
> Make sctp adapt to the limited socket backlog change.
> 
> Cc: Vlad Yasevich <vladislav.yasevich@hp.com>
> Cc: Sridhar Samudrala <sri@us.ibm.com>
> Signed-off-by: Zhu Yi <yi.zhu@intel.com>

This patch looks wrong.


> -static void sctp_add_backlog(struct sock *sk, struct sk_buff *skb)
> +static int sctp_add_backlog(struct sock *sk, struct sk_buff *skb)
>  {
>  	struct sctp_chunk *chunk = SCTP_INPUT_CB(skb)->chunk;
>  	struct sctp_ep_common *rcvr = chunk->rcvr;
> +	int ret;
>  
> -	/* Hold the assoc/ep while hanging on the backlog queue.
> -	 * This way, we know structures we need will not disappear from us
> -	 */
> -	if (SCTP_EP_TYPE_ASSOCIATION == rcvr->type)
> -		sctp_association_hold(sctp_assoc(rcvr));
> -	else if (SCTP_EP_TYPE_SOCKET == rcvr->type)
> -		sctp_endpoint_hold(sctp_ep(rcvr));
> -	else
> -		BUG();
> +	ret = sk_add_backlog_limited(sk, skb);
> +	if (!ret) {
> +		/* Hold the assoc/ep while hanging on the backlog queue.
> +		 * This way, we know structures we need will not disappear
> +		 * from us
> +		 */
> +		if (SCTP_EP_TYPE_ASSOCIATION == rcvr->type)
> +			sctp_association_hold(sctp_assoc(rcvr));
> +		else if (SCTP_EP_TYPE_SOCKET == rcvr->type)
> +			sctp_endpoint_hold(sctp_ep(rcvr));
> +		else
> +			BUG();
> +	}
> +	return ret;
>  
> -	sk_add_backlog(sk, skb);
>  }
>  

As advertized by comment, we should hold the association *before*
accessing backlog queue.

If order is not important, comment should be relaxed somehow ?


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zhu Yi - March 5, 2010, 11:05 a.m.
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> As advertized by comment, we should hold the association *before*

> accessing backlog queue.


> If order is not important, comment should be relaxed somehow ?


I don't see how the order is important here. We are under sock_lock 
here thus nobody will race with us. IMHO, the comment talks about
if a packet is queued into the backlog, we need to increase the assoc/ep
reference count. Otherwise the assoc/ep might be disappeared when
we are about to process it (by sctp_backlog_rcv) sometime later.

Thanks,
-yi
Eric Dumazet - March 5, 2010, 1:24 p.m.
Le vendredi 05 mars 2010 à 19:05 +0800, Zhu, Yi a écrit :

> I don't see how the order is important here. We are under sock_lock 
> here thus nobody will race with us. IMHO, the comment talks about
> if a packet is queued into the backlog, we need to increase the assoc/ep
> reference count. Otherwise the assoc/ep might be disappeared when
> we are about to process it (by sctp_backlog_rcv) sometime later.
> 

OK then.

Its strange this protocol has to increase a refcount for each queued
frame in its backlog, but this is unrelated to your changes anyway.




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Vlad Yasevich - March 5, 2010, 1:30 p.m.
Zhu, Yi wrote:
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
>> As advertized by comment, we should hold the association *before*
>> accessing backlog queue.
> 
>> If order is not important, comment should be relaxed somehow ?
> 
> I don't see how the order is important here. We are under sock_lock 
> here thus nobody will race with us. IMHO, the comment talks about
> if a packet is queued into the backlog, we need to increase the assoc/ep
> reference count. Otherwise the assoc/ep might be disappeared when
> we are about to process it (by sctp_backlog_rcv) sometime later.
> 
> Thanks,
> -yi

Yes, that's correct.  The order is not really important since we
are under lock and are actually already holding a ref.  However
the ref will be dropped once we exit the function, so the function
takes an additional ref that is held while the packet is backloged.

You could get rid of the extra nesting though by returning early
if backlog failed.

-vlad
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/net/sctp/input.c b/net/sctp/input.c
index c0c973e..cbc0636 100644
--- a/net/sctp/input.c
+++ b/net/sctp/input.c
@@ -75,7 +75,7 @@  static struct sctp_association *__sctp_lookup_association(
 					const union sctp_addr *peer,
 					struct sctp_transport **pt);
 
-static void sctp_add_backlog(struct sock *sk, struct sk_buff *skb);
+static int sctp_add_backlog(struct sock *sk, struct sk_buff *skb);
 
 
 /* Calculate the SCTP checksum of an SCTP packet.  */
@@ -265,8 +265,13 @@  int sctp_rcv(struct sk_buff *skb)
 	}
 
 	if (sock_owned_by_user(sk)) {
+		if (sctp_add_backlog(sk, skb)) {
+			sctp_bh_unlock_sock(sk);
+			sctp_chunk_free(chunk);
+			skb = NULL; /* sctp_chunk_free already freed the skb */
+			goto discard_release;
+		}
 		SCTP_INC_STATS_BH(SCTP_MIB_IN_PKT_BACKLOG);
-		sctp_add_backlog(sk, skb);
 	} else {
 		SCTP_INC_STATS_BH(SCTP_MIB_IN_PKT_SOFTIRQ);
 		sctp_inq_push(&chunk->rcvr->inqueue, chunk);
@@ -336,8 +341,10 @@  int sctp_backlog_rcv(struct sock *sk, struct sk_buff *skb)
 		sctp_bh_lock_sock(sk);
 
 		if (sock_owned_by_user(sk)) {
-			sk_add_backlog(sk, skb);
-			backloged = 1;
+			if (sk_add_backlog_limited(sk, skb))
+				sctp_chunk_free(chunk);
+			else
+				backloged = 1;
 		} else
 			sctp_inq_push(inqueue, chunk);
 
@@ -362,22 +369,27 @@  done:
 	return 0;
 }
 
-static void sctp_add_backlog(struct sock *sk, struct sk_buff *skb)
+static int sctp_add_backlog(struct sock *sk, struct sk_buff *skb)
 {
 	struct sctp_chunk *chunk = SCTP_INPUT_CB(skb)->chunk;
 	struct sctp_ep_common *rcvr = chunk->rcvr;
+	int ret;
 
-	/* Hold the assoc/ep while hanging on the backlog queue.
-	 * This way, we know structures we need will not disappear from us
-	 */
-	if (SCTP_EP_TYPE_ASSOCIATION == rcvr->type)
-		sctp_association_hold(sctp_assoc(rcvr));
-	else if (SCTP_EP_TYPE_SOCKET == rcvr->type)
-		sctp_endpoint_hold(sctp_ep(rcvr));
-	else
-		BUG();
+	ret = sk_add_backlog_limited(sk, skb);
+	if (!ret) {
+		/* Hold the assoc/ep while hanging on the backlog queue.
+		 * This way, we know structures we need will not disappear
+		 * from us
+		 */
+		if (SCTP_EP_TYPE_ASSOCIATION == rcvr->type)
+			sctp_association_hold(sctp_assoc(rcvr));
+		else if (SCTP_EP_TYPE_SOCKET == rcvr->type)
+			sctp_endpoint_hold(sctp_ep(rcvr));
+		else
+			BUG();
+	}
+	return ret;
 
-	sk_add_backlog(sk, skb);
 }
 
 /* Handle icmp frag needed error. */
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index f6d1e59..dfc5c12 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -3720,6 +3720,9 @@  SCTP_STATIC int sctp_init_sock(struct sock *sk)
 	SCTP_DBG_OBJCNT_INC(sock);
 	percpu_counter_inc(&sctp_sockets_allocated);
 
+	/* Set socket backlog limit. */
+	sk->sk_backlog.limit = sysctl_sctp_rmem[1];
+
 	local_bh_disable();
 	sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1);
 	local_bh_enable();