Patchwork [v2] RDS: fix rds-ping spinlock recursion

login
register
mail settings
Submitter jeff.liu
Date Oct. 9, 2012, 4:57 a.m.
Message ID <5073AEB7.9060309@oracle.com>
Download mbox | patch
Permalink /patch/190202/
State Accepted
Delegated to: David Miller
Headers show

Comments

jeff.liu - Oct. 9, 2012, 4:57 a.m.
Hello,

This is the revised patch for fixing rds-ping spinlock recursion according to Venkat's suggestions.

RDS ping/pong over TCP feature has been broken for years(2.6.39 to 3.6.0) since we
have to set TCP cork and call kernel_sendmsg() between ping/pong which both need to
lock "struct sock *sk". However, this lock has already been hold before rds_tcp_data_ready()
callback is triggerred. As a result, we always facing spinlock resursion which would resulting
in system panic.

Given that RDS ping is only used to test the connectivity and not for serious performance measurements,
we can queue the pong transmit to rds_wq as a delayed response.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
CC: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
CC: David S. Miller <davem@davemloft.net>
CC: James Morris <james.l.morris@oracle.com>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
---
 net/rds/send.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
David Miller - Oct. 9, 2012, 5:58 p.m.
From: Jeff Liu <jeff.liu@oracle.com>
Date: Tue, 09 Oct 2012 12:57:27 +0800

> This is the revised patch for fixing rds-ping spinlock recursion according to Venkat's suggestions.
> 
> RDS ping/pong over TCP feature has been broken for years(2.6.39 to 3.6.0) since we
> have to set TCP cork and call kernel_sendmsg() between ping/pong which both need to
> lock "struct sock *sk". However, this lock has already been hold before rds_tcp_data_ready()
> callback is triggerred. As a result, we always facing spinlock resursion which would resulting
> in system panic.
> 
> Given that RDS ping is only used to test the connectivity and not for serious performance measurements,
> we can queue the pong transmit to rds_wq as a delayed response.
> 
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> CC: Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
> CC: David S. Miller <davem@davemloft.net>
> CC: James Morris <james.l.morris@oracle.com>
> Signed-off-by: Jie Liu <jeff.liu@oracle.com>

Applied and queued up for -stable, thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/net/rds/send.c b/net/rds/send.c
index 96531d4..88eace5 100644
--- a/net/rds/send.c
+++ b/net/rds/send.c
@@ -1122,7 +1122,7 @@  rds_send_pong(struct rds_connection *conn, __be16 dport)
 	rds_stats_inc(s_send_pong);
 
 	if (!test_bit(RDS_LL_SEND_FULL, &conn->c_flags))
-		rds_send_xmit(conn);
+		queue_delayed_work(rds_wq, &conn->c_send_w, 0);
 
 	rds_message_put(rm);
 	return 0;