diff mbox

RDSRDMA: Fix cleanup of rds_iw_mr_pool

Message ID 20110913194101.7497.65515.stgit@build.ogc.int
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Jonathan Lallinger Sept. 13, 2011, 7:41 p.m. UTC
In the rds_iw_mr_pool struct the free_pinned field keeps track of
memory pinned by free MRs. While this field is incremented properly
upon allocation, it is never decremented upon unmapping. This would
cause the rds_rdma module to crash the kernel upon unloading, by
triggering the BUG_ON in the rds_iw_destroy_mr_pool function.

This change keeps track of the MRs that become unpinned, so that
free_pinned can be decremented appropriately.

Signed-off-by: Jonathan Lallinger <jonathan@ogc.us>
Signed-off-by: Steve Wise <swise@ogc.us>
---

 net/rds/iw_rdma.c |   13 +++++++++----
 1 files changed, 9 insertions(+), 4 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Jonathan Lallinger Sept. 27, 2011, 5:26 p.m. UTC | #1
Hello David,

I am ashamed I made the same mistake twice. This happened because I had 
two git trees (I made a second one when kernel.org went down based off 
the github remote). I fixed, built, and ran several tests on the patch, 
and then sent the wrong patch from an old git tree (which was never 
build tested).

I can assure you I have a working patch, and it has been tested by the 
QA group at Chelsio and it builds/runs but there are still additional 
bugs in rds. So once I resolve those I will resend the correct patch 
with some additional fixes.

I am sorry about this and it won't happen again.

Thanks,
  Jonathan

David Miller wrote:
> From: Jonathan Lallinger <jonathan@ogc.us>
> Date: Tue, 13 Sep 2011 14:41:01 -0500
>
>   
>> @@ -548,6 +550,7 @@ static int rds_iw_flush_mr_pool(struct rds_iw_mr_pool *pool, int free_all)
>>  		spin_unlock_irqrestore(&pool->list_lock, flags);
>>  	}
>>  
>> +	atomic_sub(unpinned, &poll->free_pinned);
>>  	atomic_sub(ncleaned, &pool->dirty_count);
>>  	atomic_sub(nfreed, &pool->item_count);
>>  
>>     
>
> net/rds/iw_rdma.c: In function ‘rds_iw_flush_mr_pool’:
> net/rds/iw_rdma.c:553:24: error: ‘poll’ undeclared (first use in this function)
> net/rds/iw_rdma.c:553:24: note: each undeclared identifier is reported only once for each function it appears in
>
> If you didn't even build test it, I know you didn't test it's
> functionality either.
>
> This is crazy.
>
> Well if it's not important enough to even build test this change
> before you post it, then it obviously doesn't matter if the RDMA
> module crashes the kernel when it's unloaded.
>   

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise Sept. 28, 2011, 8:03 p.m. UTC | #2
On 09/27/2011 12:26 PM, Jonathan Lallinger wrote:
> Hello David,
>
> I am ashamed I made the same mistake twice. This happened because I had two git trees (I made a second one when 
> kernel.org went down based off the github remote). I fixed, built, and ran several tests on the patch, and then sent 
> the wrong patch from an old git tree (which was never build tested).
>
> I can assure you I have a working patch, and it has been tested by the QA group at Chelsio and it builds/runs but 
> there are still additional bugs in rds. So once I resolve those I will resend the correct patch with some additional 
> fixes.

Hey Jonathan,

I think you should get this patch resubmitted as-is (the correct patch though ;).  If we hit other issues in testing, 
then we can submit more patches.  The problems Chelsio is seeing may be backport issues and not upstream bugs.


Steve.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/rds/iw_rdma.c b/net/rds/iw_rdma.c
index 7c1c873..050256d 100644
--- a/net/rds/iw_rdma.c
+++ b/net/rds/iw_rdma.c
@@ -84,7 +84,8 @@  static int rds_iw_map_fastreg(struct rds_iw_mr_pool *pool,
 static void rds_iw_free_fastreg(struct rds_iw_mr_pool *pool, struct rds_iw_mr *ibmr);
 static unsigned int rds_iw_unmap_fastreg_list(struct rds_iw_mr_pool *pool,
 			struct list_head *unmap_list,
-			struct list_head *kill_list);
+			struct list_head *kill_list,
+			int *unpinned);
 static void rds_iw_destroy_fastreg(struct rds_iw_mr_pool *pool, struct rds_iw_mr *ibmr);
 
 static int rds_iw_get_device(struct rds_sock *rs, struct rds_iw_device **rds_iwdev, struct rdma_cm_id **cm_id)
@@ -499,7 +500,7 @@  static int rds_iw_flush_mr_pool(struct rds_iw_mr_pool *pool, int free_all)
 	LIST_HEAD(unmap_list);
 	LIST_HEAD(kill_list);
 	unsigned long flags;
-	unsigned int nfreed = 0, ncleaned = 0, free_goal;
+	unsigned int nfreed = 0, ncleaned = 0, unpinned = 0, free_goal;
 	int ret = 0;
 
 	rds_iw_stats_inc(s_iw_rdma_mr_pool_flush);
@@ -524,7 +525,8 @@  static int rds_iw_flush_mr_pool(struct rds_iw_mr_pool *pool, int free_all)
 	 * will be destroyed by the unmap function.
 	 */
 	if (!list_empty(&unmap_list)) {
-		ncleaned = rds_iw_unmap_fastreg_list(pool, &unmap_list, &kill_list);
+		ncleaned = rds_iw_unmap_fastreg_list(pool, &unmap_list,
+						     &kill_list, &unpinned);
 		/* If we've been asked to destroy all MRs, move those
 		 * that were simply cleaned to the kill list */
 		if (free_all)
@@ -548,6 +550,7 @@  static int rds_iw_flush_mr_pool(struct rds_iw_mr_pool *pool, int free_all)
 		spin_unlock_irqrestore(&pool->list_lock, flags);
 	}
 
+	atomic_sub(unpinned, &poll->free_pinned);
 	atomic_sub(ncleaned, &pool->dirty_count);
 	atomic_sub(nfreed, &pool->item_count);
 
@@ -828,7 +831,8 @@  static void rds_iw_free_fastreg(struct rds_iw_mr_pool *pool,
 
 static unsigned int rds_iw_unmap_fastreg_list(struct rds_iw_mr_pool *pool,
 				struct list_head *unmap_list,
-				struct list_head *kill_list)
+				struct list_head *kill_list,
+				int *unpinned)
 {
 	struct rds_iw_mapping *mapping, *next;
 	unsigned int ncleaned = 0;
@@ -855,6 +859,7 @@  static unsigned int rds_iw_unmap_fastreg_list(struct rds_iw_mr_pool *pool,
 
 		spin_lock_irqsave(&pool->list_lock, flags);
 		list_for_each_entry_safe(mapping, next, unmap_list, m_list) {
+			*unpinned += mapping->m_sg.len;
 			list_move(&mapping->m_list, &laundered);
 			ncleaned++;
 		}