From patchwork Fri Jun 14 18:33:43 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kamal Mostafa X-Patchwork-Id: 251483 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) by ozlabs.org (Postfix) with ESMTP id 40E6C2C008F for ; Sat, 15 Jun 2013 04:34:06 +1000 (EST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1UnYot-0007ps-GH; Fri, 14 Jun 2013 18:33:55 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1UnYok-0007n5-6j for kernel-team@lists.ubuntu.com; Fri, 14 Jun 2013 18:33:46 +0000 Received: from c-67-160-231-42.hsd1.ca.comcast.net ([67.160.231.42] helo=fourier) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1UnYoj-000216-PN; Fri, 14 Jun 2013 18:33:45 +0000 Received: from kamal by fourier with local (Exim 4.80) (envelope-from ) id 1UnYoh-0008Kk-A2; Fri, 14 Jun 2013 11:33:43 -0700 From: Kamal Mostafa To: Alex Elder Subject: [ 3.8.y.z extended stable ] Patch "libceph: must hold mutex for reset_changed_osds()" has been added to staging queue Date: Fri, 14 Jun 2013 11:33:43 -0700 Message-Id: <1371234823-32001-1-git-send-email-kamal@canonical.com> X-Mailer: git-send-email 1.8.1.2 X-Extended-Stable: 3.8 Cc: Kamal Mostafa , Sage Weil , kernel-team@lists.ubuntu.com X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.14 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: kernel-team-bounces@lists.ubuntu.com This is a note to let you know that I have just added a patch titled libceph: must hold mutex for reset_changed_osds() to the linux-3.8.y-queue branch of the 3.8.y.z extended stable tree which can be found at: http://kernel.ubuntu.com/git?p=ubuntu/linux.git;a=shortlog;h=refs/heads/linux-3.8.y-queue This patch is scheduled to be released in version 3.8.13.3. If you, or anyone else, feels it should not be added to this tree, please reply to this email. For more information about the 3.8.y.z tree, see https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable Thanks. -Kamal ------ From 1579343b25d154140632bbc3b5cafb543c62f3ea Mon Sep 17 00:00:00 2001 From: Alex Elder Date: Wed, 15 May 2013 16:28:33 -0500 Subject: libceph: must hold mutex for reset_changed_osds() commit 14d2f38df67fadee34625fcbd282ee22514c4846 upstream. An osd client has a red-black tree describing its osds, and occasionally we would get crashes due to one of these trees tree becoming corrupt somehow. The problem turned out to be that reset_changed_osds() was being called without protection of the osd client request mutex. That function would call __reset_osd() for any osd that had changed, and __reset_osd() would call __remove_osd() for any osd with no outstanding requests, and finally __remove_osd() would remove the corresponding entry from the red-black tree. Thus, the tree was getting modified without having any lock protection, and was vulnerable to problems due to concurrent updates. This appears to be the only osd tree updating path that has this problem. It can be fairly easily fixed by moving the call up a few lines, to just before the request mutex gets dropped in kick_requests(). This resolves: http://tracker.ceph.com/issues/5043 Signed-off-by: Alex Elder Reviewed-by: Sage Weil Signed-off-by: Kamal Mostafa --- net/ceph/osd_client.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- 1.8.1.2 diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index eb9a444..b7b980d 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -1339,13 +1339,13 @@ static void kick_requests(struct ceph_osd_client *osdc, int force_resend) __register_request(osdc, req); __unregister_linger_request(osdc, req); } + reset_changed_osds(osdc); mutex_unlock(&osdc->request_mutex); if (needmap) { dout("%d requests for down osds, need new map\n", needmap); ceph_monc_request_next_osdmap(&osdc->client->monc); } - reset_changed_osds(osdc); }