From patchwork Tue Jul 9 23:01:38 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anatol Pomazau X-Patchwork-Id: 257893 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 6CA4C2C00AC for ; Wed, 10 Jul 2013 09:01:52 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752207Ab3GIXBv (ORCPT ); Tue, 9 Jul 2013 19:01:51 -0400 Received: from mail-qe0-f74.google.com ([209.85.128.74]:32888 "EHLO mail-qe0-f74.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751452Ab3GIXBu (ORCPT ); Tue, 9 Jul 2013 19:01:50 -0400 Received: by mail-qe0-f74.google.com with SMTP id a11so545334qen.5 for ; Tue, 09 Jul 2013 16:01:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=from:to:cc:subject:date:message-id:x-mailer; bh=7p+cNbC7p82DO5aUawTYzYbRiKBeMic3oQvHi9C/o6E=; b=MSkmBN5pglWHsZ9GWYzQVtd5GiUo+MGMQdBWRR2ref+YZuDWFWoV5edvIMdpQPqPqH dXkcfOQGIWdutVHdW9nCT/SmKtG6jC8Rg++qhsF833hlmWTXU/jtzWgh/h3lAy+PAuLY Nuc0ZkN9ADCCF+nFeWvDkO5ymojgHgtnFP2vQZXCc49J9ih8jSAfXrT8rIBvwTgJ4y3q liNEZqQTYoMoqXAGzJkkQruQzIJwD0ZqUMWIR/EI19Xq1AVXv6Oc4qVUTVd+kTyXT7YO Blfz/RknWRnvd6t+llUKscMjTbBM/n2kcknIVlzu+RDfu3PaQ7ErkiOjtaBeQB4kz1eu z29g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=from:to:cc:subject:date:message-id:x-mailer:x-gm-message-state; bh=7p+cNbC7p82DO5aUawTYzYbRiKBeMic3oQvHi9C/o6E=; b=INhaWhkIT2biVnWaHoKNqg8QiAmO4NuxmP2u+CrvdxLqBZssfxaXVtxWXBTGu8U5OW yqqT6UB3Ggr6ekbAOwJRNwq06WKp97DSVYUIZ2UPEY8rf9LjVRS5XHHt9owl2NhEVLUF 50zlk8aAULPVekohYO+dkdn6ov98J7g81nT9iBYWmE6Nhk1RSBQR188rf6JrQNIJrWaV iN4P9T4s8cA3Ofggx/O2gLaP7QKEs4/5vte+K+4nth5Oo6JF2JCrSr7mIrhTG0ufLVKA fq/K+4cFQUMltsDmukBuBNgctug1qCruKSXh661JBVIo/DKbFfGs6IDwIkILSNV+pNZC vHFg== X-Received: by 10.236.185.4 with SMTP id t4mr15143822yhm.21.1373410908494; Tue, 09 Jul 2013 16:01:48 -0700 (PDT) Received: from corp2gmr1-1.hot.corp.google.com (corp2gmr1-1.hot.corp.google.com [172.24.189.92]) by gmr-mx.google.com with ESMTPS id n70si219707yhj.7.2013.07.09.16.01.48 for (version=TLSv1.1 cipher=AES128-SHA bits=128/128); Tue, 09 Jul 2013 16:01:48 -0700 (PDT) Received: from anatol.mtv.corp.google.com (anatol.mtv.corp.google.com [172.17.130.253]) by corp2gmr1-1.hot.corp.google.com (Postfix) with ESMTP id 5929E31C1D6; Tue, 9 Jul 2013 16:01:48 -0700 (PDT) Received: by anatol.mtv.corp.google.com (Postfix, from userid 67983) id 20F32140AA2; Tue, 9 Jul 2013 16:01:48 -0700 (PDT) From: Anatol Pomazau To: linux-ext4@vger.kernel.org Cc: tytso@mit.edu, Anatol Pomozov Subject: [PATCH] ext4: Rate limit printk in buffer_io_error() Date: Tue, 9 Jul 2013 16:01:38 -0700 Message-Id: <1373410898-26826-1-git-send-email-anatol@google.com> X-Mailer: git-send-email 1.8.3 X-Gm-Message-State: ALoCoQnGGa9R+0V1Gqoi1sQCprTaMUvzkRmaO9GnUH3E+AiCOYhyojCOM4jyXnQQFoameYO4MhnW3h+clBIV+oclk1XVH1pS9+qADCTcp1ZKZycoYjpZYrDDuhIceWzYL3zazUgK7ceFv7h+cGO2coeP4CaWNg6esyXPQJ3vo6VZ68aOeqMNLp/m//+f9jjGEMxCfniijYCzzis0y/pZhz+UsmObNNA/Tg== Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org From: Anatol Pomozov If there are a lot of outstanding buffered IOs when a device is taken offline (due to hardware errors etc), ext4_end_bio prints out a message for each failed logical block. While this is desirable, we see thousands of such lines being printed out before the serial console gets overwhelmed, causing ext4_end_bio() wait for the printk to complete. This in itself isn't a disaster, except for the detail that this function is being called with the queue lock held. This causes any other function in the block layer to spin on its spin_lock_irqsave while the serial console is draining. If NMI watchdog is enabled on this machine then it eventually comes along and shoots the machine in the head. The end result is that losing any one disk causes the machine to go down. This patch rate limits the printk to bandaid around the problem. Tested: xfstests Change-Id: I8ab5690dcf4f3a67e78be147d45e489fdf4a88d8 Signed-off-by: Anatol Pomozov --- fs/ext4/page-io.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c index 4acf1f7..4c9d5e7 100644 --- a/fs/ext4/page-io.c +++ b/fs/ext4/page-io.c @@ -25,6 +25,7 @@ #include #include #include +#include #include "ext4_jbd2.h" #include "xattr.h" @@ -214,7 +215,7 @@ ext4_io_end_t *ext4_init_io_end(struct inode *inode, gfp_t flags) static void buffer_io_error(struct buffer_head *bh) { char b[BDEVNAME_SIZE]; - printk(KERN_ERR "Buffer I/O error on device %s, logical block %llu\n", + printk_ratelimited(KERN_ERR "Buffer I/O error on device %s, logical block %llu\n", bdevname(bh->b_bdev, b), (unsigned long long)bh->b_blocknr); }