From patchwork Tue Jul  9 14:02:50 2013
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Sidorov, Andrei" <Andrei.Sidorov@arrisi.com>
X-Patchwork-Id: 257718
Return-Path: 
 <linux-mtd-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from casper.infradead.org (casper.infradead.org
	[IPv6:2001:770:15f::2])
	(using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client did not present a certificate)
	by ozlabs.org (Postfix) with ESMTPS id 949D22C00AC
	for <incoming@patchwork.ozlabs.org>;
	Wed, 10 Jul 2013 00:03:54 +1000 (EST)
Received: from merlin.infradead.org ([2001:4978:20e::2])
	by casper.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux))
	id 1UwYVt-0004h4-Ts; Tue, 09 Jul 2013 14:03:30 +0000
Received: from localhost ([::1] helo=merlin.infradead.org)
	by merlin.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux))
	id 1UwYVk-0000ce-VS; Tue, 09 Jul 2013 14:03:20 +0000
Received: from mail.arrisi.com ([216.234.147.109])
	by merlin.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux))
	id 1UwYVh-0000b4-Vo
	for linux-mtd@lists.infradead.org; Tue, 09 Jul 2013 14:03:19 +0000
Received: from ATLOWA2.ARRS.ARRISI.com (webclient.arrisi.com [10.2.131.253])
	by mail1.arrisi.com (8.14.5/8.14.5) with ESMTP id r69E2puA008518
	(version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT)
	for <linux-mtd@lists.infradead.org>; Tue, 9 Jul 2013 10:02:51 -0400
Received: from ATLEXMBX4.ARRS.ARRISI.com ([fe80::41ae:f623:2536:9794]) by
	ATLOWA2.ARRS.ARRISI.com ([::1]) with mapi id 14.02.0318.004;
	Tue, 9 Jul 2013 10:02:51 -0400
From: "Sidorov, Andrei" <Andrei.Sidorov@arrisi.com>
To: "linux-mtd@lists.infradead.org" <linux-mtd@lists.infradead.org>
Subject: UBI: torture after scrub
Thread-Topic: torture after scrub
Thread-Index: Ac58rP7hHeFFI2yfQ0uQ3gApn67p4w==
Date: Tue, 9 Jul 2013 14:02:50 +0000
Message-ID: 
 <C0F0BC787567C848B2C90989451123DA3D27CF9F@ATLEXMBX4.ARRS.ARRISI.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [216.234.147.121]
MIME-Version: 1.0
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.10.8794,
	1.0.431, 0.0.0000
	definitions=2013-07-09_04:2013-07-09, 2013-07-09,
	1970-01-01 signatures=0
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0
	spamscore=0 suspectscore=0
	phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0
	reason=mlx
	scancount=1 engine=7.0.1-1305240000 definitions=main-1307090085
X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 
X-CRM114-CacheID: sfid-20130709_100318_121927_C4840434 
X-CRM114-Status: GOOD (  11.02  )
X-Spam-Score: -2.2 (--)
X-Spam-Report: SpamAssassin version 3.3.2 on merlin.infradead.org summary:
	Content analysis details:   (-2.2 points)
	pts rule name              description
	---- ----------------------
	--------------------------------------------------
	-0.3 RP_MATCHES_RCVD Envelope sender domain matches handover relay
	domain
	-1.9 BAYES_00               BODY: Bayes spam probability is 0 to 1%
	[score: 0.0000]
X-BeenThere: linux-mtd@lists.infradead.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Linux MTD discussion mailing list <linux-mtd.lists.infradead.org>
List-Unsubscribe: <http://lists.infradead.org/mailman/options/linux-mtd>,
	<mailto:linux-mtd-request@lists.infradead.org?subject=unsubscribe>
List-Archive: <http://lists.infradead.org/pipermail/linux-mtd/>
List-Post: <mailto:linux-mtd@lists.infradead.org>
List-Help: <mailto:linux-mtd-request@lists.infradead.org?subject=help>
List-Subscribe: <http://lists.infradead.org/mailman/listinfo/linux-mtd>,
	<mailto:linux-mtd-request@lists.infradead.org?subject=subscribe>
Sender: "linux-mtd" <linux-mtd-bounces@lists.infradead.org>
Errors-To: linux-mtd-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org

Hi,

There is something a bit odd in UBI behaviour related to scrubbing.
Throughout power cut test logs I see lots of

bitflips detected in PEB 30
PEB 30 gets scrubbed
PEB 30 gets erased and later reused
bitflips detected in PEB 30
....

Shouldn't UBI do torture test of source EB after scrubbing? Otherwise
it looks meaningless to scrub this EB at all.

Below change is based on linux 3.3, but it applies almost without changes
to 3.10 tree (however I'm not sure if sync erasure in wear_levelling_worker
adds up to write amplification).

Does it look reasonable?

Index: linux/drivers/mtd/ubi/eba.c
===================================================================
--- linux.orig/drivers/mtd/ubi/eba.c
+++ linux/drivers/mtd/ubi/eba.c
@@ -986,6 +986,7 @@ int ubi_eba_copy_leb(struct ubi_device *
 		     struct ubi_vid_hdr *vid_hdr)
 {
 	int err, vol_id, lnum, data_size, aldata_size, idx;
+	int err_bitflips = 0;
 	struct ubi_volume *vol;
 	uint32_t crc;
 
@@ -1060,7 +1061,9 @@ int ubi_eba_copy_leb(struct ubi_device *
 	mutex_lock(&ubi->buf_mutex);
 	dbg_wl("read %d bytes of data", aldata_size);
 	err = ubi_io_read_data(ubi, ubi->peb_buf1, from, 0, aldata_size);
-	if (err && err != UBI_IO_BITFLIPS) {
+	if (err == UBI_IO_BITFLIPS)
+		err_bitflips = MOVE_SOURCE_BITFLIPS;
+	else if (err) {
 		ubi_warn("error %d while reading data from PEB %d",
 			 err, from);
 		err = MOVE_SOURCE_RD_ERR;
@@ -1164,7 +1167,7 @@ out_unlock_buf:
 	mutex_unlock(&ubi->buf_mutex);
 out_unlock_leb:
 	leb_write_unlock(ubi, vol_id, lnum);
-	return err;
+	return err ? err : err_bitflips;
 }
 
 /**
Index: linux/drivers/mtd/ubi/ubi.h
===================================================================
--- linux.orig/drivers/mtd/ubi/ubi.h
+++ linux/drivers/mtd/ubi/ubi.h
@@ -112,6 +112,7 @@ enum {
  *
  * MOVE_CANCEL_RACE: canceled because the volume is being deleted, the source
  *                   PEB was put meanwhile, or there is I/O on the source PEB
+ * MOVE_SOURCE_BITFLIPS: PEB moved, but there were bitflips in the source PEB
  * MOVE_SOURCE_RD_ERR: canceled because there was a read error from the source
  *                     PEB
  * MOVE_TARGET_RD_ERR: canceled because there was a read error from the target
@@ -124,6 +125,7 @@ enum {
  */
 enum {
 	MOVE_CANCEL_RACE = 1,
+	MOVE_SOURCE_BITFLIPS,
 	MOVE_SOURCE_RD_ERR,
 	MOVE_TARGET_RD_ERR,
 	MOVE_TARGET_WR_ERR,
Index: linux/drivers/mtd/ubi/wl.c
===================================================================
--- linux.orig/drivers/mtd/ubi/wl.c
+++ linux/drivers/mtd/ubi/wl.c
@@ -797,6 +797,10 @@ static int wear_leveling_worker(struct u
 			scrubbing = 1;
 			goto out_not_moved;
 		}
+		if (err == MOVE_SOURCE_BITFLIPS) {
+			scrubbing = 1;
+			goto out_moved;
+		}
 		if (err == MOVE_CANCEL_BITFLIPS || err == MOVE_TARGET_WR_ERR ||
 		    err == MOVE_TARGET_RD_ERR) {
 			/*
@@ -830,6 +834,7 @@ static int wear_leveling_worker(struct u
 		ubi_assert(0);
 	}
 
+out_moved:
 	/* The PEB has been successfully moved */
 	if (scrubbing)
 		ubi_msg("scrubbed PEB %d (LEB %d:%d), data moved to PEB %d",
@@ -845,7 +850,7 @@ static int wear_leveling_worker(struct u
 	ubi->move_to_put = ubi->wl_scheduled = 0;
 	spin_unlock(&ubi->wl_lock);
 
-	err = schedule_erase(ubi, e1, 0);
+	err = schedule_erase(ubi, e1, scrubbing);
 	if (err) {
 		kmem_cache_free(ubi_wl_entry_slab, e1);
 		if (e2)