diff mbox

[U-Boot,1/2] mtd: vf610_nfc: remove caching of page in buffer

Message ID 1427965511-30658-1-git-send-email-stefan@agner.ch
State Superseded
Delegated to: Scott Wood
Headers show

Commit Message

Stefan Agner April 2, 2015, 9:05 a.m. UTC
To improve performance we remember the current page in the buffer
and avoid reading it twice. This implicit page cache increases
complexity while does not increase performance in real world cases.
This patch removes that feature.
---
As discussed in the other patchset...
http://thread.gmane.org/gmane.comp.boot-loaders.u-boot/215802

...I did some performance measurements:

Time to "Starting kernel ..."
- without bad block scan & with UBIFS fastmap: 2.02s
- with bad block scan & with UBIFS fastmap: 3.99s
- without bad block scan & without UBIFS fastmap: 4.42s
- with bad block scan & without UBIFS fastmap: 6.38s

Without page cache (with this patch applied):
Time to "Starting kernel ..."
- without bad block scan & with UBIFS fastmap: 2.02s
- with bad block scan & with UBIFS fastmap: 4.01s
- without bad block scan & without UBIFS fastmap: 4.41s
- with bad block scan & without UBIFS fastmap: 6.39s

I then checked whether that reread occurs at least once with a
printf in the "Already read?" if. It looks like this access
pattern does not occure (anymore?) during a normal boot.

Hence I now vote to get rid of it too...

 drivers/mtd/nand/vf610_nfc.c | 11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

Comments

Bill Pringlemeir April 2, 2015, 8:30 p.m. UTC | #1
On  2 Apr 2015, stefan@agner.ch wrote:

> To improve performance we remember the current page in the buffer
> and avoid reading it twice. This implicit page cache increases
> complexity while does not increase performance in real world cases.
> This patch removes that feature.
> ---
> As discussed in the other patchset...
> http://thread.gmane.org/gmane.comp.boot-loaders.u-boot/215802

> ...I did some performance measurements:

> Time to "Starting kernel ..."
> - without bad block scan & with UBIFS fastmap: 2.02s
> - with bad block scan & with UBIFS fastmap: 3.99s
> - without bad block scan & without UBIFS fastmap: 4.42s
> - with bad block scan & without UBIFS fastmap: 6.38s

> Without page cache (with this patch applied):
> Time to "Starting kernel ..."
> - without bad block scan & with UBIFS fastmap: 2.02s
> - with bad block scan & with UBIFS fastmap: 4.01s
> - without bad block scan & without UBIFS fastmap: 4.41s
> - with bad block scan & without UBIFS fastmap: 6.39s

[snip]

I also measured 'write performance' with the mtd_speedtest (performing
similar patch to the Linux driver) and I see no difference.  I think a
write benchmark is more appropriate to test this functionality?  While
at least it seems that neither read nor write is affected by the
simplification.

Fwiw,
Bill Pringlemeir.
Stefan Agner April 7, 2015, 1:14 p.m. UTC | #2
On 2015-04-02 22:30, Bill Pringlemeir wrote:
> On  2 Apr 2015, stefan@agner.ch wrote:
> 
>> To improve performance we remember the current page in the buffer
>> and avoid reading it twice. This implicit page cache increases
>> complexity while does not increase performance in real world cases.
>> This patch removes that feature.
>> ---
>> As discussed in the other patchset...
>> http://thread.gmane.org/gmane.comp.boot-loaders.u-boot/215802
> 
>> ...I did some performance measurements:
> 
>> Time to "Starting kernel ..."
>> - without bad block scan & with UBIFS fastmap: 2.02s
>> - with bad block scan & with UBIFS fastmap: 3.99s
>> - without bad block scan & without UBIFS fastmap: 4.42s
>> - with bad block scan & without UBIFS fastmap: 6.38s
> 
>> Without page cache (with this patch applied):
>> Time to "Starting kernel ..."
>> - without bad block scan & with UBIFS fastmap: 2.02s
>> - with bad block scan & with UBIFS fastmap: 4.01s
>> - without bad block scan & without UBIFS fastmap: 4.41s
>> - with bad block scan & without UBIFS fastmap: 6.39s
> 
> [snip]
> 
> I also measured 'write performance' with the mtd_speedtest (performing
> similar patch to the Linux driver) and I see no difference.  I think a
> write benchmark is more appropriate to test this functionality?  While
> at least it seems that neither read nor write is affected by the
> simplification.

On U-Boot, I just benchmarked the overall boot time since this is most
important for us. I plan to (re-)integrate the changes into the Linux
driver and check the performance again later this week.

Thanks for for the write test. So I can take this as a Ack?

I will send all the NFC changes in one patchset as v2 probably later
today.

--
Stefan
Bill Pringlemeir April 7, 2015, 2:24 p.m. UTC | #3
On  7 Apr 2015, stefan@agner.ch wrote:
> On 2015-04-02 22:30, Bill Pringlemeir wrote:
>> On  2 Apr 2015, stefan@agner.ch wrote:

>> [snip]

>> I also measured 'write performance' with the mtd_speedtest
>> (performing similar patch to the Linux driver) and I see no
>> difference.  I think a write benchmark is more appropriate to test
>> this functionality?  While at least it seems that neither read nor
>> write is affected by the simplification.

> On U-Boot, I just benchmarked the overall boot time since this is most
> important for us. I plan to (re-)integrate the changes into the Linux
> driver and check the performance again later this week.

> Thanks for for the write test. So I can take this as a Ack?

Of course, if you want it,

Acked-by: Bill Pringlemeir <bpringlemeir@nbsps.com>

The OOB patch also significantly decreases UbiFS mounting time in Linux.
I load Linux itself via tftp/network and not using u-boot with nand.  I
guess I should try that.

> I will send all the NFC changes in one patchset as v2 probably later
> today.
Stefan Agner April 7, 2015, 3:09 p.m. UTC | #4
On 2015-04-07 16:24, Bill Pringlemeir wrote:
> On  7 Apr 2015, stefan@agner.ch wrote:
>> On 2015-04-02 22:30, Bill Pringlemeir wrote:
>>> On  2 Apr 2015, stefan@agner.ch wrote:
> 
>>> [snip]
> 
>>> I also measured 'write performance' with the mtd_speedtest
>>> (performing similar patch to the Linux driver) and I see no
>>> difference.  I think a write benchmark is more appropriate to test
>>> this functionality?  While at least it seems that neither read nor
>>> write is affected by the simplification.
> 
>> On U-Boot, I just benchmarked the overall boot time since this is most
>> important for us. I plan to (re-)integrate the changes into the Linux
>> driver and check the performance again later this week.
> 
>> Thanks for for the write test. So I can take this as a Ack?
> 
> Of course, if you want it,
> 
> Acked-by: Bill Pringlemeir <bpringlemeir@nbsps.com>

Thx


> The OOB patch also significantly decreases UbiFS mounting time in Linux.
> I load Linux itself via tftp/network and not using u-boot with nand.  I
> guess I should try that.

Is it UBIFS mounting time or is it bad block scanning? Afaik, it should
speed up the latter significantly, but I don't see why it should speed
up the former.

--
Stefan
Bill Pringlemeir April 7, 2015, 4:48 p.m. UTC | #5
On  7 Apr 2015, stefan@agner.ch wrote:

> On 2015-04-07 16:24, Bill Pringlemeir wrote:

>> The OOB patch also significantly decreases UbiFS mounting time in
>> Linux.  I load Linux itself via tftp/network and not using u-boot
>> with nand.  I guess I should try that.

> Is it UBIFS mounting time or is it bad block scanning? Afaik, it
> should speed up the latter significantly, but I don't see why it
> should speed up the former.

It is both,

* no changes.

[0.840632] nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xca
[0.964838] 0x000001040000-0x000010000000 : "root"
.124s

base ubi0 mount is .833585s

* improved READ_OOB

[0.638869] nand: device found, Manufacturer ID: 0x2c, Chip ID: 0xca
[0.707994] 0x000001040000-0x000010000000 : "root"
.069s

base ubi0 mount 1/10s reduction. .738204s


This is for a 256MB device.  The Ubi mount/scan time is not completely
in-significant.

For instance, here is my last run with improved READ_OOB

   [ 0.942538] ubi0: attaching mtd3
   ...
   [ 1.680742] ubi0: background thread "ubi_bgt0d" started, PID 104

This is my 'base ubi0 mount' numbers.

The time is slightly different than what I recorded previously.  I
booted several times without 'READ_OOB' and the times were consistently
'.83xxS'.  It is possible that the initial BBT scan being quicker
altered something; so it is not UBI use of OOB.  I am not sure.  I just
noted that it was ??significantly?? different.

Regards,
Bill Pringlemeir.
diff mbox

Patch

diff --git a/drivers/mtd/nand/vf610_nfc.c b/drivers/mtd/nand/vf610_nfc.c
index d98dd28..fa0bb9d 100644
--- a/drivers/mtd/nand/vf610_nfc.c
+++ b/drivers/mtd/nand/vf610_nfc.c
@@ -147,7 +147,6 @@  struct vf610_nfc {
 	uint               column;
 	int                spareonly;
 	int		   page_sz;
-	int                page;
 	/* Status and ID are in alternate locations. */
 	int                alt_buf;
 #define ALT_BUF_ID   1
@@ -347,7 +346,6 @@  static void vf610_nfc_command(struct mtd_info *mtd, unsigned command,
 
 	switch (command) {
 	case NAND_CMD_PAGEPROG:
-		nfc->page = -1;
 		vf610_nfc_transfer_size(nfc->regs, nfc->page_sz);
 		vf610_nfc_send_commands(nfc->regs, NAND_CMD_SEQIN,
 					command, PROGRAM_PAGE_CMD_CODE);
@@ -367,10 +365,6 @@  static void vf610_nfc_command(struct mtd_info *mtd, unsigned command,
 	case NAND_CMD_SEQIN: /* Pre-read for partial writes. */
 	case NAND_CMD_READ0:
 		column = 0;
-		/* Already read? */
-		if (nfc->page == page)
-			return;
-		nfc->page = page;
 		vf610_nfc_transfer_size(nfc->regs, nfc->page_sz);
 		vf610_nfc_send_commands(nfc->regs, NAND_CMD_READ0,
 					NAND_CMD_READSTART, READ_PAGE_CMD_CODE);
@@ -378,7 +372,6 @@  static void vf610_nfc_command(struct mtd_info *mtd, unsigned command,
 		break;
 
 	case NAND_CMD_ERASE1:
-		nfc->page = -1;
 		vf610_nfc_transfer_size(nfc->regs, 0);
 		vf610_nfc_send_commands(nfc->regs, command,
 					NAND_CMD_ERASE2, ERASE_CMD_CODE);
@@ -532,10 +525,8 @@  static inline int vf610_nfc_correct_data(struct mtd_info *mtd, u_char *dat)
 	flip = count_written_bits(dat, nfc->chip.ecc.size, ecc_count);
 
 	/* ECC failed. */
-	if (flip > ecc_count) {
-		nfc->page = -1;
+	if (flip > ecc_count)
 		return -1;
-	}
 
 	/* Erased page. */
 	memset(dat, 0xff, nfc->chip.ecc.size);