diff mbox

ext4: allow inode_readahead_blks=0 (linux-2.6.37)

Message ID 20110208063925.GA13619@lw.yar.ru
State Accepted, archived
Headers show

Commit Message

Alexander V. Lukyanov Feb. 8, 2011, 6:39 a.m. UTC
Hello!

I cannot disable inode-read-ahead feature of ext4 (on 2.6.37):

# echo 0 > /sys/fs/ext4/sda2/inode_readahead_blks 
bash: echo: write error: Invalid argument

On a server with lots of small files and random access this read-ahead makes
performance worse, and I'd like to disable it. I work around this problem
by using value of 1, but it still reads an extra block.

This patch fixes the problem by checking for zero explicitly.


Signed-off-by: Alexander V. Lukyanov <lav@netis.ru>

Comments

Theodore Ts'o Feb. 22, 2011, 2:32 a.m. UTC | #1
On Tue, Feb 08, 2011 at 09:39:25AM +0300, Alexander V. Lukyanov wrote:
> Hello!
> 
> I cannot disable inode-read-ahead feature of ext4 (on 2.6.37):
> 
> # echo 0 > /sys/fs/ext4/sda2/inode_readahead_blks 
> bash: echo: write error: Invalid argument
> 
> On a server with lots of small files and random access this read-ahead makes
> performance worse, and I'd like to disable it. I work around this problem
> by using value of 1, but it still reads an extra block.

So I'm curious --- have you actually benchmarked a performance
decrease?  What sort of hardware are you using?

The readahead should be changing a 4k read to a 8k read with a value
of 1, which shouldn't take a much of a difference to a HDD.

I can apply this patch, but is it really making a difference for you?

      	    	 	       	  	 	- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander V. Lukyanov Feb. 24, 2011, 8:18 a.m. UTC | #2
On Mon, Feb 21, 2011 at 09:32:11PM -0500, Ted Ts'o wrote:
> On Tue, Feb 08, 2011 at 09:39:25AM +0300, Alexander V. Lukyanov wrote:
> > Hello!
> >
> > I cannot disable inode-read-ahead feature of ext4 (on 2.6.37):
> >
> > # echo 0 > /sys/fs/ext4/sda2/inode_readahead_blks
> > bash: echo: write error: Invalid argument
> >
> > On a server with lots of small files and random access this read-ahead makes
> > performance worse, and I'd like to disable it. I work around this problem
> > by using value of 1, but it still reads an extra block.
>
> So I'm curious --- have you actually benchmarked a performance
> decrease?  What sort of hardware are you using?

Yes, with the default value of inode_readahead_blks LA went from 4 to 30
(if I remember correctly). The problem was the increased load on HDD.

The hardware is: Core2duo CPU, 4GB RAM, 4x80GB SATA disks without NCQ,
the load is evenly distributed on the disks. At that time each disk
contained 1 million files, randomly accessed for read/create-write,
10MB/s read and 10MB/s write (rate sum of 4 disks).

> The readahead should be changing a 4k read to a 8k read with a value
> of 1, which shouldn't take a much of a difference to a HDD.

Sure, with inode_readahead_blks=1 it works acceptably. But I'd like
to disable the inode read-ahead completely.

> I can apply this patch, but is it really making a difference for you?

I think it is logical to be able to disable an unneeded feature.
Besides, there is a code already to check s_inode_readahead_blks!=0
(fs/ext4/inode.c:4737):

                /*
                 * If we need to do any I/O, try to pre-readahead extra
                 * blocks from the inode table.
                 */
                if (EXT4_SB(sb)->s_inode_readahead_blks) {

--
   Alexander.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- a/fs/ext4/super.c.0	2010-11-16 10:48:33.418629215 +0300
+++ b/fs/ext4/super.c	2010-11-16 10:46:07.739753246 +0300
@@ -1657,7 +1657,7 @@  set_qf_format:
 				return 0;
 			if (option < 0 || option > (1 << 30))
 				return 0;
-			if (!is_power_of_2(option)) {
+			if (option && !is_power_of_2(option)) {
 				ext4_msg(sb, KERN_ERR,
 					 "EXT4-fs: inode_readahead_blks"
 					 " must be a power of 2");
@@ -2274,7 +2274,7 @@  static ssize_t inode_readahead_blks_stor
 	if (parse_strtoul(buf, 0x40000000, &t))
 		return -EINVAL;
 
-	if (!is_power_of_2(t))
+	if (t && !is_power_of_2(t))
 		return -EINVAL;
 
 	sbi->s_inode_readahead_blks = t;