[0/19,v3] ext4: Fix transaction overflow due to revoke descriptors
mbox series

Message ID 20191003215523.7313-1-jack@suse.cz
Headers show
Series
  • ext4: Fix transaction overflow due to revoke descriptors
Related show

Message

Jan Kara Oct. 3, 2019, 10:05 p.m. UTC
Hello,

Here is v3 of this series with couple more bugs fixed. Now all failed tests Ted
higlighted pass for me.

Changes since v2:
* Fixed bug in revoke credit estimates for extent freeing in bigalloc
  filesystems
* Fixed bug in xattr code treating positive return of
  ext4_journal_ensure_credits() as error
* Fixed preexisting bug in ext4_evict_inode() where we reserved too few credits
* Added trace point to jbd2_journal_restart()
* Fix some kernel doc bugs
* Rebased on top of 5.4-rc1

Changes since v1:
* Reordered some patches to reduce code churn
* Computation in jbd2_revoke_descriptors_per_block() was too early - moved it
  to later when journal superblock is loaded and so the feature checking
  actually works.
* Made sure nobody outside JBD2 uses handle->h_buffer_credits since now it
  contains also credits for revoke descriptors and it was confusing come users.
* Updated cover letter with more details about reproducer

Original cover letter:

I've recently got a bug report where JBD2 assertion failed due to
transaction commit running out of journal space. After closer inspection of
the crash dump it seems that the problem is that there were too many
journal descriptor blocks (more that max_transaction_size >> 5 + 32 we
estimate in jbd2_log_space_left()) due to descriptor blocks with revoke
records. In fact the estimate on the number of descriptor blocks looks
pretty arbitrary and there can be much more descriptor blocks needed for
revoke records. We need one revoke record for every metadata block freed.
So in the worst case (1k blocksize, 64-bit journal feature enabled,
checksumming enabled) we fit 125 revoke record in one descriptor block.  In
common cases its about 500 revoke records per descriptor block. Now when
we free large directories or large file with data journalling enabled, we can
have *lots* of blocks to revoke - with extent mapped files easily millions
in a single transaction which can mean 10k descriptor blocks - clearly more
than the estimate of 128 descriptor blocks per transaction ;)

This patch series aims at fixing the problem by accounting descriptor blocks
into transaction credits and reserving appropriate amount of credits for revoke
descriptors on transaction handle start. Similar to normal transaction credits,
the filesystem has to provide estimate for the number of blocks it is going
to revoke using the transaction handle so that credits for revoke descriptors
can be reserved.

The series has survived fstests in couple configurations and also the stress
test like:
  Create filesystem with 1KB blocksize and journal size 32MB
  Mount the filesystem with -o nodelalloc
  for (( i = 0; i < 4; i++ )); do
    dd if=/dev/zero of=file$i bs=1M count=2048 conv=fsync
    chattr +j file$i
  done
  for (( i = 0; i < 4; i++ )); do
    rm file$i&
  done

which reliably triggers the assertion failure in JBD2 on unpatched kernel.

Review and comments are welcome :).

								Honza
Previous versions:
Link: http://lore.kernel.org/r/20190927111536.16455-1-jack@suse.cz
Link: http://lore.kernel.org/r/20190930103544.11479-1-jack@suse.cz

Comments

Theodore Ts'o Oct. 19, 2019, 7:19 p.m. UTC | #1
Hi Jan,

I've tried applying this patch set against 5.4-rc3, and I'm finding a
easily reproducible failure using:

	kvm-xfstests -c ext3conv ext4/039

It is the BUG_ON in fs/jbd2/commit.c, around line 570:

	J_ASSERT(commit_transaction->t_nr_buffers <=
		 atomic_read(&commit_transaction->t_outstanding_credits));

The failure (with the obvious debugging printk added) is:

ext4/039		[15:13:16][    6.747101] run fstests ext4/039 at 2019-10
-19 15:13:16
[    7.018766] Mounted ext4 file system at /vdc supports timestamps until 2038 (
0x7fffffff)
[    8.227631] JBD2: t_nr_buffers 226, t_outstanding_credits=223
[    8.229215] ------------[ cut here ]------------
[    8.230249] kernel BUG at fs/jbd2/commit.c:573!
     	       ...

The full log is attached (although the stack trace isn't terribly
interesting, since this is being run out of kjournald2).

						- Ted
c[?7lSeaBIOS (version 1.12.0-1)
Booting from ROM..c[?7lKERNEL: kernel	5.4.0-rc3-xfstests-00022-g6db1a59f1c6a-dirty #1234 SMP Sat Oct 19 15:10:53 EDT 2019 x86_64
FSTESTVER: blktests	667d741 (Wed, 4 Sep 2019 10:49:18 -0700)
FSTESTVER: e2fsprogs	v1.45.4-15-g4b4f7b35 (Wed, 9 Oct 2019 20:25:01 -0400)
FSTESTVER: fio		fio-3.15 (Fri, 12 Jul 2019 10:40:36 -0600)
FSTESTVER: fsverity	2151209 (Fri, 28 Jun 2019 14:34:41 -0700)
FSTESTVER: ima-evm-utils	0267fa1 (Mon, 3 Dec 2018 06:11:35 -0500)
FSTESTVER: nvme-cli	v1.9 (Thu, 15 Aug 2019 13:14:59 -0600)
FSTESTVER: quota		6e63107 (Thu, 15 Aug 2019 11:23:55 +0200)
FSTESTVER: util-linux	v2.33.2 (Tue, 9 Apr 2019 14:58:07 +0200)
FSTESTVER: xfsprogs	v5.3.0-rc1-8-g7aaa32db (Thu, 19 Sep 2019 13:21:52 -0400)
FSTESTVER: xfstests	linux-v3.8-2563-g45bd2a28 (Mon, 14 Oct 2019 08:11:38 -0400)
FSTESTVER: xfstests-bld	5e2a748 (Mon, 14 Oct 2019 09:36:05 -0400)
FSTESTCFG: "ext3conv"
FSTESTSET: "ext4/039"
FSTESTEXC: ""
FSTESTOPT: "aex"
MNTOPTS: ""
CPUS: "2"
MEM: "1966.97"
              total        used        free      shared  buff/cache   available
Mem:           1966          96        1778           8          92        1830
Swap:             0           0           0
[    5.146878] NFSD: Using /var/lib/nfs/v4recovery as the NFSv4 state recovery directory
[    5.148619] NFSD: Using legacy client tracking operations.
BEGIN TEST ext3conv (1 test): Ext4 4k block w/nodelalloc and no flex_bg Sat Oct 19 15:13:15 EDT 2019
DEVICE: /dev/vdd
EXT_MKFS_OPTIONS: -O ^flex_bg
EXT_MOUNT_OPTIONS: -o block_validity,nodelalloc
FSTYP         -- ext4
PLATFORM      -- Linux/x86_64 kvm-xfstests 5.4.0-rc3-xfstests-00022-g6db1a59f1c6a-dirty #1234 SMP Sat Oct 19 15:10:53 EDT 2019
MKFS_OPTIONS  -- -q -O ^flex_bg /dev/vdc
MOUNT_OPTIONS -- -o acl,user_xattr -o block_validity,nodelalloc /dev/vdc /vdc

ext4/039		[15:13:16][    6.747101] run fstests ext4/039 at 2019-10-19 15:13:16
[    7.018766] Mounted ext4 file system at /vdc supports timestamps until 2038 (0x7fffffff)
[    8.227631] JBD2: t_nr_buffers 226, t_outstanding_credits=223
[    8.229215] ------------[ cut here ]------------
[    8.230249] kernel BUG at fs/jbd2/commit.c:573!
[    8.231231] invalid opcode: 0000 [#1] SMP NOPTI
[    8.232223] CPU: 1 PID: 1384 Comm: jbd2/vdc-8 Not tainted 5.4.0-rc3-xfstests-00022-g6db1a59f1c6a-dirty #1234
[    8.234303] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[    8.236124] RIP: 0010:jbd2_journal_commit_transaction+0x1565/0x1f35
[    8.237489] Code: f0 fe ff ff 48 c7 c2 b8 52 3a 8c be 4e 00 00 00 48 c7 c7 3f 06 3f 8c c6 05 a8 c5 5a 01 01 e8 8f 15 d4 ff e9 cc fe ff ff 0f 0b <0f> 0b 0f 0b 0f 0b e8 10 a0 d5 ff 85 c0 0f 85 15 fb ff ff 48 c7 c2
[    8.241449] RSP: 0018:ffffaa64c22cbcf0 EFLAGS: 00010283
[    8.242581] RAX: 00000000000000df RBX: ffff8ed0b8b7f028 RCX: 0000000000000000
[    8.244138] RDX: 0000000000000000 RSI: 00000000000000e2 RDI: ffff8ed0bdbd6608
[    8.245664] RBP: ffffaa64c22cbe80 R08: ffff8ed0bdbd6608 R09: 0000000000000000
[    8.247029] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[    8.248023] R13: ffff8ed0b52045a0 R14: ffff8ed0b5204540 R15: ffff8ed0b8b7f000
[    8.249042] FS:  0000000000000000(0000) GS:ffff8ed0bda00000(0000) knlGS:0000000000000000
[    8.250191] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    8.251007] CR2: 00005601514547c0 CR3: 00000000722c4006 CR4: 0000000000360ee0
[    8.252007] Call Trace:
[    8.252356]  ? __lock_acquire+0x24a/0xf80
[    8.252946]  ? __lock_acquired+0x1eb/0x310
[    8.253574]  ? kjournald2+0xe3/0x3a0
[    8.254108]  kjournald2+0xe3/0x3a0
[    8.254567]  ? replenish_dl_entity.cold+0x1d/0x1d
[    8.255364]  ? __jbd2_debug+0x50/0x50
[    8.255886]  kthread+0x126/0x140
[    8.256408]  ? kthread_delayed_work_timer_fn+0xa0/0xa0
[    8.257324]  ret_from_fork+0x3a/0x50
[    8.257916] ---[ end trace 9acad1489f655cc4 ]---
[    8.258799] RIP: 0010:jbd2_journal_commit_transaction+0x1565/0x1f35
[    8.259742] Code: f0 fe ff ff 48 c7 c2 b8 52 3a 8c be 4e 00 00 00 48 c7 c7 3f 06 3f 8c c6 05 a8 c5 5a 01 01 e8 8f 15 d4 ff e9 cc fe ff ff 0f 0b <0f> 0b 0f 0b 0f 0b e8 10 a0 d5 ff 85 c0 0f 85 15 fb ff ff 48 c7 c2
[    8.264348] RSP: 0018:ffffaa64c22cbcf0 EFLAGS: 00010283
[    8.265803] RAX: 00000000000000df RBX: ffff8ed0b8b7f028 RCX: 0000000000000000
[    8.267610] RDX: 0000000000000000 RSI: 00000000000000e2 RDI: ffff8ed0bdbd6608
[    8.269498] RBP: ffffaa64c22cbe80 R08: ffff8ed0bdbd6608 R09: 0000000000000000
[    8.271295] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[    8.273021] R13: ffff8ed0b52045a0 R14: ffff8ed0b5204540 R15: ffff8ed0b8b7f000
[    8.274792] FS:  0000000000000000(0000) GS:ffff8ed0bda00000(0000) knlGS:0000000000000000
[    8.276789] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    8.278152] CR2: 00005601514547c0 CR3: 00000000722c4006 CR4: 0000000000360ee0
[    8.279796] ------------[ cut here ]------------
[    8.280909] WARNING: CPU: 1 PID: 1384 at kernel/exit.c:723 do_exit+0x47/0xb70
[    8.282588] CPU: 1 PID: 1384 Comm: jbd2/vdc-8 Tainted: G      D           5.4.0-rc3-xfstests-00022-g6db1a59f1c6a-dirty #1234
[    8.284979] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[    8.286909] RIP: 0010:do_exit+0x47/0xb70
[    8.287809] Code: 00 00 48 89 44 24 38 31 c0 65 48 8b 1c 25 c0 5d 01 00 48 8b 83 20 12 00 00 48 85 c0 74 0e 48 8b 10 48 39 d0 0f 84 71 04 00 00 <0f> 0b 65 44 8b 25 07 fa 15 75 41 81 e4 00 ff 1f 00 44 89 64 24 0c
[    8.290621] RSP: 0018:ffffaa64c22cbee0 EFLAGS: 00010216
[    8.291398] RAX: ffffaa64c22cbdc0 RBX: ffff8ed0b5aa6400 RCX: 0000000000000000
[    8.292582] RDX: ffff8ed0ba5e4548 RSI: 0000000000000000 RDI: 000000000000000b
[    8.293818] RBP: 000000000000000b R08: 0000000000000000 R09: 0000000000000000
[    8.295187] R10: 0000000000000008 R11: ffffaa64c22cba1d R12: 000000000000000b
[    8.296506] R13: 0000000000000002 R14: 0000000000000006 R15: ffff8ed0b5aa6400
[    8.297843] FS:  0000000000000000(0000) GS:ffff8ed0bda00000(0000) knlGS:0000000000000000
[    8.299363] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    8.300503] CR2: 00005601514547c0 CR3: 00000000722c4006 CR4: 0000000000360ee0
[    8.301841] Call Trace:
[    8.302275]  ? __jbd2_debug+0x50/0x50
[    8.303003]  ? kthread+0x126/0x140
[    8.303754]  rewind_stack_do_exit+0x17/0x20
[    8.304485] irq event stamp: 134503
[    8.305232] hardirqs last  enabled at (134503): [<ffffffff8ae017ea>] trace_hardirqs_on_thunk+0x1a/0x20
[    8.306951] hardirqs last disabled at (134501): [<ffffffff8bc002b0>] __do_softirq+0x2b0/0x42a
[    8.308579] softirqs last  enabled at (134502): [<ffffffff8bc0032a>] __do_softirq+0x32a/0x42a
[    8.310100] softirqs last disabled at (134495): [<ffffffff8aeb84d3>] irq_exit+0xb3/0xc0
[    8.311524] ---[ end trace 9acad1489f655cc5 ]---
QEMU: Terminated
Jan Kara Oct. 24, 2019, 1:09 p.m. UTC | #2
On Sat 19-10-19 15:19:33, Theodore Y. Ts'o wrote:
> Hi Jan,
> 
> I've tried applying this patch set against 5.4-rc3, and I'm finding a
> easily reproducible failure using:
> 
> 	kvm-xfstests -c ext3conv ext4/039
> 
> It is the BUG_ON in fs/jbd2/commit.c, around line 570:
> 
> 	J_ASSERT(commit_transaction->t_nr_buffers <=
> 		 atomic_read(&commit_transaction->t_outstanding_credits));
> 
> The failure (with the obvious debugging printk added) is:
> 
> ext4/039		[15:13:16][    6.747101] run fstests ext4/039 at 2019-10
> -19 15:13:16
> [    7.018766] Mounted ext4 file system at /vdc supports timestamps until 2038 (
> 0x7fffffff)
> [    8.227631] JBD2: t_nr_buffers 226, t_outstanding_credits=223
> [    8.229215] ------------[ cut here ]------------
> [    8.230249] kernel BUG at fs/jbd2/commit.c:573!
>      	       ...
> 
> The full log is attached (although the stack trace isn't terribly
> interesting, since this is being run out of kjournald2).

Thanks! Somehow this escaped my testing although I thought I have run ext3
configuration... Anyway we are reserving too few space in this case - with
some debugging added:

[   80.296029] t_buffers: 222, t_outstanding_credits: 219,
t_revoke_written: 23, t_revoke_reserved: 12, t_revoke_records_written
11432, t_revoke_records_reserved 11432, revokes_per_block: 1020

Which is really puzzling because it would suggest that revokes_per_block is
actually wrong. Digging more into this.

								Honza
Jan Kara Oct. 24, 2019, 3:12 p.m. UTC | #3
On Thu 24-10-19 15:09:08, Jan Kara wrote:
> On Sat 19-10-19 15:19:33, Theodore Y. Ts'o wrote:
> > Hi Jan,
> > 
> > I've tried applying this patch set against 5.4-rc3, and I'm finding a
> > easily reproducible failure using:
> > 
> > 	kvm-xfstests -c ext3conv ext4/039
> > 
> > It is the BUG_ON in fs/jbd2/commit.c, around line 570:
> > 
> > 	J_ASSERT(commit_transaction->t_nr_buffers <=
> > 		 atomic_read(&commit_transaction->t_outstanding_credits));
> > 
> > The failure (with the obvious debugging printk added) is:
> > 
> > ext4/039		[15:13:16][    6.747101] run fstests ext4/039 at 2019-10
> > -19 15:13:16
> > [    7.018766] Mounted ext4 file system at /vdc supports timestamps until 2038 (
> > 0x7fffffff)
> > [    8.227631] JBD2: t_nr_buffers 226, t_outstanding_credits=223
> > [    8.229215] ------------[ cut here ]------------
> > [    8.230249] kernel BUG at fs/jbd2/commit.c:573!
> >      	       ...
> > 
> > The full log is attached (although the stack trace isn't terribly
> > interesting, since this is being run out of kjournald2).
> 
> Thanks! Somehow this escaped my testing although I thought I have run ext3
> configuration... Anyway we are reserving too few space in this case - with
> some debugging added:
> 
> [   80.296029] t_buffers: 222, t_outstanding_credits: 219,
> t_revoke_written: 23, t_revoke_reserved: 12, t_revoke_records_written
> 11432, t_revoke_records_reserved 11432, revokes_per_block: 1020
> 
> Which is really puzzling because it would suggest that revokes_per_block is
> actually wrong. Digging more into this.

Yeah, ext4 was updating journal features in this case but
j_revoke_records_per_block didn't get properly updated. Fixed now.

								Honza
Theodore Ts'o Nov. 4, 2019, 3:32 a.m. UTC | #4
Hi Jan,

I believe that I'm waiting for the v4 version of this series with some
pending fixes that you are planning on making.  Is that correct?

Thanks!!

					- Ted
Jan Kara Nov. 4, 2019, 11:22 a.m. UTC | #5
Hi Ted!

On Sun 03-11-19 22:32:52, Theodore Y. Ts'o wrote:
> I believe that I'm waiting for the v4 version of this series with some
> pending fixes that you are planning on making.  Is that correct?

Ah, good that you pinged me because I have the series ready but I was
waiting for your answers to some explanations... In particular discussion
around patch 3 (move iput() outside of transaction), patch 15 (dropping of
j_state_lock around t_tid load), patch 18 (possible large overreservation
of descriptor blocks due to rounding), and 21 (change of on-disk format for
revoke descriptors).

Out of these I probably find the overreservation due to rounding the most
serious and easy enough to handle so I'll fix that and then resend the
series unless you raise your objection also in some other case.

								Honza
Theodore Ts'o Nov. 4, 2019, 1:09 p.m. UTC | #6
On Mon, Nov 04, 2019 at 12:22:32PM +0100, Jan Kara wrote:
> Hi Ted!
> 
> On Sun 03-11-19 22:32:52, Theodore Y. Ts'o wrote:
> > I believe that I'm waiting for the v4 version of this series with some
> > pending fixes that you are planning on making.  Is that correct?
> 
> Ah, good that you pinged me because I have the series ready but I was
> waiting for your answers to some explanations... In particular discussion
> around patch 3 (move iput() outside of transaction), patch 15 (dropping of
> j_state_lock around t_tid load), patch 18 (possible large overreservation
> of descriptor blocks due to rounding), and 21 (change of on-disk format for
> revoke descriptors).

Sorry, I didn't comment because I accepted your arguments; but I guess
I should have said so explicitly.  I just replied to those threads.

> Out of these I probably find the overreservation due to rounding the most
> serious and easy enough to handle so I'll fix that and then resend the
> series unless you raise your objection also in some other case.

Great!

					- Ted
Theodore Ts'o Nov. 5, 2019, 9:04 p.m. UTC | #7
On Tue, Nov 05, 2019 at 05:44:06PM +0100, Jan Kara wrote:
> Hello,
> 
> Here is v3 of this series with couple more bugs fixed. Now all failed tests Ted
> higlighted pass for me.

Thanks, I've applied this to the ext4 git tree.  Thanks for your work
on this patch series!

				- Ted
Theodore Ts'o Nov. 14, 2019, 5:26 a.m. UTC | #8
On Wed, Nov 13, 2019 at 10:45:45AM +0100, Jan Kara wrote:
> Thanks for the heads up! I didn't do any performance testing with the jbd2
> changes specifically and our internal performance testing grid only checks
> Linus' kernel so it didn't yet even try to run that code. I'll queue some
> sqlite insert tests internally with my changes to see whether I'm able to
> reproduce. I don't have NVME disks available quickly but I guess SATA SSD
> could do the job as well...

Sorry, false alarm.  What Phoronix was testing was 5.3 versus 5.4-rcX,
using Ubuntu's bleeding-edge kernels.  It wouldn't have any of the
ext4 patches we have queued for the *next* merge window.

That being said, I wasn't able to reproduce performance delta using
upstream kernels, running on a Google Compute Engine VM, machtype
n1-highcpu-8, using a GCE Local SSD (SCSI-attached) for the first
benchmark, which I believe was the pts/sqlite benchmark using a thread
count of 1:

     Phoronix Test Suite 9.0.1
     SQLite 3.30.1
     Threads / Copies: 1
     Seconds < Lower Is Better
     5.3.0 ..................... 225 |===========================================
     5.4.0-rc3 ................. 224 |==========================================
     5.4-rc3-80-gafb2442fa429 .. 227 |===========================================
     5.4.0-rc7 ................. 223 |==========================================

     Processor: Intel Xeon (4 Cores / 8 Threads), Chipset: Intel 440FX
     82441FX PMC, Memory: 1 x 7373 MB RAM, Disk: 11GB PersistentDisk +
     403GB EphemeralDisk, Network: Red Hat Virtio device

     OS: Debian 10, Kernel: 5.4.0-rc3-xfstests (x86_64) 20191113, Compiler:
     GCC 8.3.0, File-System: ext4, System Layer: KVM

This was done using an extension to a gce-xfstests test appliance, to
which I hope to be adding an automation engine where it will kexec
into a series of kernels, run the benchmarks and then spit out the
report somewhere.  For now, the benchmarks are run manually.

(Adding commentary and click-baity titles is left as an exercise to
the reader.  :-)

						- Ted
						
P.S.  For all that I like to make snarky comments about Phoronix.com,
I have to admit Michael Larabel has done a pretty good job with his
performance test engine.  I probably would have choosen a different
implementation than PHP, and I'd have added an explicit way to specify
the file system to be tested other than mounting it on top of
/var/lib/phoronix-test-suite, and at least have the option of placing
the benchmarks' build trees and binaries in a different location than
the file system under test.

But that being said, he's collecting a decent set of benchmark tools,
and it is pretty cool that it has an automated way of collecting the
benchmark results, including the pretty graphs suitable for web
articles and conference slide decks ("and now, we turn to the rigged
benchmarks section of the presentation designed to show my new feature
in the best possible light...").
Jan Kara Nov. 14, 2019, 8:49 a.m. UTC | #9
On Thu 14-11-19 00:26:52, Theodore Y. Ts'o wrote:
> On Wed, Nov 13, 2019 at 10:45:45AM +0100, Jan Kara wrote:
> > Thanks for the heads up! I didn't do any performance testing with the jbd2
> > changes specifically and our internal performance testing grid only checks
> > Linus' kernel so it didn't yet even try to run that code. I'll queue some
> > sqlite insert tests internally with my changes to see whether I'm able to
> > reproduce. I don't have NVME disks available quickly but I guess SATA SSD
> > could do the job as well...
> 
> Sorry, false alarm.  What Phoronix was testing was 5.3 versus 5.4-rcX,
> using Ubuntu's bleeding-edge kernels.  It wouldn't have any of the
> ext4 patches we have queued for the *next* merge window.

OK, thanks for looking! I've run some tests on my test setup anyway...

> That being said, I wasn't able to reproduce performance delta using
> upstream kernels, running on a Google Compute Engine VM, machtype
> n1-highcpu-8, using a GCE Local SSD (SCSI-attached) for the first
> benchmark, which I believe was the pts/sqlite benchmark using a thread
> count of 1:
> 
>      Phoronix Test Suite 9.0.1
>      SQLite 3.30.1
>      Threads / Copies: 1
>      Seconds < Lower Is Better
>      5.3.0 ..................... 225 |===========================================
>      5.4.0-rc3 ................. 224 |==========================================
>      5.4-rc3-80-gafb2442fa429 .. 227 |===========================================
>      5.4.0-rc7 ................. 223 |==========================================
> 
>      Processor: Intel Xeon (4 Cores / 8 Threads), Chipset: Intel 440FX
>      82441FX PMC, Memory: 1 x 7373 MB RAM, Disk: 11GB PersistentDisk +
>      403GB EphemeralDisk, Network: Red Hat Virtio device
> 
>      OS: Debian 10, Kernel: 5.4.0-rc3-xfstests (x86_64) 20191113, Compiler:
>      GCC 8.3.0, File-System: ext4, System Layer: KVM
> 
> This was done using an extension to a gce-xfstests test appliance, to
> which I hope to be adding an automation engine where it will kexec
> into a series of kernels, run the benchmarks and then spit out the
> report somewhere.  For now, the benchmarks are run manually.
> 
> (Adding commentary and click-baity titles is left as an exercise to
> the reader.  :-)
> 
> 						- Ted
> 						
> P.S.  For all that I like to make snarky comments about Phoronix.com,
> I have to admit Michael Larabel has done a pretty good job with his
> performance test engine.  I probably would have choosen a different
> implementation than PHP, and I'd have added an explicit way to specify
> the file system to be tested other than mounting it on top of
> /var/lib/phoronix-test-suite, and at least have the option of placing
> the benchmarks' build trees and binaries in a different location than
> the file system under test.
> 
> But that being said, he's collecting a decent set of benchmark tools,
> and it is pretty cool that it has an automated way of collecting the
> benchmark results, including the pretty graphs suitable for web
> articles and conference slide decks ("and now, we turn to the rigged
> benchmarks section of the presentation designed to show my new feature
> in the best possible light...").

Let me make a small marketing pitch mmtest [1] :) For me running the test is
just:
  * Boot the right kernel on the machine
  * Run:
   ./run-mmtests.sh -c configs/config-db-sqlite-insert-medium-ext4 \
      --no-monitor Whatever_run_name_1

Now the config file already has proper partition, fstype, mkfs opts etc.
configured so it's a bit of cheating but still :). And when I have data for
both kernels, I do:
  cd work/log
  ../../compare_kernels.sh

and get a table with the comparison of the two benchmarking runs with
averages, standard deviations, percentiles, and other more advanced
statistical stuff to distinguish signal from noise. We also have support
for gathering various monitoring while the test is running (turbostat,
iostat, vmstat, ...) and graphing all the results (although the graphs are
more aimed at quick analysis of what's going on rather than at presenting
results to a public).

So for this campaign I've compared "5.3+some SUSE patches" to "5.4-rc7+your
'dev' branch". And the results look like:

sqlite
                              5.3-SUSE                5.4-rc7
                                                     ext4-dev
Min       Trans     2181.67 (   0.00%)     2412.72 (  10.59%)
Hmean     Trans     2399.39 (   0.00%)     2602.73 *   8.47%*
Stddev    Trans      172.15 (   0.00%)      141.61 (  17.74%)
CoeffVar  Trans        7.14 (   0.00%)        5.43 (  24.00%)
Max       Trans     2671.84 (   0.00%)     3027.81 (  13.32%)
...

These are Trans/Sec values so there's actually a small improvement on this
machine. But it's somwhat difficult to tell because the benchmark variation
is rather high (likely due to powersafe cpufreq governor if I should guess).

								Honza

[1] git://github.com/gormanm/mmtests