diff mbox series

ext4/jbd2: drop jbd2_transaction_committed()

Message ID 20240513072119.2335346-1-yi.zhang@huaweicloud.com
State New
Headers show
Series ext4/jbd2: drop jbd2_transaction_committed() | expand

Commit Message

Zhang Yi May 13, 2024, 7:21 a.m. UTC
From: Zhang Yi <yi.zhang@huawei.com>

jbd2_transaction_committed() is used to check whether a transaction with
the given tid has already committed, it hold j_state_lock in read mode
and check the tid of current running transaction and committing
transaction, but holding the j_state_lock is expensive.

We have already stored the sequence number of the most recently
committed transaction in journal t->j_commit_sequence, we could do this
check by comparing it with the given tid instead. If the given tid isn't
smaller than j_commit_sequence, we can ensure that the given transaction
has been committed. That way we could drop the expensive lock and
achieve about 10% ~ 20% performance gains in concurrent DIOs on may
virtual machine with 100G ramdisk.

fio -filename=/mnt/foo -direct=1 -iodepth=10 -rw=$rw -ioengine=libaio \
    -bs=4k -size=10G -numjobs=10 -runtime=60 -overwrite=1 -name=test \
    -group_reporting

Before:
  overwrite       IOPS=88.2k, BW=344MiB/s
  read            IOPS=95.7k, BW=374MiB/s
  rand overwrite  IOPS=98.7k, BW=386MiB/s
  randread        IOPS=102k, BW=397MiB/s

After:
  verwrite:       IOPS=105k, BW=410MiB/s
  read:           IOPS=112k, BW=436MiB/s
  rand overwrite: IOPS=104k, BW=404MiB/s
  randread:       IOPS=111k, BW=432MiB/s

CC: Dave Chinner <david@fromorbit.com>
Suggested-by: Dave Chinner <david@fromorbit.com>
Link: https://lore.kernel.org/linux-ext4/493ab4c5-505c-a351-eefa-7d2677cdf800@huaweicloud.com/T/#m6a14df5d085527a188c5a151191e87a3252dc4e2
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/ext4/inode.c      |  4 ++--
 fs/jbd2/journal.c    | 17 -----------------
 include/linux/jbd2.h |  1 -
 3 files changed, 2 insertions(+), 20 deletions(-)

Comments

Jan Kara May 15, 2024, 12:25 a.m. UTC | #1
On Mon 13-05-24 15:21:19, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
> 
> jbd2_transaction_committed() is used to check whether a transaction with
> the given tid has already committed, it hold j_state_lock in read mode
> and check the tid of current running transaction and committing
> transaction, but holding the j_state_lock is expensive.
> 
> We have already stored the sequence number of the most recently
> committed transaction in journal t->j_commit_sequence, we could do this
> check by comparing it with the given tid instead. If the given tid isn't
> smaller than j_commit_sequence, we can ensure that the given transaction
> has been committed. That way we could drop the expensive lock and
> achieve about 10% ~ 20% performance gains in concurrent DIOs on may
> virtual machine with 100G ramdisk.
> 
> fio -filename=/mnt/foo -direct=1 -iodepth=10 -rw=$rw -ioengine=libaio \
>     -bs=4k -size=10G -numjobs=10 -runtime=60 -overwrite=1 -name=test \
>     -group_reporting
> 
> Before:
>   overwrite       IOPS=88.2k, BW=344MiB/s
>   read            IOPS=95.7k, BW=374MiB/s
>   rand overwrite  IOPS=98.7k, BW=386MiB/s
>   randread        IOPS=102k, BW=397MiB/s
> 
> After:
>   verwrite:       IOPS=105k, BW=410MiB/s
>   read:           IOPS=112k, BW=436MiB/s
>   rand overwrite: IOPS=104k, BW=404MiB/s
>   randread:       IOPS=111k, BW=432MiB/s
> 
> CC: Dave Chinner <david@fromorbit.com>
> Suggested-by: Dave Chinner <david@fromorbit.com>
> Link: https://lore.kernel.org/linux-ext4/493ab4c5-505c-a351-eefa-7d2677cdf800@huaweicloud.com/T/#m6a14df5d085527a188c5a151191e87a3252dc4e2
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>

I agree this is workable solution and the performance benefits are nice. But
I have some comments regarding the implementation:

> @@ -3199,8 +3199,8 @@ static bool ext4_inode_datasync_dirty(struct inode *inode)
>  	journal_t *journal = EXT4_SB(inode->i_sb)->s_journal;
>  
>  	if (journal) {
> -		if (jbd2_transaction_committed(journal,
> -			EXT4_I(inode)->i_datasync_tid))
> +		if (tid_geq(journal->j_commit_sequence,
> +			    EXT4_I(inode)->i_datasync_tid))

Please leave the helper jbd2_transaction_committed(), just make the
implementation more efficient. Also accessing j_commit_sequence without any
lock is theoretically problematic wrt compiler optimization. You should have
READ_ONCE() there and the places modifying j_commit_sequence need to use
WRITE_ONCE().

								Honza
Zhang Yi May 16, 2024, 8:27 a.m. UTC | #2
On 2024/5/15 8:25, Jan Kara wrote:
> On Mon 13-05-24 15:21:19, Zhang Yi wrote:
>> From: Zhang Yi <yi.zhang@huawei.com>
>>
>> jbd2_transaction_committed() is used to check whether a transaction with
>> the given tid has already committed, it hold j_state_lock in read mode
>> and check the tid of current running transaction and committing
>> transaction, but holding the j_state_lock is expensive.
>>
>> We have already stored the sequence number of the most recently
>> committed transaction in journal t->j_commit_sequence, we could do this
>> check by comparing it with the given tid instead. If the given tid isn't
>> smaller than j_commit_sequence, we can ensure that the given transaction
>> has been committed. That way we could drop the expensive lock and
>> achieve about 10% ~ 20% performance gains in concurrent DIOs on may
>> virtual machine with 100G ramdisk.
>>
>> fio -filename=/mnt/foo -direct=1 -iodepth=10 -rw=$rw -ioengine=libaio \
>>     -bs=4k -size=10G -numjobs=10 -runtime=60 -overwrite=1 -name=test \
>>     -group_reporting
>>
>> Before:
>>   overwrite       IOPS=88.2k, BW=344MiB/s
>>   read            IOPS=95.7k, BW=374MiB/s
>>   rand overwrite  IOPS=98.7k, BW=386MiB/s
>>   randread        IOPS=102k, BW=397MiB/s
>>
>> After:
>>   verwrite:       IOPS=105k, BW=410MiB/s
>>   read:           IOPS=112k, BW=436MiB/s
>>   rand overwrite: IOPS=104k, BW=404MiB/s
>>   randread:       IOPS=111k, BW=432MiB/s
>>
>> CC: Dave Chinner <david@fromorbit.com>
>> Suggested-by: Dave Chinner <david@fromorbit.com>
>> Link: https://lore.kernel.org/linux-ext4/493ab4c5-505c-a351-eefa-7d2677cdf800@huaweicloud.com/T/#m6a14df5d085527a188c5a151191e87a3252dc4e2
>> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
> 
> I agree this is workable solution and the performance benefits are nice. But
> I have some comments regarding the implementation:
> 
>> @@ -3199,8 +3199,8 @@ static bool ext4_inode_datasync_dirty(struct inode *inode)
>>  	journal_t *journal = EXT4_SB(inode->i_sb)->s_journal;
>>  
>>  	if (journal) {
>> -		if (jbd2_transaction_committed(journal,
>> -			EXT4_I(inode)->i_datasync_tid))
>> +		if (tid_geq(journal->j_commit_sequence,
>> +			    EXT4_I(inode)->i_datasync_tid))
> 
> Please leave the helper jbd2_transaction_committed(), just make the
> implementation more efficient. 

Sure.

> Also accessing j_commit_sequence without any
> lock is theoretically problematic wrt compiler optimization. You should have
> READ_ONCE() there and the places modifying j_commit_sequence need to use
> WRITE_ONCE().
> 

Thanks for pointing this out, but I'm not sure if we have to need READ_ONCE()
here. IIUC, if we add READ_ONCE(), we could make sure to get the latest
j_commit_sequence, if not, there is a window (it might becomes larger) that
we could get the old value and jbd2_transaction_committed() could return false
even if the given transaction was just committed, but I think the window is
always there, so it looks like it is not a big problem, is that right?

Thanks,
Yi.
Jan Kara May 20, 2024, 8:49 a.m. UTC | #3
On Thu 16-05-24 16:27:25, Zhang Yi wrote:
> On 2024/5/15 8:25, Jan Kara wrote:
> > On Mon 13-05-24 15:21:19, Zhang Yi wrote:
> > Also accessing j_commit_sequence without any
> > lock is theoretically problematic wrt compiler optimization. You should have
> > READ_ONCE() there and the places modifying j_commit_sequence need to use
> > WRITE_ONCE().
> > 
> 
> Thanks for pointing this out, but I'm not sure if we have to need READ_ONCE()
> here. IIUC, if we add READ_ONCE(), we could make sure to get the latest
> j_commit_sequence, if not, there is a window (it might becomes larger) that
> we could get the old value and jbd2_transaction_committed() could return false
> even if the given transaction was just committed, but I think the window is
> always there, so it looks like it is not a big problem, is that right?

Well, all accesses to any memory should use READ_ONCE(), be protected by a
lock, or use types that handle atomicity on assembly level (like atomic_t,
or atomic bit operations and similar). Otherwise the compiler is free to
assume the underlying memory cannot change and generate potentionally
invalid code. In this case, I don't think realistically any compiler will
do it but still it is a good practice and also it saves us from KCSAN
warnings. If you want to know more details about possible problems, see

  tools/memory-model/Documentation/explanation.txt

chapter "PLAIN ACCESSES AND DATA RACES".

								Honza
Zhang Yi May 20, 2024, 12:39 p.m. UTC | #4
On 2024/5/20 16:49, Jan Kara wrote:
> On Thu 16-05-24 16:27:25, Zhang Yi wrote:
>> On 2024/5/15 8:25, Jan Kara wrote:
>>> On Mon 13-05-24 15:21:19, Zhang Yi wrote:
>>> Also accessing j_commit_sequence without any
>>> lock is theoretically problematic wrt compiler optimization. You should have
>>> READ_ONCE() there and the places modifying j_commit_sequence need to use
>>> WRITE_ONCE().
>>>
>>
>> Thanks for pointing this out, but I'm not sure if we have to need READ_ONCE()
>> here. IIUC, if we add READ_ONCE(), we could make sure to get the latest
>> j_commit_sequence, if not, there is a window (it might becomes larger) that
>> we could get the old value and jbd2_transaction_committed() could return false
>> even if the given transaction was just committed, but I think the window is
>> always there, so it looks like it is not a big problem, is that right?
> 
> Well, all accesses to any memory should use READ_ONCE(), be protected by a
> lock, or use types that handle atomicity on assembly level (like atomic_t,
> or atomic bit operations and similar). Otherwise the compiler is free to
> assume the underlying memory cannot change and generate potentionally
> invalid code. In this case, I don't think realistically any compiler will
> do it but still it is a good practice and also it saves us from KCSAN
> warnings. If you want to know more details about possible problems, see
> 
>   tools/memory-model/Documentation/explanation.txt
> 
> chapter "PLAIN ACCESSES AND DATA RACES".
> 

Sure, this document is really helpful, I'll add READ_ONCE() and
WRITE_ONCE() here, thanks a lot.

Yi.
kernel test robot May 27, 2024, 6:37 a.m. UTC | #5
Hello,

kernel test robot noticed a 674.0% improvement of stress-ng.fiemap.ops_per_sec on:


commit: 8a7952dd64b38f3ac2754ca550b52ac3ca921c1a ("[PATCH] ext4/jbd2: drop jbd2_transaction_committed()")
url: https://github.com/intel-lab-lkp/linux/commits/Zhang-Yi/ext4-jbd2-drop-jbd2_transaction_committed/20240513-153311
base: https://git.kernel.org/cgit/linux/kernel/git/tytso/ext4.git dev
patch link: https://lore.kernel.org/all/20240513072119.2335346-1-yi.zhang@huaweicloud.com/
patch subject: [PATCH] ext4/jbd2: drop jbd2_transaction_committed()

testcase: stress-ng
test machine: 64 threads 2 sockets Intel(R) Xeon(R) Gold 6346 CPU @ 3.10GHz (Ice Lake) with 256G memory
parameters:

	nr_threads: 100%
	disk: 1HDD
	testtime: 60s
	fs: ext4
	test: fiemap
	cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240527/202405271406.9b9763f6-oliver.sang@intel.com

=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-13/performance/1HDD/ext4/x86_64-rhel-8.3/100%/debian-12-x86_64-20240206.cgz/lkp-icl-2sp8/fiemap/stress-ng/60s

commit: 
  26770a717c ("jbd2: add prefix 'jbd2' for 'shrink_type'")
  8a7952dd64 ("ext4/jbd2: drop jbd2_transaction_committed()")

26770a717cac5704 8a7952dd64b38f3ac2754ca550b 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      1.13 ±  5%     +52.7%       1.73 ±  3%  iostat.cpu.user
      0.01 ± 47%      +0.0        0.04 ± 44%  mpstat.cpu.all.iowait%
      0.02 ±  4%      -0.0        0.01 ± 15%  mpstat.cpu.all.soft%
      1.15 ±  5%      +0.6        1.75 ±  3%  mpstat.cpu.all.usr%
    100427 ± 42%     -59.1%      41106 ± 66%  numa-vmstat.node0.nr_anon_pages
    149979 ± 34%    +205.2%     457720 ± 31%  numa-vmstat.node1.nr_inactive_anon
    149980 ± 34%    +205.2%     457720 ± 31%  numa-vmstat.node1.nr_zone_inactive_anon
    402376 ± 42%     -59.1%     164453 ± 66%  numa-meminfo.node0.AnonPages
    424656 ± 39%     -57.6%     180029 ± 58%  numa-meminfo.node0.AnonPages.max
    620453 ± 33%    +198.4%    1851452 ± 30%  numa-meminfo.node1.Inactive
    601997 ± 34%    +204.4%    1832749 ± 31%  numa-meminfo.node1.Inactive(anon)
      2387 ±  2%     +12.8%       2693 ±  4%  vmstat.io.bo
     10.17 ± 13%     +83.8%      18.69 ±  3%  vmstat.procs.b
    201402 ±  9%    +631.9%    1474068 ±  6%  vmstat.system.cs
    167899            +5.5%     177141        vmstat.system.in
    347.50 ± 16%     +53.4%     533.00 ± 10%  perf-c2c.DRAM.local
      4232 ± 48%    +166.1%      11262 ± 24%  perf-c2c.DRAM.remote
     23484 ± 10%    +119.8%      51621 ±  7%  perf-c2c.HITM.local
      3377 ± 56%    +179.0%       9424 ± 27%  perf-c2c.HITM.remote
     26861 ±  7%    +127.3%      61046 ±  5%  perf-c2c.HITM.total
   7612782 ±  7%    +676.3%   59099460 ±  6%  stress-ng.fiemap.ops
    124734 ±  8%    +674.0%     965466 ±  6%  stress-ng.fiemap.ops_per_sec
  12752270 ± 10%    +643.8%   94847653 ±  5%  stress-ng.time.involuntary_context_switches
     19.02 ±  6%    +124.6%      42.70 ±  5%  stress-ng.time.user_time
     62295           +17.8%      73411 ±  4%  stress-ng.time.voluntary_context_switches
    262799 ±  5%     +90.1%     499700 ± 43%  meminfo.Active
   3750501           +38.7%    5203048 ±  2%  meminfo.Cached
   3989040           +36.2%    5432649 ±  2%  meminfo.Committed_AS
   1154859 ±  3%    +105.8%    2376633 ± 12%  meminfo.Inactive
   1118790 ±  3%    +109.2%    2340758 ± 12%  meminfo.Inactive(anon)
    362491 ±  7%    +153.0%     917217 ± 19%  meminfo.Mapped
   5902699           +25.8%    7426870        meminfo.Memused
    546333 ±  3%    +265.7%    1998111 ±  5%  meminfo.Shmem
   5963280           +25.0%    7456799        meminfo.max_used_kB
      0.92 ± 16%     -21.5%       0.72 ±  8%  sched_debug.cfs_rq:/.h_nr_running.stddev
    809.29 ±  9%     -15.5%     683.50 ±  7%  sched_debug.cfs_rq:/.runnable_avg.stddev
    567.16 ± 10%    +123.7%       1268 ±  9%  sched_debug.cfs_rq:/.util_est.avg
      2015 ±  8%     +41.9%       2859 ± 11%  sched_debug.cfs_rq:/.util_est.max
      0.75 ±223%  +23377.8%     176.08 ± 20%  sched_debug.cfs_rq:/.util_est.min
    447.08 ±  8%     +19.1%     532.35 ± 12%  sched_debug.cfs_rq:/.util_est.stddev
      0.92 ± 16%     -22.1%       0.71 ±  8%  sched_debug.cpu.nr_running.stddev
    101043 ±  9%    +621.7%     729210 ±  5%  sched_debug.cpu.nr_switches.avg
    176826 ± 18%    +423.9%     926474 ±  5%  sched_debug.cpu.nr_switches.max
     17586 ± 35%    +762.4%     151656 ± 44%  sched_debug.cpu.nr_switches.min
     54968 ± 44%    +170.9%     148885 ±  5%  sched_debug.cpu.nr_switches.stddev
      5712            +5.1%       6002 ±  2%  proc-vmstat.nr_active_file
    939938           +38.6%    1302470 ±  2%  proc-vmstat.nr_file_pages
    280225 ±  3%    +108.7%     584958 ± 12%  proc-vmstat.nr_inactive_anon
     91383 ±  8%    +149.7%     228197 ± 19%  proc-vmstat.nr_mapped
    136324 ±  3%    +265.7%     498577 ±  5%  proc-vmstat.nr_shmem
     25123            +2.8%      25834        proc-vmstat.nr_slab_reclaimable
      5712            +5.1%       6002 ±  2%  proc-vmstat.nr_zone_active_file
    280225 ±  3%    +108.7%     584958 ± 12%  proc-vmstat.nr_zone_inactive_anon
     15324 ± 30%    +359.4%      70400 ±  9%  proc-vmstat.numa_hint_faults
     10098 ± 24%    +283.5%      38729 ± 19%  proc-vmstat.numa_hint_faults_local
    604637           +86.4%    1127295 ±  4%  proc-vmstat.numa_hit
    538359           +97.1%    1061202 ±  4%  proc-vmstat.numa_local
    532388           +13.4%     603511 ±  3%  proc-vmstat.numa_pte_updates
    723145           +72.1%    1244321 ±  3%  proc-vmstat.pgalloc_normal
    383685 ±  2%     +21.4%     465726 ±  2%  proc-vmstat.pgfault
    152450           +13.4%     172900 ±  4%  proc-vmstat.pgpgout
 2.951e+09 ±  6%    +571.9%  1.983e+10 ±  2%  perf-stat.i.branch-instructions
      0.64 ±  4%      -0.2        0.40 ±  2%  perf-stat.i.branch-miss-rate%
  19046341 ±  4%    +291.2%   74517781        perf-stat.i.branch-misses
   9666270 ± 31%    +318.8%   40484483 ± 19%  perf-stat.i.cache-misses
  70884859 ±  6%    +367.8%  3.316e+08 ±  2%  perf-stat.i.cache-references
    213832 ±  9%    +623.2%    1546476 ±  6%  perf-stat.i.context-switches
     15.20 ±  7%     -84.9%       2.30 ±  2%  perf-stat.i.cpi
    414.63 ±  5%     +10.9%     459.73 ±  8%  perf-stat.i.cpu-migrations
     26710 ± 31%     -77.7%       5960 ± 22%  perf-stat.i.cycles-between-cache-misses
 1.501e+10 ±  6%    +556.2%  9.849e+10 ±  2%  perf-stat.i.instructions
      0.07 ±  6%    +515.9%       0.44 ±  2%  perf-stat.i.ipc
      3.33 ±  9%    +626.6%      24.22 ±  6%  perf-stat.i.metric.K/sec
      4908 ±  3%     +26.6%       6216 ±  3%  perf-stat.i.minor-faults
      4908 ±  3%     +26.6%       6216 ±  3%  perf-stat.i.page-faults
      0.65 ±  5%      -0.3        0.38 ±  2%  perf-stat.overall.branch-miss-rate%
     15.14 ±  6%     -84.8%       2.30 ±  2%  perf-stat.overall.cpi
     25496 ± 29%     -77.2%       5824 ± 21%  perf-stat.overall.cycles-between-cache-misses
      0.07 ±  6%    +556.1%       0.44 ±  2%  perf-stat.overall.ipc
   2.9e+09 ±  6%    +572.3%   1.95e+10 ±  2%  perf-stat.ps.branch-instructions
  18850245 ±  4%    +289.0%   73330498        perf-stat.ps.branch-misses
   9571442 ± 31%    +315.9%   39805993 ± 19%  perf-stat.ps.cache-misses
  70175556 ±  6%    +365.5%  3.267e+08 ±  2%  perf-stat.ps.cache-references
    205492 ±  9%    +636.9%    1514212 ±  6%  perf-stat.ps.context-switches
    405.62 ±  5%     +10.8%     449.30 ±  8%  perf-stat.ps.cpu-migrations
 1.475e+10 ±  6%    +556.5%  9.684e+10 ±  2%  perf-stat.ps.instructions
      4823 ±  3%     +25.8%       6069 ±  4%  perf-stat.ps.minor-faults
      4823 ±  3%     +25.8%       6069 ±  4%  perf-stat.ps.page-faults
 9.213e+11 ±  6%    +555.4%  6.038e+12 ±  2%  perf-stat.total.instructions
     85.73           -85.7        0.00        perf-profile.calltrace.cycles-pp.jbd2_transaction_committed.ext4_set_iomap.ext4_iomap_begin_report.iomap_iter.iomap_fiemap
     86.11           -84.1        1.99 ±  2%  perf-profile.calltrace.cycles-pp.ext4_set_iomap.ext4_iomap_begin_report.iomap_iter.iomap_fiemap.do_vfs_ioctl
     50.51 ±  3%     -50.5        0.00        perf-profile.calltrace.cycles-pp._raw_read_lock.jbd2_transaction_committed.ext4_set_iomap.ext4_iomap_begin_report.iomap_iter
     96.50           -12.4       84.06        perf-profile.calltrace.cycles-pp.ext4_iomap_begin_report.iomap_iter.iomap_fiemap.do_vfs_ioctl.__x64_sys_ioctl
     97.49            -5.6       91.94        perf-profile.calltrace.cycles-pp.iomap_iter.iomap_fiemap.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64
     98.08            -1.5       96.62        perf-profile.calltrace.cycles-pp.iomap_fiemap.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe
     98.13            -1.1       96.99        perf-profile.calltrace.cycles-pp.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
     98.14            -1.1       97.05        perf-profile.calltrace.cycles-pp.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
     98.18            -1.1       97.12        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
     98.19            -1.0       97.14        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.ioctl
     98.22            -0.9       97.31        perf-profile.calltrace.cycles-pp.ioctl
      0.62 ±  9%      +0.3        0.94        perf-profile.calltrace.cycles-pp.__schedule.schedule.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.62 ±  9%      +0.3        0.96        perf-profile.calltrace.cycles-pp.schedule.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
      0.70 ±  9%      +0.5        1.18        perf-profile.calltrace.cycles-pp.__x64_sys_sched_yield.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
      0.87 ±  8%      +0.7        1.58        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__sched_yield
      0.87 ±  8%      +0.7        1.59        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__sched_yield
      0.00            +1.0        1.00 ±  4%  perf-profile.calltrace.cycles-pp.ext4_sb_block_valid.__check_block_validity.ext4_map_blocks.ext4_iomap_begin_report.iomap_iter
      0.94 ±  8%      +1.0        1.98 ±  2%  perf-profile.calltrace.cycles-pp.__sched_yield
      0.00            +1.1        1.10 ±  2%  perf-profile.calltrace.cycles-pp._copy_to_user.fiemap_fill_next_extent.iomap_fiemap.do_vfs_ioctl.__x64_sys_ioctl
      0.00            +1.3        1.35 ±  8%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.percpu_counter_add_batch.ext4_es_lookup_extent.ext4_map_blocks
      0.00            +1.5        1.48 ±  2%  perf-profile.calltrace.cycles-pp.__check_block_validity.ext4_map_blocks.ext4_iomap_begin_report.iomap_iter.iomap_fiemap
      0.00            +1.6        1.58 ±  7%  perf-profile.calltrace.cycles-pp._raw_spin_lock.percpu_counter_add_batch.ext4_es_lookup_extent.ext4_map_blocks.ext4_iomap_begin_report
      0.00            +3.2        3.22 ±  2%  perf-profile.calltrace.cycles-pp.fiemap_fill_next_extent.iomap_fiemap.do_vfs_ioctl.__x64_sys_ioctl.do_syscall_64
      4.96 ±  6%      +4.8        9.72 ±  2%  perf-profile.calltrace.cycles-pp._raw_read_lock.ext4_es_lookup_extent.ext4_map_blocks.ext4_iomap_begin_report.iomap_iter
      0.78 ±  7%      +5.4        6.22 ±  2%  perf-profile.calltrace.cycles-pp.iomap_iter_advance.iomap_iter.iomap_fiemap.do_vfs_ioctl.__x64_sys_ioctl
      0.62 ±  6%     +50.6       51.27        perf-profile.calltrace.cycles-pp.percpu_counter_add_batch.ext4_es_lookup_extent.ext4_map_blocks.ext4_iomap_begin_report.iomap_iter
      9.55 ±  7%     +66.3       75.88        perf-profile.calltrace.cycles-pp.ext4_es_lookup_extent.ext4_map_blocks.ext4_iomap_begin_report.iomap_iter.iomap_fiemap
     10.12 ±  7%     +69.9       79.98        perf-profile.calltrace.cycles-pp.ext4_map_blocks.ext4_iomap_begin_report.iomap_iter.iomap_fiemap.do_vfs_ioctl
     85.84           -85.8        0.00        perf-profile.children.cycles-pp.jbd2_transaction_committed
     86.14           -84.0        2.10 ±  2%  perf-profile.children.cycles-pp.ext4_set_iomap
     55.59 ±  3%     -45.8        9.83 ±  2%  perf-profile.children.cycles-pp._raw_read_lock
     96.56           -12.1       84.50        perf-profile.children.cycles-pp.ext4_iomap_begin_report
     97.54            -5.3       92.26        perf-profile.children.cycles-pp.iomap_iter
     98.11            -1.3       96.84        perf-profile.children.cycles-pp.iomap_fiemap
     98.14            -1.1       97.00        perf-profile.children.cycles-pp.do_vfs_ioctl
     98.14            -1.1       97.06        perf-profile.children.cycles-pp.__x64_sys_ioctl
     98.22            -0.9       97.34        perf-profile.children.cycles-pp.ioctl
     99.42            -0.7       98.76        perf-profile.children.cycles-pp.do_syscall_64
     99.43            -0.6       98.81        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.39 ± 21%      -0.1        0.25 ±  6%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      0.14 ± 24%      -0.1        0.03 ±105%  perf-profile.children.cycles-pp.build_id__mark_dso_hit
      0.19 ±  5%      +0.0        0.21 ±  6%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      0.08 ± 13%      +0.0        0.11        perf-profile.children.cycles-pp.switch_fpu_return
      0.09 ±  7%      +0.0        0.12 ±  5%  perf-profile.children.cycles-pp.update_process_times
      0.11 ±  5%      +0.0        0.14 ±  6%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      0.10 ±  5%      +0.0        0.14 ±  5%  perf-profile.children.cycles-pp.tick_nohz_handler
      0.00            +0.1        0.05        perf-profile.children.cycles-pp.__fdget
      0.00            +0.1        0.05 ±  7%  perf-profile.children.cycles-pp.__update_load_avg_se
      0.00            +0.1        0.06 ±  9%  perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack
      0.00            +0.1        0.06 ±  6%  perf-profile.children.cycles-pp.syscall_return_via_sysret
      0.00            +0.1        0.06        perf-profile.children.cycles-pp.___perf_sw_event
      0.00            +0.1        0.06        perf-profile.children.cycles-pp.__dequeue_entity
      0.00            +0.1        0.06 ±  6%  perf-profile.children.cycles-pp.__switch_to_asm
      0.00            +0.1        0.06 ±  7%  perf-profile.children.cycles-pp.pick_eevdf
      0.00            +0.1        0.07        perf-profile.children.cycles-pp.__switch_to
      0.00            +0.1        0.08 ±  6%  perf-profile.children.cycles-pp.entry_SYSCALL_64
      0.00            +0.1        0.08 ±  4%  perf-profile.children.cycles-pp.rseq_ip_fixup
      0.00            +0.1        0.09 ±  4%  perf-profile.children.cycles-pp.restore_fpregs_from_fpstate
      0.27 ± 22%      +0.1        0.37 ±  9%  perf-profile.children.cycles-pp.perf_session__process_events
      0.27 ± 22%      +0.1        0.37 ±  9%  perf-profile.children.cycles-pp.record__finish_output
      0.26 ± 25%      +0.1        0.36 ±  9%  perf-profile.children.cycles-pp.reader__read_event
      0.00            +0.1        0.11 ±  3%  perf-profile.children.cycles-pp.set_next_entity
      0.00            +0.1        0.12 ±  4%  perf-profile.children.cycles-pp.update_load_avg
      0.08 ±  6%      +0.1        0.20        perf-profile.children.cycles-pp.do_sched_yield
      0.00            +0.1        0.12 ±  4%  perf-profile.children.cycles-pp.put_prev_entity
      0.02 ±141%      +0.1        0.15 ±  2%  perf-profile.children.cycles-pp.update_curr
      0.09 ± 36%      +0.1        0.23 ± 18%  perf-profile.children.cycles-pp.process_simple
      0.00            +0.1        0.15 ±  3%  perf-profile.children.cycles-pp.__rseq_handle_notify_resume
      0.07 ± 57%      +0.1        0.22 ± 19%  perf-profile.children.cycles-pp.queue_event
      0.00            +0.1        0.15 ±  4%  perf-profile.children.cycles-pp.clear_bhb_loop
      0.00            +0.1        0.15 ±  4%  perf-profile.children.cycles-pp.stress_fiemap
      0.07 ± 57%      +0.2        0.22 ± 19%  perf-profile.children.cycles-pp.ordered_events__queue
      0.00            +0.2        0.16 ±  2%  perf-profile.children.cycles-pp.yield_task_fair
      0.00            +0.2        0.16 ±  3%  perf-profile.children.cycles-pp.ext4_inode_block_valid
      0.20 ±  5%      +0.2        0.41        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      0.09 ± 10%      +0.3        0.38 ±  2%  perf-profile.children.cycles-pp.pick_next_task_fair
      0.62 ±  9%      +0.3        0.95        perf-profile.children.cycles-pp.__schedule
      0.63 ± 10%      +0.3        0.97        perf-profile.children.cycles-pp.schedule
      0.06 ± 11%      +0.4        0.48 ±  3%  perf-profile.children.cycles-pp.iomap_to_fiemap
      0.70 ±  9%      +0.5        1.18        perf-profile.children.cycles-pp.__x64_sys_sched_yield
      0.13 ±  8%      +0.9        1.05 ±  3%  perf-profile.children.cycles-pp.ext4_sb_block_valid
      0.94 ±  8%      +1.1        2.03 ±  2%  perf-profile.children.cycles-pp.__sched_yield
      0.16 ±  7%      +1.1        1.27 ±  2%  perf-profile.children.cycles-pp._copy_to_user
      0.20 ±  8%      +1.4        1.59 ±  2%  perf-profile.children.cycles-pp.__check_block_validity
      0.20 ±  7%      +1.4        1.64 ±  7%  perf-profile.children.cycles-pp._raw_spin_lock
      0.41 ±  7%      +2.9        3.30 ±  2%  perf-profile.children.cycles-pp.fiemap_fill_next_extent
      0.80 ±  7%      +5.6        6.35 ±  2%  perf-profile.children.cycles-pp.iomap_iter_advance
      0.64 ±  7%     +50.8       51.40        perf-profile.children.cycles-pp.percpu_counter_add_batch
      9.60 ±  7%     +66.7       76.28        perf-profile.children.cycles-pp.ext4_es_lookup_extent
     10.16 ±  7%     +70.2       80.39        perf-profile.children.cycles-pp.ext4_map_blocks
     55.43 ±  3%     -45.7        9.70 ±  2%  perf-profile.self.cycles-pp._raw_read_lock
     34.41 ±  6%     -34.4        0.00        perf-profile.self.cycles-pp.jbd2_transaction_committed
      0.10 ± 10%      -0.0        0.07 ±  5%  perf-profile.self.cycles-pp.prepare_task_switch
      0.08 ± 11%      +0.0        0.12 ±  4%  perf-profile.self.cycles-pp.__schedule
      0.00            +0.1        0.05        perf-profile.self.cycles-pp.___perf_sw_event
      0.00            +0.1        0.05        perf-profile.self.cycles-pp.__update_load_avg_se
      0.00            +0.1        0.05        perf-profile.self.cycles-pp.update_curr
      0.00            +0.1        0.05 ±  7%  perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack
      0.00            +0.1        0.06 ±  9%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.00            +0.1        0.06 ± 13%  perf-profile.self.cycles-pp.__sched_yield
      0.00            +0.1        0.06 ±  6%  perf-profile.self.cycles-pp.__switch_to_asm
      0.00            +0.1        0.07        perf-profile.self.cycles-pp.__switch_to
      0.20 ±  7%      +0.1        0.28 ±  4%  perf-profile.self.cycles-pp._raw_spin_lock
      0.00            +0.1        0.09 ±  4%  perf-profile.self.cycles-pp.restore_fpregs_from_fpstate
      0.00            +0.1        0.11 ±  4%  perf-profile.self.cycles-pp.ext4_inode_block_valid
      0.00            +0.1        0.13 ±  5%  perf-profile.self.cycles-pp.stress_fiemap
      0.07 ± 57%      +0.1        0.22 ± 19%  perf-profile.self.cycles-pp.queue_event
      0.00            +0.1        0.15 ±  4%  perf-profile.self.cycles-pp.clear_bhb_loop
      0.05 ± 45%      +0.3        0.38 ±  2%  perf-profile.self.cycles-pp.__check_block_validity
      0.03 ± 70%      +0.3        0.38 ±  3%  perf-profile.self.cycles-pp.iomap_to_fiemap
      0.13 ±  7%      +0.9        1.00 ±  4%  perf-profile.self.cycles-pp.ext4_sb_block_valid
      0.13 ±  7%      +0.9        1.05 ±  2%  perf-profile.self.cycles-pp.iomap_fiemap
      0.16 ±  7%      +1.1        1.24 ±  2%  perf-profile.self.cycles-pp._copy_to_user
      0.19 ±  7%      +1.3        1.52 ±  2%  perf-profile.self.cycles-pp.iomap_iter
      0.29 ± 11%      +1.6        1.88 ±  2%  perf-profile.self.cycles-pp.ext4_set_iomap
      0.25 ±  8%      +1.8        2.04 ±  2%  perf-profile.self.cycles-pp.fiemap_fill_next_extent
      0.27 ±  8%      +1.9        2.21 ±  2%  perf-profile.self.cycles-pp.ext4_iomap_begin_report
      0.38 ± 13%      +2.3        2.70 ±  4%  perf-profile.self.cycles-pp.ext4_map_blocks
      0.78 ±  7%      +5.4        6.23 ±  2%  perf-profile.self.cycles-pp.iomap_iter_advance
      3.98 ±  8%     +11.1       15.12 ±  2%  perf-profile.self.cycles-pp.ext4_es_lookup_extent
      0.56 ±  7%     +48.9       49.48        perf-profile.self.cycles-pp.percpu_counter_add_batch




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
diff mbox series

Patch

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 537803250ca9..e8e2865bf9ac 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3199,8 +3199,8 @@  static bool ext4_inode_datasync_dirty(struct inode *inode)
 	journal_t *journal = EXT4_SB(inode->i_sb)->s_journal;
 
 	if (journal) {
-		if (jbd2_transaction_committed(journal,
-			EXT4_I(inode)->i_datasync_tid))
+		if (tid_geq(journal->j_commit_sequence,
+			    EXT4_I(inode)->i_datasync_tid))
 			return false;
 		if (test_opt2(inode->i_sb, JOURNAL_FAST_COMMIT))
 			return !list_empty(&EXT4_I(inode)->i_fc_list);
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index b6c114c11b97..73737cd1106f 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -786,23 +786,6 @@  int jbd2_fc_end_commit_fallback(journal_t *journal)
 }
 EXPORT_SYMBOL(jbd2_fc_end_commit_fallback);
 
-/* Return 1 when transaction with given tid has already committed. */
-int jbd2_transaction_committed(journal_t *journal, tid_t tid)
-{
-	int ret = 1;
-
-	read_lock(&journal->j_state_lock);
-	if (journal->j_running_transaction &&
-	    journal->j_running_transaction->t_tid == tid)
-		ret = 0;
-	if (journal->j_committing_transaction &&
-	    journal->j_committing_transaction->t_tid == tid)
-		ret = 0;
-	read_unlock(&journal->j_state_lock);
-	return ret;
-}
-EXPORT_SYMBOL(jbd2_transaction_committed);
-
 /*
  * When this function returns the transaction corresponding to tid
  * will be completed.  If the transaction has currently running, start
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 971f3e826e15..e15ae324169d 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -1643,7 +1643,6 @@  extern void	jbd2_clear_buffer_revoked_flags(journal_t *journal);
 int jbd2_log_start_commit(journal_t *journal, tid_t tid);
 int jbd2_journal_start_commit(journal_t *journal, tid_t *tid);
 int jbd2_log_wait_commit(journal_t *journal, tid_t tid);
-int jbd2_transaction_committed(journal_t *journal, tid_t tid);
 int jbd2_complete_transaction(journal_t *journal, tid_t tid);
 int jbd2_log_do_checkpoint(journal_t *journal);
 int jbd2_trans_will_send_data_barrier(journal_t *journal, tid_t tid);