Message ID | 20191010201940.30382-1-halves@canonical.com |
---|---|
Headers | show |
Series | Fix bcache performance degradation when querying priority_stats | expand |
On 10/10/19 1:19 PM, Heitor Alves de Siqueira wrote: > BugLink: https://bugs.launchpad.net/bugs/1840043 > > [Impact] > Querying bcache's priority_stats attribute in sysfs causes severe > performance degradation for read/write workloads and occasional system > stalls > > [Test Case] > Note: As the sorting step has the most noticeable performance impact, > the test case below pins a workload and the sysfs query to the same CPU. > CPU contention issues still occur without any pinning, this just removes > the scheduling factor of landing in different CPUs and affecting > different tasks. > > 1) Start a read/write workload on the bcache device with e.g. fio or dd, > pinned to a certain CPU: > # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress > > 2) Start a sysfs query loop for the priority_stats attribute pinned to > the same CPU: > # for i in {1..100000}; do taskset 0x10 cat > # /sys/fs/bcache/*/cache0/priority_stats > /dev/null; done > > 3) Monitor the read/write workload for any performance impact > > [Fix] > To fix CPU contention and performance impact, a cond_resched() call is > introduced in the priority_stats sort comparison. > > [Regression Potential] > Regression potential is low, as the change is confined to the > priority_stats sysfs query. In cases where frequent queries to bcache > priority_stats take place (e.g. node_exporter), the impact should be > more noticeable as those could now take a bit longer to complete. A > regression due to this patch would most likely show up as a performance > degradation in bcache-focused workloads. > > Shile Zhang (1): > bcache: add cond_resched() in __bch_cache_cmp() > > drivers/md/bcache/sysfs.c | 1 + > 1 file changed, 1 insertion(+) > Acked-by: Connor Kuehl <connor.kuehl@canonical.com>
On 10.10.19 22:19, Heitor Alves de Siqueira wrote: > BugLink: https://bugs.launchpad.net/bugs/1840043 > > [Impact] > Querying bcache's priority_stats attribute in sysfs causes severe > performance degradation for read/write workloads and occasional system > stalls > > [Test Case] > Note: As the sorting step has the most noticeable performance impact, > the test case below pins a workload and the sysfs query to the same CPU. > CPU contention issues still occur without any pinning, this just removes > the scheduling factor of landing in different CPUs and affecting > different tasks. > > 1) Start a read/write workload on the bcache device with e.g. fio or dd, > pinned to a certain CPU: > # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress > > 2) Start a sysfs query loop for the priority_stats attribute pinned to > the same CPU: > # for i in {1..100000}; do taskset 0x10 cat > # /sys/fs/bcache/*/cache0/priority_stats > /dev/null; done > > 3) Monitor the read/write workload for any performance impact > > [Fix] > To fix CPU contention and performance impact, a cond_resched() call is > introduced in the priority_stats sort comparison. > > [Regression Potential] > Regression potential is low, as the change is confined to the > priority_stats sysfs query. In cases where frequent queries to bcache > priority_stats take place (e.g. node_exporter), the impact should be > more noticeable as those could now take a bit longer to complete. A > regression due to this patch would most likely show up as a performance > degradation in bcache-focused workloads. > > Shile Zhang (1): > bcache: add cond_resched() in __bch_cache_cmp() > > drivers/md/bcache/sysfs.c | 1 + > 1 file changed, 1 insertion(+) > Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
On 10.10.19 22:19, Heitor Alves de Siqueira wrote: > BugLink: https://bugs.launchpad.net/bugs/1840043 > > [Impact] > Querying bcache's priority_stats attribute in sysfs causes severe > performance degradation for read/write workloads and occasional system > stalls > > [Test Case] > Note: As the sorting step has the most noticeable performance impact, > the test case below pins a workload and the sysfs query to the same CPU. > CPU contention issues still occur without any pinning, this just removes > the scheduling factor of landing in different CPUs and affecting > different tasks. > > 1) Start a read/write workload on the bcache device with e.g. fio or dd, > pinned to a certain CPU: > # taskset 0x10 dd if=/dev/zero of=/dev/bcache0 bs=4k status=progress > > 2) Start a sysfs query loop for the priority_stats attribute pinned to > the same CPU: > # for i in {1..100000}; do taskset 0x10 cat > # /sys/fs/bcache/*/cache0/priority_stats > /dev/null; done > > 3) Monitor the read/write workload for any performance impact > > [Fix] > To fix CPU contention and performance impact, a cond_resched() call is > introduced in the priority_stats sort comparison. > > [Regression Potential] > Regression potential is low, as the change is confined to the > priority_stats sysfs query. In cases where frequent queries to bcache > priority_stats take place (e.g. node_exporter), the impact should be > more noticeable as those could now take a bit longer to complete. A > regression due to this patch would most likely show up as a performance > degradation in bcache-focused workloads. > > Shile Zhang (1): > bcache: add cond_resched() in __bch_cache_cmp() > > drivers/md/bcache/sysfs.c | 1 + > 1 file changed, 1 insertion(+) > Applied to xenial, bionic, disco and eoan master-next branches. Thanks, Kleber