Message ID | 20210413125533.217440-1-pbonzini@redhat.com |
---|---|
State | New |
Headers | show |
Series | ratelimit: protect with a mutex | expand |
On Tue, Apr 13, 2021 at 02:55:33PM +0200, Paolo Bonzini wrote: > Right now, rate limiting is protected by the AioContext mutex, which is > taken for example both by the block jobs and by qmp_block_job_set_speed > (via find_block_job). > > We would like to remove the dependency of block layer code on the > AioContext mutex, since most drivers and the core I/O code are already > not relying on it. However, there is no existing lock that can easily > be taken by both ratelimit_set_speed and ratelimit_calculate_delay, > especially because the latter might run in coroutine context (and > therefore under a CoMutex) but the former will not. > > Since concurrent calls to ratelimit_calculate_delay are not possible, > one idea could be to use a seqlock to get a snapshot of slice_ns and > slice_quota. But for now keep it simple, and just add a mutex to the > RateLimit struct; block jobs are generally not performance critical to > the point of optimizing the clock cycles spent in synchronization. > > This also requires the introduction of init/destroy functions, so > add them to the two users of ratelimit.h. > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > block/block-copy.c | 2 ++ > blockjob.c | 3 +++ > include/qemu/ratelimit.h | 14 ++++++++++++++ > 3 files changed, 19 insertions(+) Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
On 13/04/2021 14:55, Paolo Bonzini wrote: > Right now, rate limiting is protected by the AioContext mutex, which is > taken for example both by the block jobs and by qmp_block_job_set_speed > (via find_block_job). > > We would like to remove the dependency of block layer code on the > AioContext mutex, since most drivers and the core I/O code are already > not relying on it. However, there is no existing lock that can easily > be taken by both ratelimit_set_speed and ratelimit_calculate_delay, > especially because the latter might run in coroutine context (and > therefore under a CoMutex) but the former will not. > > Since concurrent calls to ratelimit_calculate_delay are not possible, > one idea could be to use a seqlock to get a snapshot of slice_ns and > slice_quota. But for now keep it simple, and just add a mutex to the > RateLimit struct; block jobs are generally not performance critical to > the point of optimizing the clock cycles spent in synchronization. > > This also requires the introduction of init/destroy functions, so > add them to the two users of ratelimit.h. > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > --- > block/block-copy.c | 2 ++ > blockjob.c | 3 +++ > include/qemu/ratelimit.h | 14 ++++++++++++++ > 3 files changed, 19 insertions(+) > > diff --git a/block/block-copy.c b/block/block-copy.c > index 39ae481c8b..9b4af00614 100644 > --- a/block/block-copy.c > +++ b/block/block-copy.c > @@ -230,6 +230,7 @@ void block_copy_state_free(BlockCopyState *s) > return; > } > > + ratelimit_destroy(&s->rate_limit); > bdrv_release_dirty_bitmap(s->copy_bitmap); > shres_destroy(s->mem); > g_free(s); > @@ -289,6 +290,7 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target, > s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER); > } > > + ratelimit_init(&s->rate_limit); > QLIST_INIT(&s->tasks); > QLIST_INIT(&s->calls); > > diff --git a/blockjob.c b/blockjob.c > index 207e8c7fd9..46f15befe8 100644 > --- a/blockjob.c > +++ b/blockjob.c > @@ -87,6 +87,7 @@ void block_job_free(Job *job) > > block_job_remove_all_bdrv(bjob); > blk_unref(bjob->blk); > + ratelimit_destroy(&bjob->limit); > error_free(bjob->blocker); > } > > @@ -435,6 +436,8 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver, > assert(job->job.driver->free == &block_job_free); > assert(job->job.driver->user_resume == &block_job_user_resume); > > + ratelimit_init(&job->limit); > + > job->blk = blk; > > job->finalize_cancelled_notifier.notify = block_job_event_cancelled; > diff --git a/include/qemu/ratelimit.h b/include/qemu/ratelimit.h > index 01da8d63f1..003ea6d5a3 100644 > --- a/include/qemu/ratelimit.h > +++ b/include/qemu/ratelimit.h > @@ -14,9 +14,11 @@ > #ifndef QEMU_RATELIMIT_H > #define QEMU_RATELIMIT_H > > +#include "qemu/lockable.h" > #include "qemu/timer.h" > > typedef struct { > + QemuMutex lock; > int64_t slice_start_time; > int64_t slice_end_time; > uint64_t slice_quota; > @@ -40,6 +42,7 @@ static inline int64_t ratelimit_calculate_delay(RateLimit *limit, uint64_t n) > int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); > double delay_slices; > > + QEMU_LOCK_GUARD(&limit->lock); > assert(limit->slice_quota && limit->slice_ns); > > if (limit->slice_end_time < now) { > @@ -65,9 +68,20 @@ static inline int64_t ratelimit_calculate_delay(RateLimit *limit, uint64_t n) > return limit->slice_end_time - now; > } > > +static inline void ratelimit_init(RateLimit *limit) > +{ > + qemu_mutex_init(&limit->lock); > +} > + > +static inline void ratelimit_destroy(RateLimit *limit) > +{ > + qemu_mutex_destroy(&limit->lock); > +} > + > static inline void ratelimit_set_speed(RateLimit *limit, uint64_t speed, > uint64_t slice_ns) > { > + QEMU_LOCK_GUARD(&limit->lock); > limit->slice_ns = slice_ns; > limit->slice_quota = MAX(((double)speed * slice_ns) / 1000000000ULL, 1); > } > Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
13.04.2021 15:55, Paolo Bonzini wrote: > Right now, rate limiting is protected by the AioContext mutex, which is > taken for example both by the block jobs and by qmp_block_job_set_speed > (via find_block_job). > > We would like to remove the dependency of block layer code on the > AioContext mutex, since most drivers and the core I/O code are already > not relying on it. However, there is no existing lock that can easily > be taken by both ratelimit_set_speed and ratelimit_calculate_delay, > especially because the latter might run in coroutine context (and > therefore under a CoMutex) but the former will not. > > Since concurrent calls to ratelimit_calculate_delay are not possible, > one idea could be to use a seqlock to get a snapshot of slice_ns and > slice_quota. But for now keep it simple, and just add a mutex to the > RateLimit struct; block jobs are generally not performance critical to > the point of optimizing the clock cycles spent in synchronization. > > This also requires the introduction of init/destroy functions, so > add them to the two users of ratelimit.h. > > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
diff --git a/block/block-copy.c b/block/block-copy.c index 39ae481c8b..9b4af00614 100644 --- a/block/block-copy.c +++ b/block/block-copy.c @@ -230,6 +230,7 @@ void block_copy_state_free(BlockCopyState *s) return; } + ratelimit_destroy(&s->rate_limit); bdrv_release_dirty_bitmap(s->copy_bitmap); shres_destroy(s->mem); g_free(s); @@ -289,6 +290,7 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target, s->copy_size = MAX(s->cluster_size, BLOCK_COPY_MAX_BUFFER); } + ratelimit_init(&s->rate_limit); QLIST_INIT(&s->tasks); QLIST_INIT(&s->calls); diff --git a/blockjob.c b/blockjob.c index 207e8c7fd9..46f15befe8 100644 --- a/blockjob.c +++ b/blockjob.c @@ -87,6 +87,7 @@ void block_job_free(Job *job) block_job_remove_all_bdrv(bjob); blk_unref(bjob->blk); + ratelimit_destroy(&bjob->limit); error_free(bjob->blocker); } @@ -435,6 +436,8 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver, assert(job->job.driver->free == &block_job_free); assert(job->job.driver->user_resume == &block_job_user_resume); + ratelimit_init(&job->limit); + job->blk = blk; job->finalize_cancelled_notifier.notify = block_job_event_cancelled; diff --git a/include/qemu/ratelimit.h b/include/qemu/ratelimit.h index 01da8d63f1..003ea6d5a3 100644 --- a/include/qemu/ratelimit.h +++ b/include/qemu/ratelimit.h @@ -14,9 +14,11 @@ #ifndef QEMU_RATELIMIT_H #define QEMU_RATELIMIT_H +#include "qemu/lockable.h" #include "qemu/timer.h" typedef struct { + QemuMutex lock; int64_t slice_start_time; int64_t slice_end_time; uint64_t slice_quota; @@ -40,6 +42,7 @@ static inline int64_t ratelimit_calculate_delay(RateLimit *limit, uint64_t n) int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME); double delay_slices; + QEMU_LOCK_GUARD(&limit->lock); assert(limit->slice_quota && limit->slice_ns); if (limit->slice_end_time < now) { @@ -65,9 +68,20 @@ static inline int64_t ratelimit_calculate_delay(RateLimit *limit, uint64_t n) return limit->slice_end_time - now; } +static inline void ratelimit_init(RateLimit *limit) +{ + qemu_mutex_init(&limit->lock); +} + +static inline void ratelimit_destroy(RateLimit *limit) +{ + qemu_mutex_destroy(&limit->lock); +} + static inline void ratelimit_set_speed(RateLimit *limit, uint64_t speed, uint64_t slice_ns) { + QEMU_LOCK_GUARD(&limit->lock); limit->slice_ns = slice_ns; limit->slice_quota = MAX(((double)speed * slice_ns) / 1000000000ULL, 1); }
Right now, rate limiting is protected by the AioContext mutex, which is taken for example both by the block jobs and by qmp_block_job_set_speed (via find_block_job). We would like to remove the dependency of block layer code on the AioContext mutex, since most drivers and the core I/O code are already not relying on it. However, there is no existing lock that can easily be taken by both ratelimit_set_speed and ratelimit_calculate_delay, especially because the latter might run in coroutine context (and therefore under a CoMutex) but the former will not. Since concurrent calls to ratelimit_calculate_delay are not possible, one idea could be to use a seqlock to get a snapshot of slice_ns and slice_quota. But for now keep it simple, and just add a mutex to the RateLimit struct; block jobs are generally not performance critical to the point of optimizing the clock cycles spent in synchronization. This also requires the introduction of init/destroy functions, so add them to the two users of ratelimit.h. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- block/block-copy.c | 2 ++ blockjob.c | 3 +++ include/qemu/ratelimit.h | 14 ++++++++++++++ 3 files changed, 19 insertions(+)