Message ID | 20190108170655.29766-9-vsementsov@virtuozzo.com |
---|---|
State | New |
Headers | show |
Series | qcow2: encryption threads | expand |
On Tue 08 Jan 2019 06:06:54 PM CET, Vladimir Sementsov-Ogievskiy wrote: > Encryption will be done in threads, to take benefit of it, we should > move it out of the lock first. > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> > --- > block/qcow2.c | 35 +++++++++++++++++++++-------------- > 1 file changed, 21 insertions(+), 14 deletions(-) > > diff --git a/block/qcow2.c b/block/qcow2.c > index d6ef606d89..76d3715350 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c > @@ -2077,11 +2077,20 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset, > ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes, > &cluster_offset, &l2meta); > if (ret < 0) { > - goto fail; > + goto out_locked; > } > > assert((cluster_offset & 511) == 0); > > + ret = qcow2_pre_write_overlap_check(bs, 0, > + cluster_offset + offset_in_cluster, > + cur_bytes); > + if (ret < 0) { > + goto out_locked; > + } > + > + qemu_co_mutex_unlock(&s->lock); The usage of lock() and unlock() functions inside and outside of the loop plus the two 'locked' and 'unlocked' exit paths is starting to make things a bit more messy. Having a look at the code it seems that there's only these three parts in the whole function that need to have the lock held: one: ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes, &cluster_offset, &l2meta); /* ... */ ret = qcow2_pre_write_overlap_check(bs, 0, cluster_offset + offset_in_cluster, cur_bytes); two: ret = qcow2_handle_l2meta(bs, &l2meta, true); three: qcow2_handle_l2meta(bs, &l2meta, false); So I wonder if it's perhaps simpler to add lock/unlock calls around those blocks? Berto
18.01.2019 12:51, Alberto Garcia wrote: > On Tue 08 Jan 2019 06:06:54 PM CET, Vladimir Sementsov-Ogievskiy wrote: >> Encryption will be done in threads, to take benefit of it, we should >> move it out of the lock first. >> >> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >> --- >> block/qcow2.c | 35 +++++++++++++++++++++-------------- >> 1 file changed, 21 insertions(+), 14 deletions(-) >> >> diff --git a/block/qcow2.c b/block/qcow2.c >> index d6ef606d89..76d3715350 100644 >> --- a/block/qcow2.c >> +++ b/block/qcow2.c >> @@ -2077,11 +2077,20 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset, >> ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes, >> &cluster_offset, &l2meta); >> if (ret < 0) { >> - goto fail; >> + goto out_locked; >> } >> >> assert((cluster_offset & 511) == 0); >> >> + ret = qcow2_pre_write_overlap_check(bs, 0, >> + cluster_offset + offset_in_cluster, >> + cur_bytes); >> + if (ret < 0) { >> + goto out_locked; >> + } >> + >> + qemu_co_mutex_unlock(&s->lock); > > The usage of lock() and unlock() functions inside and outside of the > loop plus the two 'locked' and 'unlocked' exit paths is starting to make > things a bit more messy. > > Having a look at the code it seems that there's only these three parts > in the whole function that need to have the lock held: > > one: > ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes, > &cluster_offset, &l2meta); > /* ... */ > ret = qcow2_pre_write_overlap_check(bs, 0, > cluster_offset + > offset_in_cluster, > cur_bytes); > > two: > > ret = qcow2_handle_l2meta(bs, &l2meta, true); > > > three: > qcow2_handle_l2meta(bs, &l2meta, false); > > > So I wonder if it's perhaps simpler to add lock/unlock calls around > those blocks? this means, that we'll unlock after qcow2_handle_l2meta and then immediately lock on next iteration before qcow2_alloc_cluster_offset, so current code avoids this extra unlock/lock.. > > Berto >
On Fri 18 Jan 2019 12:37:42 PM CET, Vladimir Sementsov-Ogievskiy wrote: > 18.01.2019 12:51, Alberto Garcia wrote: >> On Tue 08 Jan 2019 06:06:54 PM CET, Vladimir Sementsov-Ogievskiy wrote: >>> Encryption will be done in threads, to take benefit of it, we should >>> move it out of the lock first. >>> >>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >>> --- >>> block/qcow2.c | 35 +++++++++++++++++++++-------------- >>> 1 file changed, 21 insertions(+), 14 deletions(-) >>> >>> diff --git a/block/qcow2.c b/block/qcow2.c >>> index d6ef606d89..76d3715350 100644 >>> --- a/block/qcow2.c >>> +++ b/block/qcow2.c >>> @@ -2077,11 +2077,20 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset, >>> ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes, >>> &cluster_offset, &l2meta); >>> if (ret < 0) { >>> - goto fail; >>> + goto out_locked; >>> } >>> >>> assert((cluster_offset & 511) == 0); >>> >>> + ret = qcow2_pre_write_overlap_check(bs, 0, >>> + cluster_offset + offset_in_cluster, >>> + cur_bytes); >>> + if (ret < 0) { >>> + goto out_locked; >>> + } >>> + >>> + qemu_co_mutex_unlock(&s->lock); >> >> The usage of lock() and unlock() functions inside and outside of the >> loop plus the two 'locked' and 'unlocked' exit paths is starting to make >> things a bit more messy. >> >> Having a look at the code it seems that there's only these three parts >> in the whole function that need to have the lock held: >> >> one: >> ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes, >> &cluster_offset, &l2meta); >> /* ... */ >> ret = qcow2_pre_write_overlap_check(bs, 0, >> cluster_offset + >> offset_in_cluster, >> cur_bytes); >> >> two: >> >> ret = qcow2_handle_l2meta(bs, &l2meta, true); >> >> >> three: >> qcow2_handle_l2meta(bs, &l2meta, false); >> >> >> So I wonder if it's perhaps simpler to add lock/unlock calls around >> those blocks? > > this means, that we'll unlock after qcow2_handle_l2meta and then > immediately lock on next iteration before qcow2_alloc_cluster_offset, > so current code avoids this extra unlock/lock.. I don't have a very strong opinion, but maybe it's worth having if it makes the code easier to read and maintain. Berto
18.01.2019 17:39, Alberto Garcia wrote: > On Fri 18 Jan 2019 12:37:42 PM CET, Vladimir Sementsov-Ogievskiy wrote: >> 18.01.2019 12:51, Alberto Garcia wrote: >>> On Tue 08 Jan 2019 06:06:54 PM CET, Vladimir Sementsov-Ogievskiy wrote: >>>> Encryption will be done in threads, to take benefit of it, we should >>>> move it out of the lock first. >>>> >>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> >>>> --- >>>> block/qcow2.c | 35 +++++++++++++++++++++-------------- >>>> 1 file changed, 21 insertions(+), 14 deletions(-) >>>> >>>> diff --git a/block/qcow2.c b/block/qcow2.c >>>> index d6ef606d89..76d3715350 100644 >>>> --- a/block/qcow2.c >>>> +++ b/block/qcow2.c >>>> @@ -2077,11 +2077,20 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset, >>>> ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes, >>>> &cluster_offset, &l2meta); >>>> if (ret < 0) { >>>> - goto fail; >>>> + goto out_locked; >>>> } >>>> >>>> assert((cluster_offset & 511) == 0); >>>> >>>> + ret = qcow2_pre_write_overlap_check(bs, 0, >>>> + cluster_offset + offset_in_cluster, >>>> + cur_bytes); >>>> + if (ret < 0) { >>>> + goto out_locked; >>>> + } >>>> + >>>> + qemu_co_mutex_unlock(&s->lock); >>> >>> The usage of lock() and unlock() functions inside and outside of the >>> loop plus the two 'locked' and 'unlocked' exit paths is starting to make >>> things a bit more messy. >>> >>> Having a look at the code it seems that there's only these three parts >>> in the whole function that need to have the lock held: >>> >>> one: >>> ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes, >>> &cluster_offset, &l2meta); >>> /* ... */ >>> ret = qcow2_pre_write_overlap_check(bs, 0, >>> cluster_offset + >>> offset_in_cluster, >>> cur_bytes); >>> >>> two: >>> >>> ret = qcow2_handle_l2meta(bs, &l2meta, true); >>> >>> >>> three: >>> qcow2_handle_l2meta(bs, &l2meta, false); >>> >>> >>> So I wonder if it's perhaps simpler to add lock/unlock calls around >>> those blocks? >> >> this means, that we'll unlock after qcow2_handle_l2meta and then >> immediately lock on next iteration before qcow2_alloc_cluster_offset, >> so current code avoids this extra unlock/lock.. > > I don't have a very strong opinion, but maybe it's worth having if it > makes the code easier to read and maintain. > Intuitively I think, that locks optimization worth this difficulty, so I'd prefer to leave this logic as is, at least in these series.
diff --git a/block/qcow2.c b/block/qcow2.c index d6ef606d89..76d3715350 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -2077,11 +2077,20 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset, ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes, &cluster_offset, &l2meta); if (ret < 0) { - goto fail; + goto out_locked; } assert((cluster_offset & 511) == 0); + ret = qcow2_pre_write_overlap_check(bs, 0, + cluster_offset + offset_in_cluster, + cur_bytes); + if (ret < 0) { + goto out_locked; + } + + qemu_co_mutex_unlock(&s->lock); + qemu_iovec_reset(&hd_qiov); qemu_iovec_concat(&hd_qiov, qiov, bytes_done, cur_bytes); @@ -2093,7 +2102,7 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset, * s->cluster_size); if (cluster_data == NULL) { ret = -ENOMEM; - goto fail; + goto out_unlocked; } } @@ -2108,40 +2117,34 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset, cluster_data, cur_bytes, NULL) < 0) { ret = -EIO; - goto fail; + goto out_unlocked; } qemu_iovec_reset(&hd_qiov); qemu_iovec_add(&hd_qiov, cluster_data, cur_bytes); } - ret = qcow2_pre_write_overlap_check(bs, 0, - cluster_offset + offset_in_cluster, cur_bytes); - if (ret < 0) { - goto fail; - } - /* If we need to do COW, check if it's possible to merge the * writing of the guest data together with that of the COW regions. * If it's not possible (or not necessary) then write the * guest data now. */ if (!merge_cow(offset, cur_bytes, &hd_qiov, l2meta)) { - qemu_co_mutex_unlock(&s->lock); BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO); trace_qcow2_writev_data(qemu_coroutine_self(), cluster_offset + offset_in_cluster); ret = bdrv_co_pwritev(bs->file, cluster_offset + offset_in_cluster, cur_bytes, &hd_qiov, 0); - qemu_co_mutex_lock(&s->lock); if (ret < 0) { - goto fail; + goto out_unlocked; } } + qemu_co_mutex_lock(&s->lock); + ret = qcow2_handle_l2meta(bs, &l2meta, true); if (ret) { - goto fail; + goto out_locked; } bytes -= cur_bytes; @@ -2150,8 +2153,12 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset, trace_qcow2_writev_done_part(qemu_coroutine_self(), cur_bytes); } ret = 0; + goto out_locked; -fail: +out_unlocked: + qemu_co_mutex_lock(&s->lock); + +out_locked: qcow2_handle_l2meta(bs, &l2meta, false); qemu_co_mutex_unlock(&s->lock);
Encryption will be done in threads, to take benefit of it, we should move it out of the lock first. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> --- block/qcow2.c | 35 +++++++++++++++++++++-------------- 1 file changed, 21 insertions(+), 14 deletions(-)