diff mbox series

[v3,8/9] qcow2: bdrv_co_pwritev: move encryption code out of the lock

Message ID 20190108170655.29766-9-vsementsov@virtuozzo.com
State New
Headers show
Series qcow2: encryption threads | expand

Commit Message

Vladimir Sementsov-Ogievskiy Jan. 8, 2019, 5:06 p.m. UTC
Encryption will be done in threads, to take benefit of it, we should
move it out of the lock first.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/qcow2.c | 35 +++++++++++++++++++++--------------
 1 file changed, 21 insertions(+), 14 deletions(-)

Comments

Alberto Garcia Jan. 18, 2019, 9:51 a.m. UTC | #1
On Tue 08 Jan 2019 06:06:54 PM CET, Vladimir Sementsov-Ogievskiy wrote:
> Encryption will be done in threads, to take benefit of it, we should
> move it out of the lock first.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block/qcow2.c | 35 +++++++++++++++++++++--------------
>  1 file changed, 21 insertions(+), 14 deletions(-)
>
> diff --git a/block/qcow2.c b/block/qcow2.c
> index d6ef606d89..76d3715350 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -2077,11 +2077,20 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
>          ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes,
>                                           &cluster_offset, &l2meta);
>          if (ret < 0) {
> -            goto fail;
> +            goto out_locked;
>          }
>  
>          assert((cluster_offset & 511) == 0);
>  
> +        ret = qcow2_pre_write_overlap_check(bs, 0,
> +                                            cluster_offset + offset_in_cluster,
> +                                            cur_bytes);
> +        if (ret < 0) {
> +            goto out_locked;
> +        }
> +
> +        qemu_co_mutex_unlock(&s->lock);

The usage of lock() and unlock() functions inside and outside of the
loop plus the two 'locked' and 'unlocked' exit paths is starting to make
things a bit more messy.

Having a look at the code it seems that there's only these three parts
in the whole function that need to have the lock held:

one:
   ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes,
                                    &cluster_offset, &l2meta);
   /* ... */
   ret = qcow2_pre_write_overlap_check(bs, 0,
                                       cluster_offset +
                                       offset_in_cluster,
                                       cur_bytes);

two:

   ret = qcow2_handle_l2meta(bs, &l2meta, true);


three:
   qcow2_handle_l2meta(bs, &l2meta, false);


So I wonder if it's perhaps simpler to add lock/unlock calls around
those blocks?

Berto
Vladimir Sementsov-Ogievskiy Jan. 18, 2019, 11:37 a.m. UTC | #2
18.01.2019 12:51, Alberto Garcia wrote:
> On Tue 08 Jan 2019 06:06:54 PM CET, Vladimir Sementsov-Ogievskiy wrote:
>> Encryption will be done in threads, to take benefit of it, we should
>> move it out of the lock first.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block/qcow2.c | 35 +++++++++++++++++++++--------------
>>   1 file changed, 21 insertions(+), 14 deletions(-)
>>
>> diff --git a/block/qcow2.c b/block/qcow2.c
>> index d6ef606d89..76d3715350 100644
>> --- a/block/qcow2.c
>> +++ b/block/qcow2.c
>> @@ -2077,11 +2077,20 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
>>           ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes,
>>                                            &cluster_offset, &l2meta);
>>           if (ret < 0) {
>> -            goto fail;
>> +            goto out_locked;
>>           }
>>   
>>           assert((cluster_offset & 511) == 0);
>>   
>> +        ret = qcow2_pre_write_overlap_check(bs, 0,
>> +                                            cluster_offset + offset_in_cluster,
>> +                                            cur_bytes);
>> +        if (ret < 0) {
>> +            goto out_locked;
>> +        }
>> +
>> +        qemu_co_mutex_unlock(&s->lock);
> 
> The usage of lock() and unlock() functions inside and outside of the
> loop plus the two 'locked' and 'unlocked' exit paths is starting to make
> things a bit more messy.
> 
> Having a look at the code it seems that there's only these three parts
> in the whole function that need to have the lock held:
> 
> one:
>     ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes,
>                                      &cluster_offset, &l2meta);
>     /* ... */
>     ret = qcow2_pre_write_overlap_check(bs, 0,
>                                         cluster_offset +
>                                         offset_in_cluster,
>                                         cur_bytes);
> 
> two:
> 
>     ret = qcow2_handle_l2meta(bs, &l2meta, true);
> 
> 
> three:
>     qcow2_handle_l2meta(bs, &l2meta, false);
> 
> 
> So I wonder if it's perhaps simpler to add lock/unlock calls around
> those blocks?

this means, that we'll unlock after qcow2_handle_l2meta and then immediately
lock on next iteration before qcow2_alloc_cluster_offset, so current code
avoids this extra unlock/lock..


> 
> Berto
>
Alberto Garcia Jan. 18, 2019, 2:39 p.m. UTC | #3
On Fri 18 Jan 2019 12:37:42 PM CET, Vladimir Sementsov-Ogievskiy wrote:
> 18.01.2019 12:51, Alberto Garcia wrote:
>> On Tue 08 Jan 2019 06:06:54 PM CET, Vladimir Sementsov-Ogievskiy wrote:
>>> Encryption will be done in threads, to take benefit of it, we should
>>> move it out of the lock first.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>   block/qcow2.c | 35 +++++++++++++++++++++--------------
>>>   1 file changed, 21 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/block/qcow2.c b/block/qcow2.c
>>> index d6ef606d89..76d3715350 100644
>>> --- a/block/qcow2.c
>>> +++ b/block/qcow2.c
>>> @@ -2077,11 +2077,20 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
>>>           ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes,
>>>                                            &cluster_offset, &l2meta);
>>>           if (ret < 0) {
>>> -            goto fail;
>>> +            goto out_locked;
>>>           }
>>>   
>>>           assert((cluster_offset & 511) == 0);
>>>   
>>> +        ret = qcow2_pre_write_overlap_check(bs, 0,
>>> +                                            cluster_offset + offset_in_cluster,
>>> +                                            cur_bytes);
>>> +        if (ret < 0) {
>>> +            goto out_locked;
>>> +        }
>>> +
>>> +        qemu_co_mutex_unlock(&s->lock);
>> 
>> The usage of lock() and unlock() functions inside and outside of the
>> loop plus the two 'locked' and 'unlocked' exit paths is starting to make
>> things a bit more messy.
>> 
>> Having a look at the code it seems that there's only these three parts
>> in the whole function that need to have the lock held:
>> 
>> one:
>>     ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes,
>>                                      &cluster_offset, &l2meta);
>>     /* ... */
>>     ret = qcow2_pre_write_overlap_check(bs, 0,
>>                                         cluster_offset +
>>                                         offset_in_cluster,
>>                                         cur_bytes);
>> 
>> two:
>> 
>>     ret = qcow2_handle_l2meta(bs, &l2meta, true);
>> 
>> 
>> three:
>>     qcow2_handle_l2meta(bs, &l2meta, false);
>> 
>> 
>> So I wonder if it's perhaps simpler to add lock/unlock calls around
>> those blocks?
>
> this means, that we'll unlock after qcow2_handle_l2meta and then
> immediately lock on next iteration before qcow2_alloc_cluster_offset,
> so current code avoids this extra unlock/lock..

I don't have a very strong opinion, but maybe it's worth having if it
makes the code easier to read and maintain.

Berto
Vladimir Sementsov-Ogievskiy Jan. 18, 2019, 5:15 p.m. UTC | #4
18.01.2019 17:39, Alberto Garcia wrote:
> On Fri 18 Jan 2019 12:37:42 PM CET, Vladimir Sementsov-Ogievskiy wrote:
>> 18.01.2019 12:51, Alberto Garcia wrote:
>>> On Tue 08 Jan 2019 06:06:54 PM CET, Vladimir Sementsov-Ogievskiy wrote:
>>>> Encryption will be done in threads, to take benefit of it, we should
>>>> move it out of the lock first.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>>> ---
>>>>    block/qcow2.c | 35 +++++++++++++++++++++--------------
>>>>    1 file changed, 21 insertions(+), 14 deletions(-)
>>>>
>>>> diff --git a/block/qcow2.c b/block/qcow2.c
>>>> index d6ef606d89..76d3715350 100644
>>>> --- a/block/qcow2.c
>>>> +++ b/block/qcow2.c
>>>> @@ -2077,11 +2077,20 @@ static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
>>>>            ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes,
>>>>                                             &cluster_offset, &l2meta);
>>>>            if (ret < 0) {
>>>> -            goto fail;
>>>> +            goto out_locked;
>>>>            }
>>>>    
>>>>            assert((cluster_offset & 511) == 0);
>>>>    
>>>> +        ret = qcow2_pre_write_overlap_check(bs, 0,
>>>> +                                            cluster_offset + offset_in_cluster,
>>>> +                                            cur_bytes);
>>>> +        if (ret < 0) {
>>>> +            goto out_locked;
>>>> +        }
>>>> +
>>>> +        qemu_co_mutex_unlock(&s->lock);
>>>
>>> The usage of lock() and unlock() functions inside and outside of the
>>> loop plus the two 'locked' and 'unlocked' exit paths is starting to make
>>> things a bit more messy.
>>>
>>> Having a look at the code it seems that there's only these three parts
>>> in the whole function that need to have the lock held:
>>>
>>> one:
>>>      ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes,
>>>                                       &cluster_offset, &l2meta);
>>>      /* ... */
>>>      ret = qcow2_pre_write_overlap_check(bs, 0,
>>>                                          cluster_offset +
>>>                                          offset_in_cluster,
>>>                                          cur_bytes);
>>>
>>> two:
>>>
>>>      ret = qcow2_handle_l2meta(bs, &l2meta, true);
>>>
>>>
>>> three:
>>>      qcow2_handle_l2meta(bs, &l2meta, false);
>>>
>>>
>>> So I wonder if it's perhaps simpler to add lock/unlock calls around
>>> those blocks?
>>
>> this means, that we'll unlock after qcow2_handle_l2meta and then
>> immediately lock on next iteration before qcow2_alloc_cluster_offset,
>> so current code avoids this extra unlock/lock..
> 
> I don't have a very strong opinion, but maybe it's worth having if it
> makes the code easier to read and maintain.
> 

Intuitively I think, that locks optimization worth this difficulty,
so I'd prefer to leave this logic as is, at least in these series.
diff mbox series

Patch

diff --git a/block/qcow2.c b/block/qcow2.c
index d6ef606d89..76d3715350 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2077,11 +2077,20 @@  static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
         ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes,
                                          &cluster_offset, &l2meta);
         if (ret < 0) {
-            goto fail;
+            goto out_locked;
         }
 
         assert((cluster_offset & 511) == 0);
 
+        ret = qcow2_pre_write_overlap_check(bs, 0,
+                                            cluster_offset + offset_in_cluster,
+                                            cur_bytes);
+        if (ret < 0) {
+            goto out_locked;
+        }
+
+        qemu_co_mutex_unlock(&s->lock);
+
         qemu_iovec_reset(&hd_qiov);
         qemu_iovec_concat(&hd_qiov, qiov, bytes_done, cur_bytes);
 
@@ -2093,7 +2102,7 @@  static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
                                                    * s->cluster_size);
                 if (cluster_data == NULL) {
                     ret = -ENOMEM;
-                    goto fail;
+                    goto out_unlocked;
                 }
             }
 
@@ -2108,40 +2117,34 @@  static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
                                       cluster_data,
                                       cur_bytes, NULL) < 0) {
                 ret = -EIO;
-                goto fail;
+                goto out_unlocked;
             }
 
             qemu_iovec_reset(&hd_qiov);
             qemu_iovec_add(&hd_qiov, cluster_data, cur_bytes);
         }
 
-        ret = qcow2_pre_write_overlap_check(bs, 0,
-                cluster_offset + offset_in_cluster, cur_bytes);
-        if (ret < 0) {
-            goto fail;
-        }
-
         /* If we need to do COW, check if it's possible to merge the
          * writing of the guest data together with that of the COW regions.
          * If it's not possible (or not necessary) then write the
          * guest data now. */
         if (!merge_cow(offset, cur_bytes, &hd_qiov, l2meta)) {
-            qemu_co_mutex_unlock(&s->lock);
             BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
             trace_qcow2_writev_data(qemu_coroutine_self(),
                                     cluster_offset + offset_in_cluster);
             ret = bdrv_co_pwritev(bs->file,
                                   cluster_offset + offset_in_cluster,
                                   cur_bytes, &hd_qiov, 0);
-            qemu_co_mutex_lock(&s->lock);
             if (ret < 0) {
-                goto fail;
+                goto out_unlocked;
             }
         }
 
+        qemu_co_mutex_lock(&s->lock);
+
         ret = qcow2_handle_l2meta(bs, &l2meta, true);
         if (ret) {
-            goto fail;
+            goto out_locked;
         }
 
         bytes -= cur_bytes;
@@ -2150,8 +2153,12 @@  static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
         trace_qcow2_writev_done_part(qemu_coroutine_self(), cur_bytes);
     }
     ret = 0;
+    goto out_locked;
 
-fail:
+out_unlocked:
+    qemu_co_mutex_lock(&s->lock);
+
+out_locked:
     qcow2_handle_l2meta(bs, &l2meta, false);
 
     qemu_co_mutex_unlock(&s->lock);