diff mbox series

[v2,5/7] block/block-copy: block_copy(): add timeout_ns parameter

Message ID 20220401091920.287612-6-vsementsov@openvz.org
State New
Headers show
Series copy-before-write: on-cbw-error and cbw-timeout | expand

Commit Message

Vladimir Sementsov-Ogievskiy April 1, 2022, 9:19 a.m. UTC
Add possibility to limit block_copy() call in time. To be used in the
next commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
---
 block/block-copy.c         | 26 +++++++++++++++++++-------
 block/copy-before-write.c  |  2 +-
 include/block/block-copy.h |  2 +-
 3 files changed, 21 insertions(+), 9 deletions(-)

Comments

Hanna Czenczek April 1, 2022, 1:16 p.m. UTC | #1
On 01.04.22 11:19, Vladimir Sementsov-Ogievskiy wrote:
> Add possibility to limit block_copy() call in time. To be used in the
> next commit.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
> ---
>   block/block-copy.c         | 26 +++++++++++++++++++-------
>   block/copy-before-write.c  |  2 +-
>   include/block/block-copy.h |  2 +-
>   3 files changed, 21 insertions(+), 9 deletions(-)
>
> diff --git a/block/block-copy.c b/block/block-copy.c
> index ec46775ea5..b47cb188dd 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c

[...]

> @@ -894,12 +902,16 @@ int coroutine_fn block_copy(BlockCopyState *s, int64_t start, int64_t bytes,
>           .max_workers = BLOCK_COPY_MAX_WORKERS,
>       };
>   
> -    return block_copy_common(&call_state);
> -}
> +    ret = qemu_co_timeout(block_copy_async_co_entry, call_state, timeout_ns,
> +                          g_free);

A direct path for timeout_ns == 0 might still be nice to have.

> +    if (ret < 0) {
> +        /* Timeout. call_state will be freed by running coroutine. */

Maybe assert(ret == -ETIMEDOUT);?

> +        return ret;

If I’m right in understanding how qemu_co_timeout() works, 
block_copy_common() will continue to run here.  Shouldn’t we at least 
cancel it by setting call_state->cancelled to true?

(Besides this, I think that letting block_copy_common() running in the 
background should be OK.  I’m not sure what the implications are if we 
do cancel the call here, while on-cbw-error is break-guest-write, 
though.  Should be fine, I guess, because block_copy_common() will still 
correctly keep track of what it has successfully copied and what it hasn’t?)

> +    }
>   
> -static void coroutine_fn block_copy_async_co_entry(void *opaque)
> -{
> -    block_copy_common(opaque);
> +    ret = call_state->ret;
> +
> +    return ret;

But here we still need to free call_state, right?

>   }
>   
>   BlockCopyCallState *block_copy_async(BlockCopyState *s,
Hanna Czenczek April 1, 2022, 1:22 p.m. UTC | #2
On 01.04.22 15:16, Hanna Reitz wrote:
> On 01.04.22 11:19, Vladimir Sementsov-Ogievskiy wrote:
>> Add possibility to limit block_copy() call in time. To be used in the
>> next commit.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
>> ---
>>   block/block-copy.c         | 26 +++++++++++++++++++-------
>>   block/copy-before-write.c  |  2 +-
>>   include/block/block-copy.h |  2 +-
>>   3 files changed, 21 insertions(+), 9 deletions(-)
>>
>> diff --git a/block/block-copy.c b/block/block-copy.c
>> index ec46775ea5..b47cb188dd 100644
>> --- a/block/block-copy.c
>> +++ b/block/block-copy.c
>
> [...]
>
>> @@ -894,12 +902,16 @@ int coroutine_fn block_copy(BlockCopyState *s, 
>> int64_t start, int64_t bytes,
>>           .max_workers = BLOCK_COPY_MAX_WORKERS,
>>       };
>>   -    return block_copy_common(&call_state);
>> -}
>> +    ret = qemu_co_timeout(block_copy_async_co_entry, call_state, 
>> timeout_ns,
>> +                          g_free);
>
> A direct path for timeout_ns == 0 might still be nice to have.

Ah, never mind, just saw that qemu_co_timeout() itself has a direct path 
for this.  Hadn’t noticed that before.
Vladimir Sementsov-Ogievskiy April 1, 2022, 4:08 p.m. UTC | #3
01.04.2022 16:16, Hanna Reitz wrote:
> On 01.04.22 11:19, Vladimir Sementsov-Ogievskiy wrote:
>> Add possibility to limit block_copy() call in time. To be used in the
>> next commit.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
>> ---
>>   block/block-copy.c         | 26 +++++++++++++++++++-------
>>   block/copy-before-write.c  |  2 +-
>>   include/block/block-copy.h |  2 +-
>>   3 files changed, 21 insertions(+), 9 deletions(-)
>>
>> diff --git a/block/block-copy.c b/block/block-copy.c
>> index ec46775ea5..b47cb188dd 100644
>> --- a/block/block-copy.c
>> +++ b/block/block-copy.c
> 
> [...]
> 
>> @@ -894,12 +902,16 @@ int coroutine_fn block_copy(BlockCopyState *s, int64_t start, int64_t bytes,
>>           .max_workers = BLOCK_COPY_MAX_WORKERS,
>>       };
>> -    return block_copy_common(&call_state);
>> -}
>> +    ret = qemu_co_timeout(block_copy_async_co_entry, call_state, timeout_ns,
>> +                          g_free);
> 
> A direct path for timeout_ns == 0 might still be nice to have.
> 
>> +    if (ret < 0) {
>> +        /* Timeout. call_state will be freed by running coroutine. */
> 
> Maybe assert(ret == -ETIMEDOUT);?

OK

> 
>> +        return ret;
> 
> If I’m right in understanding how qemu_co_timeout() works, block_copy_common() will continue to run here.  Shouldn’t we at least cancel it by setting call_state->cancelled to true?

Agree

> 
> (Besides this, I think that letting block_copy_common() running in the background should be OK.  I’m not sure what the implications are if we do cancel the call here, while on-cbw-error is break-guest-write, though.  Should be fine, I guess, because block_copy_common() will still correctly keep track of what it has successfully copied and what it hasn’t?)

Hmm. I now think, that we should at least wait for such cancelled background requests before block_copy_state_free in cbw_close(). But in "[PATCH v5 00/45] Transactional block-graph modifying API" I want to detach children from CBW filter before calling .close().. So, possible solution is to wait for all cancelled requests on .bdrv_co_drain_begin().

Or alternatively, may be just increase bs->in_flight for CBW filter for each background cancelled request? And decrease when it finish. For this we should add a kind of callback to be called when timed-out coroutine entry finish.

> 
>> +    }
>> -static void coroutine_fn block_copy_async_co_entry(void *opaque)
>> -{
>> -    block_copy_common(opaque);
>> +    ret = call_state->ret;
>> +
>> +    return ret;
> 
> But here we still need to free call_state, right?
> 
>>   }
>>   BlockCopyCallState *block_copy_async(BlockCopyState *s,
>
Hanna Czenczek April 4, 2022, 2:39 p.m. UTC | #4
On 01.04.22 18:08, Vladimir Sementsov-Ogievskiy wrote:
> 01.04.2022 16:16, Hanna Reitz wrote:
>> On 01.04.22 11:19, Vladimir Sementsov-Ogievskiy wrote:
>>> Add possibility to limit block_copy() call in time. To be used in the
>>> next commit.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
>>> ---
>>>   block/block-copy.c         | 26 +++++++++++++++++++-------
>>>   block/copy-before-write.c  |  2 +-
>>>   include/block/block-copy.h |  2 +-
>>>   3 files changed, 21 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/block/block-copy.c b/block/block-copy.c
>>> index ec46775ea5..b47cb188dd 100644
>>> --- a/block/block-copy.c
>>> +++ b/block/block-copy.c
>>
>> [...]
>>
>>> @@ -894,12 +902,16 @@ int coroutine_fn block_copy(BlockCopyState *s, 
>>> int64_t start, int64_t bytes,
>>>           .max_workers = BLOCK_COPY_MAX_WORKERS,
>>>       };
>>> -    return block_copy_common(&call_state);
>>> -}
>>> +    ret = qemu_co_timeout(block_copy_async_co_entry, call_state, 
>>> timeout_ns,
>>> +                          g_free);
>>
>> A direct path for timeout_ns == 0 might still be nice to have.
>>
>>> +    if (ret < 0) {
>>> +        /* Timeout. call_state will be freed by running coroutine. */
>>
>> Maybe assert(ret == -ETIMEDOUT);?
>
> OK
>
>>
>>> +        return ret;
>>
>> If I’m right in understanding how qemu_co_timeout() works, 
>> block_copy_common() will continue to run here.  Shouldn’t we at least 
>> cancel it by setting call_state->cancelled to true?
>
> Agree
>
>>
>> (Besides this, I think that letting block_copy_common() running in 
>> the background should be OK.  I’m not sure what the implications are 
>> if we do cancel the call here, while on-cbw-error is 
>> break-guest-write, though.  Should be fine, I guess, because 
>> block_copy_common() will still correctly keep track of what it has 
>> successfully copied and what it hasn’t?)
>
> Hmm. I now think, that we should at least wait for such cancelled 
> background requests before block_copy_state_free in cbw_close(). But 
> in "[PATCH v5 00/45] Transactional block-graph modifying API" I want 
> to detach children from CBW filter before calling .close().. So, 
> possible solution is to wait for all cancelled requests on 
> .bdrv_co_drain_begin().
>
> Or alternatively, may be just increase bs->in_flight for CBW filter 
> for each background cancelled request? And decrease when it finish. 
> For this we should add a kind of callback to be called when timed-out 
> coroutine entry finish.

in_flight sounds good to me.  That would automatically work for 
draining, right?
Vladimir Sementsov-Ogievskiy April 5, 2022, 11:33 a.m. UTC | #5
04.04.2022 17:39, Hanna Reitz wrote:
> On 01.04.22 18:08, Vladimir Sementsov-Ogievskiy wrote:
>> 01.04.2022 16:16, Hanna Reitz wrote:
>>> On 01.04.22 11:19, Vladimir Sementsov-Ogievskiy wrote:
>>>> Add possibility to limit block_copy() call in time. To be used in the
>>>> next commit.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
>>>> ---
>>>>   block/block-copy.c         | 26 +++++++++++++++++++-------
>>>>   block/copy-before-write.c  |  2 +-
>>>>   include/block/block-copy.h |  2 +-
>>>>   3 files changed, 21 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/block/block-copy.c b/block/block-copy.c
>>>> index ec46775ea5..b47cb188dd 100644
>>>> --- a/block/block-copy.c
>>>> +++ b/block/block-copy.c
>>>
>>> [...]
>>>
>>>> @@ -894,12 +902,16 @@ int coroutine_fn block_copy(BlockCopyState *s, int64_t start, int64_t bytes,
>>>>           .max_workers = BLOCK_COPY_MAX_WORKERS,
>>>>       };
>>>> -    return block_copy_common(&call_state);
>>>> -}
>>>> +    ret = qemu_co_timeout(block_copy_async_co_entry, call_state, timeout_ns,
>>>> +                          g_free);
>>>
>>> A direct path for timeout_ns == 0 might still be nice to have.
>>>
>>>> +    if (ret < 0) {
>>>> +        /* Timeout. call_state will be freed by running coroutine. */
>>>
>>> Maybe assert(ret == -ETIMEDOUT);?
>>
>> OK
>>
>>>
>>>> +        return ret;
>>>
>>> If I’m right in understanding how qemu_co_timeout() works, block_copy_common() will continue to run here.  Shouldn’t we at least cancel it by setting call_state->cancelled to true?
>>
>> Agree
>>
>>>
>>> (Besides this, I think that letting block_copy_common() running in the background should be OK.  I’m not sure what the implications are if we do cancel the call here, while on-cbw-error is break-guest-write, though.  Should be fine, I guess, because block_copy_common() will still correctly keep track of what it has successfully copied and what it hasn’t?)
>>
>> Hmm. I now think, that we should at least wait for such cancelled background requests before block_copy_state_free in cbw_close(). But in "[PATCH v5 00/45] Transactional block-graph modifying API" I want to detach children from CBW filter before calling .close().. So, possible solution is to wait for all cancelled requests on .bdrv_co_drain_begin().
>>
>> Or alternatively, may be just increase bs->in_flight for CBW filter for each background cancelled request? And decrease when it finish. For this we should add a kind of callback to be called when timed-out coroutine entry finish.
> 
> in_flight sounds good to me.  That would automatically work for draining, right?
> 
> 

Yes it should
Vladimir Sementsov-Ogievskiy April 6, 2022, 4:10 p.m. UTC | #6
01.04.2022 16:16, Hanna Reitz wrote:
>> -static void coroutine_fn block_copy_async_co_entry(void *opaque)
>> -{
>> -    block_copy_common(opaque);
>> +    ret = call_state->ret;
>> +
>> +    return ret;
> 
> But here we still need to free call_state, right?

Right, will fix.
diff mbox series

Patch

diff --git a/block/block-copy.c b/block/block-copy.c
index ec46775ea5..b47cb188dd 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -883,10 +883,18 @@  static int coroutine_fn block_copy_common(BlockCopyCallState *call_state)
     return ret;
 }
 
+static void coroutine_fn block_copy_async_co_entry(void *opaque)
+{
+    block_copy_common(opaque);
+}
+
 int coroutine_fn block_copy(BlockCopyState *s, int64_t start, int64_t bytes,
-                            bool ignore_ratelimit)
+                            bool ignore_ratelimit, uint64_t timeout_ns)
 {
-    BlockCopyCallState call_state = {
+    int ret;
+    BlockCopyCallState *call_state = g_new(BlockCopyCallState, 1);
+
+    *call_state = (BlockCopyCallState) {
         .s = s,
         .offset = start,
         .bytes = bytes,
@@ -894,12 +902,16 @@  int coroutine_fn block_copy(BlockCopyState *s, int64_t start, int64_t bytes,
         .max_workers = BLOCK_COPY_MAX_WORKERS,
     };
 
-    return block_copy_common(&call_state);
-}
+    ret = qemu_co_timeout(block_copy_async_co_entry, call_state, timeout_ns,
+                          g_free);
+    if (ret < 0) {
+        /* Timeout. call_state will be freed by running coroutine. */
+        return ret;
+    }
 
-static void coroutine_fn block_copy_async_co_entry(void *opaque)
-{
-    block_copy_common(opaque);
+    ret = call_state->ret;
+
+    return ret;
 }
 
 BlockCopyCallState *block_copy_async(BlockCopyState *s,
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 0614c3d08b..7ef3f9f4c1 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -107,7 +107,7 @@  static coroutine_fn int cbw_do_copy_before_write(BlockDriverState *bs,
     off = QEMU_ALIGN_DOWN(offset, cluster_size);
     end = QEMU_ALIGN_UP(offset + bytes, cluster_size);
 
-    ret = block_copy(s->bcs, off, end - off, true);
+    ret = block_copy(s->bcs, off, end - off, true, 0);
     if (ret < 0 && s->on_cbw_error == ON_CBW_ERROR_BREAK_GUEST_WRITE) {
         return ret;
     }
diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index 68bbd344b2..1c9616cdee 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -40,7 +40,7 @@  int64_t block_copy_reset_unallocated(BlockCopyState *s,
                                      int64_t offset, int64_t *count);
 
 int coroutine_fn block_copy(BlockCopyState *s, int64_t offset, int64_t bytes,
-                            bool ignore_ratelimit);
+                            bool ignore_ratelimit, uint64_t timeout_ns);
 
 /*
  * Run block-copy in a coroutine, create corresponding BlockCopyCallState