diff mbox series

[V2,RESEND] block/replication.c: Fix crash issue after failover

Message ID 20190621062843.1605-1-chen.zhang@intel.com
State New
Headers show
Series [V2,RESEND] block/replication.c: Fix crash issue after failover | expand

Commit Message

Zhang, Chen June 21, 2019, 6:28 a.m. UTC
From: Zhang Chen <chen.zhang@intel.com>

If we try to close replication after failover, it will crash here.
So we need check the block job on active disk before cancel the job.

Signed-off-by: Zhang Chen <chen.zhang@intel.com>
---
 block/replication.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

John Snow June 26, 2019, 9:41 p.m. UTC | #1
On 6/21/19 2:28 AM, Zhang Chen wrote:
> From: Zhang Chen <chen.zhang@intel.com>
> 
> If we try to close replication after failover, it will crash here.
> So we need check the block job on active disk before cancel the job.
> 
> Signed-off-by: Zhang Chen <chen.zhang@intel.com>
> ---
>  block/replication.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/block/replication.c b/block/replication.c
> index b41bc507c0..a68bc7e986 100644
> --- a/block/replication.c
> +++ b/block/replication.c
> @@ -149,7 +149,9 @@ static void replication_close(BlockDriverState *bs)
>          replication_stop(s->rs, false, NULL);
>      }
>      if (s->stage == BLOCK_REPLICATION_FAILOVER) {
> -        job_cancel_sync(&s->commit_job->job);
> +        if (s->commit_job) {
> +            job_cancel_sync(&s->commit_job->job);
> +        }
>      }
>  
>      if (s->mode == REPLICATION_MODE_SECONDARY) {
> 

I actually don't understand this right away.

The only place I see that sets commit_job is replication_stop, which
sets it immediately after s->stage = BLOCK_REPLICATION_FAILOVER.

So if we're here in replication_close, shouldn't we have a valid job object?

...unless we never succeeded in launching this commit job, but then
don't we have worse problems?

...Or, perhaps the job actually finished, but then we never cleared the
job variable in replication_done, but then I don't see why this if
statement would actually help us.

Can you share some details of the crash to help me understand the crash,
and why this patch helps?

--js
diff mbox series

Patch

diff --git a/block/replication.c b/block/replication.c
index b41bc507c0..a68bc7e986 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -149,7 +149,9 @@  static void replication_close(BlockDriverState *bs)
         replication_stop(s->rs, false, NULL);
     }
     if (s->stage == BLOCK_REPLICATION_FAILOVER) {
-        job_cancel_sync(&s->commit_job->job);
+        if (s->commit_job) {
+            job_cancel_sync(&s->commit_job->job);
+        }
     }
 
     if (s->mode == REPLICATION_MODE_SECONDARY) {