Message ID | 20110719115511.609C553E5@gandalf.tls.msk.ru |
---|---|
State | New |
Headers | show |
On 2011-07-19 13:46, Michael Tokarev wrote: > If we do, it results in double monitor_resume() (second being called > from migrate_fd_cleanup() anyway) and monitor suspend count becoming > negative. > > Cc'ing people from `git blame' list for the lines in question: the > change fixes the problem but I'm not sure what the original intention > of this code was in this place. Unfortunately noone replied to two > my attempts to raise this issue. > > Signed-Off-By: Michael Tokarev <mjt@tls.msk.ru> > --- > migration.c | 3 --- > 1 files changed, 0 insertions(+), 3 deletions(-) > > diff --git a/migration.c b/migration.c > index af3a1f2..115588c 100644 > --- a/migration.c > +++ b/migration.c > @@ -330,9 +330,6 @@ ssize_t migrate_fd_put_buffer(void *opaque, const void *data, size_t size) > if (ret == -EAGAIN) { > qemu_set_fd_handler2(s->fd, NULL, NULL, migrate_fd_put_notify, s); > } else if (ret < 0) { > - if (s->mon) { > - monitor_resume(s->mon); > - } > s->state = MIG_STATE_ERROR; > notifier_list_notify(&migration_state_notifiers); > } Looks reasonable to me, but Marcelo should comment on this, specifically which scenario once required the resume. Jan
On Tue, Jul 19, 2011 at 11:48:22PM +0200, Jan Kiszka wrote: > On 2011-07-19 13:46, Michael Tokarev wrote: > > If we do, it results in double monitor_resume() (second being called > > from migrate_fd_cleanup() anyway) and monitor suspend count becoming > > negative. > > > > Cc'ing people from `git blame' list for the lines in question: the > > change fixes the problem but I'm not sure what the original intention > > of this code was in this place. Unfortunately noone replied to two > > my attempts to raise this issue. > > > > Signed-Off-By: Michael Tokarev <mjt@tls.msk.ru> > > --- > > migration.c | 3 --- > > 1 files changed, 0 insertions(+), 3 deletions(-) > > > > diff --git a/migration.c b/migration.c > > index af3a1f2..115588c 100644 > > --- a/migration.c > > +++ b/migration.c > > @@ -330,9 +330,6 @@ ssize_t migrate_fd_put_buffer(void *opaque, const void *data, size_t size) > > if (ret == -EAGAIN) { > > qemu_set_fd_handler2(s->fd, NULL, NULL, migrate_fd_put_notify, s); > > } else if (ret < 0) { > > - if (s->mon) { > > - monitor_resume(s->mon); > > - } > > s->state = MIG_STATE_ERROR; > > notifier_list_notify(&migration_state_notifiers); > > } > > Looks reasonable to me, but Marcelo should comment on this, specifically > which scenario once required the resume. > > Jan If the monitor was suspended (migrate without -d), then this path must resume. Should record that somewhere and check here.
On 2011-07-20 18:34, Marcelo Tosatti wrote: > On Tue, Jul 19, 2011 at 11:48:22PM +0200, Jan Kiszka wrote: >> On 2011-07-19 13:46, Michael Tokarev wrote: >>> If we do, it results in double monitor_resume() (second being called >>> from migrate_fd_cleanup() anyway) and monitor suspend count becoming >>> negative. >>> >>> Cc'ing people from `git blame' list for the lines in question: the >>> change fixes the problem but I'm not sure what the original intention >>> of this code was in this place. Unfortunately noone replied to two >>> my attempts to raise this issue. >>> >>> Signed-Off-By: Michael Tokarev <mjt@tls.msk.ru> >>> --- >>> migration.c | 3 --- >>> 1 files changed, 0 insertions(+), 3 deletions(-) >>> >>> diff --git a/migration.c b/migration.c >>> index af3a1f2..115588c 100644 >>> --- a/migration.c >>> +++ b/migration.c >>> @@ -330,9 +330,6 @@ ssize_t migrate_fd_put_buffer(void *opaque, const void *data, size_t size) >>> if (ret == -EAGAIN) { >>> qemu_set_fd_handler2(s->fd, NULL, NULL, migrate_fd_put_notify, s); >>> } else if (ret < 0) { >>> - if (s->mon) { >>> - monitor_resume(s->mon); >>> - } >>> s->state = MIG_STATE_ERROR; >>> notifier_list_notify(&migration_state_notifiers); >>> } >> >> Looks reasonable to me, but Marcelo should comment on this, specifically >> which scenario once required the resume. >> >> Jan > > If the monitor was suspended (migrate without -d), then this path must > resume. Should record that somewhere and check here. It's clear that we need to resume. The question is in which case migrate_fd_cleanup may not be called after an error in migrate_fd_put_buffer. Jan
So, can we decide on this somehow? I don't see a code path where we don't call monitor_resume at the end, so the "intermediate" monitor_resume can be dropped. This way we fix real bug. If there will be other problem from that, it can be fixed later - this will mean that code path is found... Should I resend the initial patch again? Thanks, /mjt 21.07.2011 02:06, Jan Kiszka wriote: > On 2011-07-20 18:34, Marcelo Tosatti wrote: >> On Tue, Jul 19, 2011 at 11:48:22PM +0200, Jan Kiszka wrote: >>> On 2011-07-19 13:46, Michael Tokarev wrote: >>>> If we do, it results in double monitor_resume() (second being called >>>> from migrate_fd_cleanup() anyway) and monitor suspend count becoming >>>> negative. >>>> >>>> Cc'ing people from `git blame' list for the lines in question: the >>>> change fixes the problem but I'm not sure what the original intention >>>> of this code was in this place. Unfortunately noone replied to two >>>> my attempts to raise this issue. >>>> >>>> Signed-Off-By: Michael Tokarev <mjt@tls.msk.ru> >>>> --- >>>> migration.c | 3 --- >>>> 1 files changed, 0 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/migration.c b/migration.c >>>> index af3a1f2..115588c 100644 >>>> --- a/migration.c >>>> +++ b/migration.c >>>> @@ -330,9 +330,6 @@ ssize_t migrate_fd_put_buffer(void *opaque, const void *data, size_t size) >>>> if (ret == -EAGAIN) { >>>> qemu_set_fd_handler2(s->fd, NULL, NULL, migrate_fd_put_notify, s); >>>> } else if (ret < 0) { >>>> - if (s->mon) { >>>> - monitor_resume(s->mon); >>>> - } >>>> s->state = MIG_STATE_ERROR; >>>> notifier_list_notify(&migration_state_notifiers); >>>> } >>> >>> Looks reasonable to me, but Marcelo should comment on this, specifically >>> which scenario once required the resume. >>> >>> Jan >> >> If the monitor was suspended (migrate without -d), then this path must >> resume. Should record that somewhere and check here. > > It's clear that we need to resume. The question is in which case > migrate_fd_cleanup may not be called after an error in > migrate_fd_put_buffer. > > Jan >
On 2011-08-03 09:38, Michael Tokarev wrote: > So, can we decide on this somehow? I don't see a code > path where we don't call monitor_resume at the end, > so the "intermediate" monitor_resume can be dropped. > This way we fix real bug. If there will be other > problem from that, it can be fixed later - this will > mean that code path is found... I do not see any reason for the spurious resume as well, so you may add my Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com> > > Should I resend the initial patch again? May help to finally trigger the merge. Direct it to Luiz Capitulino as the subsystem maintainer. Jan
diff --git a/migration.c b/migration.c index af3a1f2..115588c 100644 --- a/migration.c +++ b/migration.c @@ -330,9 +330,6 @@ ssize_t migrate_fd_put_buffer(void *opaque, const void *data, size_t size) if (ret == -EAGAIN) { qemu_set_fd_handler2(s->fd, NULL, NULL, migrate_fd_put_notify, s); } else if (ret < 0) { - if (s->mon) { - monitor_resume(s->mon); - } s->state = MIG_STATE_ERROR; notifier_list_notify(&migration_state_notifiers); }
If we do, it results in double monitor_resume() (second being called from migrate_fd_cleanup() anyway) and monitor suspend count becoming negative. Cc'ing people from `git blame' list for the lines in question: the change fixes the problem but I'm not sure what the original intention of this code was in this place. Unfortunately noone replied to two my attempts to raise this issue. Signed-Off-By: Michael Tokarev <mjt@tls.msk.ru> --- migration.c | 3 --- 1 files changed, 0 insertions(+), 3 deletions(-)