Message ID | 1670228615-2684-1-git-send-email-baiyw2@chinatelecom.cn |
---|---|
State | New |
Headers | show |
Series | hw/rtc: fix crash caused by lost_clock >= 0 assertion | expand |
On Mon, Dec 05, 2022 at 04:23:35PM +0800, Yaowei Bai wrote: > In our production environment a guest crashed with this log: > > qemu-kvm: /home/abuild/rpmbuild/BUILD/qemu-5.0.0/hw/rtc/mc146818rtc.c:201: periodic_timer_update: Assertion `lost_clock >= 0' failed. > 2022-09-26 10:00:28.747+0000: shutting down, reason=crashed > > This happened after the host synced time with the NTP server which > we had adjusted backward the time because it mistakenly went faster > than the real time. Other people also have this problem: > > https://bugzilla.redhat.com/show_bug.cgi?id=2054781 > > After the host adjusted backward the time, the guset reconfigured the > period, this makes cur_clock smaller than last_periodic_clock in > periodic_timer_update function. However, the code assumes that cur_clock > is bigger than last_periodic_clock, which is not true in the situation > above. So we need to make it clear by introducing a if statement. With > this patch we can handle this crash situation to just reset the > next_periodic_time. > > Signed-off-by: Yaowei Bai <baiyw2@chinatelecom.cn> Hmm not sure this is a good fix. Paolo what's your take? > --- > hw/rtc/mc146818rtc.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/hw/rtc/mc146818rtc.c b/hw/rtc/mc146818rtc.c > index 1ebb412..a397949 100644 > --- a/hw/rtc/mc146818rtc.c > +++ b/hw/rtc/mc146818rtc.c > @@ -199,7 +199,9 @@ periodic_timer_update(RTCState *s, int64_t current_time, uint32_t old_period, bo > next_periodic_clock = muldiv64(s->next_periodic_time, > RTC_CLOCK_RATE, NANOSECONDS_PER_SECOND); > last_periodic_clock = next_periodic_clock - old_period; > - lost_clock = cur_clock - last_periodic_clock; > + if (cur_clock > last_periodic_clock) { > + lost_clock = cur_clock - last_periodic_clock; > + } > assert(lost_clock >= 0); > } > > -- > 2.7.4
diff --git a/hw/rtc/mc146818rtc.c b/hw/rtc/mc146818rtc.c index 1ebb412..a397949 100644 --- a/hw/rtc/mc146818rtc.c +++ b/hw/rtc/mc146818rtc.c @@ -199,7 +199,9 @@ periodic_timer_update(RTCState *s, int64_t current_time, uint32_t old_period, bo next_periodic_clock = muldiv64(s->next_periodic_time, RTC_CLOCK_RATE, NANOSECONDS_PER_SECOND); last_periodic_clock = next_periodic_clock - old_period; - lost_clock = cur_clock - last_periodic_clock; + if (cur_clock > last_periodic_clock) { + lost_clock = cur_clock - last_periodic_clock; + } assert(lost_clock >= 0); }
In our production environment a guest crashed with this log: qemu-kvm: /home/abuild/rpmbuild/BUILD/qemu-5.0.0/hw/rtc/mc146818rtc.c:201: periodic_timer_update: Assertion `lost_clock >= 0' failed. 2022-09-26 10:00:28.747+0000: shutting down, reason=crashed This happened after the host synced time with the NTP server which we had adjusted backward the time because it mistakenly went faster than the real time. Other people also have this problem: https://bugzilla.redhat.com/show_bug.cgi?id=2054781 After the host adjusted backward the time, the guset reconfigured the period, this makes cur_clock smaller than last_periodic_clock in periodic_timer_update function. However, the code assumes that cur_clock is bigger than last_periodic_clock, which is not true in the situation above. So we need to make it clear by introducing a if statement. With this patch we can handle this crash situation to just reset the next_periodic_time. Signed-off-by: Yaowei Bai <baiyw2@chinatelecom.cn> --- hw/rtc/mc146818rtc.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)