Message ID | 1431077706-3560-1-git-send-email-adrianhuang0701@gmail.com |
---|---|
State | Superseded |
Headers | show |
Hi, On 08/05/2015 at 17:35:06 +0800, Adrian Huang wrote : > Steps to reproduce the problem: > 1) Enable RTC wake-up option in BIOS Setup > 2) Issue one of these commands in the OS: "poweroff" > or "shutdown -h now" > 3) System will shut down and then reboot automatically > > Root-cause of the issue: > 1) During the shutdown process, the hwclock utility is used > to save the system clock to hardware clock (RTC). > 2) The hwclock utility invokes ioctl() with RTC_UIE_ON. The > kernel configures the RTC alarm for the periodic interrupt > (every 1 second). > 3) The hwclock uitlity closes the /dev/rtc0 device, and the > kernel disables the RTC alarm irq (AIE bit of Register B) > via ioctl() with RTC_UIE_OFF. But, the configured alarm > time is the current_time + 1. > 4) After the next 1 second is elapsed, the AF (alarm > interrupt flag) of Register C is set. > 5) The S5 handler in BIOS is invoked to configure alarm > registers (enable AIE bit and configure alarm date/time). > But, BIOS does not clear the previous interrupt status > during alarm configuration. Therefore, "AF=AIE=1" causes > the rtc device to trigger an interrupt. > 6) So, the machine reboots automatically right after shutdown. > > This patch restores the configured alarm time (user configures the > time in BIOS Setup) to rtc alarm registers. In some circumstances, > the time of the rtc alarm registers is the past time because > user-space programs (for example: hwclock) may invoke ioctl() with > RTC_UIE_ON. In any case, this patch prevents the AF bit from getting > set to 1. Note, AF=1 will cause the system to reboot after shut down. > Therefore, this patch fixes the issue from occurring. > Actually, I'm not sure that solves your issue. From what I understand, what is happening in step 4 is the bug you want to work around and it happens because AF gets set even if AIE is not set. So, your rtc_alarm_restore() basically calls cmos_set_alarm() which calls cmos_irq_disable() finally this one calls cmos_checkintr() which does read register C and so clears AF. What I'm not sure is why cmos_do_shutdown() isn't clearing AF properly. Is it correctly called at shutdown?
Hi, > Hi, > > On 08/05/2015 at 17:35:06 +0800, Adrian Huang wrote : >> Steps to reproduce the problem: >> 1) Enable RTC wake-up option in BIOS Setup >> 2) Issue one of these commands in the OS: "poweroff" >> or "shutdown -h now" >> 3) System will shut down and then reboot automatically >> >> Root-cause of the issue: >> 1) During the shutdown process, the hwclock utility is used >> to save the system clock to hardware clock (RTC). >> 2) The hwclock utility invokes ioctl() with RTC_UIE_ON. The >> kernel configures the RTC alarm for the periodic interrupt >> (every 1 second). >> 3) The hwclock uitlity closes the /dev/rtc0 device, and the >> kernel disables the RTC alarm irq (AIE bit of Register B) >> via ioctl() with RTC_UIE_OFF. But, the configured alarm >> time is the current_time + 1. >> 4) After the next 1 second is elapsed, the AF (alarm >> interrupt flag) of Register C is set. >> 5) The S5 handler in BIOS is invoked to configure alarm >> registers (enable AIE bit and configure alarm date/time). >> But, BIOS does not clear the previous interrupt status >> during alarm configuration. Therefore, "AF=AIE=1" causes >> the rtc device to trigger an interrupt. >> 6) So, the machine reboots automatically right after shutdown. >> >> This patch restores the configured alarm time (user configures the >> time in BIOS Setup) to rtc alarm registers. In some circumstances, >> the time of the rtc alarm registers is the past time because >> user-space programs (for example: hwclock) may invoke ioctl() with >> RTC_UIE_ON. In any case, this patch prevents the AF bit from getting >> set to 1. Note, AF=1 will cause the system to reboot after shut down. >> Therefore, this patch fixes the issue from occurring. >> > > Actually, I'm not sure that solves your issue. From what I > understand, what is happening in step 4 is the bug you want to work > around and it happens because AF gets set even if AIE is not set. > Yes. > So, your rtc_alarm_restore() basically calls cmos_set_alarm() which > calls cmos_irq_disable() finally this one calls cmos_checkintr() which > does read register C and so clears AF. > The call path is correct. cmos_checkintr() did clear interrupt flags. But, the problem is that AF is set in the next one second. Assume 'n' is the current time. 1) The call path is finished at 'n' (current time). It clears the interrupt flags. 2) At n + 1, AF is set because the HW alarm time is programmed at n +1. This leads to the problem. > What I'm not sure is why cmos_do_shutdown() isn't clearing AF properly. > Is it correctly called at shutdown? > The function cmos_pnp_shutdown() simply returns because cmos_poweroff() always returns 0 (CONFIG_PM and CONFIG_PNP are set). The cmos_do_shutdown() is not invoked at shutdown. We did not get a chance to clear interrupt flag. Here is the callpath: kernel_restart_prepare() -> device_shutdown() -> cmos_pnp_shutdown() [Experiment 1 - SLES11 SP3 with upstream kernel (without my patch)] The revised code in cmos_pnp_shutdown() is shown as follows. It ensures that cmos_do_shutdown() is invoked, which gets the last chance to clear the interrupt flag before powering off the machine. #if 0 if (system_state == SYSTEM_POWER_OFF && !cmos_poweroff(dev)) return; #endif cmos_poweroff(dev); The issue can be still reproduced. This implies that the time cost between hwclock and cmos_do_shutdown() is within one second. So, AF is not set yet. [Experiment 2 - SLES11 SP3 with upstream kernel (without my patch)] Still apply the code snippet in Experiment 1. The system stayed power off when I intentionally added 'sleep 3' command in the end of the 'stop' service of /etc/init.d/boot.clock (This script uses the hwclock utility to set hardware clock to the current system time). This proves that cmos_do_shutdown() clears AF generated by RTC HW. The system always stayed power off with 10-time shutdown tests. [Experiment 3 - SLES11 SP3 with upstream kernel (without my patch)] Do not apply the code snippet in Experiment 1. Also added 'sleep 3' command in the end of the 'stop' service of /etc/init.d/boot.clock. The issue can be reproduced. This proves that cmos_pnp_shutdown() simply returns without invoking cmos_do_shutdown(). So, AF is not cleared. Experiments 1-3 show that the issue happens if the time cost between hwclock and cmos_do_shutdown() is within one second. If we want to clear the interrupt flag for every shutdown, we need to re-write the cmos_pnp_shutdown() to ensure the cmos_do_shutdown() is invoke for every shutdown. Although my patch can work around the issue, kindly let me know if you want to re-work cmos_pnp_shutdown(). We can come out a better way to invoke cmos_do_shutdown() for every shutdown. But, be remember, the issue still happens even though we make sure cmos_do_shutdown() is invoked for every shutdown. The problem is the one second interval. -- Adrian
Hi, On 20/05/2015 at 12:30:27 +0800, Huang Adrian wrote : > > > > Actually, I'm not sure that solves your issue. From what I > > understand, what is happening in step 4 is the bug you want to work > > around and it happens because AF gets set even if AIE is not set. > > > > Yes. > > > So, your rtc_alarm_restore() basically calls cmos_set_alarm() which > > calls cmos_irq_disable() finally this one calls cmos_checkintr() which > > does read register C and so clears AF. > > > > The call path is correct. cmos_checkintr() did clear interrupt flags. But, > the problem is that AF is set in the next one second. > > Assume 'n' is the current time. > > 1) The call path is finished at 'n' (current time). It clears the > interrupt flags. > 2) At n + 1, AF is set because the HW alarm time is programmed at n +1. > > This leads to the problem. > > > What I'm not sure is why cmos_do_shutdown() isn't clearing AF properly. > > Is it correctly called at shutdown? > > > > The function cmos_pnp_shutdown() simply returns because > cmos_poweroff() always returns 0 (CONFIG_PM and CONFIG_PNP are > set). The cmos_do_shutdown() is not invoked at shutdown. We did not > get a chance to clear interrupt flag. Here is the callpath: > > kernel_restart_prepare() -> device_shutdown() -> cmos_pnp_shutdown() > > [Experiment 1 - SLES11 SP3 with upstream kernel (without my patch)] > The revised code in cmos_pnp_shutdown() is shown as follows. It > ensures that cmos_do_shutdown() is invoked, which gets the last chance > to clear the interrupt flag before powering off the machine. > > #if 0 > if (system_state == SYSTEM_POWER_OFF && !cmos_poweroff(dev)) > return; > #endif > cmos_poweroff(dev); > > > The issue can be still reproduced. This implies that the time cost > between hwclock and cmos_do_shutdown() is within one second. > So, AF is not set yet. > > [Experiment 2 - SLES11 SP3 with upstream kernel (without my patch)] > Still apply the code snippet in Experiment 1. The system stayed > power off when I intentionally added 'sleep 3' command in the end of > the 'stop' service of /etc/init.d/boot.clock (This script uses the > hwclock utility to set hardware clock > to the current system time). This proves that cmos_do_shutdown() > clears AF generated by RTC HW. > > The system always stayed power off with 10-time shutdown tests. > > [Experiment 3 - SLES11 SP3 with upstream kernel (without my patch)] > Do not apply the code snippet in Experiment 1. Also added 'sleep 3' > command in the end of the 'stop' service of /etc/init.d/boot.clock. > The issue can be reproduced. This proves that cmos_pnp_shutdown() > simply returns without invoking cmos_do_shutdown(). So, AF is not > cleared. > > Experiments 1-3 show that the issue happens if the time cost > between hwclock and cmos_do_shutdown() is within one second. > If we want to clear the interrupt flag for every shutdown, we need > to re-write the cmos_pnp_shutdown() to ensure the > cmos_do_shutdown() is invoke for every shutdown. > > Although my patch can work around the issue, kindly let me know > if you want to re-work cmos_pnp_shutdown(). We can come out a > better way to invoke cmos_do_shutdown() for every shutdown. But, > be remember, the issue still happens even though we make sure > cmos_do_shutdown() is invoked for every shutdown. The problem > is the one second interval. > That's what I understood. How I see it is the following: your patch is reading and setting the alarm then clearing AF at a later point in the shutdown. This may as well be before that one second has elapsed so I don't feel this is a proper fix. Also, other RTCs don't suffer from that issue and this may actually break some of them. What I would do is ensure that AF gets cleared in cmos_do_shutdown() after setting the alarm to now or now - 1 when it is not used for wakeup/restart. Regards,
Hi On Fri, May 22, 2015 at 6:04 PM, Alexandre Belloni <alexandre.belloni@free-electrons.com> wrote: > Hi, > > That's what I understood. How I see it is the following: your patch is > reading and setting the alarm then clearing AF at a later point in the > shutdown. This may as well be before that one second has elapsed so I > don't feel this is a proper fix. Also, other RTCs don't suffer from that > issue and this may actually break some of them. > > What I would do is ensure that AF gets cleared in cmos_do_shutdown() > after setting the alarm to now or now - 1 when it is not used for > wakeup/restart. > Good suggestion. I've prepared a patch based on your suggestion. It's still under my local test. I'll submit the patch later after testing. Thanks for the suggestion! -- Adrian
diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c index 166fc60..2fe17da 100644 --- a/drivers/rtc/interface.c +++ b/drivers/rtc/interface.c @@ -986,4 +986,25 @@ int rtc_timer_cancel(struct rtc_device *rtc, struct rtc_timer *timer) return ret; } +/* rtc_alarm_restore - Restores the alarm time + * @ rtc: rtc device to be used + * + * Kernel interface to restore the alarm time + */ +int rtc_alarm_restore(struct rtc_device *rtc) +{ + struct rtc_wkalrm aie_alarm; + int err; + /* If someone has configured the AIE timer, do nothing. */ + if (rtc->aie_timer.enabled) + return 0; + + /* Read the alarm date/time from aie_timer. */ + err = rtc_read_alarm(rtc, &aie_alarm); + if (err < 0) + return err; + + return __rtc_set_alarm(rtc, &aie_alarm); +} +EXPORT_SYMBOL_GPL(rtc_alarm_restore); diff --git a/drivers/rtc/rtc-dev.c b/drivers/rtc/rtc-dev.c index 799c34b..a5ea279 100644 --- a/drivers/rtc/rtc-dev.c +++ b/drivers/rtc/rtc-dev.c @@ -437,6 +437,7 @@ static int rtc_dev_release(struct inode *inode, struct file *file) rtc_dev_ioctl(file, RTC_UIE_OFF, 0); rtc_update_irq_enable(rtc, 0); rtc_irq_set_state(rtc, NULL, 0); + rtc_alarm_restore(rtc); if (rtc->ops->release) rtc->ops->release(rtc->dev.parent); diff --git a/include/linux/rtc.h b/include/linux/rtc.h index 8dcf682..bf945cb 100644 --- a/include/linux/rtc.h +++ b/include/linux/rtc.h @@ -188,6 +188,7 @@ extern int rtc_update_irq_enable(struct rtc_device *rtc, unsigned int enabled); extern int rtc_alarm_irq_enable(struct rtc_device *rtc, unsigned int enabled); extern int rtc_dev_update_irq_enable_emul(struct rtc_device *rtc, unsigned int enabled); +extern int rtc_alarm_restore(struct rtc_device *rtc); void rtc_handle_legacy_irq(struct rtc_device *rtc, int num, int mode); void rtc_aie_update_irq(void *private);