diff mbox

[RFC,2/2] rtc: Restore the RTC alarm time to the configured alarm time in BIOS Setup

Message ID 1431077706-3560-1-git-send-email-adrianhuang0701@gmail.com
State Superseded
Headers show

Commit Message

Huang Adrian May 8, 2015, 9:35 a.m. UTC
Steps to reproduce the problem:
	1) Enable RTC wake-up option in BIOS Setup
	2) Issue one of these commands in the OS: "poweroff" 
	   or "shutdown -h now"
	3) System will shut down and then reboot automatically

Root-cause of the issue:
	1) During the shutdown process, the hwclock utility is used
	   to save the system clock to hardware clock (RTC).
	2) The hwclock utility invokes ioctl() with RTC_UIE_ON. The
	   kernel configures the RTC alarm for the periodic interrupt
	   (every 1 second).
	3) The hwclock uitlity closes the /dev/rtc0 device, and the
	   kernel disables the RTC alarm irq (AIE bit of Register B)
	   via ioctl() with RTC_UIE_OFF. But, the configured alarm
	   time is the current_time + 1.
	4) After the next 1 second is elapsed, the AF (alarm
	   interrupt flag) of Register C is set.
	5) The S5 handler in BIOS is invoked to configure alarm
	   registers (enable AIE bit and configure alarm date/time).
	   But, BIOS does not clear the previous interrupt status
	   during alarm configuration. Therefore, "AF=AIE=1" causes 
	   the rtc device to trigger an interrupt.  
	6) So, the machine reboots automatically right after shutdown.

This patch restores the configured alarm time (user configures the
time in BIOS Setup) to rtc alarm registers. In some circumstances,
the time of the rtc alarm registers is the past time because
user-space programs (for example: hwclock) may invoke ioctl() with
RTC_UIE_ON. In any case, this patch prevents the AF bit from getting
set to 1. Note, AF=1 will cause the system to reboot after shut down.
Therefore, this patch fixes the issue from occurring.

Signed-off-by: Adrian Huang <ahuang12@lenovo.com>
Reviewed-by: Nagananda Chumbalkar <nchumbalkar@lenovo.com>
---
 drivers/rtc/interface.c | 21 +++++++++++++++++++++
 drivers/rtc/rtc-dev.c   |  1 +
 include/linux/rtc.h     |  1 +
 3 files changed, 23 insertions(+)

Comments

Alexandre Belloni May 18, 2015, 10:05 p.m. UTC | #1
Hi,

On 08/05/2015 at 17:35:06 +0800, Adrian Huang wrote :
> Steps to reproduce the problem:
> 	1) Enable RTC wake-up option in BIOS Setup
> 	2) Issue one of these commands in the OS: "poweroff" 
> 	   or "shutdown -h now"
> 	3) System will shut down and then reboot automatically
> 
> Root-cause of the issue:
> 	1) During the shutdown process, the hwclock utility is used
> 	   to save the system clock to hardware clock (RTC).
> 	2) The hwclock utility invokes ioctl() with RTC_UIE_ON. The
> 	   kernel configures the RTC alarm for the periodic interrupt
> 	   (every 1 second).
> 	3) The hwclock uitlity closes the /dev/rtc0 device, and the
> 	   kernel disables the RTC alarm irq (AIE bit of Register B)
> 	   via ioctl() with RTC_UIE_OFF. But, the configured alarm
> 	   time is the current_time + 1.
> 	4) After the next 1 second is elapsed, the AF (alarm
> 	   interrupt flag) of Register C is set.
> 	5) The S5 handler in BIOS is invoked to configure alarm
> 	   registers (enable AIE bit and configure alarm date/time).
> 	   But, BIOS does not clear the previous interrupt status
> 	   during alarm configuration. Therefore, "AF=AIE=1" causes 
> 	   the rtc device to trigger an interrupt.  
> 	6) So, the machine reboots automatically right after shutdown.
> 
> This patch restores the configured alarm time (user configures the
> time in BIOS Setup) to rtc alarm registers. In some circumstances,
> the time of the rtc alarm registers is the past time because
> user-space programs (for example: hwclock) may invoke ioctl() with
> RTC_UIE_ON. In any case, this patch prevents the AF bit from getting
> set to 1. Note, AF=1 will cause the system to reboot after shut down.
> Therefore, this patch fixes the issue from occurring.
> 

Actually, I'm not sure that solves your issue. From what I
understand, what is happening in step 4 is the bug you want to work
around and it happens because AF gets set even if AIE is not set.

So, your rtc_alarm_restore() basically calls cmos_set_alarm() which
calls cmos_irq_disable() finally this one calls cmos_checkintr() which
does read register C and so clears AF.

What I'm not sure is why cmos_do_shutdown() isn't clearing AF properly.
Is it correctly called at shutdown?
Huang Adrian May 20, 2015, 4:30 a.m. UTC | #2
Hi,
> Hi,
>
> On 08/05/2015 at 17:35:06 +0800, Adrian Huang wrote :
>> Steps to reproduce the problem:
>>       1) Enable RTC wake-up option in BIOS Setup
>>       2) Issue one of these commands in the OS: "poweroff"
>>          or "shutdown -h now"
>>       3) System will shut down and then reboot automatically
>>
>> Root-cause of the issue:
>>       1) During the shutdown process, the hwclock utility is used
>>          to save the system clock to hardware clock (RTC).
>>       2) The hwclock utility invokes ioctl() with RTC_UIE_ON. The
>>          kernel configures the RTC alarm for the periodic interrupt
>>          (every 1 second).
>>       3) The hwclock uitlity closes the /dev/rtc0 device, and the
>>          kernel disables the RTC alarm irq (AIE bit of Register B)
>>          via ioctl() with RTC_UIE_OFF. But, the configured alarm
>>          time is the current_time + 1.
>>       4) After the next 1 second is elapsed, the AF (alarm
>>          interrupt flag) of Register C is set.
>>       5) The S5 handler in BIOS is invoked to configure alarm
>>          registers (enable AIE bit and configure alarm date/time).
>>          But, BIOS does not clear the previous interrupt status
>>          during alarm configuration. Therefore, "AF=AIE=1" causes
>>          the rtc device to trigger an interrupt.
>>       6) So, the machine reboots automatically right after shutdown.
>>
>> This patch restores the configured alarm time (user configures the
>> time in BIOS Setup) to rtc alarm registers. In some circumstances,
>> the time of the rtc alarm registers is the past time because
>> user-space programs (for example: hwclock) may invoke ioctl() with
>> RTC_UIE_ON. In any case, this patch prevents the AF bit from getting
>> set to 1. Note, AF=1 will cause the system to reboot after shut down.
>> Therefore, this patch fixes the issue from occurring.
>>
>
> Actually, I'm not sure that solves your issue. From what I
> understand, what is happening in step 4 is the bug you want to work
> around and it happens because AF gets set even if AIE is not set.
>

Yes.

> So, your rtc_alarm_restore() basically calls cmos_set_alarm() which
> calls cmos_irq_disable() finally this one calls cmos_checkintr() which
> does read register C and so clears AF.
>

The call path is correct. cmos_checkintr() did clear interrupt flags. But,
the problem is that AF is set in the next one second.

Assume 'n' is the current time.

1) The call path is finished at 'n' (current time). It clears the
interrupt flags.
2) At n + 1, AF is set because the HW alarm time is programmed at n +1.

This leads to the problem.

> What I'm not sure is why cmos_do_shutdown() isn't clearing AF properly.
> Is it correctly called at shutdown?
>

The function cmos_pnp_shutdown() simply returns because
cmos_poweroff() always returns 0 (CONFIG_PM and CONFIG_PNP are
set). The cmos_do_shutdown() is not invoked at shutdown. We did not
get a chance to clear interrupt flag. Here is the callpath:

kernel_restart_prepare() -> device_shutdown() -> cmos_pnp_shutdown()

[Experiment 1 - SLES11 SP3 with upstream kernel (without my patch)]
The revised code in cmos_pnp_shutdown() is shown as follows. It
ensures that cmos_do_shutdown() is invoked, which gets the last chance
to clear the interrupt flag before powering off the machine.

#if 0
        if (system_state == SYSTEM_POWER_OFF && !cmos_poweroff(dev))
                return;
#endif
        cmos_poweroff(dev);


The issue can be still reproduced. This implies that the time cost
between hwclock and cmos_do_shutdown() is within one second.
So, AF is not set yet.

[Experiment 2 - SLES11 SP3 with upstream kernel (without my patch)]
Still apply the code snippet in Experiment 1. The system stayed
power off when I intentionally added 'sleep 3' command in the end of
the 'stop' service of /etc/init.d/boot.clock (This script uses the
hwclock utility to set hardware clock
to the current system time). This proves that cmos_do_shutdown()
clears AF generated by RTC HW.

The system always stayed power off with 10-time shutdown tests.

[Experiment 3 - SLES11 SP3 with upstream kernel (without my patch)]
Do not apply the code snippet in Experiment 1. Also added 'sleep 3'
command in the end of the 'stop' service of /etc/init.d/boot.clock.
The issue can be reproduced. This proves that cmos_pnp_shutdown()
simply returns without invoking cmos_do_shutdown(). So, AF is not
cleared.

Experiments 1-3 show that the issue happens if the time cost
between hwclock and cmos_do_shutdown() is within one second.
If we want to clear the interrupt flag for every shutdown, we need
to re-write the cmos_pnp_shutdown() to ensure the
cmos_do_shutdown() is invoke for every shutdown.

Although my patch can work around the issue, kindly let me know
if you want to re-work cmos_pnp_shutdown(). We can come out a
better way to invoke cmos_do_shutdown() for every shutdown. But,
be remember, the issue still happens even though we make sure
cmos_do_shutdown() is invoked for every shutdown. The problem
is the one second interval.

-- Adrian
Alexandre Belloni May 22, 2015, 10:04 a.m. UTC | #3
Hi,

On 20/05/2015 at 12:30:27 +0800, Huang Adrian wrote :
> >
> > Actually, I'm not sure that solves your issue. From what I
> > understand, what is happening in step 4 is the bug you want to work
> > around and it happens because AF gets set even if AIE is not set.
> >
> 
> Yes.
> 
> > So, your rtc_alarm_restore() basically calls cmos_set_alarm() which
> > calls cmos_irq_disable() finally this one calls cmos_checkintr() which
> > does read register C and so clears AF.
> >
> 
> The call path is correct. cmos_checkintr() did clear interrupt flags. But,
> the problem is that AF is set in the next one second.
> 
> Assume 'n' is the current time.
> 
> 1) The call path is finished at 'n' (current time). It clears the
> interrupt flags.
> 2) At n + 1, AF is set because the HW alarm time is programmed at n +1.
> 
> This leads to the problem.
> 
> > What I'm not sure is why cmos_do_shutdown() isn't clearing AF properly.
> > Is it correctly called at shutdown?
> >
> 
> The function cmos_pnp_shutdown() simply returns because
> cmos_poweroff() always returns 0 (CONFIG_PM and CONFIG_PNP are
> set). The cmos_do_shutdown() is not invoked at shutdown. We did not
> get a chance to clear interrupt flag. Here is the callpath:
> 
> kernel_restart_prepare() -> device_shutdown() -> cmos_pnp_shutdown()
> 
> [Experiment 1 - SLES11 SP3 with upstream kernel (without my patch)]
> The revised code in cmos_pnp_shutdown() is shown as follows. It
> ensures that cmos_do_shutdown() is invoked, which gets the last chance
> to clear the interrupt flag before powering off the machine.
> 
> #if 0
>         if (system_state == SYSTEM_POWER_OFF && !cmos_poweroff(dev))
>                 return;
> #endif
>         cmos_poweroff(dev);
> 
> 
> The issue can be still reproduced. This implies that the time cost
> between hwclock and cmos_do_shutdown() is within one second.
> So, AF is not set yet.
> 
> [Experiment 2 - SLES11 SP3 with upstream kernel (without my patch)]
> Still apply the code snippet in Experiment 1. The system stayed
> power off when I intentionally added 'sleep 3' command in the end of
> the 'stop' service of /etc/init.d/boot.clock (This script uses the
> hwclock utility to set hardware clock
> to the current system time). This proves that cmos_do_shutdown()
> clears AF generated by RTC HW.
> 
> The system always stayed power off with 10-time shutdown tests.
> 
> [Experiment 3 - SLES11 SP3 with upstream kernel (without my patch)]
> Do not apply the code snippet in Experiment 1. Also added 'sleep 3'
> command in the end of the 'stop' service of /etc/init.d/boot.clock.
> The issue can be reproduced. This proves that cmos_pnp_shutdown()
> simply returns without invoking cmos_do_shutdown(). So, AF is not
> cleared.
> 
> Experiments 1-3 show that the issue happens if the time cost
> between hwclock and cmos_do_shutdown() is within one second.
> If we want to clear the interrupt flag for every shutdown, we need
> to re-write the cmos_pnp_shutdown() to ensure the
> cmos_do_shutdown() is invoke for every shutdown.
> 
> Although my patch can work around the issue, kindly let me know
> if you want to re-work cmos_pnp_shutdown(). We can come out a
> better way to invoke cmos_do_shutdown() for every shutdown. But,
> be remember, the issue still happens even though we make sure
> cmos_do_shutdown() is invoked for every shutdown. The problem
> is the one second interval.
> 

That's what I understood. How I see it is the following: your patch is
reading and setting the alarm then clearing AF at a later point in the
shutdown. This may as well be before that one second has elapsed so I
don't feel this is a proper fix. Also, other RTCs don't suffer from that
issue and this may actually break some of them.

What I would do is ensure that AF gets cleared in cmos_do_shutdown()
after setting the alarm to now or now - 1 when it is not used for
wakeup/restart.

Regards,
Huang Adrian May 28, 2015, 11:16 a.m. UTC | #4
Hi

On Fri, May 22, 2015 at 6:04 PM, Alexandre Belloni
<alexandre.belloni@free-electrons.com> wrote:
> Hi,
>
> That's what I understood. How I see it is the following: your patch is
> reading and setting the alarm then clearing AF at a later point in the
> shutdown. This may as well be before that one second has elapsed so I
> don't feel this is a proper fix. Also, other RTCs don't suffer from that
> issue and this may actually break some of them.
>
> What I would do is ensure that AF gets cleared in cmos_do_shutdown()
> after setting the alarm to now or now - 1 when it is not used for
> wakeup/restart.
>

Good suggestion. I've prepared a patch based on your suggestion. It's
still under my local test. I'll submit the patch later after testing.

Thanks for the suggestion!

-- Adrian
diff mbox

Patch

diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
index 166fc60..2fe17da 100644
--- a/drivers/rtc/interface.c
+++ b/drivers/rtc/interface.c
@@ -986,4 +986,25 @@  int rtc_timer_cancel(struct rtc_device *rtc, struct rtc_timer *timer)
 	return ret;
 }
 
+/* rtc_alarm_restore - Restores the alarm time
+ * @ rtc: rtc device to be used
+ *
+ * Kernel interface to restore the alarm time
+ */
+int rtc_alarm_restore(struct rtc_device *rtc)
+{
+	struct rtc_wkalrm aie_alarm;
+	int err;
 
+	/* If someone has configured the AIE timer, do nothing. */
+	if (rtc->aie_timer.enabled)
+		return 0;
+
+	/* Read the alarm date/time from aie_timer. */
+	err = rtc_read_alarm(rtc, &aie_alarm);
+	if (err < 0)
+		return err;
+
+	return __rtc_set_alarm(rtc, &aie_alarm);
+}
+EXPORT_SYMBOL_GPL(rtc_alarm_restore);
diff --git a/drivers/rtc/rtc-dev.c b/drivers/rtc/rtc-dev.c
index 799c34b..a5ea279 100644
--- a/drivers/rtc/rtc-dev.c
+++ b/drivers/rtc/rtc-dev.c
@@ -437,6 +437,7 @@  static int rtc_dev_release(struct inode *inode, struct file *file)
 	rtc_dev_ioctl(file, RTC_UIE_OFF, 0);
 	rtc_update_irq_enable(rtc, 0);
 	rtc_irq_set_state(rtc, NULL, 0);
+	rtc_alarm_restore(rtc);
 
 	if (rtc->ops->release)
 		rtc->ops->release(rtc->dev.parent);
diff --git a/include/linux/rtc.h b/include/linux/rtc.h
index 8dcf682..bf945cb 100644
--- a/include/linux/rtc.h
+++ b/include/linux/rtc.h
@@ -188,6 +188,7 @@  extern int rtc_update_irq_enable(struct rtc_device *rtc, unsigned int enabled);
 extern int rtc_alarm_irq_enable(struct rtc_device *rtc, unsigned int enabled);
 extern int rtc_dev_update_irq_enable_emul(struct rtc_device *rtc,
 						unsigned int enabled);
+extern int rtc_alarm_restore(struct rtc_device *rtc);
 
 void rtc_handle_legacy_irq(struct rtc_device *rtc, int num, int mode);
 void rtc_aie_update_irq(void *private);