Message ID | 4F8BA1C1.4030804@teksavvy.com |
---|---|
State | Superseded |
Headers | show |
On 12-04-16 12:36 AM, Mark Lord wrote: > Something recent has killed suspend-to-ram on a number of machines here. > The symptom is that they suspend, but immediately wake up and panic, > with just a black screen so no visible messages to go by. > > The patch below works around the issue -- making things work as they used to work. > > +++ linux/drivers/rtc/interface.c 2012-04-16 00:09:14.105387382 -0400 > @@ -773,7 +773,7 @@ > if (!rtc->ops || !rtc->ops->alarm_irq_enable) > return; > > - rtc->ops->alarm_irq_enable(rtc->dev.parent, false); > + //rtc->ops->alarm_irq_enable(rtc->dev.parent, false); // Kills suspend on ZBOX HD-ID41U > } > > Last known working kernel was 3.2.11. > The line above got added somewhere between it and 3.2.15, > and is also present (no surprise) in newer kernels. > > The highest kernel I've tested for this is 3.3.2, > which also fails until I nuke the line shown above. > > This is straight x86_64 (Atom) hardware, using rtc-cmos. > I can re-test if anyone has a fix for this. > > Meanwhile, whatever patch put this into -stable probably > ought to be reverted upstream and in -stable as well. Speaking of which -- that batch of RTC updates is riddled with bugs. For example, this beauty from rtc-mpc5121.c in the same update: ... rtc->rtc = rtc_device_register("mpc5200-rtc", &op->dev, &mpc5200_rtc_ops, THIS_MODULE); ... rtc->rtc->uie_unsupported = 1; // <<<< Ooops NULL pointer >>>> if (IS_ERR(rtc->rtc)) { // <<<< this needs to be earlier >>>> err = PTR_ERR(rtc->rtc); goto out_free_irq; } ... Can somebody show me how to identify the commit from the code? I know which lines got changed, but don't know how to find the corresponding commits in -git. And once we do identify the commits, they really need a code review. Thanks.
On Mon, Apr 16, 2012 at 3:55 PM, Mark Lord <kernel@teksavvy.com> wrote: > On 12-04-16 12:36 AM, Mark Lord wrote: >> Something recent has killed suspend-to-ram on a number of machines here. >> The symptom is that they suspend, but immediately wake up and panic, >> with just a black screen so no visible messages to go by. >> >> The patch below works around the issue -- making things work as they used to work. >> >> +++ linux/drivers/rtc/interface.c 2012-04-16 00:09:14.105387382 -0400 >> @@ -773,7 +773,7 @@ >> if (!rtc->ops || !rtc->ops->alarm_irq_enable) >> return; >> >> - rtc->ops->alarm_irq_enable(rtc->dev.parent, false); >> + //rtc->ops->alarm_irq_enable(rtc->dev.parent, false); // Kills suspend on ZBOX HD-ID41U >> } >> >> Last known working kernel was 3.2.11. >> The line above got added somewhere between it and 3.2.15, >> and is also present (no surprise) in newer kernels. >> >> The highest kernel I've tested for this is 3.3.2, >> which also fails until I nuke the line shown above. >> >> This is straight x86_64 (Atom) hardware, using rtc-cmos. >> I can re-test if anyone has a fix for this. >> >> Meanwhile, whatever patch put this into -stable probably >> ought to be reverted upstream and in -stable as well. > > > Speaking of which -- that batch of RTC updates is riddled with bugs. > For example, this beauty from rtc-mpc5121.c in the same update: > > ... > rtc->rtc = rtc_device_register("mpc5200-rtc", &op->dev, > &mpc5200_rtc_ops, THIS_MODULE); > ... > > rtc->rtc->uie_unsupported = 1; // <<<< Ooops NULL pointer >>>> > > if (IS_ERR(rtc->rtc)) { // <<<< this needs to be earlier >>>> > err = PTR_ERR(rtc->rtc); > goto out_free_irq; > } > ... > > Can somebody show me how to identify the commit from the code? > I know which lines got changed, but don't know how to find > the corresponding commits in -git. CC'in John. --- commit 4a649903f91232d02284d53724b0a45728111767 Author: John Stultz <john.stultz@linaro.org> Date: Tue Mar 6 17:16:09 2012 -0800 rtc: Provide flag for rtc devices that don't support UIE Richard Weinberger noticed that on some RTC hardware that doesn't support UIE mode, due to coarse granular alarms (like 1minute resolution), the current virtualized RTC support doesn't properly error out when UIE is enabled. Instead the current code queues an alarm for the next second, but it won't fire until up to a miniute later. This patch provides a generic way to flag this sort of hardware and fixes the issue on the mpc5121 where Richard noticed the problem. CC: stable@vger.kernel.org Reported-by: Richard Weinberger <richard@nod.at> Tested-by: Richard Weinberger <richard@nod.at> Signed-off-by: John Stultz <john.stultz@linaro.org>
On Mon, Apr 16, 2012 at 09:55:49AM -0400, Mark Lord wrote: > Can somebody show me how to identify the commit from the code? > I know which lines got changed, but don't know how to find > the corresponding commits in -git. git annotate is usually pretty helpful for this.
On 12-04-16 10:23 AM, richard -rw- weinberger wrote: > On Mon, Apr 16, 2012 at 3:55 PM, Mark Lord <kernel@teksavvy.com> wrote: >> On 12-04-16 12:36 AM, Mark Lord wrote: >>> Something recent has killed suspend-to-ram on a number of machines here. >>> The symptom is that they suspend, but immediately wake up and panic, >>> with just a black screen so no visible messages to go by. >>> >>> The patch below works around the issue -- making things work as they used to work. >>> >>> +++ linux/drivers/rtc/interface.c 2012-04-16 00:09:14.105387382 -0400 >>> @@ -773,7 +773,7 @@ >>> if (!rtc->ops || !rtc->ops->alarm_irq_enable) >>> return; >>> >>> - rtc->ops->alarm_irq_enable(rtc->dev.parent, false); >>> + //rtc->ops->alarm_irq_enable(rtc->dev.parent, false); // Kills suspend on ZBOX HD-ID41U >>> } How about the line above -- that's the commit that breaks things here. Thanks! >> Speaking of which -- that batch of RTC updates is riddled with bugs. >> For example, this beauty from rtc-mpc5121.c in the same update: >> >> ... >> rtc->rtc = rtc_device_register("mpc5200-rtc", &op->dev, >> &mpc5200_rtc_ops, THIS_MODULE); >> ... >> >> rtc->rtc->uie_unsupported = 1; // <<<< Ooops NULL pointer >>>> >> >> if (IS_ERR(rtc->rtc)) { // <<<< this needs to be earlier >>>> >> err = PTR_ERR(rtc->rtc); >> goto out_free_irq; >> } >> ... .. > --- > commit 4a649903f91232d02284d53724b0a45728111767 > Author: John Stultz <john.stultz@linaro.org> > Date: Tue Mar 6 17:16:09 2012 -0800 > > rtc: Provide flag for rtc devices that don't support UIE .. > CC: stable@vger.kernel.org > Reported-by: Richard Weinberger <richard@nod.at> > Tested-by: Richard Weinberger <richard@nod.at> > Signed-off-by: John Stultz <john.stultz@linaro.org> .. That commit is visibly buggy, but at least the fix is simple enough. It's not what's breaking the systems here though. Thanks again!
On Mon, Apr 16, 2012 at 5:42 PM, Mark Lord <kernel@teksavvy.com> wrote: > On 12-04-16 10:23 AM, richard -rw- weinberger wrote: >> On Mon, Apr 16, 2012 at 3:55 PM, Mark Lord <kernel@teksavvy.com> wrote: >>> On 12-04-16 12:36 AM, Mark Lord wrote: >>>> Something recent has killed suspend-to-ram on a number of machines here. >>>> The symptom is that they suspend, but immediately wake up and panic, >>>> with just a black screen so no visible messages to go by. >>>> >>>> The patch below works around the issue -- making things work as they used to work. >>>> >>>> +++ linux/drivers/rtc/interface.c 2012-04-16 00:09:14.105387382 -0400 >>>> @@ -773,7 +773,7 @@ >>>> if (!rtc->ops || !rtc->ops->alarm_irq_enable) >>>> return; >>>> >>>> - rtc->ops->alarm_irq_enable(rtc->dev.parent, false); >>>> + //rtc->ops->alarm_irq_enable(rtc->dev.parent, false); // Kills suspend on ZBOX HD-ID41U >>>> } > > > How about the line above -- that's the commit that breaks things here. Download Linus' GIT tree and use git blame. :-)
On 12-04-16 11:49 AM, richard -rw- weinberger wrote: > On Mon, Apr 16, 2012 at 5:42 PM, Mark Lord <kernel@teksavvy.com> wrote: >> On 12-04-16 10:23 AM, richard -rw- weinberger wrote: >>> On Mon, Apr 16, 2012 at 3:55 PM, Mark Lord <kernel@teksavvy.com> wrote: >>>> On 12-04-16 12:36 AM, Mark Lord wrote: >>>>> Something recent has killed suspend-to-ram on a number of machines here. >>>>> The symptom is that they suspend, but immediately wake up and panic, >>>>> with just a black screen so no visible messages to go by. >>>>> >>>>> The patch below works around the issue -- making things work as they used to work. >>>>> >>>>> +++ linux/drivers/rtc/interface.c 2012-04-16 00:09:14.105387382 -0400 >>>>> @@ -773,7 +773,7 @@ >>>>> if (!rtc->ops || !rtc->ops->alarm_irq_enable) >>>>> return; >>>>> >>>>> - rtc->ops->alarm_irq_enable(rtc->dev.parent, false); >>>>> + //rtc->ops->alarm_irq_enable(rtc->dev.parent, false); // Kills suspend on ZBOX HD-ID41U >>>>> } >> >> >> How about the line above -- that's the commit that breaks things here. > > Download Linus' GIT tree and use git blame. :-) Too steep a learning curve for a casual user. But google works: http://www.mail-archive.com/stable@vger.kernel.org/msg04391.html > [ 055/175] rtc: Disable the alarm in the hardware (v2) > > Greg KH > Fri, 30 Mar 2012 14:57:53 -0700 > > 3.3-stable review patch. If anyone has any objections, please let me know. > > ------------------ > > From: Rabin Vincent <rabin.vincent@stericsson.com> > > commit 41c7f7424259ff11009449f87c95656f69f9b186 upstream. > > Currently, the RTC code does not disable the alarm in the hardware. > > This means that after a sequence such as the one below (the files are in the > RTC sysfs), the box will boot up after 2 minutes even though we've > asked for the alarm to be turned off. > > # echo $((`cat since_epoch`)+120) > wakealarm > # echo 0 > wakealarm > # poweroff > > Fix this by disabling the alarm when there are no timers to run. > > The original version of this patch was reverted. This version > disables the irq directly instead of setting a disabled timer > in the future. > > Cc: John Stultz <john.stu...@linaro.org> > Signed-off-by: Rabin Vincent <rabin.vinc...@stericsson.com> > [Merged in the second revision from Rabin] > Signed-off-by: John Stultz <john.stu...@linaro.org> > Signed-off-by: Greg Kroah-Hartman <gre...@linuxfoundation.org> > > --- > drivers/rtc/interface.c | 15 +++++++++++++-- > 1 file changed, 13 insertions(+), 2 deletions(-) > > --- a/drivers/rtc/interface.c > +++ b/drivers/rtc/interface.c > @@ -763,6 +763,14 @@ static int rtc_timer_enqueue(struct rtc_ > return 0; > } > > +static void rtc_alarm_disable(struct rtc_device *rtc) > +{ > + if (!rtc->ops || !rtc->ops->alarm_irq_enable) > + return; > + > + rtc->ops->alarm_irq_enable(rtc->dev.parent, false); > +} > + > /** > * rtc_timer_remove - Removes a rtc_timer from the rtc_device timerqueue > * @rtc rtc device > @@ -784,8 +792,10 @@ static void rtc_timer_remove(struct rtc_ > struct rtc_wkalrm alarm; > int err; > next = timerqueue_getnext(&rtc->timerqueue); > - if (!next) > + if (!next) { > + rtc_alarm_disable(rtc); > return; > + } > alarm.time = rtc_ktime_to_tm(next->expires); > alarm.enabled = 1; > err = __rtc_set_alarm(rtc, &alarm); > @@ -847,7 +857,8 @@ again: > err = __rtc_set_alarm(rtc, &alarm); > if (err == -ETIME) > goto again; > - } > + } else > + rtc_alarm_disable(rtc); > > mutex_unlock(&rtc->ops_lock); > } >
On 04/16/2012 07:23 AM, richard -rw- weinberger wrote: > On Mon, Apr 16, 2012 at 3:55 PM, Mark Lord<kernel@teksavvy.com> wrote: >> Speaking of which -- that batch of RTC updates is riddled with bugs. >> For example, this beauty from rtc-mpc5121.c in the same update: >> >> ... >> rtc->rtc = rtc_device_register("mpc5200-rtc",&op->dev, >> &mpc5200_rtc_ops, THIS_MODULE); >> ... >> >> rtc->rtc->uie_unsupported = 1; //<<<< Ooops NULL pointer>>>> >> >> if (IS_ERR(rtc->rtc)) { //<<<< this needs to be earlier>>>> >> err = PTR_ERR(rtc->rtc); >> goto out_free_irq; >> } >> ... >> >> Thanks for noticing this and CC'ing me. I'll spin up a fix promptly for review. -john
On 04/16/2012 08:57 AM, Mark Lord wrote: > On 12-04-16 11:49 AM, richard -rw- weinberger wrote: >> On Mon, Apr 16, 2012 at 5:42 PM, Mark Lord<kernel@teksavvy.com> wrote: >>> On 12-04-16 10:23 AM, richard -rw- weinberger wrote: >>>> On Mon, Apr 16, 2012 at 3:55 PM, Mark Lord<kernel@teksavvy.com> wrote: >>>>> On 12-04-16 12:36 AM, Mark Lord wrote: >>>>>> Something recent has killed suspend-to-ram on a number of machines here. >>>>>> The symptom is that they suspend, but immediately wake up and panic, >>>>>> with just a black screen so no visible messages to go by. >>>>>> >>>>>> The patch below works around the issue -- making things work as they used to work. >>>>>> >>>>>> +++ linux/drivers/rtc/interface.c 2012-04-16 00:09:14.105387382 -0400 >>>>>> @@ -773,7 +773,7 @@ >>>>>> if (!rtc->ops || !rtc->ops->alarm_irq_enable) >>>>>> return; >>>>>> >>>>>> - rtc->ops->alarm_irq_enable(rtc->dev.parent, false); >>>>>> + //rtc->ops->alarm_irq_enable(rtc->dev.parent, false); // Kills suspend on ZBOX HD-ID41U >>>>>> } >>> >>> How about the line above -- that's the commit that breaks things here. >> Download Linus' GIT tree and use git blame. :-) > Too steep a learning curve for a casual user. > But google works: > > http://www.mail-archive.com/stable@vger.kernel.org/msg04391.html Thanks for the report and sorry for the trouble. I'm trying to reproduce this locally. Can you send me your .config? thanks -john
On 04/16/2012 12:45 PM, John Stultz wrote: > On 04/16/2012 08:57 AM, Mark Lord wrote: >> On 12-04-16 11:49 AM, richard -rw- weinberger wrote: >>> On Mon, Apr 16, 2012 at 5:42 PM, Mark Lord<kernel@teksavvy.com> wrote: >>>> On 12-04-16 10:23 AM, richard -rw- weinberger wrote: >>>>> On Mon, Apr 16, 2012 at 3:55 PM, Mark Lord<kernel@teksavvy.com> >>>>> wrote: >>>>>> On 12-04-16 12:36 AM, Mark Lord wrote: >>>>>>> Something recent has killed suspend-to-ram on a number of >>>>>>> machines here. >>>>>>> The symptom is that they suspend, but immediately wake up and >>>>>>> panic, >>>>>>> with just a black screen so no visible messages to go by. >>>>>>> >>>>>>> The patch below works around the issue -- making things work as >>>>>>> they used to work. >>>>>>> >>>>>>> +++ linux/drivers/rtc/interface.c 2012-04-16 >>>>>>> 00:09:14.105387382 -0400 >>>>>>> @@ -773,7 +773,7 @@ >>>>>>> if (!rtc->ops || !rtc->ops->alarm_irq_enable) >>>>>>> return; >>>>>>> >>>>>>> - rtc->ops->alarm_irq_enable(rtc->dev.parent, false); >>>>>>> + //rtc->ops->alarm_irq_enable(rtc->dev.parent, false); // >>>>>>> Kills suspend on ZBOX HD-ID41U >>>>>>> } >>>> >>>> How about the line above -- that's the commit that breaks things here. >>> Download Linus' GIT tree and use git blame. :-) >> Too steep a learning curve for a casual user. >> But google works: >> >> http://www.mail-archive.com/stable@vger.kernel.org/msg04391.html > > Thanks for the report and sorry for the trouble. I'm trying to > reproduce this locally. Can you send me your .config? Ok, so far I've not been able to reproduce anything like this with my atom x86_64 system (done a number of suspends both with and without RTC alarms queued to wake the system up). Can you provide any more details about how you're triggering suspend when you see the problem? Do you have an RTC alarm set for some future time to wake up the system? I'm just trying to understand when rtc_alarm_disable is being called and causing the trouble in your case. The original related issue with the earlier version of this patch was some hardware would wake up immediately after suspend if the rtc was set in the past (which is what was done to "disable" the alarm). I suspect there is a similar hardware quirk we're dealing with that may require extra logic in the rtc-cmos.c alarm_irq_enable() function. thanks -john
On 12-04-16 03:44 PM, John Stultz wrote: > On 04/16/2012 07:23 AM, richard -rw- weinberger wrote: >> On Mon, Apr 16, 2012 at 3:55 PM, Mark Lord<kernel@teksavvy.com> wrote: >>> Speaking of which -- that batch of RTC updates is riddled with bugs. >>> For example, this beauty from rtc-mpc5121.c in the same update: >>> >>> ... >>> rtc->rtc = rtc_device_register("mpc5200-rtc",&op->dev, >>> &mpc5200_rtc_ops, THIS_MODULE); >>> ... >>> >>> rtc->rtc->uie_unsupported = 1; //<<<< Ooops NULL pointer>>>> >>> >>> if (IS_ERR(rtc->rtc)) { //<<<< this needs to be earlier>>>> >>> err = PTR_ERR(rtc->rtc); >>> goto out_free_irq; >>> } >>> ... >>> >>> > > Thanks for noticing this and CC'ing me. I'll spin up a fix promptly for review. > -john Yeah. My apologies for being so harsh there. At least the fix is easy enough. Cheers!
On 12-04-16 05:43 PM, John Stultz wrote: > On 04/16/2012 12:45 PM, John Stultz wrote: >> On 04/16/2012 08:57 AM, Mark Lord wrote: >>> On 12-04-16 11:49 AM, richard -rw- weinberger wrote: >>>> On Mon, Apr 16, 2012 at 5:42 PM, Mark Lord<kernel@teksavvy.com> wrote: >>>>> On 12-04-16 10:23 AM, richard -rw- weinberger wrote: >>>>>> On Mon, Apr 16, 2012 at 3:55 PM, Mark Lord<kernel@teksavvy.com> wrote: >>>>>>> On 12-04-16 12:36 AM, Mark Lord wrote: >>>>>>>> Something recent has killed suspend-to-ram on a number of machines here. >>>>>>>> The symptom is that they suspend, but immediately wake up and panic, >>>>>>>> with just a black screen so no visible messages to go by. >>>>>>>> >>>>>>>> The patch below works around the issue -- making things work as they used to work. >>>>>>>> >>>>>>>> +++ linux/drivers/rtc/interface.c 2012-04-16 00:09:14.105387382 -0400 >>>>>>>> @@ -773,7 +773,7 @@ >>>>>>>> if (!rtc->ops || !rtc->ops->alarm_irq_enable) >>>>>>>> return; >>>>>>>> >>>>>>>> - rtc->ops->alarm_irq_enable(rtc->dev.parent, false); >>>>>>>> + //rtc->ops->alarm_irq_enable(rtc->dev.parent, false); // Kills suspend on ZBOX HD-ID41U >>>>>>>> } >>>>> >>>>> How about the line above -- that's the commit that breaks things here. >>>> Download Linus' GIT tree and use git blame. :-) >>> Too steep a learning curve for a casual user. >>> But google works: >>> >>> http://www.mail-archive.com/stable@vger.kernel.org/msg04391.html >> >> Thanks for the report and sorry for the trouble. I'm trying to reproduce this locally. Can you >> send me your .config? > > Ok, so far I've not been able to reproduce anything like this with my atom x86_64 system (done a > number of suspends both with and without RTC alarms queued to wake the system up). Can you provide > any more details about how you're triggering suspend when you see the problem? > Do you have an RTC alarm set for some future time to wake up the system? > I'm just trying to understand when rtc_alarm_disable is being called and causing the trouble in your > case. > > The original related issue with the earlier version of this patch was some hardware would wake up > immediately after suspend if the rtc was set in the past (which is what was done to "disable" the > alarm). I suspect there is a similar hardware quirk we're dealing with that may require extra logic > in the rtc-cmos.c alarm_irq_enable() function. Thanks for looking into it, John. I also spent many more hours digging away at it here today, and I now understand (mostly) what is happening and why. The code above introduces a new access to the RTC that never existed before. For the case where the Alarm has never been enabled by software, I believe the code above will still try to "disable" it. That's the new behaviour we didn't have prior to this patch. And.. on some of the systems I'm testing on, the BIOS setup has the RTC Alarm "enabled", which means "under BIOS control", as opposed to "disabled" which means "under software control". It's the "under BIOS control" systems that the above patch breaks. So I think the code may just need to be slightly more clever, and not disable an Alarm that was never enabled by software in the first place. Cheers Mark
--- linux/drivers/rtc/interface.c.orig 2012-04-16 00:08:47.615389718 -0400 +++ linux/drivers/rtc/interface.c 2012-04-16 00:09:14.105387382 -0400 @@ -773,7 +773,7 @@ if (!rtc->ops || !rtc->ops->alarm_irq_enable) return; - rtc->ops->alarm_irq_enable(rtc->dev.parent, false); + //rtc->ops->alarm_irq_enable(rtc->dev.parent, false); // Kills suspend on ZBOX HD-ID41U } /**