Patchwork dell-laptop: defer dell_rfkill_update to worker thread

login
register
mail settings
Submitter Chase Douglas
Date April 9, 2010, 10:17 p.m.
Message ID <1270851475-22321-1-git-send-email-chase.douglas@canonical.com>
Download mbox | patch
Permalink /patch/49871/
State Accepted
Delegated to: Andy Whitcroft
Headers show

Comments

Chase Douglas - April 9, 2010, 10:17 p.m.
From: Chase Douglas <cndougla@emerald.redvoodoo.org>

dell_rfkill_update fires an SMI, which must occur on cpu 0. Thus, if called
on a different cpu, the task will block. Since it's called in hard irq
context, we must defer this to the worker thread.

It is potentially possible that two rfkill key events could be processed
before the work completes on the first. The second event is dropped with a
KERN_NOTICE message.

BugLink: http://bugs.launchpad.net/bugs/555261

Signed-off-by: Chase Douglas <chase.douglas@canonical.com>
---
 drivers/platform/x86/dell-laptop.c |    9 +++++++--
 1 files changed, 7 insertions(+), 2 deletions(-)
Chase Douglas - April 9, 2010, 11:26 p.m.
Hi all,

If you have anything to note, especially bad behavior, please CC
kernel-team@lists.ubuntu.com as well.

Thanks!

-- Chase

On Fri, Apr 9, 2010 at 6:51 PM, Mario Limonciello
<mario_limonciello@dell.com> wrote:
> Hi Dell-enablement list:
>
> Chase sent a patch to the kernel-team ML that helps to fix an OOPS that
> has been happening when pressing the RFKILL SW with a lot of Dell HW.
>
> I've seen this happen on a variety of Dell HW myself, i'm sure that some
> of you guys have too.  Could you guys help Chase to test it on any
> hardware you've seen this happen?  I'll see if I can too.
>
> Thanks,
>
> On 04/09/2010 05:31 PM, Chase Douglas wrote:
>> On Fri, Apr 9, 2010 at 6:24 PM, Mario Limonciello
>> <mario_limonciello@dell.com> wrote:
>>
>>> Great job with tracking this down!  I've seen this problem tons of times
>>> myself, but was at a loss for why it was actually happening.
>>>
>> Since we're so close to shipping 10.04 it would be great if this could
>> get some extra testing to ensure rfkill switches still work properly
>> and without causing an OOPS. Do you have the time and ability to help
>> with this? If so, you can find my test kernel at
>> http://people.canonical.com/~cndougla/555261/.
Mario Limonciello - April 11, 2010, 1:35 a.m.
Hi Chase:

On my D630, I no longer appear to see the OOP's ending up in dmesg after
a few switches on and off once I loaded your test kernel, so looks good
from my end.

Thanks,

On 04/09/2010 06:26 PM, Chase Douglas wrote:
> Hi all,
>
> If you have anything to note, especially bad behavior, please CC
> kernel-team@lists.ubuntu.com as well.
>
> Thanks!
>
> -- Chase
>
> On Fri, Apr 9, 2010 at 6:51 PM, Mario Limonciello
> <mario_limonciello@dell.com> wrote:
>   
>> Hi Dell-enablement list:
>>
>> Chase sent a patch to the kernel-team ML that helps to fix an OOPS that
>> has been happening when pressing the RFKILL SW with a lot of Dell HW.
>>
>> I've seen this happen on a variety of Dell HW myself, i'm sure that some
>> of you guys have too.  Could you guys help Chase to test it on any
>> hardware you've seen this happen?  I'll see if I can too.
>>
>> Thanks,
>>
>> On 04/09/2010 05:31 PM, Chase Douglas wrote:
>>     
>>> On Fri, Apr 9, 2010 at 6:24 PM, Mario Limonciello
>>> <mario_limonciello@dell.com> wrote:
>>>
>>>       
>>>> Great job with tracking this down!  I've seen this problem tons of times
>>>> myself, but was at a loss for why it was actually happening.
>>>>
>>>>         
>>> Since we're so close to shipping 10.04 it would be great if this could
>>> get some extra testing to ensure rfkill switches still work properly
>>> and without causing an OOPS. Do you have the time and ability to help
>>> with this? If so, you can find my test kernel at
>>> http://people.canonical.com/~cndougla/555261/.
>>>
Stefan Bader - April 12, 2010, 2:20 p.m.
Am I correct in assuming, this is intended for upstreaming? Will you propose
stable when submitting it?
Patch looks sensible in general.

-Stefan

Chase Douglas wrote:
> From: Chase Douglas <cndougla@emerald.redvoodoo.org>
> 
> dell_rfkill_update fires an SMI, which must occur on cpu 0. Thus, if called
> on a different cpu, the task will block. Since it's called in hard irq
> context, we must defer this to the worker thread.
> 
> It is potentially possible that two rfkill key events could be processed
> before the work completes on the first. The second event is dropped with a
> KERN_NOTICE message.
> 
> BugLink: http://bugs.launchpad.net/bugs/555261
> 
> Signed-off-by: Chase Douglas <chase.douglas@canonical.com>
> ---
>  drivers/platform/x86/dell-laptop.c |    9 +++++++--
>  1 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/platform/x86/dell-laptop.c b/drivers/platform/x86/dell-laptop.c
> index 8bf7ac7..15d96a0 100644
> --- a/drivers/platform/x86/dell-laptop.c
> +++ b/drivers/platform/x86/dell-laptop.c
> @@ -288,8 +288,10 @@ static const struct rfkill_ops dell_rfkill_ops = {
>  /*
>   * Called for each KEY_WLAN key press event. Note that a physical
>   * rf-kill switch change also causes the BIOS to emit a KEY_WLAN.
> + *
> + * dell_rfkill_set may block, so schedule it on a worker thread.
>   */
> -static void dell_rfkill_update(void)
> +static void dell_rfkill_update(struct work_struct *work)
>  {
>  	hw_switch_status ^= BIT(HW_SWITCH_MASK);
>  	if (wifi_rfkill && (hw_switch_status & BIT(WLAN_SWITCH_MASK))) {
> @@ -307,6 +309,7 @@ static void dell_rfkill_update(void)
>  		dell_rfkill_set((void*)3, rfkill_blocked(wwan_rfkill));
>  	}
>  }
> +DECLARE_WORK(dell_rfkill_update_work, &dell_rfkill_update);
>  
>  static int dell_setup_rfkill(void)
>  {
> @@ -431,7 +434,9 @@ static bool dell_input_filter(struct input_handle *handle, unsigned int type,
>  			     unsigned int code, int value)
>  {
>  	if (type == EV_KEY && code == KEY_WLAN && value == 1) {
> -		dell_rfkill_update();
> +		if (!schedule_work(&dell_rfkill_update_work))
> +			printk(KERN_NOTICE "rfkill switch handling already "
> +					   "scheduled, dropping this event\n");
>  		return 1;
>  	}
>
Chase Douglas - April 12, 2010, 3:08 p.m.
This patch applies on top of UBUNTU SAUCE patches that are not
upstream. I don't know the status of the upstreaming of those patches,
but this should come along if/when that happens. The changes would
likely be too large for -stable, so I don't think there's much to be
gained there.

-- Chase

On Mon, Apr 12, 2010 at 7:20 AM, Stefan Bader
<stefan.bader@canonical.com> wrote:
> Am I correct in assuming, this is intended for upstreaming? Will you propose
> stable when submitting it?
> Patch looks sensible in general.
>
> -Stefan
>
> Chase Douglas wrote:
>> From: Chase Douglas <cndougla@emerald.redvoodoo.org>
>>
>> dell_rfkill_update fires an SMI, which must occur on cpu 0. Thus, if called
>> on a different cpu, the task will block. Since it's called in hard irq
>> context, we must defer this to the worker thread.
>>
>> It is potentially possible that two rfkill key events could be processed
>> before the work completes on the first. The second event is dropped with a
>> KERN_NOTICE message.
>>
>> BugLink: http://bugs.launchpad.net/bugs/555261
>>
>> Signed-off-by: Chase Douglas <chase.douglas@canonical.com>
>> ---
>>  drivers/platform/x86/dell-laptop.c |    9 +++++++--
>>  1 files changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/platform/x86/dell-laptop.c b/drivers/platform/x86/dell-laptop.c
>> index 8bf7ac7..15d96a0 100644
>> --- a/drivers/platform/x86/dell-laptop.c
>> +++ b/drivers/platform/x86/dell-laptop.c
>> @@ -288,8 +288,10 @@ static const struct rfkill_ops dell_rfkill_ops = {
>>  /*
>>   * Called for each KEY_WLAN key press event. Note that a physical
>>   * rf-kill switch change also causes the BIOS to emit a KEY_WLAN.
>> + *
>> + * dell_rfkill_set may block, so schedule it on a worker thread.
>>   */
>> -static void dell_rfkill_update(void)
>> +static void dell_rfkill_update(struct work_struct *work)
>>  {
>>       hw_switch_status ^= BIT(HW_SWITCH_MASK);
>>       if (wifi_rfkill && (hw_switch_status & BIT(WLAN_SWITCH_MASK))) {
>> @@ -307,6 +309,7 @@ static void dell_rfkill_update(void)
>>               dell_rfkill_set((void*)3, rfkill_blocked(wwan_rfkill));
>>       }
>>  }
>> +DECLARE_WORK(dell_rfkill_update_work, &dell_rfkill_update);
>>
>>  static int dell_setup_rfkill(void)
>>  {
>> @@ -431,7 +434,9 @@ static bool dell_input_filter(struct input_handle *handle, unsigned int type,
>>                            unsigned int code, int value)
>>  {
>>       if (type == EV_KEY && code == KEY_WLAN && value == 1) {
>> -             dell_rfkill_update();
>> +             if (!schedule_work(&dell_rfkill_update_work))
>> +                     printk(KERN_NOTICE "rfkill switch handling already "
>> +                                        "scheduled, dropping this event\n");
>>               return 1;
>>       }
>>
>
>
Mario Limonciello - April 12, 2010, 3:13 p.m.
Hi Chase:


On 04/12/2010 10:08 AM, Chase Douglas wrote:
> This patch applies on top of UBUNTU SAUCE patches that are not
> upstream. I don't know the status of the upstreaming of those patches,
> but this should come along if/when that happens. The changes would
> likely be too large for -stable, so I don't think there's much to be
> gained there.
>
>   
An implementation of some of those SAUCE patches has landed upstream,
the remaining ones will need to be evaluated whether still necessary
during Lucid+1.  This other new patch should apply to both scenarios
though in some sense, and should head upstream.
Chase Douglas - April 12, 2010, 3:40 p.m.
This patch applies on top of UBUNTU SAUCE patches that are not
upstream. I don't know the status of the upstreaming of those patches,
but this should come along if/when that happens. The changes would
likely be too large for -stable, so I don't think there's much to be
gained there.

-- Chase

On Mon, Apr 12, 2010 at 7:20 AM, Stefan Bader
<stefan.bader@canonical.com> wrote:
> Am I correct in assuming, this is intended for upstreaming? Will you propose
> stable when submitting it?
> Patch looks sensible in general.
>
> -Stefan
>
> Chase Douglas wrote:
>> From: Chase Douglas <cndougla@emerald.redvoodoo.org>
>>
>> dell_rfkill_update fires an SMI, which must occur on cpu 0. Thus, if called
>> on a different cpu, the task will block. Since it's called in hard irq
>> context, we must defer this to the worker thread.
>>
>> It is potentially possible that two rfkill key events could be processed
>> before the work completes on the first. The second event is dropped with a
>> KERN_NOTICE message.
>>
>> BugLink: http://bugs.launchpad.net/bugs/555261
>>
>> Signed-off-by: Chase Douglas <chase.douglas@canonical.com>
>> ---
>>  drivers/platform/x86/dell-laptop.c |    9 +++++++--
>>  1 files changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/platform/x86/dell-laptop.c b/drivers/platform/x86/dell-laptop.c
>> index 8bf7ac7..15d96a0 100644
>> --- a/drivers/platform/x86/dell-laptop.c
>> +++ b/drivers/platform/x86/dell-laptop.c
>> @@ -288,8 +288,10 @@ static const struct rfkill_ops dell_rfkill_ops = {
>>  /*
>>   * Called for each KEY_WLAN key press event. Note that a physical
>>   * rf-kill switch change also causes the BIOS to emit a KEY_WLAN.
>> + *
>> + * dell_rfkill_set may block, so schedule it on a worker thread.
>>   */
>> -static void dell_rfkill_update(void)
>> +static void dell_rfkill_update(struct work_struct *work)
>>  {
>>       hw_switch_status ^= BIT(HW_SWITCH_MASK);
>>       if (wifi_rfkill && (hw_switch_status & BIT(WLAN_SWITCH_MASK))) {
>> @@ -307,6 +309,7 @@ static void dell_rfkill_update(void)
>>               dell_rfkill_set((void*)3, rfkill_blocked(wwan_rfkill));
>>       }
>>  }
>> +DECLARE_WORK(dell_rfkill_update_work, &dell_rfkill_update);
>>
>>  static int dell_setup_rfkill(void)
>>  {
>> @@ -431,7 +434,9 @@ static bool dell_input_filter(struct input_handle *handle, unsigned int type,
>>                            unsigned int code, int value)
>>  {
>>       if (type == EV_KEY && code == KEY_WLAN && value == 1) {
>> -             dell_rfkill_update();
>> +             if (!schedule_work(&dell_rfkill_update_work))
>> +                     printk(KERN_NOTICE "rfkill switch handling already "
>> +                                        "scheduled, dropping this event\n");
>>               return 1;
>>       }
>>
>
>
Matthew Garrett - April 12, 2010, 4:04 p.m.
On Mon, Apr 12, 2010 at 04:20:42PM +0200, Stefan Bader wrote:
> Am I correct in assuming, this is intended for upstreaming? Will you propose
> stable when submitting it?

This ought to be already fixed in upstream. The broken code was never in 
a mainline release. As an aside, it would be nice to get some more 
upstream involvement in this - I've dug out and incorporated the SAUCE 
patches that seem to make sense, but it'd be nice to have them mailed to 
me instead!
Andy Whitcroft - April 13, 2010, 2:29 p.m.
On Fri, Apr 09, 2010 at 06:17:55PM -0400, Chase Douglas wrote:
> From: Chase Douglas <cndougla@emerald.redvoodoo.org>
> 
> dell_rfkill_update fires an SMI, which must occur on cpu 0. Thus, if called
> on a different cpu, the task will block. Since it's called in hard irq
> context, we must defer this to the worker thread.
> 
> It is potentially possible that two rfkill key events could be processed
> before the work completes on the first. The second event is dropped with a
> KERN_NOTICE message.
> 
> BugLink: http://bugs.launchpad.net/bugs/555261
> 
> Signed-off-by: Chase Douglas <chase.douglas@canonical.com>
> ---
>  drivers/platform/x86/dell-laptop.c |    9 +++++++--
>  1 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/platform/x86/dell-laptop.c b/drivers/platform/x86/dell-laptop.c
> index 8bf7ac7..15d96a0 100644
> --- a/drivers/platform/x86/dell-laptop.c
> +++ b/drivers/platform/x86/dell-laptop.c
> @@ -288,8 +288,10 @@ static const struct rfkill_ops dell_rfkill_ops = {
>  /*
>   * Called for each KEY_WLAN key press event. Note that a physical
>   * rf-kill switch change also causes the BIOS to emit a KEY_WLAN.
> + *
> + * dell_rfkill_set may block, so schedule it on a worker thread.
>   */
> -static void dell_rfkill_update(void)
> +static void dell_rfkill_update(struct work_struct *work)
>  {
>  	hw_switch_status ^= BIT(HW_SWITCH_MASK);
>  	if (wifi_rfkill && (hw_switch_status & BIT(WLAN_SWITCH_MASK))) {
> @@ -307,6 +309,7 @@ static void dell_rfkill_update(void)
>  		dell_rfkill_set((void*)3, rfkill_blocked(wwan_rfkill));
>  	}
>  }
> +DECLARE_WORK(dell_rfkill_update_work, &dell_rfkill_update);
>  
>  static int dell_setup_rfkill(void)
>  {
> @@ -431,7 +434,9 @@ static bool dell_input_filter(struct input_handle *handle, unsigned int type,
>  			     unsigned int code, int value)
>  {
>  	if (type == EV_KEY && code == KEY_WLAN && value == 1) {
> -		dell_rfkill_update();
> +		if (!schedule_work(&dell_rfkill_update_work))
> +			printk(KERN_NOTICE "rfkill switch handling already "
> +					   "scheduled, dropping this event\n");
>  		return 1;
>  	}
>  

Looks like a fair work-around.  In theory we could loose key-press
effects here which would could lead us to be out of sync with the H/W,
though to do that one would have to hit the key before the handler could
complete -- which I suspect is close to impossible to achieve.  In that
case we get a clear diagnostic, and its a heck of a lot better than a
Panic.

Acked-by: Andy Whitcroft <apw@canonical.com>

-apw
Andy Whitcroft - April 13, 2010, 2:39 p.m.
Applied to Lucid.

-apw
Matthew Garrett - April 13, 2010, 2:58 p.m.
On Tue, Apr 13, 2010 at 03:29:00PM +0100, Andy Whitcroft wrote:

> Looks like a fair work-around.  In theory we could loose key-press
> effects here which would could lead us to be out of sync with the H/W,
> though to do that one would have to hit the key before the handler could
> complete -- which I suspect is close to impossible to achieve.  In that
> case we get a clear diagnostic, and its a heck of a lot better than a
> Panic.

I think this case is fine. The button press updates the hardware, and 
when the update task eventually gets run it'll read the state from the 
hardware, so it doesn't matter how many toggles happen in the 
intervening time.
Andy Whitcroft - April 16, 2010, 2:06 p.m.
On Mon, Apr 12, 2010 at 05:04:38PM +0100, Matthew Garrett wrote:
> On Mon, Apr 12, 2010 at 04:20:42PM +0200, Stefan Bader wrote:
> > Am I correct in assuming, this is intended for upstreaming? Will you propose
> > stable when submitting it?
> 
> This ought to be already fixed in upstream. The broken code was never in 
> a mainline release. As an aside, it would be nice to get some more 
> upstream involvement in this - I've dug out and incorporated the SAUCE 
> patches that seem to make sense, but it'd be nice to have them mailed to 
> me instead!

Hmmm I had assumed that Mario was going to be pushing the whole lot
upstream as well.  I am planning on a sweep of our delta pushing up the
sane bits shortly as well.

-apw
Mario Limonciello - April 16, 2010, 5:11 p.m.
Hi Guys:

On Fri, Apr 16, 2010 at 09:06, Andy Whitcroft <apw@canonical.com> wrote:

> On Mon, Apr 12, 2010 at 05:04:38PM +0100, Matthew Garrett wrote:
> > On Mon, Apr 12, 2010 at 04:20:42PM +0200, Stefan Bader wrote:
> > > Am I correct in assuming, this is intended for upstreaming? Will you
> propose
> > > stable when submitting it?
> >
> > This ought to be already fixed in upstream. The broken code was never in
> > a mainline release. As an aside, it would be nice to get some more
> > upstream involvement in this - I've dug out and incorporated the SAUCE
> > patches that seem to make sense, but it'd be nice to have them mailed to
> > me instead!
>
> Hmmm I had assumed that Mario was going to be pushing the whole lot
> upstream as well.  I am planning on a sweep of our delta pushing up the
> sane bits shortly as well.
>
>
As I recall, during karmic we pulled a patch that didn't make it upstream in
the form we have it, and a bunch of those other patches layered on top of
it. I didn't have any type of notification that a different form had finally
landed until Matthew started pulling some of the patches we had in too.
Chase Douglas - May 17, 2010, 6:52 p.m.
On Fri, Apr 16, 2010 at 10:06 AM, Andy Whitcroft <apw@canonical.com> wrote:
> On Mon, Apr 12, 2010 at 05:04:38PM +0100, Matthew Garrett wrote:
>> On Mon, Apr 12, 2010 at 04:20:42PM +0200, Stefan Bader wrote:
>> > Am I correct in assuming, this is intended for upstreaming? Will you propose
>> > stable when submitting it?
>>
>> This ought to be already fixed in upstream. The broken code was never in
>> a mainline release. As an aside, it would be nice to get some more
>> upstream involvement in this - I've dug out and incorporated the SAUCE
>> patches that seem to make sense, but it'd be nice to have them mailed to
>> me instead!
>
> Hmmm I had assumed that Mario was going to be pushing the whole lot
> upstream as well.  I am planning on a sweep of our delta pushing up the
> sane bits shortly as well.

I went and checked upstream to find out if it was using a work queue
to handle this already as Matthew thought. Of course, he was correct:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=814cb8adbe2fb49302ac65bc31fa749143823860

It looks like our Maverick kernel code for this driver is based on
upstream and not on Lucid, so we should be good going forward.
Hopefully this snafu won't happen again. Matthew, thanks for bringing
this to our attention!

-- Chase

Patch

diff --git a/drivers/platform/x86/dell-laptop.c b/drivers/platform/x86/dell-laptop.c
index 8bf7ac7..15d96a0 100644
--- a/drivers/platform/x86/dell-laptop.c
+++ b/drivers/platform/x86/dell-laptop.c
@@ -288,8 +288,10 @@  static const struct rfkill_ops dell_rfkill_ops = {
 /*
  * Called for each KEY_WLAN key press event. Note that a physical
  * rf-kill switch change also causes the BIOS to emit a KEY_WLAN.
+ *
+ * dell_rfkill_set may block, so schedule it on a worker thread.
  */
-static void dell_rfkill_update(void)
+static void dell_rfkill_update(struct work_struct *work)
 {
 	hw_switch_status ^= BIT(HW_SWITCH_MASK);
 	if (wifi_rfkill && (hw_switch_status & BIT(WLAN_SWITCH_MASK))) {
@@ -307,6 +309,7 @@  static void dell_rfkill_update(void)
 		dell_rfkill_set((void*)3, rfkill_blocked(wwan_rfkill));
 	}
 }
+DECLARE_WORK(dell_rfkill_update_work, &dell_rfkill_update);
 
 static int dell_setup_rfkill(void)
 {
@@ -431,7 +434,9 @@  static bool dell_input_filter(struct input_handle *handle, unsigned int type,
 			     unsigned int code, int value)
 {
 	if (type == EV_KEY && code == KEY_WLAN && value == 1) {
-		dell_rfkill_update();
+		if (!schedule_work(&dell_rfkill_update_work))
+			printk(KERN_NOTICE "rfkill switch handling already "
+					   "scheduled, dropping this event\n");
 		return 1;
 	}