Message ID | 5745D2DD.6080300@nvidia.com |
---|---|
State | Not Applicable, archived |
Headers | show |
On 5/25/2016 12:29 PM, Jon Hunter wrote: > > On 25/05/16 17:10, Jon Hunter wrote: > > ... > >> So power_supply_read_temp() calls ->get_property() and passes the >> power_supply psy struct which is initialised. The problem is that inside >> the bq27xxx driver, this then kicks off the worker thread to update the >> bq27xxx state and when this worker thread runs it attempts to access the >> same psy struct but by dereferencing a pointer to it from the >> bq27xxx_device_info where the pointer has not been initialised yet. >> Therefore, IMO it seems that we should not allow this worker thread to >> start until the registration has completed and hence the pointer is >> initialised. > > Sorry, it is not the actual worker thread that triggers the NULL pointer > deference, but the function bq27xxx_battery_poll() that schedules the > worker thread. Anyway, I still don't see that we need to update the > bq27xxx state during the registration especially seeing as we call > bq27xxx_battery_update() after the registration is complete. It seems > that updating the overall state should be mutually exclusive from > reading the temp. > > Looking at my patch, it does appear that the worker thread which also > calls bq27xxx_battery_update() is still scheduled and so may be it > should be ... > > diff --git a/drivers/power/bq27xxx_battery.c b/drivers/power/bq27xxx_battery.c > index 45f6ebf88df6..1334ed522332 100644 > --- a/drivers/power/bq27xxx_battery.c > +++ b/drivers/power/bq27xxx_battery.c > @@ -733,6 +733,9 @@ static void bq27xxx_battery_poll(struct work_struct *work) > container_of(work, struct bq27xxx_device_info, > work.work); > > + if (!di->bat) > + return; > + > bq27xxx_battery_update(di); > > if (poll_interval > 0) { > > I can see that getting the temperature could work. I would point out that I don't see any recent changes to bq27xxx or the power_supply_core that would imply this is a regression. My guess is that up until now, for devices that support the TEMP property, CONFIG_THERMAL isn't been enabled. So here are my thoughts.... we can do 2 things here: 1) patch bq27xxx in some manner that will allow the bq27xxx driver to work report the temp during register (such as above patch). 2) Patch the core to avoid using get_property callback during registration. I think for our immediate concern and crash, #1 is fine. It will work and is fine. I however think this is just a symptom of the larger issue (#2). In this case, the problem we see is that di->bat is used before it is set, and we have a way around it. However, for EVERY device that registers and has TEMP prop (and CONFIG_THERMAL enabled) it is going to receive a call with its relative di->bat uninitialized too. I don't know for certain if #2 has caused problems anywhere else, and I would be surprised if it has and hasn't been caught. AS far as this crash is concerned, I think either approach will work. Adding in David, Dmitry, and Sebastian (maintainers) to see if they have a preferred approach. -rhyland
On 25/05/16 17:36, Rhyland Klein wrote: ... > I can see that getting the temperature could work. I would point out > that I don't see any recent changes to bq27xxx or the power_supply_core > that would imply this is a regression. My guess is that up until now, > for devices that support the TEMP property, CONFIG_THERMAL isn't been > enabled. > > So here are my thoughts.... we can do 2 things here: > > 1) patch bq27xxx in some manner that will allow the bq27xxx driver to > work report the temp during register (such as above patch). > 2) Patch the core to avoid using get_property callback during registration. I wonder if #2 will cause other problems for other devices as this really changes the functionality. > I think for our immediate concern and crash, #1 is fine. It will work > and is fine. I however think this is just a symptom of the larger issue > (#2). In this case, the problem we see is that di->bat is used before it > is set, and we have a way around it. However, for EVERY device that > registers and has TEMP prop (and CONFIG_THERMAL enabled) it is going to > receive a call with its relative di->bat uninitialized too. I don't think we are understanding each other and I still think that this could be specific to the bq27xxx. And here is why ... The power supply driver is going to receive a call to it's ->get_property() function with a *VALID* pointer to the power_supply structure, *psy. When initialised, di->bat == psy (assuming bq27xxx naming) and so in other words, they point to the same thing. Therefore, in the normal case, you should not need to access di->bat from within ->get_property() because you already have a valid pointer to the power_supply structure, *psy. So the ONLY problem would be IF the ->get_property function calls power_supply_get_drvdata() to get a pointer to the drvdata, *di, and then dereferences and uses the pointer, di->bat, which may NOT be initialised yet. I am wondering how likely this is as it seems a bit daft to do this, unless you are doing something like the bq27xxx driver is attempting to do. Again IMO the problem is related to how the bq27xxx driver has implemented the status update. Cheers Jon
On 5/25/2016 1:26 PM, Jon Hunter wrote: > > On 25/05/16 17:36, Rhyland Klein wrote: > > ... > >> I can see that getting the temperature could work. I would point out >> that I don't see any recent changes to bq27xxx or the power_supply_core >> that would imply this is a regression. My guess is that up until now, >> for devices that support the TEMP property, CONFIG_THERMAL isn't been >> enabled. >> >> So here are my thoughts.... we can do 2 things here: >> >> 1) patch bq27xxx in some manner that will allow the bq27xxx driver to >> work report the temp during register (such as above patch). >> 2) Patch the core to avoid using get_property callback during registration. > > I wonder if #2 will cause other problems for other devices as this > really changes the functionality. > >> I think for our immediate concern and crash, #1 is fine. It will work >> and is fine. I however think this is just a symptom of the larger issue >> (#2). In this case, the problem we see is that di->bat is used before it >> is set, and we have a way around it. However, for EVERY device that >> registers and has TEMP prop (and CONFIG_THERMAL enabled) it is going to >> receive a call with its relative di->bat uninitialized too. > > I don't think we are understanding each other and I still think that > this could be specific to the bq27xxx. And here is why ... > > The power supply driver is going to receive a call to it's > ->get_property() function with a *VALID* pointer to the power_supply > structure, *psy. When initialised, di->bat == psy (assuming bq27xxx > naming) and so in other words, they point to the same thing. Therefore, > in the normal case, you should not need to access di->bat from within > ->get_property() because you already have a valid pointer to the > power_supply structure, *psy. > > So the ONLY problem would be IF the ->get_property function calls > power_supply_get_drvdata() to get a pointer to the drvdata, *di, and > then dereferences and uses the pointer, di->bat, which may NOT be > initialised yet. I am wondering how likely this is as it seems a bit > daft to do this, unless you are doing something like the bq27xxx driver > is attempting to do. > > Again IMO the problem is related to how the bq27xxx driver has > implemented the status update. > And you might be completely correct, that is something that can only happen specifically with the bq27xxx driver. In which case, making the fix there should be the fix. I just know from the commit log (and some previous work with power supply drivers) that the case of get_property being called during registration has caused problems before. That's why I am trying to make sure we cover the generic case if it exists. Using scheduled work is common for power_supplies to regularly update their status. As for your proposed patches for bq27xxx, I think the latest one you suggested (@12:36PM EST) with the change for battery_update->battery_poll as well makes the most sense for bq27xxx. I would like to point out though that if we patch this, the cache won't be populated for the first TEMP request, which has the same end effect as the patch I proposed to power_supply_read_temp. I believe both will return 0 for the temp. I think that patch would work just fine in place of what I suggested for this specific crash. -rhyland
On 05/25/2016 09:44 PM, Rhyland Klein wrote: > On 5/25/2016 1:26 PM, Jon Hunter wrote: >> >> On 25/05/16 17:36, Rhyland Klein wrote: >> >> ... >> >>> I can see that getting the temperature could work. I would point out >>> that I don't see any recent changes to bq27xxx or the power_supply_core >>> that would imply this is a regression. My guess is that up until now, >>> for devices that support the TEMP property, CONFIG_THERMAL isn't been >>> enabled. >>> >>> So here are my thoughts.... we can do 2 things here: >>> >>> 1) patch bq27xxx in some manner that will allow the bq27xxx driver to >>> work report the temp during register (such as above patch). >>> 2) Patch the core to avoid using get_property callback during registration. >> >> I wonder if #2 will cause other problems for other devices as this >> really changes the functionality. >> >>> I think for our immediate concern and crash, #1 is fine. It will work >>> and is fine. I however think this is just a symptom of the larger issue >>> (#2). In this case, the problem we see is that di->bat is used before it >>> is set, and we have a way around it. However, for EVERY device that >>> registers and has TEMP prop (and CONFIG_THERMAL enabled) it is going to >>> receive a call with its relative di->bat uninitialized too. >> >> I don't think we are understanding each other and I still think that >> this could be specific to the bq27xxx. And here is why ... >> >> The power supply driver is going to receive a call to it's >> ->get_property() function with a *VALID* pointer to the power_supply >> structure, *psy. When initialised, di->bat == psy (assuming bq27xxx >> naming) and so in other words, they point to the same thing. Therefore, >> in the normal case, you should not need to access di->bat from within >> ->get_property() because you already have a valid pointer to the >> power_supply structure, *psy. >> >> So the ONLY problem would be IF the ->get_property function calls >> power_supply_get_drvdata() to get a pointer to the drvdata, *di, and >> then dereferences and uses the pointer, di->bat, which may NOT be >> initialised yet. I am wondering how likely this is as it seems a bit >> daft to do this, unless you are doing something like the bq27xxx driver >> is attempting to do. >> >> Again IMO the problem is related to how the bq27xxx driver has >> implemented the status update. >> > > And you might be completely correct, that is something that can only > happen specifically with the bq27xxx driver. In which case, making the > fix there should be the fix. I just know from the commit log (and some > previous work with power supply drivers) that the case of get_property > being called during registration has caused problems before. That's why > I am trying to make sure we cover the generic case if it exists. Using > scheduled work is common for power_supplies to regularly update their > status. > > As for your proposed patches for bq27xxx, I think the latest one you > suggested (@12:36PM EST) with the change for > battery_update->battery_poll as well makes the most sense for bq27xxx. I > would like to point out though that if we patch this, the cache won't be > populated for the first TEMP request, which has the same end effect as > the patch I proposed to power_supply_read_temp. I believe both will > return 0 for the temp. > > I think that patch would work just fine in place of what I suggested for > this specific crash. Hello all, Indeed I was struggling with similar issue in bq27x00_battery. The issue was introduced by... me :( when moving the ownership of power supply structure from driver to the core. However IMHO my change exposed the fundamental problem with power supply. Anyway a fix for this issue was: 7f1a57fdd6cb6e7b (power_supply: Fix possible NULL pointer dereference on early uevent) AFAIU, this fix no longer fixes all the issues, right? As for the fundamental problem, the power supply core should not call back the driver (get_property()) until the probe ends. Even if the di->bat was initialized, some other fields of driver could not be set yet. In general, the probe did not end so we should avoid calling driver internal functions. In this particular problem: 1. Fix for the driver (!di->bat) is okay... but it won't solve the problem in general. 2. I think the core should handle it somehow... Best regards, Krzysztof -- To unsubscribe from this list: send the line "unsubscribe linux-tegra" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/27/2016 10:37 AM, Krzysztof Kozlowski wrote: >> And you might be completely correct, that is something that can only >> happen specifically with the bq27xxx driver. In which case, making the >> fix there should be the fix. I just know from the commit log (and some >> previous work with power supply drivers) that the case of get_property >> being called during registration has caused problems before. That's why >> I am trying to make sure we cover the generic case if it exists. Using >> scheduled work is common for power_supplies to regularly update their >> status. >> >> As for your proposed patches for bq27xxx, I think the latest one you >> suggested (@12:36PM EST) with the change for >> battery_update->battery_poll as well makes the most sense for bq27xxx. I >> would like to point out though that if we patch this, the cache won't be >> populated for the first TEMP request, which has the same end effect as >> the patch I proposed to power_supply_read_temp. I believe both will >> return 0 for the temp. >> >> I think that patch would work just fine in place of what I suggested for >> this specific crash. > > Hello all, > > Indeed I was struggling with similar issue in bq27x00_battery. The issue > was introduced by... me :( when moving the ownership of power supply > structure from driver to the core. However IMHO my change exposed the > fundamental problem with power supply. > > Anyway a fix for this issue was: > 7f1a57fdd6cb6e7b (power_supply: Fix possible NULL pointer dereference on > early uevent) > AFAIU, this fix no longer fixes all the issues, right? > > As for the fundamental problem, the power supply core should not call > back the driver (get_property()) until the probe ends. Even if the > di->bat was initialized, some other fields of driver could not be set > yet. In general, the probe did not end so we should avoid calling driver > internal functions. > > In this particular problem: > 1. Fix for the driver (!di->bat) is okay... but it won't solve the > problem in general. > 2. I think the core should handle it somehow... I was thinking about some more generic solutions for that. Few ideas: 1. Split the power_supply_register() into register + manual call to power_supply_changed(). Each driver will have to call the power_supply_changed() when it is ready to do it. After that call, it is expected that driver provides everything for power supply (it can receive callbacks). 2. Since 7f1a57fdd6cb ("power_supply: Fix possible NULL pointer dereference on early uevent") the power_supply_changed() is called from a deferred work. Separate thread. We can introduce (in the core only) a mutex: power_supply_deferred_register_work() { psy->mutex_lock(); power_supply_changed(psy); psy->mutex_unlock(); } and add it also to all of API: power_supply_get_property() { psy->mutex_lock(); psy->get_property(); psy->mutex_unlock(); } The changes would be limited only to the core but we will introduce strict locking over all of the psy callbacks. 3. We can go back to previous API, leaving the allocation done by the core: some_drv_probe() { err = power_supply_register(&some_drv->psy...); } I think the second solution seems to be the most self-contained and robust. Best regards, Krzysztof -- To unsubscribe from this list: send the line "unsubscribe linux-tegra" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Krzysztof, On 27/05/16 09:37, Krzysztof Kozlowski wrote: ... > Indeed I was struggling with similar issue in bq27x00_battery. The issue > was introduced by... me :( when moving the ownership of power supply > structure from driver to the core. However IMHO my change exposed the > fundamental problem with power supply. > > Anyway a fix for this issue was: > 7f1a57fdd6cb6e7b (power_supply: Fix possible NULL pointer dereference on > early uevent) > AFAIU, this fix no longer fixes all the issues, right? > > As for the fundamental problem, the power supply core should not call > back the driver (get_property()) until the probe ends. Even if the > di->bat was initialized, some other fields of driver could not be set > yet. In general, the probe did not end so we should avoid calling driver > internal functions. For my understanding, can you elaborate why the power-supply core should not call back to the drivers ->get_property() before the probe ends? I assume that registering the power-supply should be the last thing done in the probe and so the power-supply should be configured at that point. The problems with the bq27xxx seem to stem from the periodic update of the bq27xxx status and so it is not clear to me that this is a generic problem for all power-supply devices. Cheers Jon
On 05/27/2016 12:28 PM, Jon Hunter wrote: > Hi Krzysztof, > > On 27/05/16 09:37, Krzysztof Kozlowski wrote: > > ... > >> Indeed I was struggling with similar issue in bq27x00_battery. The issue >> was introduced by... me :( when moving the ownership of power supply >> structure from driver to the core. However IMHO my change exposed the >> fundamental problem with power supply. >> >> Anyway a fix for this issue was: >> 7f1a57fdd6cb6e7b (power_supply: Fix possible NULL pointer dereference on >> early uevent) >> AFAIU, this fix no longer fixes all the issues, right? >> >> As for the fundamental problem, the power supply core should not call >> back the driver (get_property()) until the probe ends. Even if the >> di->bat was initialized, some other fields of driver could not be set >> yet. In general, the probe did not end so we should avoid calling driver >> internal functions. > > For my understanding, can you elaborate why the power-supply core should > not call back to the drivers ->get_property() before the probe ends? I > assume that registering the power-supply should be the last thing done > in the probe and so the power-supply should be configured at that point. It is not only about power supply but other resources allocated by the driver. If the power_supply_register() is a last call, then no problem. But if not, then these resources won't be available. Actually I exaggerated a little bit as a fundamental problem as this is quite common pattern. When driver provides something (like power supply) then after registration it should be ready for calls coming from the core or user space. It does not have to be power supply. It might be exposing sysfs entries or file operations (exposed before calling power_supply_register()). > The problems with the bq27xxx seem to stem from the periodic update of > the bq27xxx status and so it is not clear to me that this is a generic > problem for all power-supply devices. Initially, the generic problem was that the core would call back the driver from power_supply_register() in a synchronous way through power_supply_changed(). The commit 7f1a57fdd6c changed it to an asynchronous call. Here it looks like the same problem - the power_supply_register() calls thermal which calls thermal_zone_device_update() and we are back at the driver... before finishing power_supply_register() call. Best regards, Krzysztof -- To unsubscribe from this list: send the line "unsubscribe linux-tegra" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 27/05/16 12:46, Krzysztof Kozlowski wrote: > On 05/27/2016 12:28 PM, Jon Hunter wrote: >> Hi Krzysztof, >> >> On 27/05/16 09:37, Krzysztof Kozlowski wrote: >> >> ... >> >>> Indeed I was struggling with similar issue in bq27x00_battery. The issue >>> was introduced by... me :( when moving the ownership of power supply >>> structure from driver to the core. However IMHO my change exposed the >>> fundamental problem with power supply. >>> >>> Anyway a fix for this issue was: >>> 7f1a57fdd6cb6e7b (power_supply: Fix possible NULL pointer dereference on >>> early uevent) >>> AFAIU, this fix no longer fixes all the issues, right? >>> >>> As for the fundamental problem, the power supply core should not call >>> back the driver (get_property()) until the probe ends. Even if the >>> di->bat was initialized, some other fields of driver could not be set >>> yet. In general, the probe did not end so we should avoid calling driver >>> internal functions. >> >> For my understanding, can you elaborate why the power-supply core should >> not call back to the drivers ->get_property() before the probe ends? I >> assume that registering the power-supply should be the last thing done >> in the probe and so the power-supply should be configured at that point. > > It is not only about power supply but other resources allocated by the > driver. If the power_supply_register() is a last call, then no problem. > But if not, then these resources won't be available. > > Actually I exaggerated a little bit as a fundamental problem as this is > quite common pattern. When driver provides something (like power supply) > then after registration it should be ready for calls coming from the > core or user space. It does not have to be power supply. It might be > exposing sysfs entries or file operations (exposed before calling > power_supply_register()). Right, exactly when you register with the power-supply core the device better be ready so that handle any incoming calls. >> The problems with the bq27xxx seem to stem from the periodic update of >> the bq27xxx status and so it is not clear to me that this is a generic >> problem for all power-supply devices. > > Initially, the generic problem was that the core would call back the > driver from power_supply_register() in a synchronous way through > power_supply_changed(). The commit 7f1a57fdd6c changed it to an > asynchronous call. Here it looks like the same problem - the > power_supply_register() calls thermal which calls > thermal_zone_device_update() and we are back at the driver... before > finishing power_supply_register() call. So I am still not convinced this is a generic problem but a problem with the bq27xxx. In fact, I think that commit 7f1a57fdd6c could be avoided if we did something like ... http://marc.info/?l=linux-kernel&m=146425896332433&w=2 AFAICT in most cases, in ->get_property() you should have no need to access a driver's equivalent of di->bat, because you have already been passed a pointer to this via the *psy argument. Cheers Jon
On 05/27/2016 02:17 PM, Jon Hunter wrote: > > On 27/05/16 12:46, Krzysztof Kozlowski wrote: >> On 05/27/2016 12:28 PM, Jon Hunter wrote: >>> Hi Krzysztof, >>> >>> On 27/05/16 09:37, Krzysztof Kozlowski wrote: >>> >>> ... >>> >>>> Indeed I was struggling with similar issue in bq27x00_battery. The issue >>>> was introduced by... me :( when moving the ownership of power supply >>>> structure from driver to the core. However IMHO my change exposed the >>>> fundamental problem with power supply. >>>> >>>> Anyway a fix for this issue was: >>>> 7f1a57fdd6cb6e7b (power_supply: Fix possible NULL pointer dereference on >>>> early uevent) >>>> AFAIU, this fix no longer fixes all the issues, right? >>>> >>>> As for the fundamental problem, the power supply core should not call >>>> back the driver (get_property()) until the probe ends. Even if the >>>> di->bat was initialized, some other fields of driver could not be set >>>> yet. In general, the probe did not end so we should avoid calling driver >>>> internal functions. >>> >>> For my understanding, can you elaborate why the power-supply core should >>> not call back to the drivers ->get_property() before the probe ends? I >>> assume that registering the power-supply should be the last thing done >>> in the probe and so the power-supply should be configured at that point. >> >> It is not only about power supply but other resources allocated by the >> driver. If the power_supply_register() is a last call, then no problem. >> But if not, then these resources won't be available. >> >> Actually I exaggerated a little bit as a fundamental problem as this is >> quite common pattern. When driver provides something (like power supply) >> then after registration it should be ready for calls coming from the >> core or user space. It does not have to be power supply. It might be >> exposing sysfs entries or file operations (exposed before calling >> power_supply_register()). > > Right, exactly when you register with the power-supply core the device > better be ready so that handle any incoming calls. Yes, the unusual thing here is that the device is called back directly from the power_supply_register() call. > >>> The problems with the bq27xxx seem to stem from the periodic update of >>> the bq27xxx status and so it is not clear to me that this is a generic >>> problem for all power-supply devices. >> >> Initially, the generic problem was that the core would call back the >> driver from power_supply_register() in a synchronous way through >> power_supply_changed(). The commit 7f1a57fdd6c changed it to an >> asynchronous call. Here it looks like the same problem - the >> power_supply_register() calls thermal which calls >> thermal_zone_device_update() and we are back at the driver... before >> finishing power_supply_register() call. > > So I am still not convinced this is a generic problem but a problem with > the bq27xxx. In fact, I think that commit 7f1a57fdd6c could be avoided > if we did something like ... > > http://marc.info/?l=linux-kernel&m=146425896332433&w=2 > > AFAICT in most cases, in ->get_property() you should have no need to > access a driver's equivalent of di->bat, because you have already been > passed a pointer to this via the *psy argument. I agree that get_property() shouldn't access di->bat. However if it is not forbidden (at least by documentation) then someone might just do it because he does not know about such requirement. Best regards, Krzysztof -- To unsubscribe from this list: send the line "unsubscribe linux-tegra" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 27/05/16 13:55, Krzysztof Kozlowski wrote: > On 05/27/2016 02:17 PM, Jon Hunter wrote: >> >> On 27/05/16 12:46, Krzysztof Kozlowski wrote: >>> On 05/27/2016 12:28 PM, Jon Hunter wrote: >>>> Hi Krzysztof, >>>> >>>> On 27/05/16 09:37, Krzysztof Kozlowski wrote: >>>> >>>> ... >>>> >>>>> Indeed I was struggling with similar issue in bq27x00_battery. The issue >>>>> was introduced by... me :( when moving the ownership of power supply >>>>> structure from driver to the core. However IMHO my change exposed the >>>>> fundamental problem with power supply. >>>>> >>>>> Anyway a fix for this issue was: >>>>> 7f1a57fdd6cb6e7b (power_supply: Fix possible NULL pointer dereference on >>>>> early uevent) >>>>> AFAIU, this fix no longer fixes all the issues, right? >>>>> >>>>> As for the fundamental problem, the power supply core should not call >>>>> back the driver (get_property()) until the probe ends. Even if the >>>>> di->bat was initialized, some other fields of driver could not be set >>>>> yet. In general, the probe did not end so we should avoid calling driver >>>>> internal functions. >>>> >>>> For my understanding, can you elaborate why the power-supply core should >>>> not call back to the drivers ->get_property() before the probe ends? I >>>> assume that registering the power-supply should be the last thing done >>>> in the probe and so the power-supply should be configured at that point. >>> >>> It is not only about power supply but other resources allocated by the >>> driver. If the power_supply_register() is a last call, then no problem. >>> But if not, then these resources won't be available. >>> >>> Actually I exaggerated a little bit as a fundamental problem as this is >>> quite common pattern. When driver provides something (like power supply) >>> then after registration it should be ready for calls coming from the >>> core or user space. It does not have to be power supply. It might be >>> exposing sysfs entries or file operations (exposed before calling >>> power_supply_register()). >> >> Right, exactly when you register with the power-supply core the device >> better be ready so that handle any incoming calls. > > Yes, the unusual thing here is that the device is called back directly > from the power_supply_register() call. > >> >>>> The problems with the bq27xxx seem to stem from the periodic update of >>>> the bq27xxx status and so it is not clear to me that this is a generic >>>> problem for all power-supply devices. >>> >>> Initially, the generic problem was that the core would call back the >>> driver from power_supply_register() in a synchronous way through >>> power_supply_changed(). The commit 7f1a57fdd6c changed it to an >>> asynchronous call. Here it looks like the same problem - the >>> power_supply_register() calls thermal which calls >>> thermal_zone_device_update() and we are back at the driver... before >>> finishing power_supply_register() call. >> >> So I am still not convinced this is a generic problem but a problem with >> the bq27xxx. In fact, I think that commit 7f1a57fdd6c could be avoided >> if we did something like ... >> >> http://marc.info/?l=linux-kernel&m=146425896332433&w=2 >> >> AFAICT in most cases, in ->get_property() you should have no need to >> access a driver's equivalent of di->bat, because you have already been >> passed a pointer to this via the *psy argument. > > I agree that get_property() shouldn't access di->bat. However if it is > not forbidden (at least by documentation) then someone might just do it > because he does not know about such requirement. In that case, shouldn't the driver should check that di->bat is valid before anyone attempts to dereference it? However, if you and/or Rhyland have a generic fix for preventing this, please go ahead and propose it. Cheers Jon
diff --git a/drivers/power/bq27xxx_battery.c b/drivers/power/bq27xxx_battery.c index 45f6ebf88df6..1334ed522332 100644 --- a/drivers/power/bq27xxx_battery.c +++ b/drivers/power/bq27xxx_battery.c @@ -733,6 +733,9 @@ static void bq27xxx_battery_poll(struct work_struct *work) container_of(work, struct bq27xxx_device_info, work.work); + if (!di->bat) + return; + bq27xxx_battery_update(di); if (poll_interval > 0) {