diff mbox series

[01/18] pwm: Provide devm_pwmchip_alloc() function

Message ID 20230718181849.3947851-2-u.kleine-koenig@pengutronix.de
State Rejected
Headers show
Series pwm: Provide devm_pwmchip_alloc() function | expand

Commit Message

Uwe Kleine-König July 18, 2023, 6:18 p.m. UTC
This function allocates a struct pwm_chip and driver data. Compared to
the status quo the split into pwm_chip and driver data is new, otherwise
it doesn't change anything relevant (yet).

The intention is that after all drivers are switched to use this
allocation function, its possible to add a struct device to struct
pwm_chip to properly track the latter's lifetime without touching all
drivers again. Proper lifetime tracking is a necessary precondition to
introduce character device support for PWMs (that implements atomic
setting and doesn't suffer from the sysfs overhead of the /sys/class/pwm
userspace support).

The new function pwmchip_priv() (obviously?) only works for chips
allocated with devm_pwmchip_alloc().

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
---
 .../driver-api/driver-model/devres.rst        |  1 +
 Documentation/driver-api/pwm.rst              | 10 ++++----
 drivers/pwm/core.c                            | 23 +++++++++++++++++++
 include/linux/pwm.h                           |  4 ++++
 4 files changed, 33 insertions(+), 5 deletions(-)

Comments

Andy Shevchenko July 18, 2023, 8:05 p.m. UTC | #1
On Tue, Jul 18, 2023 at 08:18:32PM +0200, Uwe Kleine-König wrote:
> This function allocates a struct pwm_chip and driver data. Compared to
> the status quo the split into pwm_chip and driver data is new, otherwise
> it doesn't change anything relevant (yet).
> 
> The intention is that after all drivers are switched to use this
> allocation function, its possible to add a struct device to struct
> pwm_chip to properly track the latter's lifetime without touching all
> drivers again. Proper lifetime tracking is a necessary precondition to
> introduce character device support for PWMs (that implements atomic
> setting and doesn't suffer from the sysfs overhead of the /sys/class/pwm
> userspace support).
> 
> The new function pwmchip_priv() (obviously?) only works for chips
> allocated with devm_pwmchip_alloc().

...

> +void *pwmchip_priv(struct pwm_chip *chip)
> +{
> +	return (char *)chip + ALIGN(sizeof(*chip), 32);

Why 32? I haven't found any explanation on the choice. I can understand arch
minimum align, but hard coded value is a bit hard to get.

> +}

...

> +struct pwm_chip *devm_pwmchip_alloc(struct device *parent, size_t sizeof_priv)
> +{
> +	struct pwm_chip *chip;
> +	size_t alloc_size;

> +	alloc_size = ALIGN(sizeof(*chip), 32) + sizeof_priv;

Ditto.

Shouldn't it use a macro from overflow.h?

> +	chip = devm_kzalloc(parent, alloc_size, GFP_KERNEL);
> +	if (!chip)
> +		return NULL;
> +
> +	chip->dev = parent;
> +
> +	return chip;
> +}
Thierry Reding July 19, 2023, 7:59 a.m. UTC | #2
On Tue, Jul 18, 2023 at 08:18:32PM +0200, Uwe Kleine-König wrote:
> This function allocates a struct pwm_chip and driver data. Compared to
> the status quo the split into pwm_chip and driver data is new, otherwise
> it doesn't change anything relevant (yet).
> 
> The intention is that after all drivers are switched to use this
> allocation function, its possible to add a struct device to struct
> pwm_chip to properly track the latter's lifetime without touching all
> drivers again. Proper lifetime tracking is a necessary precondition to
> introduce character device support for PWMs (that implements atomic
> setting and doesn't suffer from the sysfs overhead of the /sys/class/pwm
> userspace support).
> 
> The new function pwmchip_priv() (obviously?) only works for chips
> allocated with devm_pwmchip_alloc().

If this is supposed to be similar to the GPIO chardev, why doesn't GPIO
require this way of allocating a struct gpio_chip? I'm not a fan of
doing all this upfront work without seeing where this is ultimately
headed. Please hold off on reworking everything until you have a
complete proposal that can be reviewed in full.

Thierry
Uwe Kleine-König July 19, 2023, 8:57 a.m. UTC | #3
Hello Thierry,

On Wed, Jul 19, 2023 at 09:59:29AM +0200, Thierry Reding wrote:
> On Tue, Jul 18, 2023 at 08:18:32PM +0200, Uwe Kleine-König wrote:
> > This function allocates a struct pwm_chip and driver data. Compared to
> > the status quo the split into pwm_chip and driver data is new, otherwise
> > it doesn't change anything relevant (yet).
> > 
> > The intention is that after all drivers are switched to use this
> > allocation function, its possible to add a struct device to struct
> > pwm_chip to properly track the latter's lifetime without touching all
> > drivers again. Proper lifetime tracking is a necessary precondition to
> > introduce character device support for PWMs (that implements atomic
> > setting and doesn't suffer from the sysfs overhead of the /sys/class/pwm
> > userspace support).
> > 
> > The new function pwmchip_priv() (obviously?) only works for chips
> > allocated with devm_pwmchip_alloc().
> 
> If this is supposed to be similar to the GPIO chardev, why doesn't GPIO
> require this way of allocating a struct gpio_chip?

For each gpio_chip there is an external struct gpio_device that is
allocated in gpiochip_add(). This works but has some downsides. E.g. 
gpiochip_remove() has to grab a mutex that is also held while userspace
polls a gpioctrl device and so gpiochip_remove blocks. While this works
(as in: it doesn't crash) it's in the category "not as good as
possible".

I'm convinced that getting this right before adding the complexity of
chardev support is a good idea.

> I'm not a fan of doing all this upfront work without seeing where this
> is ultimately headed. Please hold off on reworking everything until
> you have a complete proposal that can be reviewed in full.

I see some benefit of the conversion started in this patch set stand
alone, too. But I agree it's for now a bit theoretic.

With the goal to pick the agreed good approach for pwm this is a bigger
pile of work. Completing it in private and only present the complete
proposal has the downside that I get feedback for the house's seating
only when the roof is assembled. While me spending much time isn't your
problem, I have a problem with it. Splitting the task in chunks that go
in one after another is surely the quicker and more effective approach.

So it would be great if you could understand the issue tackled here even
though there are subsystems that do it differently (and less optimal)
and the usecase isn't complete yet. 

Thanks
Uwe
Uwe Kleine-König July 19, 2023, 10:31 a.m. UTC | #4
Hello Andy,

On Tue, Jul 18, 2023 at 11:05:29PM +0300, Andy Shevchenko wrote:
> On Tue, Jul 18, 2023 at 08:18:32PM +0200, Uwe Kleine-König wrote:
> > This function allocates a struct pwm_chip and driver data. Compared to
> > the status quo the split into pwm_chip and driver data is new, otherwise
> > it doesn't change anything relevant (yet).
> > 
> > The intention is that after all drivers are switched to use this
> > allocation function, its possible to add a struct device to struct
> > pwm_chip to properly track the latter's lifetime without touching all
> > drivers again. Proper lifetime tracking is a necessary precondition to
> > introduce character device support for PWMs (that implements atomic
> > setting and doesn't suffer from the sysfs overhead of the /sys/class/pwm
> > userspace support).
> > 
> > The new function pwmchip_priv() (obviously?) only works for chips
> > allocated with devm_pwmchip_alloc().
> 
> ...
> 
> > +void *pwmchip_priv(struct pwm_chip *chip)
> > +{
> > +	return (char *)chip + ALIGN(sizeof(*chip), 32);
> 
> Why 32? I haven't found any explanation on the choice. I can understand arch
> minimum align, but hard coded value is a bit hard to get.

I copied that part from netdev_priv (without introducing the equivalent
of NETDEV_ALIGN). The 32 there isn't motivated in a comment, and i
didn't think much about it. Alternatives that I'm aware of are:

	dma_get_cache_alignment() (spi)
	Use a struct (counter's counter_device_allochelper uses ____cacheline_aligned)

(but I wonder about dma_get_cache_alignment() which might return 1 on
some archs and then doesn't even ensure natural aligning for shorts.)

> > +}
> 
> ...
> 
> > +struct pwm_chip *devm_pwmchip_alloc(struct device *parent, size_t sizeof_priv)
> > +{
> > +	struct pwm_chip *chip;
> > +	size_t alloc_size;
> 
> > +	alloc_size = ALIGN(sizeof(*chip), 32) + sizeof_priv;
> 
> Ditto.

Of course this needs to match the above. spi uses dev_get_drvdata on the
controller's dev. Also a nice idea.

> Shouldn't it use a macro from overflow.h?

Yupp, that would make sense. Something like:

	if (unlikely(check_add_overflow(ALIGN(sizeof(*chip), 32), sizeof_priv, &alloc_size)))
		return NULL;

(with 32 replaced by something with a better name.)

Thanks for your feedback,
Uwe
Uwe Kleine-König July 25, 2023, 9:10 p.m. UTC | #5
On Wed, Jul 19, 2023 at 09:59:29AM +0200, Thierry Reding wrote:
> On Tue, Jul 18, 2023 at 08:18:32PM +0200, Uwe Kleine-König wrote:
> > This function allocates a struct pwm_chip and driver data. Compared to
> > the status quo the split into pwm_chip and driver data is new, otherwise
> > it doesn't change anything relevant (yet).
> > 
> > The intention is that after all drivers are switched to use this
> > allocation function, its possible to add a struct device to struct
> > pwm_chip to properly track the latter's lifetime without touching all
> > drivers again. Proper lifetime tracking is a necessary precondition to
> > introduce character device support for PWMs (that implements atomic
> > setting and doesn't suffer from the sysfs overhead of the /sys/class/pwm
> > userspace support).
> > 
> > The new function pwmchip_priv() (obviously?) only works for chips
> > allocated with devm_pwmchip_alloc().
> 
> If this is supposed to be similar to the GPIO chardev, why doesn't GPIO
> require this way of allocating a struct gpio_chip? I'm not a fan of
> doing all this upfront work without seeing where this is ultimately
> headed. Please hold off on reworking everything until you have a
> complete proposal that can be reviewed in full.

I'm working on that and already have a patch stack with more than 100
patches, many of them are cleanups that I found while working on each
PWM driver (most of these are already posted to the linux-pwm list). The
biggest part is to convert each of the 50+ pwm drivers to
pwmchip_alloc().

Today I managed to trigger the problem I intend to address with this
series. My machine to test this on is an stm32mp157. To be able to
trigger the problem reliably I applied the following patches on top of
v6.5-rc1:

 - pwm: stm32: Don't modify HW state in .remove() callback
   This is a cleanup that I already sent out.
   https://lore.kernel.org/r/20230713155142.2454010-2-u.kleine-koenig@pengutronix.de
   The purpose for reproducing the problem is to not trigger further
   calls to the apply callback.

 - The following patch:

diff --git a/drivers/pwm/pwm-stm32.c b/drivers/pwm/pwm-stm32.c
index 687967d3265f..c7fc02b0fa3c 100644
--- a/drivers/pwm/pwm-stm32.c
+++ b/drivers/pwm/pwm-stm32.c
@@ -451,6 +451,10 @@ static int stm32_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
 	struct stm32_pwm *priv = to_stm32_pwm_dev(chip);
 	int ret;
 
+	dev_info(chip->dev, "%s:%d\n", __func__, __LINE__);
+	msleep(5000);
+	dev_info(chip->dev, "%s:%d\n", __func__, __LINE__);
+
 	enabled = pwm->state.enabled;
 
 	if (enabled && !state->enabled) {
@@ -650,7 +654,11 @@ static void stm32_pwm_remove(struct platform_device *pdev)
 {
 	struct stm32_pwm *priv = platform_get_drvdata(pdev);
 
+	dev_info(&pdev->dev, "%s:%d\n", __func__, __LINE__);
 	pwmchip_remove(&priv->chip);
+	dev_info(&pdev->dev, "%s:%d\n", __func__, __LINE__);
+
+	priv->regmap = NULL;
 }
 
 static int __maybe_unused stm32_pwm_suspend(struct device *dev)

The first hunk is only there to widen the race window. The second is to
give some diagnostics and make stm32_pwm_apply() crash if it continues
to run after the msleep. (Without it it didn't crash reproducibly, don't
understand why. *shrug*)

The device tree contains a pwm-fan device making use of one of the PWMs.

Now I do the following:

	echo fan > /sys/bus/platform/drivers/pwm-fan/unbind & sleep 1; echo 40007000.timer:pwm > /sys/bus/platform/drivers/stm32-pwm/unbind

Unbinding the fan device has two effects:

 - The device link between fan and pwm looses its property to unbind fan
   when pwm gets unbound.
   (Its .status changes from DL_STATE_ACTIVE to DL_STATE_AVAILABLE)
 - It calls pwm_fan_cleanup() which triggers a call to
   pwm_apply_state().

So when the pwm device gets unbound the first thread is sleeping in
stm32_pwm_apply(). The driver calls pwmchip_remove() and sets
priv->regmap to NULL. Then a few seconds later the first thread wakes up
in stm32_pwm_apply() with the chip freed and priv->regmap = NULL. Bang!

This looks as follows:

root@crown:~# echo fan > /sys/bus/platform/drivers/pwm-fan/unbind & sleep 1; echo 40007000.timer:pwm > /sys/bus/platform/drivers/stm32-pwm/unbind
[  187.182113] stm32-pwm 40007000.timer:pwm: stm32_pwm_apply:454
[  188.164769] stm32-pwm 40007000.timer:pwm: stm32_pwm_remove:657
[  188.184555] stm32-pwm 40007000.timer:pwm: stm32_pwm_remove:659
root@crown:~# [  192.236423] platform 40007000.timer:pwm: stm32_pwm_apply:456
[  192.240727] 8<--- cut here ---
[  192.243759] Unable to handle kernel NULL pointer dereference at virtual address 0000001c when read
...

Even without the crash you can see that stm32_pwm_apply() is still
running after pwmchip_remove() completed.

I'm unsure if the device link could be improved here to ensure that the
fan is completely unbound even if it started unbinding already before
the pwm device gets unbound. (And if it could, would this fit the device
links purpose and so be a sensible improvement?)

If not, the only possible other fix is to make sure that the pwm
callbacks are serialized with pwmchip_remove() and stop calling the
driver callbacks when pwmchip_remove() was called. For that the
pwm_chip struct must stay around until all consumers are really gone. So
the pwm_chip must not be allocated using devm_kzalloc for the pwm's
parent device.

Am I missing something? Is this good enough to convince you that we need
the concept of devm_pwmchip_alloc() from this thread?

My preferred way to continue fixing that would be to get all the
preparing cleanups into next soon to keep my patch stack smaller.
Another pre-condition to serializing the pwm callbacks is (I think) that
all low-level drivers must be fixed to not call pwm-API functions[1].

Then I'll prepare switching all drivers based on this series plus some
more patches to introduce a struct device in struct gpio_chip to track
the lifetime and eventually fix the issue demonstrated above.

While addressing this issue isn't a hard requirement for my final plan
to introduce /dev/pwmchip character devices, fixing it now before the
pwm core is complicated with the character device code should be easier
at least. (Also these character devices introduce another "user" of
pwm_chips and so another reason that these shouldn't be allocated using
devm_kzalloc.)

Best regards
Uwe

[1] first approximation:

	$ git grep -Ew 'pwm_(apply|get_state|is_enabled|set_period|get_period|[sg]et_duty_cycle|get_polarity|enable|disable)' v6.5-rc1 -- drivers/pwm/pwm-*.c | wc -l
	43
Uwe Kleine-König Oct. 10, 2023, 8:05 a.m. UTC | #6
Hello Saravana,

you were pointed out to me as the expert for device links. I found a
problem with these.

On Tue, Jul 25, 2023 at 11:10:04PM +0200, Uwe Kleine-König wrote:
> Today I managed to trigger the problem I intend to address with this
> series. My machine to test this on is an stm32mp157. To be able to
> trigger the problem reliably I applied the following patches on top of
> v6.5-rc1:
> 
>  - pwm: stm32: Don't modify HW state in .remove() callback
>    This is a cleanup that I already sent out.
>    https://lore.kernel.org/r/20230713155142.2454010-2-u.kleine-koenig@pengutronix.de
>    The purpose for reproducing the problem is to not trigger further
>    calls to the apply callback.
> 
>  - The following patch:
> 
> diff --git a/drivers/pwm/pwm-stm32.c b/drivers/pwm/pwm-stm32.c
> index 687967d3265f..c7fc02b0fa3c 100644
> --- a/drivers/pwm/pwm-stm32.c
> +++ b/drivers/pwm/pwm-stm32.c
> @@ -451,6 +451,10 @@ static int stm32_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
>  	struct stm32_pwm *priv = to_stm32_pwm_dev(chip);
>  	int ret;
>  
> +	dev_info(chip->dev, "%s:%d\n", __func__, __LINE__);
> +	msleep(5000);
> +	dev_info(chip->dev, "%s:%d\n", __func__, __LINE__);
> +
>  	enabled = pwm->state.enabled;
>  
>  	if (enabled && !state->enabled) {
> @@ -650,7 +654,11 @@ static void stm32_pwm_remove(struct platform_device *pdev)
>  {
>  	struct stm32_pwm *priv = platform_get_drvdata(pdev);
>  
> +	dev_info(&pdev->dev, "%s:%d\n", __func__, __LINE__);
>  	pwmchip_remove(&priv->chip);
> +	dev_info(&pdev->dev, "%s:%d\n", __func__, __LINE__);
> +
> +	priv->regmap = NULL;
>  }
>  
>  static int __maybe_unused stm32_pwm_suspend(struct device *dev)
> 
> The first hunk is only there to widen the race window. The second is to
> give some diagnostics and make stm32_pwm_apply() crash if it continues
> to run after the msleep. (Without it it didn't crash reproducibly, don't
> understand why. *shrug*)
> 
> The device tree contains a pwm-fan device making use of one of the PWMs.
> 
> Now I do the following:
> 
> 	echo fan > /sys/bus/platform/drivers/pwm-fan/unbind & sleep 1; echo 40007000.timer:pwm > /sys/bus/platform/drivers/stm32-pwm/unbind
> 
> Unbinding the fan device has two effects:
> 
>  - The device link between fan and pwm looses its property to unbind fan
>    when pwm gets unbound.
>    (Its .status changes from DL_STATE_ACTIVE to DL_STATE_AVAILABLE)
>  - It calls pwm_fan_cleanup() which triggers a call to
>    pwm_apply_state().
> 
> So when the pwm device gets unbound the first thread is sleeping in
> stm32_pwm_apply(). The driver calls pwmchip_remove() and sets
> priv->regmap to NULL. Then a few seconds later the first thread wakes up
> in stm32_pwm_apply() with the chip freed and priv->regmap = NULL. Bang!
> 
> This looks as follows:
> 
> root@crown:~# echo fan > /sys/bus/platform/drivers/pwm-fan/unbind & sleep 1; echo 40007000.timer:pwm > /sys/bus/platform/drivers/stm32-pwm/unbind
> [  187.182113] stm32-pwm 40007000.timer:pwm: stm32_pwm_apply:454
> [  188.164769] stm32-pwm 40007000.timer:pwm: stm32_pwm_remove:657
> [  188.184555] stm32-pwm 40007000.timer:pwm: stm32_pwm_remove:659
> root@crown:~# [  192.236423] platform 40007000.timer:pwm: stm32_pwm_apply:456
> [  192.240727] 8<--- cut here ---
> [  192.243759] Unable to handle kernel NULL pointer dereference at virtual address 0000001c when read
> ...
> 
> Even without the crash you can see that stm32_pwm_apply() is still
> running after pwmchip_remove() completed.
> 
> I'm unsure if the device link could be improved here to ensure that the
> fan is completely unbound even if it started unbinding already before
> the pwm device gets unbound. (And if it could, would this fit the device
> links purpose and so be a sensible improvement?)

While I think that there is something to be done in the pwm core that
this doesn't explode (i.e. do proper lifetime tracking such that a
pwm_chip doesn't disappear while still being used---and I'm working on
that) I expected that the device links between pwm consumer and provider
would prevent the above described oops, too. But somehow the fan already
going away (but still using the PWM) when the PWM is unbound, results in
the PWM disappearing before the fan is completely gone.

Is this expected, or a problem that can (and should?) be fixed?

If you need more context or a tester, don't hesitate to ask.

Best regards
Uwe
Saravana Kannan Oct. 13, 2023, 9:42 p.m. UTC | #7
On Tue, Oct 10, 2023 at 1:05 AM Uwe Kleine-König
<u.kleine-koenig@pengutronix.de> wrote:
>
> Hello Saravana,
>
> you were pointed out to me as the expert for device links. I found a
> problem with these.
>
> On Tue, Jul 25, 2023 at 11:10:04PM +0200, Uwe Kleine-König wrote:
> > Today I managed to trigger the problem I intend to address with this
> > series. My machine to test this on is an stm32mp157. To be able to
> > trigger the problem reliably I applied the following patches on top of
> > v6.5-rc1:
> >
> >  - pwm: stm32: Don't modify HW state in .remove() callback
> >    This is a cleanup that I already sent out.
> >    https://lore.kernel.org/r/20230713155142.2454010-2-u.kleine-koenig@pengutronix.de
> >    The purpose for reproducing the problem is to not trigger further
> >    calls to the apply callback.
> >
> >  - The following patch:
> >
> > diff --git a/drivers/pwm/pwm-stm32.c b/drivers/pwm/pwm-stm32.c
> > index 687967d3265f..c7fc02b0fa3c 100644
> > --- a/drivers/pwm/pwm-stm32.c
> > +++ b/drivers/pwm/pwm-stm32.c
> > @@ -451,6 +451,10 @@ static int stm32_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
> >       struct stm32_pwm *priv = to_stm32_pwm_dev(chip);
> >       int ret;
> >
> > +     dev_info(chip->dev, "%s:%d\n", __func__, __LINE__);
> > +     msleep(5000);
> > +     dev_info(chip->dev, "%s:%d\n", __func__, __LINE__);
> > +
> >       enabled = pwm->state.enabled;
> >
> >       if (enabled && !state->enabled) {
> > @@ -650,7 +654,11 @@ static void stm32_pwm_remove(struct platform_device *pdev)
> >  {
> >       struct stm32_pwm *priv = platform_get_drvdata(pdev);
> >
> > +     dev_info(&pdev->dev, "%s:%d\n", __func__, __LINE__);
> >       pwmchip_remove(&priv->chip);
> > +     dev_info(&pdev->dev, "%s:%d\n", __func__, __LINE__);
> > +
> > +     priv->regmap = NULL;
> >  }
> >
> >  static int __maybe_unused stm32_pwm_suspend(struct device *dev)
> >
> > The first hunk is only there to widen the race window. The second is to
> > give some diagnostics and make stm32_pwm_apply() crash if it continues
> > to run after the msleep. (Without it it didn't crash reproducibly, don't
> > understand why. *shrug*)
> >
> > The device tree contains a pwm-fan device making use of one of the PWMs.
> >
> > Now I do the following:
> >
> >       echo fan > /sys/bus/platform/drivers/pwm-fan/unbind & sleep 1; echo 40007000.timer:pwm > /sys/bus/platform/drivers/stm32-pwm/unbind
> >
> > Unbinding the fan device has two effects:
> >
> >  - The device link between fan and pwm looses its property to unbind fan
> >    when pwm gets unbound.
> >    (Its .status changes from DL_STATE_ACTIVE to DL_STATE_AVAILABLE)
> >  - It calls pwm_fan_cleanup() which triggers a call to
> >    pwm_apply_state().
> >
> > So when the pwm device gets unbound the first thread is sleeping in
> > stm32_pwm_apply(). The driver calls pwmchip_remove() and sets
> > priv->regmap to NULL. Then a few seconds later the first thread wakes up
> > in stm32_pwm_apply() with the chip freed and priv->regmap = NULL. Bang!
> >
> > This looks as follows:
> >
> > root@crown:~# echo fan > /sys/bus/platform/drivers/pwm-fan/unbind & sleep 1; echo 40007000.timer:pwm > /sys/bus/platform/drivers/stm32-pwm/unbind
> > [  187.182113] stm32-pwm 40007000.timer:pwm: stm32_pwm_apply:454
> > [  188.164769] stm32-pwm 40007000.timer:pwm: stm32_pwm_remove:657
> > [  188.184555] stm32-pwm 40007000.timer:pwm: stm32_pwm_remove:659
> > root@crown:~# [  192.236423] platform 40007000.timer:pwm: stm32_pwm_apply:456
> > [  192.240727] 8<--- cut here ---
> > [  192.243759] Unable to handle kernel NULL pointer dereference at virtual address 0000001c when read
> > ...
> >
> > Even without the crash you can see that stm32_pwm_apply() is still
> > running after pwmchip_remove() completed.
> >
> > I'm unsure if the device link could be improved here to ensure that the
> > fan is completely unbound even if it started unbinding already before
> > the pwm device gets unbound. (And if it could, would this fit the device
> > links purpose and so be a sensible improvement?)
>
> While I think that there is something to be done in the pwm core that
> this doesn't explode (i.e. do proper lifetime tracking such that a
> pwm_chip doesn't disappear while still being used---and I'm working on
> that) I expected that the device links between pwm consumer and provider
> would prevent the above described oops, too. But somehow the fan already
> going away (but still using the PWM) when the PWM is unbound, results in
> the PWM disappearing before the fan is completely gone.
>
> Is this expected, or a problem that can (and should?) be fixed?

I didn't read your full series, but I read this email. With what's in
this email, the problem seems to be in the driver or the pwm
framework. The pwm driver/framework can't tell the driver core that
you successfully unbound (returning from .remove()) before you have
finish all your ongoing transactions with the device. If your
"apply()" is still running, you need to make sure it's complete before
.remove() does any resource releasing/clean up.

Also, how is the consumer driver's .remove() succeeding if it has an
ongoing pwm call()? This all sounds like insufficient locking and
critical region protection in both the consumer and supplier.

Device links can't do anything here because you are giving it wrong
info -- that the unbind was successful before it actually is.

Device links will and can make sure that the consumer is unbound
successfully before the unbind is called on the supplier. And it looks
like that's still true here.

-Saravana

>
> If you need more context or a tester, don't hesitate to ask.
>
> Best regards
> Uwe
>
> --
> Pengutronix e.K.                           | Uwe Kleine-König            |
> Industrial Linux Solutions                 | https://www.pengutronix.de/ |
Uwe Kleine-König Oct. 14, 2023, 4:17 p.m. UTC | #8
On Fri, Oct 13, 2023 at 02:42:20PM -0700, Saravana Kannan wrote:
> On Tue, Oct 10, 2023 at 1:05 AM Uwe Kleine-König
> <u.kleine-koenig@pengutronix.de> wrote:
> >
> > Hello Saravana,
> >
> > you were pointed out to me as the expert for device links. I found a
> > problem with these.
> >
> > On Tue, Jul 25, 2023 at 11:10:04PM +0200, Uwe Kleine-König wrote:
> > > Today I managed to trigger the problem I intend to address with this
> > > series. My machine to test this on is an stm32mp157. To be able to
> > > trigger the problem reliably I applied the following patches on top of
> > > v6.5-rc1:
> > >
> > >  - pwm: stm32: Don't modify HW state in .remove() callback
> > >    This is a cleanup that I already sent out.
> > >    https://lore.kernel.org/r/20230713155142.2454010-2-u.kleine-koenig@pengutronix.de
> > >    The purpose for reproducing the problem is to not trigger further
> > >    calls to the apply callback.
> > >
> > >  - The following patch:
> > >
> > > diff --git a/drivers/pwm/pwm-stm32.c b/drivers/pwm/pwm-stm32.c
> > > index 687967d3265f..c7fc02b0fa3c 100644
> > > --- a/drivers/pwm/pwm-stm32.c
> > > +++ b/drivers/pwm/pwm-stm32.c
> > > @@ -451,6 +451,10 @@ static int stm32_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
> > >       struct stm32_pwm *priv = to_stm32_pwm_dev(chip);
> > >       int ret;
> > >
> > > +     dev_info(chip->dev, "%s:%d\n", __func__, __LINE__);
> > > +     msleep(5000);
> > > +     dev_info(chip->dev, "%s:%d\n", __func__, __LINE__);
> > > +
> > >       enabled = pwm->state.enabled;
> > >
> > >       if (enabled && !state->enabled) {
> > > @@ -650,7 +654,11 @@ static void stm32_pwm_remove(struct platform_device *pdev)
> > >  {
> > >       struct stm32_pwm *priv = platform_get_drvdata(pdev);
> > >
> > > +     dev_info(&pdev->dev, "%s:%d\n", __func__, __LINE__);
> > >       pwmchip_remove(&priv->chip);
> > > +     dev_info(&pdev->dev, "%s:%d\n", __func__, __LINE__);
> > > +
> > > +     priv->regmap = NULL;
> > >  }
> > >
> > >  static int __maybe_unused stm32_pwm_suspend(struct device *dev)
> > >
> > > The first hunk is only there to widen the race window. The second is to
> > > give some diagnostics and make stm32_pwm_apply() crash if it continues
> > > to run after the msleep. (Without it it didn't crash reproducibly, don't
> > > understand why. *shrug*)
> > >
> > > The device tree contains a pwm-fan device making use of one of the PWMs.
> > >
> > > Now I do the following:
> > >
> > >       echo fan > /sys/bus/platform/drivers/pwm-fan/unbind & sleep 1; echo 40007000.timer:pwm > /sys/bus/platform/drivers/stm32-pwm/unbind
> > >
> > > Unbinding the fan device has two effects:
> > >
> > >  - The device link between fan and pwm looses its property to unbind fan
> > >    when pwm gets unbound.
> > >    (Its .status changes from DL_STATE_ACTIVE to DL_STATE_AVAILABLE)
> > >  - It calls pwm_fan_cleanup() which triggers a call to
> > >    pwm_apply_state().
> > >
> > > So when the pwm device gets unbound the first thread is sleeping in
> > > stm32_pwm_apply(). The driver calls pwmchip_remove() and sets
> > > priv->regmap to NULL. Then a few seconds later the first thread wakes up
> > > in stm32_pwm_apply() with the chip freed and priv->regmap = NULL. Bang!
> > >
> > > This looks as follows:
> > >
> > > root@crown:~# echo fan > /sys/bus/platform/drivers/pwm-fan/unbind & sleep 1; echo 40007000.timer:pwm > /sys/bus/platform/drivers/stm32-pwm/unbind
> > > [  187.182113] stm32-pwm 40007000.timer:pwm: stm32_pwm_apply:454
> > > [  188.164769] stm32-pwm 40007000.timer:pwm: stm32_pwm_remove:657
> > > [  188.184555] stm32-pwm 40007000.timer:pwm: stm32_pwm_remove:659
> > > root@crown:~# [  192.236423] platform 40007000.timer:pwm: stm32_pwm_apply:456
> > > [  192.240727] 8<--- cut here ---
> > > [  192.243759] Unable to handle kernel NULL pointer dereference at virtual address 0000001c when read
> > > ...
> > >
> > > Even without the crash you can see that stm32_pwm_apply() is still
> > > running after pwmchip_remove() completed.
> > >
> > > I'm unsure if the device link could be improved here to ensure that the
> > > fan is completely unbound even if it started unbinding already before
> > > the pwm device gets unbound. (And if it could, would this fit the device
> > > links purpose and so be a sensible improvement?)
> >
> > While I think that there is something to be done in the pwm core that
> > this doesn't explode (i.e. do proper lifetime tracking such that a
> > pwm_chip doesn't disappear while still being used---and I'm working on
> > that) I expected that the device links between pwm consumer and provider
> > would prevent the above described oops, too. But somehow the fan already
> > going away (but still using the PWM) when the PWM is unbound, results in
> > the PWM disappearing before the fan is completely gone.
> >
> > Is this expected, or a problem that can (and should?) be fixed?
> 
> I didn't read your full series, but I read this email. With what's in
> this email, the problem seems to be in the driver or the pwm
> framework. The pwm driver/framework can't tell the driver core that
> you successfully unbound (returning from .remove()) before you have
> finish all your ongoing transactions with the device. If your
> "apply()" is still running, you need to make sure it's complete before
> .remove() does any resource releasing/clean up.
> 
> Also, how is the consumer driver's .remove() succeeding if it has an
> ongoing pwm call()?

The thing that works fine and as expected is:

 - trigger unbind of PWM device via sysfs

Because there is a device link PWM provider -> pwm consumer (fan), the
fan is removed and once its gone (and not earlier), the PWM gets unbound.

The failing sequence is:

 - trigger unbind of fan device in userspace thread A via sysfs. The
   fan's remove callback blocks for 5s in pwm_apply_state() and so
   .remove() doesn't complete yet.

 - a second later: trigger unbind of PWM device via sysfs in thread B.
   As before I'd expect that the device link results in waiting for the
   fan to be removed completely, but the PWM is removed immediately.

 - pwm_apply_state's sleep completes (in thread B) and operates on freed
   resources => bang!

> This all sounds like insufficient locking and
> critical region protection in both the consumer and supplier.

My (and I think also Thierry's) expectation was, that the device link
provides the needed synchronisation. But it doesn't as it doesn't block
the PWM provider going away until the fan is completely gone.

> Device links can't do anything here because you are giving it wrong
> info -- that the unbind was successful before it actually is.

The fan's unbind is ongoing, but not complete yet and I'd expect that
the device link blocks unbinding the PWM until the fan is completely
gone. So I think there is no wrong information.

> Device links will and can make sure that the consumer is unbound
> successfully before the unbind is called on the supplier. And it looks
> like that's still true here.

I hope you understood the situation better now and see the problem we
have.

The problem is fixable in the pwm framework (and I'm working on that),
but I think there is also something to improve around devicelink
handling.

Best regards
Uwe
Saravana Kannan Oct. 17, 2023, 11:35 p.m. UTC | #9
On Sat, Oct 14, 2023 at 9:17 AM Uwe Kleine-König
<u.kleine-koenig@pengutronix.de> wrote:
>
> On Fri, Oct 13, 2023 at 02:42:20PM -0700, Saravana Kannan wrote:
> > On Tue, Oct 10, 2023 at 1:05 AM Uwe Kleine-König
> > <u.kleine-koenig@pengutronix.de> wrote:
> > >
> > > Hello Saravana,
> > >
> > > you were pointed out to me as the expert for device links. I found a
> > > problem with these.
> > >
> > > On Tue, Jul 25, 2023 at 11:10:04PM +0200, Uwe Kleine-König wrote:
> > > > Today I managed to trigger the problem I intend to address with this
> > > > series. My machine to test this on is an stm32mp157. To be able to
> > > > trigger the problem reliably I applied the following patches on top of
> > > > v6.5-rc1:
> > > >
> > > >  - pwm: stm32: Don't modify HW state in .remove() callback
> > > >    This is a cleanup that I already sent out.
> > > >    https://lore.kernel.org/r/20230713155142.2454010-2-u.kleine-koenig@pengutronix.de
> > > >    The purpose for reproducing the problem is to not trigger further
> > > >    calls to the apply callback.
> > > >
> > > >  - The following patch:
> > > >
> > > > diff --git a/drivers/pwm/pwm-stm32.c b/drivers/pwm/pwm-stm32.c
> > > > index 687967d3265f..c7fc02b0fa3c 100644
> > > > --- a/drivers/pwm/pwm-stm32.c
> > > > +++ b/drivers/pwm/pwm-stm32.c
> > > > @@ -451,6 +451,10 @@ static int stm32_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
> > > >       struct stm32_pwm *priv = to_stm32_pwm_dev(chip);
> > > >       int ret;
> > > >
> > > > +     dev_info(chip->dev, "%s:%d\n", __func__, __LINE__);
> > > > +     msleep(5000);
> > > > +     dev_info(chip->dev, "%s:%d\n", __func__, __LINE__);
> > > > +
> > > >       enabled = pwm->state.enabled;
> > > >
> > > >       if (enabled && !state->enabled) {
> > > > @@ -650,7 +654,11 @@ static void stm32_pwm_remove(struct platform_device *pdev)
> > > >  {
> > > >       struct stm32_pwm *priv = platform_get_drvdata(pdev);
> > > >
> > > > +     dev_info(&pdev->dev, "%s:%d\n", __func__, __LINE__);
> > > >       pwmchip_remove(&priv->chip);
> > > > +     dev_info(&pdev->dev, "%s:%d\n", __func__, __LINE__);
> > > > +
> > > > +     priv->regmap = NULL;
> > > >  }
> > > >
> > > >  static int __maybe_unused stm32_pwm_suspend(struct device *dev)
> > > >
> > > > The first hunk is only there to widen the race window. The second is to
> > > > give some diagnostics and make stm32_pwm_apply() crash if it continues
> > > > to run after the msleep. (Without it it didn't crash reproducibly, don't
> > > > understand why. *shrug*)
> > > >
> > > > The device tree contains a pwm-fan device making use of one of the PWMs.
> > > >
> > > > Now I do the following:
> > > >
> > > >       echo fan > /sys/bus/platform/drivers/pwm-fan/unbind & sleep 1; echo 40007000.timer:pwm > /sys/bus/platform/drivers/stm32-pwm/unbind
> > > >
> > > > Unbinding the fan device has two effects:
> > > >
> > > >  - The device link between fan and pwm looses its property to unbind fan
> > > >    when pwm gets unbound.
> > > >    (Its .status changes from DL_STATE_ACTIVE to DL_STATE_AVAILABLE)
> > > >  - It calls pwm_fan_cleanup() which triggers a call to
> > > >    pwm_apply_state().
> > > >
> > > > So when the pwm device gets unbound the first thread is sleeping in
> > > > stm32_pwm_apply(). The driver calls pwmchip_remove() and sets
> > > > priv->regmap to NULL. Then a few seconds later the first thread wakes up
> > > > in stm32_pwm_apply() with the chip freed and priv->regmap = NULL. Bang!
> > > >
> > > > This looks as follows:
> > > >
> > > > root@crown:~# echo fan > /sys/bus/platform/drivers/pwm-fan/unbind & sleep 1; echo 40007000.timer:pwm > /sys/bus/platform/drivers/stm32-pwm/unbind
> > > > [  187.182113] stm32-pwm 40007000.timer:pwm: stm32_pwm_apply:454
> > > > [  188.164769] stm32-pwm 40007000.timer:pwm: stm32_pwm_remove:657
> > > > [  188.184555] stm32-pwm 40007000.timer:pwm: stm32_pwm_remove:659
> > > > root@crown:~# [  192.236423] platform 40007000.timer:pwm: stm32_pwm_apply:456
> > > > [  192.240727] 8<--- cut here ---
> > > > [  192.243759] Unable to handle kernel NULL pointer dereference at virtual address 0000001c when read
> > > > ...
> > > >
> > > > Even without the crash you can see that stm32_pwm_apply() is still
> > > > running after pwmchip_remove() completed.
> > > >
> > > > I'm unsure if the device link could be improved here to ensure that the
> > > > fan is completely unbound even if it started unbinding already before
> > > > the pwm device gets unbound. (And if it could, would this fit the device
> > > > links purpose and so be a sensible improvement?)
> > >
> > > While I think that there is something to be done in the pwm core that
> > > this doesn't explode (i.e. do proper lifetime tracking such that a
> > > pwm_chip doesn't disappear while still being used---and I'm working on
> > > that) I expected that the device links between pwm consumer and provider
> > > would prevent the above described oops, too. But somehow the fan already
> > > going away (but still using the PWM) when the PWM is unbound, results in
> > > the PWM disappearing before the fan is completely gone.
> > >
> > > Is this expected, or a problem that can (and should?) be fixed?
> >
> > I didn't read your full series, but I read this email. With what's in
> > this email, the problem seems to be in the driver or the pwm
> > framework. The pwm driver/framework can't tell the driver core that
> > you successfully unbound (returning from .remove()) before you have
> > finish all your ongoing transactions with the device. If your
> > "apply()" is still running, you need to make sure it's complete before
> > .remove() does any resource releasing/clean up.
> >
> > Also, how is the consumer driver's .remove() succeeding if it has an
> > ongoing pwm call()?
>
> The thing that works fine and as expected is:
>
>  - trigger unbind of PWM device via sysfs
>
> Because there is a device link PWM provider -> pwm consumer (fan), the
> fan is removed and once its gone (and not earlier), the PWM gets unbound.
>
> The failing sequence is:
>
>  - trigger unbind of fan device in userspace thread A via sysfs. The
>    fan's remove callback blocks for 5s in pwm_apply_state() and so
>    .remove() doesn't complete yet.
>
>  - a second later: trigger unbind of PWM device via sysfs in thread B.
>    As before I'd expect that the device link results in waiting for the
>    fan to be removed completely, but the PWM is removed immediately.
>
>  - pwm_apply_state's sleep completes (in thread B) and operates on freed
>    resources => bang!
>
> > This all sounds like insufficient locking and
> > critical region protection in both the consumer and supplier.
>
> My (and I think also Thierry's) expectation was, that the device link
> provides the needed synchronisation. But it doesn't as it doesn't block
> the PWM provider going away until the fan is completely gone.
>
> > Device links can't do anything here because you are giving it wrong
> > info -- that the unbind was successful before it actually is.
>
> The fan's unbind is ongoing, but not complete yet and I'd expect that
> the device link blocks unbinding the PWM until the fan is completely
> gone. So I think there is no wrong information.
>
> > Device links will and can make sure that the consumer is unbound
> > successfully before the unbind is called on the supplier. And it looks
> > like that's still true here.
>
> I hope you understood the situation better now and see the problem we
> have.
>
> The problem is fixable in the pwm framework (and I'm working on that),
> but I think there is also something to improve around devicelink
> handling.

Thanks for a better explanation of the issue. I agree, this seems like
something device links should be able to take care of.

I'll take a look into this.

-Saravana
Saravana Kannan Oct. 18, 2023, 1:42 a.m. UTC | #10
On Tue, Oct 17, 2023 at 4:35 PM Saravana Kannan <saravanak@google.com> wrote:
>
> On Sat, Oct 14, 2023 at 9:17 AM Uwe Kleine-König
> <u.kleine-koenig@pengutronix.de> wrote:
> >
> > On Fri, Oct 13, 2023 at 02:42:20PM -0700, Saravana Kannan wrote:
> > > On Tue, Oct 10, 2023 at 1:05 AM Uwe Kleine-König
> > > <u.kleine-koenig@pengutronix.de> wrote:
> > > >
> > > > Hello Saravana,
> > > >
> > > > you were pointed out to me as the expert for device links. I found a
> > > > problem with these.
> > > >
> > > > On Tue, Jul 25, 2023 at 11:10:04PM +0200, Uwe Kleine-König wrote:
> > > > > Today I managed to trigger the problem I intend to address with this
> > > > > series. My machine to test this on is an stm32mp157. To be able to
> > > > > trigger the problem reliably I applied the following patches on top of
> > > > > v6.5-rc1:
> > > > >
> > > > >  - pwm: stm32: Don't modify HW state in .remove() callback
> > > > >    This is a cleanup that I already sent out.
> > > > >    https://lore.kernel.org/r/20230713155142.2454010-2-u.kleine-koenig@pengutronix.de
> > > > >    The purpose for reproducing the problem is to not trigger further
> > > > >    calls to the apply callback.
> > > > >
> > > > >  - The following patch:
> > > > >
> > > > > diff --git a/drivers/pwm/pwm-stm32.c b/drivers/pwm/pwm-stm32.c
> > > > > index 687967d3265f..c7fc02b0fa3c 100644
> > > > > --- a/drivers/pwm/pwm-stm32.c
> > > > > +++ b/drivers/pwm/pwm-stm32.c
> > > > > @@ -451,6 +451,10 @@ static int stm32_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
> > > > >       struct stm32_pwm *priv = to_stm32_pwm_dev(chip);
> > > > >       int ret;
> > > > >
> > > > > +     dev_info(chip->dev, "%s:%d\n", __func__, __LINE__);
> > > > > +     msleep(5000);
> > > > > +     dev_info(chip->dev, "%s:%d\n", __func__, __LINE__);
> > > > > +
> > > > >       enabled = pwm->state.enabled;
> > > > >
> > > > >       if (enabled && !state->enabled) {
> > > > > @@ -650,7 +654,11 @@ static void stm32_pwm_remove(struct platform_device *pdev)
> > > > >  {
> > > > >       struct stm32_pwm *priv = platform_get_drvdata(pdev);
> > > > >
> > > > > +     dev_info(&pdev->dev, "%s:%d\n", __func__, __LINE__);
> > > > >       pwmchip_remove(&priv->chip);
> > > > > +     dev_info(&pdev->dev, "%s:%d\n", __func__, __LINE__);
> > > > > +
> > > > > +     priv->regmap = NULL;
> > > > >  }
> > > > >
> > > > >  static int __maybe_unused stm32_pwm_suspend(struct device *dev)
> > > > >
> > > > > The first hunk is only there to widen the race window. The second is to
> > > > > give some diagnostics and make stm32_pwm_apply() crash if it continues
> > > > > to run after the msleep. (Without it it didn't crash reproducibly, don't
> > > > > understand why. *shrug*)
> > > > >
> > > > > The device tree contains a pwm-fan device making use of one of the PWMs.
> > > > >
> > > > > Now I do the following:
> > > > >
> > > > >       echo fan > /sys/bus/platform/drivers/pwm-fan/unbind & sleep 1; echo 40007000.timer:pwm > /sys/bus/platform/drivers/stm32-pwm/unbind
> > > > >
> > > > > Unbinding the fan device has two effects:
> > > > >
> > > > >  - The device link between fan and pwm looses its property to unbind fan
> > > > >    when pwm gets unbound.
> > > > >    (Its .status changes from DL_STATE_ACTIVE to DL_STATE_AVAILABLE)
> > > > >  - It calls pwm_fan_cleanup() which triggers a call to
> > > > >    pwm_apply_state().
> > > > >
> > > > > So when the pwm device gets unbound the first thread is sleeping in
> > > > > stm32_pwm_apply(). The driver calls pwmchip_remove() and sets
> > > > > priv->regmap to NULL. Then a few seconds later the first thread wakes up
> > > > > in stm32_pwm_apply() with the chip freed and priv->regmap = NULL. Bang!
> > > > >
> > > > > This looks as follows:
> > > > >
> > > > > root@crown:~# echo fan > /sys/bus/platform/drivers/pwm-fan/unbind & sleep 1; echo 40007000.timer:pwm > /sys/bus/platform/drivers/stm32-pwm/unbind
> > > > > [  187.182113] stm32-pwm 40007000.timer:pwm: stm32_pwm_apply:454
> > > > > [  188.164769] stm32-pwm 40007000.timer:pwm: stm32_pwm_remove:657
> > > > > [  188.184555] stm32-pwm 40007000.timer:pwm: stm32_pwm_remove:659
> > > > > root@crown:~# [  192.236423] platform 40007000.timer:pwm: stm32_pwm_apply:456
> > > > > [  192.240727] 8<--- cut here ---
> > > > > [  192.243759] Unable to handle kernel NULL pointer dereference at virtual address 0000001c when read
> > > > > ...
> > > > >
> > > > > Even without the crash you can see that stm32_pwm_apply() is still
> > > > > running after pwmchip_remove() completed.
> > > > >
> > > > > I'm unsure if the device link could be improved here to ensure that the
> > > > > fan is completely unbound even if it started unbinding already before
> > > > > the pwm device gets unbound. (And if it could, would this fit the device
> > > > > links purpose and so be a sensible improvement?)
> > > >
> > > > While I think that there is something to be done in the pwm core that
> > > > this doesn't explode (i.e. do proper lifetime tracking such that a
> > > > pwm_chip doesn't disappear while still being used---and I'm working on
> > > > that) I expected that the device links between pwm consumer and provider
> > > > would prevent the above described oops, too. But somehow the fan already
> > > > going away (but still using the PWM) when the PWM is unbound, results in
> > > > the PWM disappearing before the fan is completely gone.
> > > >
> > > > Is this expected, or a problem that can (and should?) be fixed?
> > >
> > > I didn't read your full series, but I read this email. With what's in
> > > this email, the problem seems to be in the driver or the pwm
> > > framework. The pwm driver/framework can't tell the driver core that
> > > you successfully unbound (returning from .remove()) before you have
> > > finish all your ongoing transactions with the device. If your
> > > "apply()" is still running, you need to make sure it's complete before
> > > .remove() does any resource releasing/clean up.
> > >
> > > Also, how is the consumer driver's .remove() succeeding if it has an
> > > ongoing pwm call()?
> >
> > The thing that works fine and as expected is:
> >
> >  - trigger unbind of PWM device via sysfs
> >
> > Because there is a device link PWM provider -> pwm consumer (fan), the
> > fan is removed and once its gone (and not earlier), the PWM gets unbound.
> >
> > The failing sequence is:
> >
> >  - trigger unbind of fan device in userspace thread A via sysfs. The
> >    fan's remove callback blocks for 5s in pwm_apply_state() and so
> >    .remove() doesn't complete yet.
> >
> >  - a second later: trigger unbind of PWM device via sysfs in thread B.
> >    As before I'd expect that the device link results in waiting for the
> >    fan to be removed completely, but the PWM is removed immediately.
> >
> >  - pwm_apply_state's sleep completes (in thread B) and operates on freed
> >    resources => bang!
> >
> > > This all sounds like insufficient locking and
> > > critical region protection in both the consumer and supplier.
> >
> > My (and I think also Thierry's) expectation was, that the device link
> > provides the needed synchronisation. But it doesn't as it doesn't block
> > the PWM provider going away until the fan is completely gone.
> >
> > > Device links can't do anything here because you are giving it wrong
> > > info -- that the unbind was successful before it actually is.
> >
> > The fan's unbind is ongoing, but not complete yet and I'd expect that
> > the device link blocks unbinding the PWM until the fan is completely
> > gone. So I think there is no wrong information.
> >
> > > Device links will and can make sure that the consumer is unbound
> > > successfully before the unbind is called on the supplier. And it looks
> > > like that's still true here.
> >
> > I hope you understood the situation better now and see the problem we
> > have.
> >
> > The problem is fixable in the pwm framework (and I'm working on that),
> > but I think there is also something to improve around devicelink
> > handling.
>
> Thanks for a better explanation of the issue. I agree, this seems like
> something device links should be able to take care of.
>
> I'll take a look into this.

Took me a while to debug this because I couldn't find the .remove()
function and I was very confused about what's going on.
I'm guessing you started hitting this issue only after moving to the
devm_ variant of the pwm APIs.

I think I have a fix.
https://lore.kernel.org/lkml/20231018013851.3303928-1-saravanak@google.com/T/#u

I didn't even compile test it, but it's a trivial change. Can you
please test it out and give your "tested-by" if it fixes your issue?

Thanks,
Saravana
Uwe Kleine-König Oct. 18, 2023, 11:17 a.m. UTC | #11
On Tue, Oct 17, 2023 at 06:42:40PM -0700, Saravana Kannan wrote:
> On Tue, Oct 17, 2023 at 4:35 PM Saravana Kannan <saravanak@google.com> wrote:
> >
> > On Sat, Oct 14, 2023 at 9:17 AM Uwe Kleine-König
> > <u.kleine-koenig@pengutronix.de> wrote:
> > >
> > > On Fri, Oct 13, 2023 at 02:42:20PM -0700, Saravana Kannan wrote:
> > > > On Tue, Oct 10, 2023 at 1:05 AM Uwe Kleine-König
> > > > <u.kleine-koenig@pengutronix.de> wrote:
> > > > >
> > > > > Hello Saravana,
> > > > >
> > > > > you were pointed out to me as the expert for device links. I found a
> > > > > problem with these.
> > > > >
> > > > > On Tue, Jul 25, 2023 at 11:10:04PM +0200, Uwe Kleine-König wrote:
> > > > > > Today I managed to trigger the problem I intend to address with this
> > > > > > series. My machine to test this on is an stm32mp157. To be able to
> > > > > > trigger the problem reliably I applied the following patches on top of
> > > > > > v6.5-rc1:
> > > > > >
> > > > > >  - pwm: stm32: Don't modify HW state in .remove() callback
> > > > > >    This is a cleanup that I already sent out.
> > > > > >    https://lore.kernel.org/r/20230713155142.2454010-2-u.kleine-koenig@pengutronix.de
> > > > > >    The purpose for reproducing the problem is to not trigger further
> > > > > >    calls to the apply callback.
> > > > > >
> > > > > >  - The following patch:
> > > > > >
> > > > > > diff --git a/drivers/pwm/pwm-stm32.c b/drivers/pwm/pwm-stm32.c
> > > > > > index 687967d3265f..c7fc02b0fa3c 100644
> > > > > > --- a/drivers/pwm/pwm-stm32.c
> > > > > > +++ b/drivers/pwm/pwm-stm32.c
> > > > > > @@ -451,6 +451,10 @@ static int stm32_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
> > > > > >       struct stm32_pwm *priv = to_stm32_pwm_dev(chip);
> > > > > >       int ret;
> > > > > >
> > > > > > +     dev_info(chip->dev, "%s:%d\n", __func__, __LINE__);
> > > > > > +     msleep(5000);
> > > > > > +     dev_info(chip->dev, "%s:%d\n", __func__, __LINE__);
> > > > > > +
> > > > > >       enabled = pwm->state.enabled;
> > > > > >
> > > > > >       if (enabled && !state->enabled) {
> > > > > > @@ -650,7 +654,11 @@ static void stm32_pwm_remove(struct platform_device *pdev)
> > > > > >  {
> > > > > >       struct stm32_pwm *priv = platform_get_drvdata(pdev);
> > > > > >
> > > > > > +     dev_info(&pdev->dev, "%s:%d\n", __func__, __LINE__);
> > > > > >       pwmchip_remove(&priv->chip);
> > > > > > +     dev_info(&pdev->dev, "%s:%d\n", __func__, __LINE__);
> > > > > > +
> > > > > > +     priv->regmap = NULL;
> > > > > >  }
> > > > > >
> > > > > >  static int __maybe_unused stm32_pwm_suspend(struct device *dev)
> > > > > >
> > > > > > The first hunk is only there to widen the race window. The second is to
> > > > > > give some diagnostics and make stm32_pwm_apply() crash if it continues
> > > > > > to run after the msleep. (Without it it didn't crash reproducibly, don't
> > > > > > understand why. *shrug*)
> > > > > >
> > > > > > The device tree contains a pwm-fan device making use of one of the PWMs.
> > > > > >
> > > > > > Now I do the following:
> > > > > >
> > > > > >       echo fan > /sys/bus/platform/drivers/pwm-fan/unbind & sleep 1; echo 40007000.timer:pwm > /sys/bus/platform/drivers/stm32-pwm/unbind
> > > > > >
> > > > > > Unbinding the fan device has two effects:
> > > > > >
> > > > > >  - The device link between fan and pwm looses its property to unbind fan
> > > > > >    when pwm gets unbound.
> > > > > >    (Its .status changes from DL_STATE_ACTIVE to DL_STATE_AVAILABLE)
> > > > > >  - It calls pwm_fan_cleanup() which triggers a call to
> > > > > >    pwm_apply_state().
> > > > > >
> > > > > > So when the pwm device gets unbound the first thread is sleeping in
> > > > > > stm32_pwm_apply(). The driver calls pwmchip_remove() and sets
> > > > > > priv->regmap to NULL. Then a few seconds later the first thread wakes up
> > > > > > in stm32_pwm_apply() with the chip freed and priv->regmap = NULL. Bang!
> > > > > >
> > > > > > This looks as follows:
> > > > > >
> > > > > > root@crown:~# echo fan > /sys/bus/platform/drivers/pwm-fan/unbind & sleep 1; echo 40007000.timer:pwm > /sys/bus/platform/drivers/stm32-pwm/unbind
> > > > > > [  187.182113] stm32-pwm 40007000.timer:pwm: stm32_pwm_apply:454
> > > > > > [  188.164769] stm32-pwm 40007000.timer:pwm: stm32_pwm_remove:657
> > > > > > [  188.184555] stm32-pwm 40007000.timer:pwm: stm32_pwm_remove:659
> > > > > > root@crown:~# [  192.236423] platform 40007000.timer:pwm: stm32_pwm_apply:456
> > > > > > [  192.240727] 8<--- cut here ---
> > > > > > [  192.243759] Unable to handle kernel NULL pointer dereference at virtual address 0000001c when read
> > > > > > ...
> > > > > >
> > > > > > Even without the crash you can see that stm32_pwm_apply() is still
> > > > > > running after pwmchip_remove() completed.
> > > > > >
> > > > > > I'm unsure if the device link could be improved here to ensure that the
> > > > > > fan is completely unbound even if it started unbinding already before
> > > > > > the pwm device gets unbound. (And if it could, would this fit the device
> > > > > > links purpose and so be a sensible improvement?)
> > > > >
> > > > > While I think that there is something to be done in the pwm core that
> > > > > this doesn't explode (i.e. do proper lifetime tracking such that a
> > > > > pwm_chip doesn't disappear while still being used---and I'm working on
> > > > > that) I expected that the device links between pwm consumer and provider
> > > > > would prevent the above described oops, too. But somehow the fan already
> > > > > going away (but still using the PWM) when the PWM is unbound, results in
> > > > > the PWM disappearing before the fan is completely gone.
> > > > >
> > > > > Is this expected, or a problem that can (and should?) be fixed?
> > > >
> > > > I didn't read your full series, but I read this email. With what's in
> > > > this email, the problem seems to be in the driver or the pwm
> > > > framework. The pwm driver/framework can't tell the driver core that
> > > > you successfully unbound (returning from .remove()) before you have
> > > > finish all your ongoing transactions with the device. If your
> > > > "apply()" is still running, you need to make sure it's complete before
> > > > .remove() does any resource releasing/clean up.
> > > >
> > > > Also, how is the consumer driver's .remove() succeeding if it has an
> > > > ongoing pwm call()?
> > >
> > > The thing that works fine and as expected is:
> > >
> > >  - trigger unbind of PWM device via sysfs
> > >
> > > Because there is a device link PWM provider -> pwm consumer (fan), the
> > > fan is removed and once its gone (and not earlier), the PWM gets unbound.
> > >
> > > The failing sequence is:
> > >
> > >  - trigger unbind of fan device in userspace thread A via sysfs. The
> > >    fan's remove callback blocks for 5s in pwm_apply_state() and so
> > >    .remove() doesn't complete yet.
> > >
> > >  - a second later: trigger unbind of PWM device via sysfs in thread B.
> > >    As before I'd expect that the device link results in waiting for the
> > >    fan to be removed completely, but the PWM is removed immediately.
> > >
> > >  - pwm_apply_state's sleep completes (in thread B) and operates on freed
> > >    resources => bang!
> > >
> > > > This all sounds like insufficient locking and
> > > > critical region protection in both the consumer and supplier.
> > >
> > > My (and I think also Thierry's) expectation was, that the device link
> > > provides the needed synchronisation. But it doesn't as it doesn't block
> > > the PWM provider going away until the fan is completely gone.
> > >
> > > > Device links can't do anything here because you are giving it wrong
> > > > info -- that the unbind was successful before it actually is.
> > >
> > > The fan's unbind is ongoing, but not complete yet and I'd expect that
> > > the device link blocks unbinding the PWM until the fan is completely
> > > gone. So I think there is no wrong information.
> > >
> > > > Device links will and can make sure that the consumer is unbound
> > > > successfully before the unbind is called on the supplier. And it looks
> > > > like that's still true here.
> > >
> > > I hope you understood the situation better now and see the problem we
> > > have.
> > >
> > > The problem is fixable in the pwm framework (and I'm working on that),
> > > but I think there is also something to improve around devicelink
> > > handling.
> >
> > Thanks for a better explanation of the issue. I agree, this seems like
> > something device links should be able to take care of.
> >
> > I'll take a look into this.
> 
> Took me a while to debug this because I couldn't find the .remove()
> function and I was very confused about what's going on.
> I'm guessing you started hitting this issue only after moving to the
> devm_ variant of the pwm APIs.

Ah I see. That problem wouldn't happen if the fan called a pwm API
function in its remove callback but that happens in a devm cleanup call
(registered by devm_add_action_or_reset(dev, pwm_fan_cleanup, ctx) in
pwm_fan_probe()). I first thought you talked about
8c89fd866ad221af037ef0ec3d60b83d0b859c65.

Best regards
Uwe
Saravana Kannan Oct. 18, 2023, 9:01 p.m. UTC | #12
On Wed, Oct 18, 2023 at 4:17 AM Uwe Kleine-König
<u.kleine-koenig@pengutronix.de> wrote:
>
> On Tue, Oct 17, 2023 at 06:42:40PM -0700, Saravana Kannan wrote:
> > On Tue, Oct 17, 2023 at 4:35 PM Saravana Kannan <saravanak@google.com> wrote:
> > >
> > > On Sat, Oct 14, 2023 at 9:17 AM Uwe Kleine-König
> > > <u.kleine-koenig@pengutronix.de> wrote:
> > > >
> > > > On Fri, Oct 13, 2023 at 02:42:20PM -0700, Saravana Kannan wrote:
> > > > > On Tue, Oct 10, 2023 at 1:05 AM Uwe Kleine-König
> > > > > <u.kleine-koenig@pengutronix.de> wrote:
> > > > > >
> > > > > > Hello Saravana,
> > > > > >
> > > > > > you were pointed out to me as the expert for device links. I found a
> > > > > > problem with these.
> > > > > >
> > > > > > On Tue, Jul 25, 2023 at 11:10:04PM +0200, Uwe Kleine-König wrote:
> > > > > > > Today I managed to trigger the problem I intend to address with this
> > > > > > > series. My machine to test this on is an stm32mp157. To be able to
> > > > > > > trigger the problem reliably I applied the following patches on top of
> > > > > > > v6.5-rc1:
> > > > > > >
> > > > > > >  - pwm: stm32: Don't modify HW state in .remove() callback
> > > > > > >    This is a cleanup that I already sent out.
> > > > > > >    https://lore.kernel.org/r/20230713155142.2454010-2-u.kleine-koenig@pengutronix.de
> > > > > > >    The purpose for reproducing the problem is to not trigger further
> > > > > > >    calls to the apply callback.
> > > > > > >
> > > > > > >  - The following patch:
> > > > > > >
> > > > > > > diff --git a/drivers/pwm/pwm-stm32.c b/drivers/pwm/pwm-stm32.c
> > > > > > > index 687967d3265f..c7fc02b0fa3c 100644
> > > > > > > --- a/drivers/pwm/pwm-stm32.c
> > > > > > > +++ b/drivers/pwm/pwm-stm32.c
> > > > > > > @@ -451,6 +451,10 @@ static int stm32_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
> > > > > > >       struct stm32_pwm *priv = to_stm32_pwm_dev(chip);
> > > > > > >       int ret;
> > > > > > >
> > > > > > > +     dev_info(chip->dev, "%s:%d\n", __func__, __LINE__);
> > > > > > > +     msleep(5000);
> > > > > > > +     dev_info(chip->dev, "%s:%d\n", __func__, __LINE__);
> > > > > > > +
> > > > > > >       enabled = pwm->state.enabled;
> > > > > > >
> > > > > > >       if (enabled && !state->enabled) {
> > > > > > > @@ -650,7 +654,11 @@ static void stm32_pwm_remove(struct platform_device *pdev)
> > > > > > >  {
> > > > > > >       struct stm32_pwm *priv = platform_get_drvdata(pdev);
> > > > > > >
> > > > > > > +     dev_info(&pdev->dev, "%s:%d\n", __func__, __LINE__);
> > > > > > >       pwmchip_remove(&priv->chip);
> > > > > > > +     dev_info(&pdev->dev, "%s:%d\n", __func__, __LINE__);
> > > > > > > +
> > > > > > > +     priv->regmap = NULL;
> > > > > > >  }
> > > > > > >
> > > > > > >  static int __maybe_unused stm32_pwm_suspend(struct device *dev)
> > > > > > >
> > > > > > > The first hunk is only there to widen the race window. The second is to
> > > > > > > give some diagnostics and make stm32_pwm_apply() crash if it continues
> > > > > > > to run after the msleep. (Without it it didn't crash reproducibly, don't
> > > > > > > understand why. *shrug*)
> > > > > > >
> > > > > > > The device tree contains a pwm-fan device making use of one of the PWMs.
> > > > > > >
> > > > > > > Now I do the following:
> > > > > > >
> > > > > > >       echo fan > /sys/bus/platform/drivers/pwm-fan/unbind & sleep 1; echo 40007000.timer:pwm > /sys/bus/platform/drivers/stm32-pwm/unbind
> > > > > > >
> > > > > > > Unbinding the fan device has two effects:
> > > > > > >
> > > > > > >  - The device link between fan and pwm looses its property to unbind fan
> > > > > > >    when pwm gets unbound.
> > > > > > >    (Its .status changes from DL_STATE_ACTIVE to DL_STATE_AVAILABLE)
> > > > > > >  - It calls pwm_fan_cleanup() which triggers a call to
> > > > > > >    pwm_apply_state().
> > > > > > >
> > > > > > > So when the pwm device gets unbound the first thread is sleeping in
> > > > > > > stm32_pwm_apply(). The driver calls pwmchip_remove() and sets
> > > > > > > priv->regmap to NULL. Then a few seconds later the first thread wakes up
> > > > > > > in stm32_pwm_apply() with the chip freed and priv->regmap = NULL. Bang!
> > > > > > >
> > > > > > > This looks as follows:
> > > > > > >
> > > > > > > root@crown:~# echo fan > /sys/bus/platform/drivers/pwm-fan/unbind & sleep 1; echo 40007000.timer:pwm > /sys/bus/platform/drivers/stm32-pwm/unbind
> > > > > > > [  187.182113] stm32-pwm 40007000.timer:pwm: stm32_pwm_apply:454
> > > > > > > [  188.164769] stm32-pwm 40007000.timer:pwm: stm32_pwm_remove:657
> > > > > > > [  188.184555] stm32-pwm 40007000.timer:pwm: stm32_pwm_remove:659
> > > > > > > root@crown:~# [  192.236423] platform 40007000.timer:pwm: stm32_pwm_apply:456
> > > > > > > [  192.240727] 8<--- cut here ---
> > > > > > > [  192.243759] Unable to handle kernel NULL pointer dereference at virtual address 0000001c when read
> > > > > > > ...
> > > > > > >
> > > > > > > Even without the crash you can see that stm32_pwm_apply() is still
> > > > > > > running after pwmchip_remove() completed.
> > > > > > >
> > > > > > > I'm unsure if the device link could be improved here to ensure that the
> > > > > > > fan is completely unbound even if it started unbinding already before
> > > > > > > the pwm device gets unbound. (And if it could, would this fit the device
> > > > > > > links purpose and so be a sensible improvement?)
> > > > > >
> > > > > > While I think that there is something to be done in the pwm core that
> > > > > > this doesn't explode (i.e. do proper lifetime tracking such that a
> > > > > > pwm_chip doesn't disappear while still being used---and I'm working on
> > > > > > that) I expected that the device links between pwm consumer and provider
> > > > > > would prevent the above described oops, too. But somehow the fan already
> > > > > > going away (but still using the PWM) when the PWM is unbound, results in
> > > > > > the PWM disappearing before the fan is completely gone.
> > > > > >
> > > > > > Is this expected, or a problem that can (and should?) be fixed?
> > > > >
> > > > > I didn't read your full series, but I read this email. With what's in
> > > > > this email, the problem seems to be in the driver or the pwm
> > > > > framework. The pwm driver/framework can't tell the driver core that
> > > > > you successfully unbound (returning from .remove()) before you have
> > > > > finish all your ongoing transactions with the device. If your
> > > > > "apply()" is still running, you need to make sure it's complete before
> > > > > .remove() does any resource releasing/clean up.
> > > > >
> > > > > Also, how is the consumer driver's .remove() succeeding if it has an
> > > > > ongoing pwm call()?
> > > >
> > > > The thing that works fine and as expected is:
> > > >
> > > >  - trigger unbind of PWM device via sysfs
> > > >
> > > > Because there is a device link PWM provider -> pwm consumer (fan), the
> > > > fan is removed and once its gone (and not earlier), the PWM gets unbound.
> > > >
> > > > The failing sequence is:
> > > >
> > > >  - trigger unbind of fan device in userspace thread A via sysfs. The
> > > >    fan's remove callback blocks for 5s in pwm_apply_state() and so
> > > >    .remove() doesn't complete yet.
> > > >
> > > >  - a second later: trigger unbind of PWM device via sysfs in thread B.
> > > >    As before I'd expect that the device link results in waiting for the
> > > >    fan to be removed completely, but the PWM is removed immediately.
> > > >
> > > >  - pwm_apply_state's sleep completes (in thread B) and operates on freed
> > > >    resources => bang!
> > > >
> > > > > This all sounds like insufficient locking and
> > > > > critical region protection in both the consumer and supplier.
> > > >
> > > > My (and I think also Thierry's) expectation was, that the device link
> > > > provides the needed synchronisation. But it doesn't as it doesn't block
> > > > the PWM provider going away until the fan is completely gone.
> > > >
> > > > > Device links can't do anything here because you are giving it wrong
> > > > > info -- that the unbind was successful before it actually is.
> > > >
> > > > The fan's unbind is ongoing, but not complete yet and I'd expect that
> > > > the device link blocks unbinding the PWM until the fan is completely
> > > > gone. So I think there is no wrong information.
> > > >
> > > > > Device links will and can make sure that the consumer is unbound
> > > > > successfully before the unbind is called on the supplier. And it looks
> > > > > like that's still true here.
> > > >
> > > > I hope you understood the situation better now and see the problem we
> > > > have.
> > > >
> > > > The problem is fixable in the pwm framework (and I'm working on that),
> > > > but I think there is also something to improve around devicelink
> > > > handling.
> > >
> > > Thanks for a better explanation of the issue. I agree, this seems like
> > > something device links should be able to take care of.
> > >
> > > I'll take a look into this.
> >
> > Took me a while to debug this because I couldn't find the .remove()
> > function and I was very confused about what's going on.
> > I'm guessing you started hitting this issue only after moving to the
> > devm_ variant of the pwm APIs.
>
> Ah I see. That problem wouldn't happen if the fan called a pwm API
> function in its remove callback but that happens in a devm cleanup call
> (registered by devm_add_action_or_reset(dev, pwm_fan_cleanup, ctx) in
> pwm_fan_probe()). I first thought you talked about
> 8c89fd866ad221af037ef0ec3d60b83d0b859c65.

Am I not talking about that commit?

Btw, I'm still a bit confused by this thread. In your earlier emails
to me, you said:

>  - trigger unbind of fan device in userspace thread A via sysfs. The
>  fan's remove callback blocks for 5s in pwm_apply_state() and so
>  .remove() doesn't complete yet.

But the latest tree (Tot) didn't have any .remove() function at all.
So I just decided to see if there's any issue in Tot and just fix
that. I'm glad my fix helps fixed the issue with the used of devm_*()
APIs.

However, are you really seeing the issue (supplier freed before
consumer) if you do the clean up in the .remove() function? If so,
there might still be another issue that needs to be fixed.

Thanks,
Saravana
Uwe Kleine-König Oct. 18, 2023, 9:20 p.m. UTC | #13
On Wed, Oct 18, 2023 at 02:01:30PM -0700, Saravana Kannan wrote:
> On Wed, Oct 18, 2023 at 4:17 AM Uwe Kleine-König
> <u.kleine-koenig@pengutronix.de> wrote:
> >
> > On Tue, Oct 17, 2023 at 06:42:40PM -0700, Saravana Kannan wrote:
> > > On Tue, Oct 17, 2023 at 4:35 PM Saravana Kannan <saravanak@google.com> wrote:
> > > >
> > > > On Sat, Oct 14, 2023 at 9:17 AM Uwe Kleine-König
> > > > <u.kleine-koenig@pengutronix.de> wrote:
> > > > >
> > > > > On Fri, Oct 13, 2023 at 02:42:20PM -0700, Saravana Kannan wrote:
> > > > > > On Tue, Oct 10, 2023 at 1:05 AM Uwe Kleine-König
> > > > > > <u.kleine-koenig@pengutronix.de> wrote:
> > > > > > >
> > > > > > > Hello Saravana,
> > > > > > >
> > > > > > > you were pointed out to me as the expert for device links. I found a
> > > > > > > problem with these.
> > > > > > >
> > > > > > > On Tue, Jul 25, 2023 at 11:10:04PM +0200, Uwe Kleine-König wrote:
> > > > > > > > Today I managed to trigger the problem I intend to address with this
> > > > > > > > series. My machine to test this on is an stm32mp157. To be able to
> > > > > > > > trigger the problem reliably I applied the following patches on top of
> > > > > > > > v6.5-rc1:
> > > > > > > >
> > > > > > > >  - pwm: stm32: Don't modify HW state in .remove() callback
> > > > > > > >    This is a cleanup that I already sent out.
> > > > > > > >    https://lore.kernel.org/r/20230713155142.2454010-2-u.kleine-koenig@pengutronix.de
> > > > > > > >    The purpose for reproducing the problem is to not trigger further
> > > > > > > >    calls to the apply callback.
> > > > > > > >
> > > > > > > >  - The following patch:
> > > > > > > >
> > > > > > > > diff --git a/drivers/pwm/pwm-stm32.c b/drivers/pwm/pwm-stm32.c
> > > > > > > > index 687967d3265f..c7fc02b0fa3c 100644
> > > > > > > > --- a/drivers/pwm/pwm-stm32.c
> > > > > > > > +++ b/drivers/pwm/pwm-stm32.c
> > > > > > > > @@ -451,6 +451,10 @@ static int stm32_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
> > > > > > > >       struct stm32_pwm *priv = to_stm32_pwm_dev(chip);
> > > > > > > >       int ret;
> > > > > > > >
> > > > > > > > +     dev_info(chip->dev, "%s:%d\n", __func__, __LINE__);
> > > > > > > > +     msleep(5000);
> > > > > > > > +     dev_info(chip->dev, "%s:%d\n", __func__, __LINE__);
> > > > > > > > +
> > > > > > > >       enabled = pwm->state.enabled;
> > > > > > > >
> > > > > > > >       if (enabled && !state->enabled) {
> > > > > > > > @@ -650,7 +654,11 @@ static void stm32_pwm_remove(struct platform_device *pdev)
> > > > > > > >  {
> > > > > > > >       struct stm32_pwm *priv = platform_get_drvdata(pdev);
> > > > > > > >
> > > > > > > > +     dev_info(&pdev->dev, "%s:%d\n", __func__, __LINE__);
> > > > > > > >       pwmchip_remove(&priv->chip);
> > > > > > > > +     dev_info(&pdev->dev, "%s:%d\n", __func__, __LINE__);
> > > > > > > > +
> > > > > > > > +     priv->regmap = NULL;
> > > > > > > >  }
> > > > > > > >
> > > > > > > >  static int __maybe_unused stm32_pwm_suspend(struct device *dev)
> > > > > > > >
> > > > > > > > The first hunk is only there to widen the race window. The second is to
> > > > > > > > give some diagnostics and make stm32_pwm_apply() crash if it continues
> > > > > > > > to run after the msleep. (Without it it didn't crash reproducibly, don't
> > > > > > > > understand why. *shrug*)
> > > > > > > >
> > > > > > > > The device tree contains a pwm-fan device making use of one of the PWMs.
> > > > > > > >
> > > > > > > > Now I do the following:
> > > > > > > >
> > > > > > > >       echo fan > /sys/bus/platform/drivers/pwm-fan/unbind & sleep 1; echo 40007000.timer:pwm > /sys/bus/platform/drivers/stm32-pwm/unbind
> > > > > > > >
> > > > > > > > Unbinding the fan device has two effects:
> > > > > > > >
> > > > > > > >  - The device link between fan and pwm looses its property to unbind fan
> > > > > > > >    when pwm gets unbound.
> > > > > > > >    (Its .status changes from DL_STATE_ACTIVE to DL_STATE_AVAILABLE)
> > > > > > > >  - It calls pwm_fan_cleanup() which triggers a call to
> > > > > > > >    pwm_apply_state().
> > > > > > > >
> > > > > > > > So when the pwm device gets unbound the first thread is sleeping in
> > > > > > > > stm32_pwm_apply(). The driver calls pwmchip_remove() and sets
> > > > > > > > priv->regmap to NULL. Then a few seconds later the first thread wakes up
> > > > > > > > in stm32_pwm_apply() with the chip freed and priv->regmap = NULL. Bang!
> > > > > > > >
> > > > > > > > This looks as follows:
> > > > > > > >
> > > > > > > > root@crown:~# echo fan > /sys/bus/platform/drivers/pwm-fan/unbind & sleep 1; echo 40007000.timer:pwm > /sys/bus/platform/drivers/stm32-pwm/unbind
> > > > > > > > [  187.182113] stm32-pwm 40007000.timer:pwm: stm32_pwm_apply:454
> > > > > > > > [  188.164769] stm32-pwm 40007000.timer:pwm: stm32_pwm_remove:657
> > > > > > > > [  188.184555] stm32-pwm 40007000.timer:pwm: stm32_pwm_remove:659
> > > > > > > > root@crown:~# [  192.236423] platform 40007000.timer:pwm: stm32_pwm_apply:456
> > > > > > > > [  192.240727] 8<--- cut here ---
> > > > > > > > [  192.243759] Unable to handle kernel NULL pointer dereference at virtual address 0000001c when read
> > > > > > > > ...
> > > > > > > >
> > > > > > > > Even without the crash you can see that stm32_pwm_apply() is still
> > > > > > > > running after pwmchip_remove() completed.
> > > > > > > >
> > > > > > > > I'm unsure if the device link could be improved here to ensure that the
> > > > > > > > fan is completely unbound even if it started unbinding already before
> > > > > > > > the pwm device gets unbound. (And if it could, would this fit the device
> > > > > > > > links purpose and so be a sensible improvement?)
> > > > > > >
> > > > > > > While I think that there is something to be done in the pwm core that
> > > > > > > this doesn't explode (i.e. do proper lifetime tracking such that a
> > > > > > > pwm_chip doesn't disappear while still being used---and I'm working on
> > > > > > > that) I expected that the device links between pwm consumer and provider
> > > > > > > would prevent the above described oops, too. But somehow the fan already
> > > > > > > going away (but still using the PWM) when the PWM is unbound, results in
> > > > > > > the PWM disappearing before the fan is completely gone.
> > > > > > >
> > > > > > > Is this expected, or a problem that can (and should?) be fixed?
> > > > > >
> > > > > > I didn't read your full series, but I read this email. With what's in
> > > > > > this email, the problem seems to be in the driver or the pwm
> > > > > > framework. The pwm driver/framework can't tell the driver core that
> > > > > > you successfully unbound (returning from .remove()) before you have
> > > > > > finish all your ongoing transactions with the device. If your
> > > > > > "apply()" is still running, you need to make sure it's complete before
> > > > > > .remove() does any resource releasing/clean up.
> > > > > >
> > > > > > Also, how is the consumer driver's .remove() succeeding if it has an
> > > > > > ongoing pwm call()?
> > > > >
> > > > > The thing that works fine and as expected is:
> > > > >
> > > > >  - trigger unbind of PWM device via sysfs
> > > > >
> > > > > Because there is a device link PWM provider -> pwm consumer (fan), the
> > > > > fan is removed and once its gone (and not earlier), the PWM gets unbound.
> > > > >
> > > > > The failing sequence is:
> > > > >
> > > > >  - trigger unbind of fan device in userspace thread A via sysfs. The
> > > > >    fan's remove callback blocks for 5s in pwm_apply_state() and so
> > > > >    .remove() doesn't complete yet.
> > > > >
> > > > >  - a second later: trigger unbind of PWM device via sysfs in thread B.
> > > > >    As before I'd expect that the device link results in waiting for the
> > > > >    fan to be removed completely, but the PWM is removed immediately.
> > > > >
> > > > >  - pwm_apply_state's sleep completes (in thread B) and operates on freed
> > > > >    resources => bang!
> > > > >
> > > > > > This all sounds like insufficient locking and
> > > > > > critical region protection in both the consumer and supplier.
> > > > >
> > > > > My (and I think also Thierry's) expectation was, that the device link
> > > > > provides the needed synchronisation. But it doesn't as it doesn't block
> > > > > the PWM provider going away until the fan is completely gone.
> > > > >
> > > > > > Device links can't do anything here because you are giving it wrong
> > > > > > info -- that the unbind was successful before it actually is.
> > > > >
> > > > > The fan's unbind is ongoing, but not complete yet and I'd expect that
> > > > > the device link blocks unbinding the PWM until the fan is completely
> > > > > gone. So I think there is no wrong information.
> > > > >
> > > > > > Device links will and can make sure that the consumer is unbound
> > > > > > successfully before the unbind is called on the supplier. And it looks
> > > > > > like that's still true here.
> > > > >
> > > > > I hope you understood the situation better now and see the problem we
> > > > > have.
> > > > >
> > > > > The problem is fixable in the pwm framework (and I'm working on that),
> > > > > but I think there is also something to improve around devicelink
> > > > > handling.
> > > >
> > > > Thanks for a better explanation of the issue. I agree, this seems like
> > > > something device links should be able to take care of.
> > > >
> > > > I'll take a look into this.
> > >
> > > Took me a while to debug this because I couldn't find the .remove()
> > > function and I was very confused about what's going on.
> > > I'm guessing you started hitting this issue only after moving to the
> > > devm_ variant of the pwm APIs.
> >
> > Ah I see. That problem wouldn't happen if the fan called a pwm API
> > function in its remove callback but that happens in a devm cleanup call
> > (registered by devm_add_action_or_reset(dev, pwm_fan_cleanup, ctx) in
> > pwm_fan_probe()). I first thought you talked about
> > 8c89fd866ad221af037ef0ec3d60b83d0b859c65.
> 
> Am I not talking about that commit?

The relevant thing is that the fan (i.e. the consumer) uses
pwm_apply_state() in a devm cleanup. If the pwm provider uses
devm_pwmchip_add or plain pwmchip_add + pwmchip_remove in .remove()
doesn't matter.

> Btw, I'm still a bit confused by this thread. In your earlier emails
> to me, you said:
> 
> >  - trigger unbind of fan device in userspace thread A via sysfs. The
> >  fan's remove callback blocks for 5s in pwm_apply_state() and so
> >  .remove() doesn't complete yet.
> 
> But the latest tree (Tot) didn't have any .remove() function at all.
> So I just decided to see if there's any issue in Tot and just fix
> that. I'm glad my fix helps fixed the issue with the used of devm_*()
> APIs.

I was sloppy here and called it "remove callback" when it was really a
devm cleanup call. Sorry if that confused you. I didn't expect this to
make a difference (and I'm not even sure I was aware this is a devm
cleanup and not directly .remove() when I wrote that).

> However, are you really seeing the issue (supplier freed before
> consumer) if you do the clean up in the .remove() function? If so,
> there might still be another issue that needs to be fixed.

I didn't test that and now having understood the issue you fixed and
seeing the effect, I confidently claim there is nothing to fix for
drivers that use pwm_apply_state in .remove().

Best regards
Uwe
Saravana Kannan Oct. 18, 2023, 10:25 p.m. UTC | #14
On Wed, Oct 18, 2023 at 2:21 PM Uwe Kleine-König
<u.kleine-koenig@pengutronix.de> wrote:
>
> On Wed, Oct 18, 2023 at 02:01:30PM -0700, Saravana Kannan wrote:
> > On Wed, Oct 18, 2023 at 4:17 AM Uwe Kleine-König
> > <u.kleine-koenig@pengutronix.de> wrote:
> > >
> > > On Tue, Oct 17, 2023 at 06:42:40PM -0700, Saravana Kannan wrote:
> > > > On Tue, Oct 17, 2023 at 4:35 PM Saravana Kannan <saravanak@google.com> wrote:
> > > > >
> > > > > On Sat, Oct 14, 2023 at 9:17 AM Uwe Kleine-König
> > > > > <u.kleine-koenig@pengutronix.de> wrote:
> > > > > >
> > > > > > On Fri, Oct 13, 2023 at 02:42:20PM -0700, Saravana Kannan wrote:
> > > > > > > On Tue, Oct 10, 2023 at 1:05 AM Uwe Kleine-König
> > > > > > > <u.kleine-koenig@pengutronix.de> wrote:
> > > > > > > >
> > > > > > > > Hello Saravana,
> > > > > > > >
> > > > > > > > you were pointed out to me as the expert for device links. I found a
> > > > > > > > problem with these.
> > > > > > > >
> > > > > > > > On Tue, Jul 25, 2023 at 11:10:04PM +0200, Uwe Kleine-König wrote:
> > > > > > > > > Today I managed to trigger the problem I intend to address with this
> > > > > > > > > series. My machine to test this on is an stm32mp157. To be able to
> > > > > > > > > trigger the problem reliably I applied the following patches on top of
> > > > > > > > > v6.5-rc1:
> > > > > > > > >
> > > > > > > > >  - pwm: stm32: Don't modify HW state in .remove() callback
> > > > > > > > >    This is a cleanup that I already sent out.
> > > > > > > > >    https://lore.kernel.org/r/20230713155142.2454010-2-u.kleine-koenig@pengutronix.de
> > > > > > > > >    The purpose for reproducing the problem is to not trigger further
> > > > > > > > >    calls to the apply callback.
> > > > > > > > >
> > > > > > > > >  - The following patch:
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/pwm/pwm-stm32.c b/drivers/pwm/pwm-stm32.c
> > > > > > > > > index 687967d3265f..c7fc02b0fa3c 100644
> > > > > > > > > --- a/drivers/pwm/pwm-stm32.c
> > > > > > > > > +++ b/drivers/pwm/pwm-stm32.c
> > > > > > > > > @@ -451,6 +451,10 @@ static int stm32_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
> > > > > > > > >       struct stm32_pwm *priv = to_stm32_pwm_dev(chip);
> > > > > > > > >       int ret;
> > > > > > > > >
> > > > > > > > > +     dev_info(chip->dev, "%s:%d\n", __func__, __LINE__);
> > > > > > > > > +     msleep(5000);
> > > > > > > > > +     dev_info(chip->dev, "%s:%d\n", __func__, __LINE__);
> > > > > > > > > +
> > > > > > > > >       enabled = pwm->state.enabled;
> > > > > > > > >
> > > > > > > > >       if (enabled && !state->enabled) {
> > > > > > > > > @@ -650,7 +654,11 @@ static void stm32_pwm_remove(struct platform_device *pdev)
> > > > > > > > >  {
> > > > > > > > >       struct stm32_pwm *priv = platform_get_drvdata(pdev);
> > > > > > > > >
> > > > > > > > > +     dev_info(&pdev->dev, "%s:%d\n", __func__, __LINE__);
> > > > > > > > >       pwmchip_remove(&priv->chip);
> > > > > > > > > +     dev_info(&pdev->dev, "%s:%d\n", __func__, __LINE__);
> > > > > > > > > +
> > > > > > > > > +     priv->regmap = NULL;
> > > > > > > > >  }
> > > > > > > > >
> > > > > > > > >  static int __maybe_unused stm32_pwm_suspend(struct device *dev)
> > > > > > > > >
> > > > > > > > > The first hunk is only there to widen the race window. The second is to
> > > > > > > > > give some diagnostics and make stm32_pwm_apply() crash if it continues
> > > > > > > > > to run after the msleep. (Without it it didn't crash reproducibly, don't
> > > > > > > > > understand why. *shrug*)
> > > > > > > > >
> > > > > > > > > The device tree contains a pwm-fan device making use of one of the PWMs.
> > > > > > > > >
> > > > > > > > > Now I do the following:
> > > > > > > > >
> > > > > > > > >       echo fan > /sys/bus/platform/drivers/pwm-fan/unbind & sleep 1; echo 40007000.timer:pwm > /sys/bus/platform/drivers/stm32-pwm/unbind
> > > > > > > > >
> > > > > > > > > Unbinding the fan device has two effects:
> > > > > > > > >
> > > > > > > > >  - The device link between fan and pwm looses its property to unbind fan
> > > > > > > > >    when pwm gets unbound.
> > > > > > > > >    (Its .status changes from DL_STATE_ACTIVE to DL_STATE_AVAILABLE)
> > > > > > > > >  - It calls pwm_fan_cleanup() which triggers a call to
> > > > > > > > >    pwm_apply_state().
> > > > > > > > >
> > > > > > > > > So when the pwm device gets unbound the first thread is sleeping in
> > > > > > > > > stm32_pwm_apply(). The driver calls pwmchip_remove() and sets
> > > > > > > > > priv->regmap to NULL. Then a few seconds later the first thread wakes up
> > > > > > > > > in stm32_pwm_apply() with the chip freed and priv->regmap = NULL. Bang!
> > > > > > > > >
> > > > > > > > > This looks as follows:
> > > > > > > > >
> > > > > > > > > root@crown:~# echo fan > /sys/bus/platform/drivers/pwm-fan/unbind & sleep 1; echo 40007000.timer:pwm > /sys/bus/platform/drivers/stm32-pwm/unbind
> > > > > > > > > [  187.182113] stm32-pwm 40007000.timer:pwm: stm32_pwm_apply:454
> > > > > > > > > [  188.164769] stm32-pwm 40007000.timer:pwm: stm32_pwm_remove:657
> > > > > > > > > [  188.184555] stm32-pwm 40007000.timer:pwm: stm32_pwm_remove:659
> > > > > > > > > root@crown:~# [  192.236423] platform 40007000.timer:pwm: stm32_pwm_apply:456
> > > > > > > > > [  192.240727] 8<--- cut here ---
> > > > > > > > > [  192.243759] Unable to handle kernel NULL pointer dereference at virtual address 0000001c when read
> > > > > > > > > ...
> > > > > > > > >
> > > > > > > > > Even without the crash you can see that stm32_pwm_apply() is still
> > > > > > > > > running after pwmchip_remove() completed.
> > > > > > > > >
> > > > > > > > > I'm unsure if the device link could be improved here to ensure that the
> > > > > > > > > fan is completely unbound even if it started unbinding already before
> > > > > > > > > the pwm device gets unbound. (And if it could, would this fit the device
> > > > > > > > > links purpose and so be a sensible improvement?)
> > > > > > > >
> > > > > > > > While I think that there is something to be done in the pwm core that
> > > > > > > > this doesn't explode (i.e. do proper lifetime tracking such that a
> > > > > > > > pwm_chip doesn't disappear while still being used---and I'm working on
> > > > > > > > that) I expected that the device links between pwm consumer and provider
> > > > > > > > would prevent the above described oops, too. But somehow the fan already
> > > > > > > > going away (but still using the PWM) when the PWM is unbound, results in
> > > > > > > > the PWM disappearing before the fan is completely gone.
> > > > > > > >
> > > > > > > > Is this expected, or a problem that can (and should?) be fixed?
> > > > > > >
> > > > > > > I didn't read your full series, but I read this email. With what's in
> > > > > > > this email, the problem seems to be in the driver or the pwm
> > > > > > > framework. The pwm driver/framework can't tell the driver core that
> > > > > > > you successfully unbound (returning from .remove()) before you have
> > > > > > > finish all your ongoing transactions with the device. If your
> > > > > > > "apply()" is still running, you need to make sure it's complete before
> > > > > > > .remove() does any resource releasing/clean up.
> > > > > > >
> > > > > > > Also, how is the consumer driver's .remove() succeeding if it has an
> > > > > > > ongoing pwm call()?
> > > > > >
> > > > > > The thing that works fine and as expected is:
> > > > > >
> > > > > >  - trigger unbind of PWM device via sysfs
> > > > > >
> > > > > > Because there is a device link PWM provider -> pwm consumer (fan), the
> > > > > > fan is removed and once its gone (and not earlier), the PWM gets unbound.
> > > > > >
> > > > > > The failing sequence is:
> > > > > >
> > > > > >  - trigger unbind of fan device in userspace thread A via sysfs. The
> > > > > >    fan's remove callback blocks for 5s in pwm_apply_state() and so
> > > > > >    .remove() doesn't complete yet.
> > > > > >
> > > > > >  - a second later: trigger unbind of PWM device via sysfs in thread B.
> > > > > >    As before I'd expect that the device link results in waiting for the
> > > > > >    fan to be removed completely, but the PWM is removed immediately.
> > > > > >
> > > > > >  - pwm_apply_state's sleep completes (in thread B) and operates on freed
> > > > > >    resources => bang!
> > > > > >
> > > > > > > This all sounds like insufficient locking and
> > > > > > > critical region protection in both the consumer and supplier.
> > > > > >
> > > > > > My (and I think also Thierry's) expectation was, that the device link
> > > > > > provides the needed synchronisation. But it doesn't as it doesn't block
> > > > > > the PWM provider going away until the fan is completely gone.
> > > > > >
> > > > > > > Device links can't do anything here because you are giving it wrong
> > > > > > > info -- that the unbind was successful before it actually is.
> > > > > >
> > > > > > The fan's unbind is ongoing, but not complete yet and I'd expect that
> > > > > > the device link blocks unbinding the PWM until the fan is completely
> > > > > > gone. So I think there is no wrong information.
> > > > > >
> > > > > > > Device links will and can make sure that the consumer is unbound
> > > > > > > successfully before the unbind is called on the supplier. And it looks
> > > > > > > like that's still true here.
> > > > > >
> > > > > > I hope you understood the situation better now and see the problem we
> > > > > > have.
> > > > > >
> > > > > > The problem is fixable in the pwm framework (and I'm working on that),
> > > > > > but I think there is also something to improve around devicelink
> > > > > > handling.
> > > > >
> > > > > Thanks for a better explanation of the issue. I agree, this seems like
> > > > > something device links should be able to take care of.
> > > > >
> > > > > I'll take a look into this.
> > > >
> > > > Took me a while to debug this because I couldn't find the .remove()
> > > > function and I was very confused about what's going on.
> > > > I'm guessing you started hitting this issue only after moving to the
> > > > devm_ variant of the pwm APIs.
> > >
> > > Ah I see. That problem wouldn't happen if the fan called a pwm API
> > > function in its remove callback but that happens in a devm cleanup call
> > > (registered by devm_add_action_or_reset(dev, pwm_fan_cleanup, ctx) in
> > > pwm_fan_probe()). I first thought you talked about
> > > 8c89fd866ad221af037ef0ec3d60b83d0b859c65.
> >
> > Am I not talking about that commit?
>
> The relevant thing is that the fan (i.e. the consumer) uses
> pwm_apply_state() in a devm cleanup. If the pwm provider uses
> devm_pwmchip_add or plain pwmchip_add + pwmchip_remove in .remove()
> doesn't matter.

Duh! For whatever reason I had forgotten that stm32 was the supplier
and was confusing myself.

>
> > Btw, I'm still a bit confused by this thread. In your earlier emails
> > to me, you said:
> >
> > >  - trigger unbind of fan device in userspace thread A via sysfs. The
> > >  fan's remove callback blocks for 5s in pwm_apply_state() and so
> > >  .remove() doesn't complete yet.
> >
> > But the latest tree (Tot) didn't have any .remove() function at all.
> > So I just decided to see if there's any issue in Tot and just fix
> > that. I'm glad my fix helps fixed the issue with the used of devm_*()
> > APIs.
>
> I was sloppy here and called it "remove callback" when it was really a
> devm cleanup call. Sorry if that confused you. I didn't expect this to
> make a difference (and I'm not even sure I was aware this is a devm
> cleanup and not directly .remove() when I wrote that).
>
> > However, are you really seeing the issue (supplier freed before
> > consumer) if you do the clean up in the .remove() function? If so,
> > there might still be another issue that needs to be fixed.
>
> I didn't test that and now having understood the issue you fixed and
> seeing the effect, I confidently claim there is nothing to fix for
> drivers that use pwm_apply_state in .remove().

Yeah, I'm convinced now too!

-Saravana
diff mbox series

Patch

diff --git a/Documentation/driver-api/driver-model/devres.rst b/Documentation/driver-api/driver-model/devres.rst
index 8be086b3f829..73a9ee074737 100644
--- a/Documentation/driver-api/driver-model/devres.rst
+++ b/Documentation/driver-api/driver-model/devres.rst
@@ -414,6 +414,7 @@  POWER
   devm_reboot_mode_unregister()
 
 PWM
+  devm_pwmchip_alloc()
   devm_pwmchip_add()
   devm_pwm_get()
   devm_fwnode_pwm_get()
diff --git a/Documentation/driver-api/pwm.rst b/Documentation/driver-api/pwm.rst
index 3fdc95f7a1d1..a3824bd58e4c 100644
--- a/Documentation/driver-api/pwm.rst
+++ b/Documentation/driver-api/pwm.rst
@@ -134,11 +134,11 @@  to implement the pwm_*() functions itself. This means that it's impossible
 to have multiple PWM drivers in the system. For this reason it's mandatory
 for new drivers to use the generic PWM framework.
 
-A new PWM controller/chip can be added using pwmchip_add() and removed
-again with pwmchip_remove(). pwmchip_add() takes a filled in struct
-pwm_chip as argument which provides a description of the PWM chip, the
-number of PWM devices provided by the chip and the chip-specific
-implementation of the supported PWM operations to the framework.
+A new PWM controller/chip can be allocated using devm_pwmchip_alloc, then added
+using pwmchip_add() and removed again with pwmchip_remove(). pwmchip_add()
+takes a filled in struct pwm_chip as argument which provides a description of
+the PWM chip, the number of PWM devices provided by the chip and the
+chip-specific implementation of the supported PWM operations to the framework.
 
 When implementing polarity support in a PWM driver, make sure to respect the
 signal conventions in the PWM framework. By definition, normal polarity
diff --git a/drivers/pwm/core.c b/drivers/pwm/core.c
index 81d176142403..3f4c2d940d64 100644
--- a/drivers/pwm/core.c
+++ b/drivers/pwm/core.c
@@ -304,6 +304,29 @@  void pwmchip_remove(struct pwm_chip *chip)
 }
 EXPORT_SYMBOL_GPL(pwmchip_remove);
 
+void *pwmchip_priv(struct pwm_chip *chip)
+{
+	return (char *)chip + ALIGN(sizeof(*chip), 32);
+}
+EXPORT_SYMBOL_GPL(pwmchip_priv);
+
+struct pwm_chip *devm_pwmchip_alloc(struct device *parent, size_t sizeof_priv)
+{
+	struct pwm_chip *chip;
+	size_t alloc_size;
+
+	alloc_size = ALIGN(sizeof(*chip), 32) + sizeof_priv;
+
+	chip = devm_kzalloc(parent, alloc_size, GFP_KERNEL);
+	if (!chip)
+		return NULL;
+
+	chip->dev = parent;
+
+	return chip;
+}
+EXPORT_SYMBOL_GPL(devm_pwmchip_alloc);
+
 static void devm_pwmchip_remove(void *data)
 {
 	struct pwm_chip *chip = data;
diff --git a/include/linux/pwm.h b/include/linux/pwm.h
index 10bea923a1d5..648009c8499b 100644
--- a/include/linux/pwm.h
+++ b/include/linux/pwm.h
@@ -5,6 +5,7 @@ 
 #include <linux/err.h>
 #include <linux/mutex.h>
 #include <linux/of.h>
+#include <linux/compiler_attributes.h>
 
 struct pwm_chip;
 
@@ -385,6 +386,9 @@  int pwm_capture(struct pwm_device *pwm, struct pwm_capture *result,
 int pwmchip_add(struct pwm_chip *chip);
 void pwmchip_remove(struct pwm_chip *chip);
 
+void *pwmchip_priv(struct pwm_chip *chip) __attribute_const__;
+
+struct pwm_chip *devm_pwmchip_alloc(struct device *parent, size_t sizeof_priv);
 int devm_pwmchip_add(struct device *dev, struct pwm_chip *chip);
 
 struct pwm_device *pwm_request_from_chip(struct pwm_chip *chip,