diff mbox series

[2/3] wdt-uclass: prevent multiple cyclic_register calls

Message ID 20240509004714.1394547-3-rasmus.villemoes@prevas.dk
State Superseded
Delegated to: Stefan Roese
Headers show
Series cyclic/watchdog patches | expand

Commit Message

Rasmus Villemoes May 9, 2024, 12:47 a.m. UTC
Currently, the cyclic_register() done in wdt_start() is not undone in
wdt_stop(). Moreover, calling wdt_start multiple times (which is
perfectly allowed on an already started device, e.g. to change the
timeout value) will result in another struct cyclic_info being
registered, referring to the same watchdog device.

This can easily be seen on e.g. a wandboard:

=> cyclic list
function: watchdog@20bc000, cpu-time: 22 us, frequency: 1.01 times/s
=> wdt list
watchdog@20bc000 (imx_wdt)
=> wdt dev watchdog@20bc000
=> wdt start 50000
WDT:   Started watchdog@20bc000 with servicing every 1000ms (50s timeout)
=> cyclic list
function: watchdog@20bc000, cpu-time: 37 us, frequency: 1.03 times/s
function: watchdog@20bc000, cpu-time: 241 us, frequency: 1.01 times/s
=> wdt start 12345
WDT:   Started watchdog@20bc000 with servicing every 1000ms (12s timeout)
=> cyclic list
function: watchdog@20bc000, cpu-time: 36 us, frequency: 1.03 times/s
function: watchdog@20bc000, cpu-time: 100 us, frequency: 1.04 times/s
function: watchdog@20bc000, cpu-time: 299 us, frequency: 1.00 times/s

So properly unregister the watchdog device from the cyclic framework
in wdt_stop(). In wdt_start(), we cannot just skip the registration,
as the (new) timeout value may mean that we have to ask the cyclic
framework to call us more often. So if we're already running,
first unregister the old cyclic instance.

Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
---
 drivers/watchdog/wdt-uclass.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

Stefan Roese May 13, 2024, 10:40 a.m. UTC | #1
On 5/9/24 02:47, Rasmus Villemoes wrote:
> Currently, the cyclic_register() done in wdt_start() is not undone in
> wdt_stop(). Moreover, calling wdt_start multiple times (which is
> perfectly allowed on an already started device, e.g. to change the
> timeout value) will result in another struct cyclic_info being
> registered, referring to the same watchdog device.
> 
> This can easily be seen on e.g. a wandboard:
> 
> => cyclic list
> function: watchdog@20bc000, cpu-time: 22 us, frequency: 1.01 times/s
> => wdt list
> watchdog@20bc000 (imx_wdt)
> => wdt dev watchdog@20bc000
> => wdt start 50000
> WDT:   Started watchdog@20bc000 with servicing every 1000ms (50s timeout)
> => cyclic list
> function: watchdog@20bc000, cpu-time: 37 us, frequency: 1.03 times/s
> function: watchdog@20bc000, cpu-time: 241 us, frequency: 1.01 times/s
> => wdt start 12345
> WDT:   Started watchdog@20bc000 with servicing every 1000ms (12s timeout)
> => cyclic list
> function: watchdog@20bc000, cpu-time: 36 us, frequency: 1.03 times/s
> function: watchdog@20bc000, cpu-time: 100 us, frequency: 1.04 times/s
> function: watchdog@20bc000, cpu-time: 299 us, frequency: 1.00 times/s
> 
> So properly unregister the watchdog device from the cyclic framework
> in wdt_stop(). In wdt_start(), we cannot just skip the registration,
> as the (new) timeout value may mean that we have to ask the cyclic
> framework to call us more often. So if we're already running,
> first unregister the old cyclic instance.

Thanks again for all your valuable work here. I do have a question
regarding this patch though. AFAIU, it's a valid use-case to enable
2 different watchdog devices. And this patch will prevent such a
setup. Or do I misunderstand this?

Thanks,
Stefan

> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
> ---
>   drivers/watchdog/wdt-uclass.c | 9 +++++++--
>   1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/watchdog/wdt-uclass.c b/drivers/watchdog/wdt-uclass.c
> index 417e8d7eef9..caa1567e0c3 100644
> --- a/drivers/watchdog/wdt-uclass.c
> +++ b/drivers/watchdog/wdt-uclass.c
> @@ -122,10 +122,11 @@ int wdt_start(struct udevice *dev, u64 timeout_ms, ulong flags)
>   		struct wdt_priv *priv = dev_get_uclass_priv(dev);
>   		char str[16];
>   
> -		priv->running = true;
> -
>   		memset(str, 0, 16);
>   		if (IS_ENABLED(CONFIG_WATCHDOG)) {
> +			if (priv->running)
> +				cyclic_unregister(priv->cyclic);
> +
>   			/* Register the watchdog driver as a cyclic function */
>   			priv->cyclic = cyclic_register(wdt_cyclic,
>   						       priv->reset_period * 1000,
> @@ -140,6 +141,7 @@ int wdt_start(struct udevice *dev, u64 timeout_ms, ulong flags)
>   			}
>   		}
>   
> +		priv->running = true;
>   		printf("WDT:   Started %s with%s servicing %s (%ds timeout)\n",
>   		       dev->name, IS_ENABLED(CONFIG_WATCHDOG) ? "" : "out",
>   		       str, (u32)lldiv(timeout_ms, 1000));
> @@ -160,6 +162,9 @@ int wdt_stop(struct udevice *dev)
>   	if (ret == 0) {
>   		struct wdt_priv *priv = dev_get_uclass_priv(dev);
>   
> +		if (IS_ENABLED(CONFIG_WATCHDOG) && priv->running)
> +			cyclic_unregister(priv->cyclic);
> +
>   		priv->running = false;
>   	}
>   

Viele Grüße,
Stefan Roese
Rasmus Villemoes May 13, 2024, 11:09 a.m. UTC | #2
On 13/05/2024 12.40, Stefan Roese wrote:
> On 5/9/24 02:47, Rasmus Villemoes wrote:
>> Currently, the cyclic_register() done in wdt_start() is not undone in
>> wdt_stop(). Moreover, calling wdt_start multiple times (which is
>> perfectly allowed on an already started device, e.g. to change the
>> timeout value) will result in another struct cyclic_info being
>> registered, referring to the same watchdog device.
>>
>> This can easily be seen on e.g. a wandboard:
>>
>> => cyclic list
>> function: watchdog@20bc000, cpu-time: 22 us, frequency: 1.01 times/s
>> => wdt list
>> watchdog@20bc000 (imx_wdt)
>> => wdt dev watchdog@20bc000
>> => wdt start 50000
>> WDT:   Started watchdog@20bc000 with servicing every 1000ms (50s timeout)
>> => cyclic list
>> function: watchdog@20bc000, cpu-time: 37 us, frequency: 1.03 times/s
>> function: watchdog@20bc000, cpu-time: 241 us, frequency: 1.01 times/s
>> => wdt start 12345
>> WDT:   Started watchdog@20bc000 with servicing every 1000ms (12s timeout)
>> => cyclic list
>> function: watchdog@20bc000, cpu-time: 36 us, frequency: 1.03 times/s
>> function: watchdog@20bc000, cpu-time: 100 us, frequency: 1.04 times/s
>> function: watchdog@20bc000, cpu-time: 299 us, frequency: 1.00 times/s
>>
>> So properly unregister the watchdog device from the cyclic framework
>> in wdt_stop(). In wdt_start(), we cannot just skip the registration,
>> as the (new) timeout value may mean that we have to ask the cyclic
>> framework to call us more often. So if we're already running,
>> first unregister the old cyclic instance.
> 
> Thanks again for all your valuable work here. I do have a question
> regarding this patch though. AFAIU, it's a valid use-case to enable
> 2 different watchdog devices. And this patch will prevent such a
> setup. Or do I misunderstand this?

No, that shouldn't prevent enabling or petting two different watchdogs
at all, that should work just as well as today. All the state I'm
looking at and modifying belongs to one specific device, it's not
wdt-uclass-global state, it's wdt-uclass-per-device state.

But with the existing code, if one calls wdt_start() a second time on an
already started device, the existing priv->cyclic pointer is dropped on
the floor (it gets unconditionally overwritten by the new
cyclic_register() return value), so one gets the situation that the same
device has multiple cyclic callbacks registered. Also, we're not
unregistering the device from the cyclic framework upon ->stop().

Today, that's "mostly harmless". It does mean 'cyclic list' becomes
cluttered, and is a memory leak, so you could exhaust memory by doing
wdt_start in a loop, and cyclic_run() ends up doing extra callbacks to
the same device (all but one will be no-ops due to the per-device
rate-limiting of ->reset callbacks). In practice there's no chance of
memory corruption or other failures. However, that changes with 3/3,
where it would be catastrophic to call cyclic_register() with an already
registered cyclic_info instance, as the hlist would become corrupt. I
think it's worth fixing regardless of whether 3/3 is accepted.

Rasmus
Stefan Roese May 13, 2024, 2:24 p.m. UTC | #3
Hi Rasmus,

On 5/13/24 13:09, Rasmus Villemoes wrote:
> On 13/05/2024 12.40, Stefan Roese wrote:
>> On 5/9/24 02:47, Rasmus Villemoes wrote:
>>> Currently, the cyclic_register() done in wdt_start() is not undone in
>>> wdt_stop(). Moreover, calling wdt_start multiple times (which is
>>> perfectly allowed on an already started device, e.g. to change the
>>> timeout value) will result in another struct cyclic_info being
>>> registered, referring to the same watchdog device.
>>>
>>> This can easily be seen on e.g. a wandboard:
>>>
>>> => cyclic list
>>> function: watchdog@20bc000, cpu-time: 22 us, frequency: 1.01 times/s
>>> => wdt list
>>> watchdog@20bc000 (imx_wdt)
>>> => wdt dev watchdog@20bc000
>>> => wdt start 50000
>>> WDT:   Started watchdog@20bc000 with servicing every 1000ms (50s timeout)
>>> => cyclic list
>>> function: watchdog@20bc000, cpu-time: 37 us, frequency: 1.03 times/s
>>> function: watchdog@20bc000, cpu-time: 241 us, frequency: 1.01 times/s
>>> => wdt start 12345
>>> WDT:   Started watchdog@20bc000 with servicing every 1000ms (12s timeout)
>>> => cyclic list
>>> function: watchdog@20bc000, cpu-time: 36 us, frequency: 1.03 times/s
>>> function: watchdog@20bc000, cpu-time: 100 us, frequency: 1.04 times/s
>>> function: watchdog@20bc000, cpu-time: 299 us, frequency: 1.00 times/s
>>>
>>> So properly unregister the watchdog device from the cyclic framework
>>> in wdt_stop(). In wdt_start(), we cannot just skip the registration,
>>> as the (new) timeout value may mean that we have to ask the cyclic
>>> framework to call us more often. So if we're already running,
>>> first unregister the old cyclic instance.
>>
>> Thanks again for all your valuable work here. I do have a question
>> regarding this patch though. AFAIU, it's a valid use-case to enable
>> 2 different watchdog devices. And this patch will prevent such a
>> setup. Or do I misunderstand this?
> 
> No, that shouldn't prevent enabling or petting two different watchdogs
> at all, that should work just as well as today. All the state I'm
> looking at and modifying belongs to one specific device, it's not
> wdt-uclass-global state, it's wdt-uclass-per-device state.

That's what I was hoping, but was too lazy to figure out myself.

> But with the existing code, if one calls wdt_start() a second time on an
> already started device, the existing priv->cyclic pointer is dropped on
> the floor (it gets unconditionally overwritten by the new
> cyclic_register() return value), so one gets the situation that the same
> device has multiple cyclic callbacks registered. Also, we're not
> unregistering the device from the cyclic framework upon ->stop().

Understood.

> Today, that's "mostly harmless". It does mean 'cyclic list' becomes
> cluttered, and is a memory leak, so you could exhaust memory by doing
> wdt_start in a loop, and cyclic_run() ends up doing extra callbacks to
> the same device (all but one will be no-ops due to the per-device
> rate-limiting of ->reset callbacks). In practice there's no chance of
> memory corruption or other failures. However, that changes with 3/3,
> where it would be catastrophic to call cyclic_register() with an already
> registered cyclic_info instance, as the hlist would become corrupt. I
> think it's worth fixing regardless of whether 3/3 is accepted.

Agreed.

Reviewed-by: Stefan Roese <sr@denx.de>

Thanks,
Stefan
diff mbox series

Patch

diff --git a/drivers/watchdog/wdt-uclass.c b/drivers/watchdog/wdt-uclass.c
index 417e8d7eef9..caa1567e0c3 100644
--- a/drivers/watchdog/wdt-uclass.c
+++ b/drivers/watchdog/wdt-uclass.c
@@ -122,10 +122,11 @@  int wdt_start(struct udevice *dev, u64 timeout_ms, ulong flags)
 		struct wdt_priv *priv = dev_get_uclass_priv(dev);
 		char str[16];
 
-		priv->running = true;
-
 		memset(str, 0, 16);
 		if (IS_ENABLED(CONFIG_WATCHDOG)) {
+			if (priv->running)
+				cyclic_unregister(priv->cyclic);
+
 			/* Register the watchdog driver as a cyclic function */
 			priv->cyclic = cyclic_register(wdt_cyclic,
 						       priv->reset_period * 1000,
@@ -140,6 +141,7 @@  int wdt_start(struct udevice *dev, u64 timeout_ms, ulong flags)
 			}
 		}
 
+		priv->running = true;
 		printf("WDT:   Started %s with%s servicing %s (%ds timeout)\n",
 		       dev->name, IS_ENABLED(CONFIG_WATCHDOG) ? "" : "out",
 		       str, (u32)lldiv(timeout_ms, 1000));
@@ -160,6 +162,9 @@  int wdt_stop(struct udevice *dev)
 	if (ret == 0) {
 		struct wdt_priv *priv = dev_get_uclass_priv(dev);
 
+		if (IS_ENABLED(CONFIG_WATCHDOG) && priv->running)
+			cyclic_unregister(priv->cyclic);
+
 		priv->running = false;
 	}