diff mbox series

[resend] bootdev: avoid infinite probe loop

Message ID 20240104160346.285881-1-caleb.connolly@linaro.org
State Accepted
Commit 9d92c418acfb7576e12e2bd53fed294bb9543724
Delegated to: Tom Rini
Headers show
Series [resend] bootdev: avoid infinite probe loop | expand

Commit Message

Caleb Connolly Jan. 4, 2024, 4:03 p.m. UTC
Sometimes, when only one bootdev is available, and it fails to probe, we
end up in an infinite loop calling probe() on the same device over and
over. With only debug level log output.

Break the loop if we fail to probe the same device twice in a row, and
promote the probe failure message to log_warning().

Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org>
---
Resend, actually change log message to WARN loglevel.
---
 boot/bootdev-uclass.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

Comments

Simon Glass Jan. 4, 2024, 4:06 p.m. UTC | #1
Hi Caleb,

On Thu, Jan 4, 2024 at 9:03 AM Caleb Connolly <caleb.connolly@linaro.org> wrote:
>
> Sometimes, when only one bootdev is available, and it fails to probe, we
> end up in an infinite loop calling probe() on the same device over and
> over. With only debug level log output.
>
> Break the loop if we fail to probe the same device twice in a row, and
> promote the probe failure message to log_warning().
>
> Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org>
> ---
> Resend, actually change log message to WARN loglevel.
> ---
>  boot/bootdev-uclass.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
>

Reviewed-by: Simon Glass <sjg@chromium.org>

Thanks for the fix. Is this something for which a test could be added?
One that fails (hangs) now but works with your patch?

Regards,
Simon
Caleb Connolly Jan. 4, 2024, 4:12 p.m. UTC | #2
Hi Simon,

On 04/01/2024 16:06, Simon Glass wrote:
> Hi Caleb,
> 
> On Thu, Jan 4, 2024 at 9:03 AM Caleb Connolly <caleb.connolly@linaro.org> wrote:
>>
>> Sometimes, when only one bootdev is available, and it fails to probe, we
>> end up in an infinite loop calling probe() on the same device over and
>> over. With only debug level log output.
>>
>> Break the loop if we fail to probe the same device twice in a row, and
>> promote the probe failure message to log_warning().
>>
>> Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org>
>> ---
>> Resend, actually change log message to WARN loglevel.
>> ---
>>  boot/bootdev-uclass.c | 14 ++++++++++++--
>>  1 file changed, 12 insertions(+), 2 deletions(-)
>>
> 
> Reviewed-by: Simon Glass <sjg@chromium.org>
> 
> Thanks for the fix. Is this something for which a test could be added?
> One that fails (hangs) now but works with your patch?

I'm not sure, I'm not very familiar with the U-Boot testing
infrastructure yet. I guess this should be testable without having to
solve the halting problem :P but I don't know how best to go about
writing one.
> 
> Regards,
> Simon
Dragan Simic Jan. 4, 2024, 4:51 p.m. UTC | #3
On 2024-01-04 17:03, Caleb Connolly wrote:
> Sometimes, when only one bootdev is available, and it fails to probe, 
> we
> end up in an infinite loop calling probe() on the same device over and
> over. With only debug level log output.
> 
> Break the loop if we fail to probe the same device twice in a row, and
> promote the probe failure message to log_warning().
> 
> Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org>

Looks good to me.

Reviewed-by: Dragan Simic <dsimic@manjaro.org>

> ---
> Resend, actually change log message to WARN loglevel.
> ---
>  boot/bootdev-uclass.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/boot/bootdev-uclass.c b/boot/bootdev-uclass.c
> index d01d603700d9..cd1c2bc06774 100644
> --- a/boot/bootdev-uclass.c
> +++ b/boot/bootdev-uclass.c
> @@ -636,7 +636,7 @@ int bootdev_next_label(struct bootflow_iter *iter,
> struct udevice **devp,
> 
>  int bootdev_next_prio(struct bootflow_iter *iter, struct udevice 
> **devp)
>  {
> -	struct udevice *dev = *devp;
> +	struct udevice *dev = *devp, *last_dev = NULL;
>  	bool found;
>  	int ret;
> 
> @@ -686,9 +686,19 @@ int bootdev_next_prio(struct bootflow_iter *iter,
> struct udevice **devp)
>  			}
>  		} else {
>  			ret = device_probe(dev);
> +			if (!ret)
> +				last_dev = dev;
>  			if (ret) {
> -				log_debug("Device '%s' failed to probe\n",
> +				log_warning("Device '%s' failed to probe\n",
>  					  dev->name);
> +				if (last_dev == dev) {
> +					/*
> +					 * We have already tried this device
> +					 * and it failed to probe. Give up.
> +					 */
> +					return log_msg_ret("probe", ret);
> +				}
> +				last_dev = dev;
>  				dev = NULL;
>  			}
>  		}
Tom Rini Jan. 19, 2024, 4:08 p.m. UTC | #4
On Thu, Jan 04, 2024 at 04:03:35PM +0000, Caleb Connolly wrote:

> Sometimes, when only one bootdev is available, and it fails to probe, we
> end up in an infinite loop calling probe() on the same device over and
> over. With only debug level log output.
> 
> Break the loop if we fail to probe the same device twice in a row, and
> promote the probe failure message to log_warning().
> 
> Signed-off-by: Caleb Connolly <caleb.connolly@linaro.org>
> Reviewed-by: Simon Glass <sjg@chromium.org>
> Reviewed-by: Dragan Simic <dsimic@manjaro.org>

Applied to u-boot/master, thanks!
diff mbox series

Patch

diff --git a/boot/bootdev-uclass.c b/boot/bootdev-uclass.c
index d01d603700d9..cd1c2bc06774 100644
--- a/boot/bootdev-uclass.c
+++ b/boot/bootdev-uclass.c
@@ -636,7 +636,7 @@  int bootdev_next_label(struct bootflow_iter *iter, struct udevice **devp,
 
 int bootdev_next_prio(struct bootflow_iter *iter, struct udevice **devp)
 {
-	struct udevice *dev = *devp;
+	struct udevice *dev = *devp, *last_dev = NULL;
 	bool found;
 	int ret;
 
@@ -686,9 +686,19 @@  int bootdev_next_prio(struct bootflow_iter *iter, struct udevice **devp)
 			}
 		} else {
 			ret = device_probe(dev);
+			if (!ret)
+				last_dev = dev;
 			if (ret) {
-				log_debug("Device '%s' failed to probe\n",
+				log_warning("Device '%s' failed to probe\n",
 					  dev->name);
+				if (last_dev == dev) {
+					/*
+					 * We have already tried this device
+					 * and it failed to probe. Give up.
+					 */
+					return log_msg_ret("probe", ret);
+				}
+				last_dev = dev;
 				dev = NULL;
 			}
 		}