diff mbox series

[v4,05/12] usb: Avoid unbinding devices in use by bootflows

Message ID 20231112130245.v4.5.If206027372f73ce32480223e5626f4b944e281b7@changeid
State Superseded
Delegated to: Bin Meng
Headers show
Series Resolve issues with booting distros on x86 | expand

Commit Message

Simon Glass Nov. 12, 2023, 8:02 p.m. UTC
When a USB device is unbound, it causes any bootflows attached to it to
be removed, via a call to bootdev_clear_bootflows() from
bootdev_pre_unbind(). This obviously makes it impossible to boot the
bootflow.

However, when booting a bootflow that relies on USB, usb_stop() is
called, which unbinds the device. For EFI, this happens in
efi_exit_boot_services() which means that the bootflow disappears
before it is finished with.

There is no need to unbind all the USB devices just to quiesce them.
Add a new usb_pause() call which removes them but leaves them bound.

This resolves a hang on x86 when booting a distro from USB. This was
found using a device with 4 bootflows, the last of which was USB.


Signed-off-by: Simon Glass <sjg@chromium.org>
---

Changes in v4:
- Don't rename the legacy-USB functions
- Add a bit more detail to the comment

Changes in v2:
- Add new patch to avoid unbinding devices in use by bootflows

 boot/bootm.c                  |  2 +-
 common/usb.c                  |  5 +++++
 drivers/usb/host/usb-uclass.c | 14 ++++++++++++--
 include/usb.h                 | 21 ++++++++++++++++++++-
 4 files changed, 38 insertions(+), 4 deletions(-)

Comments

Heinrich Schuchardt Nov. 12, 2023, 8:27 p.m. UTC | #1
Am 12. November 2023 21:02:42 MEZ schrieb Simon Glass <sjg@chromium.org>:
>When a USB device is unbound, it causes any bootflows attached to it to
>be removed, via a call to bootdev_clear_bootflows() from
>bootdev_pre_unbind(). This obviously makes it impossible to boot the
>bootflow.
>
>However, when booting a bootflow that relies on USB, usb_stop() is
>called, which unbinds the device. For EFI, this happens in
>efi_exit_boot_services() which means that the bootflow disappears
>before it is finished with.

After ExitBootServices() no driver of U-Boot can be used as the operating system is in charge.

Any bootflow inside U-Boot is terminated by definition when reaching ExitBootServices.

>
>There is no need to unbind all the USB devices just to quiesce them.
>Add a new usb_pause() call which removes them but leaves them bound.

As U-Boot must not access any device after ExitBootServices() it is unclear to me what you want to achieve.

Best regards

Heinrich

>
>This resolves a hang on x86 when booting a distro from USB. This was
>found using a device with 4 bootflows, the last of which was USB.
>
>
>Signed-off-by: Simon Glass <sjg@chromium.org>
>---
>
>Changes in v4:
>- Don't rename the legacy-USB functions
>- Add a bit more detail to the comment
>
>Changes in v2:
>- Add new patch to avoid unbinding devices in use by bootflows
>
> boot/bootm.c                  |  2 +-
> common/usb.c                  |  5 +++++
> drivers/usb/host/usb-uclass.c | 14 ++++++++++++--
> include/usb.h                 | 21 ++++++++++++++++++++-
> 4 files changed, 38 insertions(+), 4 deletions(-)
>
>diff --git a/boot/bootm.c b/boot/bootm.c
>index cb61485c226c..d9c3fa8dad99 100644
>--- a/boot/bootm.c
>+++ b/boot/bootm.c
>@@ -502,7 +502,7 @@ ulong bootm_disable_interrupts(void)
> 	 * updated every 1 ms within the HCCA structure in SDRAM! For more
> 	 * details see the OpenHCI specification.
> 	 */
>-	usb_stop();
>+	usb_pause();
> #endif
> 	return iflag;
> }
>diff --git a/common/usb.c b/common/usb.c
>index 836506dcd9e9..a486d1c87d4d 100644
>--- a/common/usb.c
>+++ b/common/usb.c
>@@ -144,6 +144,11 @@ int usb_stop(void)
> 	return 0;
> }
> 
>+int usb_pause(void)
>+{
>+	return usb_stop();
>+}
>+
> /******************************************************************************
>  * Detect if a USB device has been plugged or unplugged.
>  */
>diff --git a/drivers/usb/host/usb-uclass.c b/drivers/usb/host/usb-uclass.c
>index a1cd0ad2d669..c26c65d7986b 100644
>--- a/drivers/usb/host/usb-uclass.c
>+++ b/drivers/usb/host/usb-uclass.c
>@@ -173,7 +173,7 @@ int usb_get_max_xfer_size(struct usb_device *udev, size_t *size)
> 	return ops->get_max_xfer_size(bus, size);
> }
> 
>-int usb_stop(void)
>+static int usb_finish(bool unbind_all)
> {
> 	struct udevice *bus;
> 	struct udevice *rh;
>@@ -195,7 +195,7 @@ int usb_stop(void)
> 
> 		/* Locate root hub device */
> 		device_find_first_child(bus, &rh);
>-		if (rh) {
>+		if (rh && unbind_all) {
> 			/*
> 			 * All USB devices are children of root hub.
> 			 * Unbinding root hub will unbind all of its children.
>@@ -222,6 +222,16 @@ int usb_stop(void)
> 	return err;
> }
> 
>+int usb_stop(void)
>+{
>+	return usb_finish(true);
>+}
>+
>+int usb_pause(void)
>+{
>+	return usb_finish(false);
>+}
>+
> static void usb_scan_bus(struct udevice *bus, bool recurse)
> {
> 	struct usb_bus_priv *priv;
>diff --git a/include/usb.h b/include/usb.h
>index 09e3f0cb309c..b964e2e1f6b2 100644
>--- a/include/usb.h
>+++ b/include/usb.h
>@@ -265,7 +265,26 @@ int usb_kbd_deregister(int force);
>  */
> int usb_init(void);
> 
>-int usb_stop(void); /* stop the USB Controller */
>+/**
>+ * usb_stop() - stop the USB Controller and unbind all USB controllers/devices
>+ *
>+ * This unbinds all devices on the USB buses.
>+ *
>+ * Return: 0 if OK, -ve on error
>+ */
>+int usb_stop(void);
>+
>+/**
>+ * usb_pause() - stop the USB Controller DMA, etc.
>+ *
>+ * Note that this does not unbind bus devices, so using usb_init() after this
>+ * call is not permitted. This function is only useful just before booting, to
>+ * ensure that devices are dormant.
>+ *
>+ * Return: 0 if OK, -ve on error
>+ */
>+int usb_pause(void);
>+
> int usb_detect_change(void); /* detect if a USB device has been (un)plugged */
> 
>
Simon Glass Nov. 12, 2023, 9:20 p.m. UTC | #2
Hi Heinrich,

On Sun, 12 Nov 2023 at 13:32, Heinrich Schuchardt <xypron.glpk@gmx.de> wrote:
>
>
>
> Am 12. November 2023 21:02:42 MEZ schrieb Simon Glass <sjg@chromium.org>:
> >When a USB device is unbound, it causes any bootflows attached to it to
> >be removed, via a call to bootdev_clear_bootflows() from
> >bootdev_pre_unbind(). This obviously makes it impossible to boot the
> >bootflow.
> >
> >However, when booting a bootflow that relies on USB, usb_stop() is
> >called, which unbinds the device. For EFI, this happens in
> >efi_exit_boot_services() which means that the bootflow disappears
> >before it is finished with.
>
> After ExitBootServices() no driver of U-Boot can be used as the operating system is in charge.
>
> Any bootflow inside U-Boot is terminated by definition when reaching ExitBootServices.
>
> >
> >There is no need to unbind all the USB devices just to quiesce them.
> >Add a new usb_pause() call which removes them but leaves them bound.
>
> As U-Boot must not access any device after ExitBootServices() it is unclear to me what you want to achieve.

I can't remember exactly what is needed from the bootflow, but
something is. Perhaps the kernel, or the cmdline, or fdt? It would
have been better if I had mentioned this explicitly,  But then this
patch has been sitting around for ages...

In any case, the intent of exit-boot-services is not to free all the
memory used, since some of it may have been used to hold data that the
app continues to use.

Also there is U-Boot code between when the devices are unbound and
when U-Boot actually exits back to the app.

This hang was 100% repeatable on brya (an x86 Chromebook) and it took
quite a while to find.

Regards,
Simon


>
> Best regards
>
> Heinrich
>
> >
> >This resolves a hang on x86 when booting a distro from USB. This was
> >found using a device with 4 bootflows, the last of which was USB.
> >
> >
> >Signed-off-by: Simon Glass <sjg@chromium.org>
> >---
> >
> >Changes in v4:
> >- Don't rename the legacy-USB functions
> >- Add a bit more detail to the comment
> >
> >Changes in v2:
> >- Add new patch to avoid unbinding devices in use by bootflows
> >
> > boot/bootm.c                  |  2 +-
> > common/usb.c                  |  5 +++++
> > drivers/usb/host/usb-uclass.c | 14 ++++++++++++--
> > include/usb.h                 | 21 ++++++++++++++++++++-
> > 4 files changed, 38 insertions(+), 4 deletions(-)
> >
> >diff --git a/boot/bootm.c b/boot/bootm.c
> >index cb61485c226c..d9c3fa8dad99 100644
> >--- a/boot/bootm.c
> >+++ b/boot/bootm.c
> >@@ -502,7 +502,7 @@ ulong bootm_disable_interrupts(void)
> >        * updated every 1 ms within the HCCA structure in SDRAM! For more
> >        * details see the OpenHCI specification.
> >        */
> >-      usb_stop();
> >+      usb_pause();
> > #endif
> >       return iflag;
> > }
> >diff --git a/common/usb.c b/common/usb.c
> >index 836506dcd9e9..a486d1c87d4d 100644
> >--- a/common/usb.c
> >+++ b/common/usb.c
> >@@ -144,6 +144,11 @@ int usb_stop(void)
> >       return 0;
> > }
> >
> >+int usb_pause(void)
> >+{
> >+      return usb_stop();
> >+}
> >+
> > /******************************************************************************
> >  * Detect if a USB device has been plugged or unplugged.
> >  */
> >diff --git a/drivers/usb/host/usb-uclass.c b/drivers/usb/host/usb-uclass.c
> >index a1cd0ad2d669..c26c65d7986b 100644
> >--- a/drivers/usb/host/usb-uclass.c
> >+++ b/drivers/usb/host/usb-uclass.c
> >@@ -173,7 +173,7 @@ int usb_get_max_xfer_size(struct usb_device *udev, size_t *size)
> >       return ops->get_max_xfer_size(bus, size);
> > }
> >
> >-int usb_stop(void)
> >+static int usb_finish(bool unbind_all)
> > {
> >       struct udevice *bus;
> >       struct udevice *rh;
> >@@ -195,7 +195,7 @@ int usb_stop(void)
> >
> >               /* Locate root hub device */
> >               device_find_first_child(bus, &rh);
> >-              if (rh) {
> >+              if (rh && unbind_all) {
> >                       /*
> >                        * All USB devices are children of root hub.
> >                        * Unbinding root hub will unbind all of its children.
> >@@ -222,6 +222,16 @@ int usb_stop(void)
> >       return err;
> > }
> >
> >+int usb_stop(void)
> >+{
> >+      return usb_finish(true);
> >+}
> >+
> >+int usb_pause(void)
> >+{
> >+      return usb_finish(false);
> >+}
> >+
> > static void usb_scan_bus(struct udevice *bus, bool recurse)
> > {
> >       struct usb_bus_priv *priv;
> >diff --git a/include/usb.h b/include/usb.h
> >index 09e3f0cb309c..b964e2e1f6b2 100644
> >--- a/include/usb.h
> >+++ b/include/usb.h
> >@@ -265,7 +265,26 @@ int usb_kbd_deregister(int force);
> >  */
> > int usb_init(void);
> >
> >-int usb_stop(void); /* stop the USB Controller */
> >+/**
> >+ * usb_stop() - stop the USB Controller and unbind all USB controllers/devices
> >+ *
> >+ * This unbinds all devices on the USB buses.
> >+ *
> >+ * Return: 0 if OK, -ve on error
> >+ */
> >+int usb_stop(void);
> >+
> >+/**
> >+ * usb_pause() - stop the USB Controller DMA, etc.
> >+ *
> >+ * Note that this does not unbind bus devices, so using usb_init() after this
> >+ * call is not permitted. This function is only useful just before booting, to
> >+ * ensure that devices are dormant.
> >+ *
> >+ * Return: 0 if OK, -ve on error
> >+ */
> >+int usb_pause(void);
> >+
> > int usb_detect_change(void); /* detect if a USB device has been (un)plugged */
> >
> >
Heinrich Schuchardt Nov. 12, 2023, 11:30 p.m. UTC | #3
Am 12. November 2023 22:20:50 MEZ schrieb Simon Glass <sjg@chromium.org>:
>Hi Heinrich,
>
>On Sun, 12 Nov 2023 at 13:32, Heinrich Schuchardt <xypron.glpk@gmx.de> wrote:
>>
>>
>>
>> Am 12. November 2023 21:02:42 MEZ schrieb Simon Glass <sjg@chromium.org>:
>> >When a USB device is unbound, it causes any bootflows attached to it to
>> >be removed, via a call to bootdev_clear_bootflows() from
>> >bootdev_pre_unbind(). This obviously makes it impossible to boot the
>> >bootflow.
>> >
>> >However, when booting a bootflow that relies on USB, usb_stop() is
>> >called, which unbinds the device. For EFI, this happens in
>> >efi_exit_boot_services() which means that the bootflow disappears
>> >before it is finished with.
>>
>> After ExitBootServices() no driver of U-Boot can be used as the operating system is in charge.
>>
>> Any bootflow inside U-Boot is terminated by definition when reaching ExitBootServices.
>>
>> >
>> >There is no need to unbind all the USB devices just to quiesce them.
>> >Add a new usb_pause() call which removes them but leaves them bound.
>>
>> As U-Boot must not access any device after ExitBootServices() it is unclear to me what you want to achieve.
>
>I can't remember exactly what is needed from the bootflow, but
>something is. Perhaps the kernel, or the cmdline, or fdt? It would
>have been better if I had mentioned this explicitly,  But then this
>patch has been sitting around for ages...
>
>In any case, the intent of exit-boot-services is not to free all the
>memory used, since some of it may have been used to hold data that the
>app continues to use.

The EFI application reads the memory map and receives an ID which it passes to ExitBootServiced() after this point any memory not marked as EFI runtime can be used by the EFI app. This includes the memory that harbors the U-Boot USB drivers. Therefore no drivers can be used here.

Starting the EFI app via StartImage() must  terminate any running U-Boot "boot flow".

After ExitBootServices() the EFI application cannot return to U-Boot except for SetVirtualAdressMspsetting which relocates the EFI runtime.

Bootflows and U-Boot drivers have no meaning after ExitBootServices().



>
>Also there is U-Boot code between when the devices are unbound and
>when U-Boot actually exits back to the app.
>
>This hang was 100% repeatable on brya (an x86 Chromebook) and it took
>quite a while to find.

We need a better understanding of the problem that you experience to find an adequate solution. Why does removing all devices lead to hanging the system?

Best regards

Heinrich

>
>Regards,
>Simon
>
>
>>
>> Best regards
>>
>> Heinrich
>>
>> >
>> >This resolves a hang on x86 when booting a distro from USB. This was
>> >found using a device with 4 bootflows, the last of which was USB.
>> >
>> >
>> >Signed-off-by: Simon Glass <sjg@chromium.org>
>> >---
>> >
>> >Changes in v4:
>> >- Don't rename the legacy-USB functions
>> >- Add a bit more detail to the comment
>> >
>> >Changes in v2:
>> >- Add new patch to avoid unbinding devices in use by bootflows
>> >
>> > boot/bootm.c                  |  2 +-
>> > common/usb.c                  |  5 +++++
>> > drivers/usb/host/usb-uclass.c | 14 ++++++++++++--
>> > include/usb.h                 | 21 ++++++++++++++++++++-
>> > 4 files changed, 38 insertions(+), 4 deletions(-)
>> >
>> >diff --git a/boot/bootm.c b/boot/bootm.c
>> >index cb61485c226c..d9c3fa8dad99 100644
>> >--- a/boot/bootm.c
>> >+++ b/boot/bootm.c
>> >@@ -502,7 +502,7 @@ ulong bootm_disable_interrupts(void)
>> >        * updated every 1 ms within the HCCA structure in SDRAM! For more
>> >        * details see the OpenHCI specification.
>> >        */
>> >-      usb_stop();
>> >+      usb_pause();
>> > #endif
>> >       return iflag;
>> > }
>> >diff --git a/common/usb.c b/common/usb.c
>> >index 836506dcd9e9..a486d1c87d4d 100644
>> >--- a/common/usb.c
>> >+++ b/common/usb.c
>> >@@ -144,6 +144,11 @@ int usb_stop(void)
>> >       return 0;
>> > }
>> >
>> >+int usb_pause(void)
>> >+{
>> >+      return usb_stop();
>> >+}
>> >+
>> > /******************************************************************************
>> >  * Detect if a USB device has been plugged or unplugged.
>> >  */
>> >diff --git a/drivers/usb/host/usb-uclass.c b/drivers/usb/host/usb-uclass.c
>> >index a1cd0ad2d669..c26c65d7986b 100644
>> >--- a/drivers/usb/host/usb-uclass.c
>> >+++ b/drivers/usb/host/usb-uclass.c
>> >@@ -173,7 +173,7 @@ int usb_get_max_xfer_size(struct usb_device *udev, size_t *size)
>> >       return ops->get_max_xfer_size(bus, size);
>> > }
>> >
>> >-int usb_stop(void)
>> >+static int usb_finish(bool unbind_all)
>> > {
>> >       struct udevice *bus;
>> >       struct udevice *rh;
>> >@@ -195,7 +195,7 @@ int usb_stop(void)
>> >
>> >               /* Locate root hub device */
>> >               device_find_first_child(bus, &rh);
>> >-              if (rh) {
>> >+              if (rh && unbind_all) {
>> >                       /*
>> >                        * All USB devices are children of root hub.
>> >                        * Unbinding root hub will unbind all of its children.
>> >@@ -222,6 +222,16 @@ int usb_stop(void)
>> >       return err;
>> > }
>> >
>> >+int usb_stop(void)
>> >+{
>> >+      return usb_finish(true);
>> >+}
>> >+
>> >+int usb_pause(void)
>> >+{
>> >+      return usb_finish(false);
>> >+}
>> >+
>> > static void usb_scan_bus(struct udevice *bus, bool recurse)
>> > {
>> >       struct usb_bus_priv *priv;
>> >diff --git a/include/usb.h b/include/usb.h
>> >index 09e3f0cb309c..b964e2e1f6b2 100644
>> >--- a/include/usb.h
>> >+++ b/include/usb.h
>> >@@ -265,7 +265,26 @@ int usb_kbd_deregister(int force);
>> >  */
>> > int usb_init(void);
>> >
>> >-int usb_stop(void); /* stop the USB Controller */
>> >+/**
>> >+ * usb_stop() - stop the USB Controller and unbind all USB controllers/devices
>> >+ *
>> >+ * This unbinds all devices on the USB buses.
>> >+ *
>> >+ * Return: 0 if OK, -ve on error
>> >+ */
>> >+int usb_stop(void);
>> >+
>> >+/**
>> >+ * usb_pause() - stop the USB Controller DMA, etc.
>> >+ *
>> >+ * Note that this does not unbind bus devices, so using usb_init() after this
>> >+ * call is not permitted. This function is only useful just before booting, to
>> >+ * ensure that devices are dormant.
>> >+ *
>> >+ * Return: 0 if OK, -ve on error
>> >+ */
>> >+int usb_pause(void);
>> >+
>> > int usb_detect_change(void); /* detect if a USB device has been (un)plugged */
>> >
>> >
Shantur Rathore Nov. 15, 2023, 3:12 p.m. UTC | #4
On Sun, Nov 12, 2023 at 8:04 PM Simon Glass <sjg@chromium.org> wrote:
>
> When a USB device is unbound, it causes any bootflows attached to it to
> be removed, via a call to bootdev_clear_bootflows() from
> bootdev_pre_unbind(). This obviously makes it impossible to boot the
> bootflow.
>
> However, when booting a bootflow that relies on USB, usb_stop() is
> called, which unbinds the device. For EFI, this happens in
> efi_exit_boot_services() which means that the bootflow disappears
> before it is finished with.
>
> There is no need to unbind all the USB devices just to quiesce them.
> Add a new usb_pause() call which removes them but leaves them bound.
>
> This resolves a hang on x86 when booting a distro from USB. This was
> found using a device with 4 bootflows, the last of which was USB.
>
>
> Signed-off-by: Simon Glass <sjg@chromium.org>
Tested-by: Shantur Rathore <i@shantur.com>
> ---
>
> Changes in v4:
> - Don't rename the legacy-USB functions
> - Add a bit more detail to the comment
>
> Changes in v2:
> - Add new patch to avoid unbinding devices in use by bootflows
>
>  boot/bootm.c                  |  2 +-
>  common/usb.c                  |  5 +++++
>  drivers/usb/host/usb-uclass.c | 14 ++++++++++++--
>  include/usb.h                 | 21 ++++++++++++++++++++-
>  4 files changed, 38 insertions(+), 4 deletions(-)
>
> diff --git a/boot/bootm.c b/boot/bootm.c
> index cb61485c226c..d9c3fa8dad99 100644
> --- a/boot/bootm.c
> +++ b/boot/bootm.c
> @@ -502,7 +502,7 @@ ulong bootm_disable_interrupts(void)
>          * updated every 1 ms within the HCCA structure in SDRAM! For more
>          * details see the OpenHCI specification.
>          */
> -       usb_stop();
> +       usb_pause();
>  #endif
>         return iflag;
>  }
> diff --git a/common/usb.c b/common/usb.c
> index 836506dcd9e9..a486d1c87d4d 100644
> --- a/common/usb.c
> +++ b/common/usb.c
> @@ -144,6 +144,11 @@ int usb_stop(void)
>         return 0;
>  }
>
> +int usb_pause(void)
> +{
> +       return usb_stop();
> +}
> +
>  /******************************************************************************
>   * Detect if a USB device has been plugged or unplugged.
>   */
> diff --git a/drivers/usb/host/usb-uclass.c b/drivers/usb/host/usb-uclass.c
> index a1cd0ad2d669..c26c65d7986b 100644
> --- a/drivers/usb/host/usb-uclass.c
> +++ b/drivers/usb/host/usb-uclass.c
> @@ -173,7 +173,7 @@ int usb_get_max_xfer_size(struct usb_device *udev, size_t *size)
>         return ops->get_max_xfer_size(bus, size);
>  }
>
> -int usb_stop(void)
> +static int usb_finish(bool unbind_all)
>  {
>         struct udevice *bus;
>         struct udevice *rh;
> @@ -195,7 +195,7 @@ int usb_stop(void)
>
>                 /* Locate root hub device */
>                 device_find_first_child(bus, &rh);
> -               if (rh) {
> +               if (rh && unbind_all) {
>                         /*
>                          * All USB devices are children of root hub.
>                          * Unbinding root hub will unbind all of its children.
> @@ -222,6 +222,16 @@ int usb_stop(void)
>         return err;
>  }
>
> +int usb_stop(void)
> +{
> +       return usb_finish(true);
> +}
> +
> +int usb_pause(void)
> +{
> +       return usb_finish(false);
> +}
> +
>  static void usb_scan_bus(struct udevice *bus, bool recurse)
>  {
>         struct usb_bus_priv *priv;
> diff --git a/include/usb.h b/include/usb.h
> index 09e3f0cb309c..b964e2e1f6b2 100644
> --- a/include/usb.h
> +++ b/include/usb.h
> @@ -265,7 +265,26 @@ int usb_kbd_deregister(int force);
>   */
>  int usb_init(void);
>
> -int usb_stop(void); /* stop the USB Controller */
> +/**
> + * usb_stop() - stop the USB Controller and unbind all USB controllers/devices
> + *
> + * This unbinds all devices on the USB buses.
> + *
> + * Return: 0 if OK, -ve on error
> + */
> +int usb_stop(void);
> +
> +/**
> + * usb_pause() - stop the USB Controller DMA, etc.
> + *
> + * Note that this does not unbind bus devices, so using usb_init() after this
> + * call is not permitted. This function is only useful just before booting, to
> + * ensure that devices are dormant.
> + *
> + * Return: 0 if OK, -ve on error
> + */
> +int usb_pause(void);
> +
>  int usb_detect_change(void); /* detect if a USB device has been (un)plugged */
>
>
> --
> 2.42.0.869.gea05f2083d-goog
>
Simon Glass Nov. 15, 2023, 3:50 p.m. UTC | #5
Hi Heinrich,

On Sun, 12 Nov 2023 at 16:35, Heinrich Schuchardt <xypron.glpk@gmx.de> wrote:
>
>
>
> Am 12. November 2023 22:20:50 MEZ schrieb Simon Glass <sjg@chromium.org>:
> >Hi Heinrich,
> >
> >On Sun, 12 Nov 2023 at 13:32, Heinrich Schuchardt <xypron.glpk@gmx.de> wrote:
> >>
> >>
> >>
> >> Am 12. November 2023 21:02:42 MEZ schrieb Simon Glass <sjg@chromium.org>:
> >> >When a USB device is unbound, it causes any bootflows attached to it to
> >> >be removed, via a call to bootdev_clear_bootflows() from
> >> >bootdev_pre_unbind(). This obviously makes it impossible to boot the
> >> >bootflow.
> >> >
> >> >However, when booting a bootflow that relies on USB, usb_stop() is
> >> >called, which unbinds the device. For EFI, this happens in
> >> >efi_exit_boot_services() which means that the bootflow disappears
> >> >before it is finished with.
> >>
> >> After ExitBootServices() no driver of U-Boot can be used as the operating system is in charge.
> >>
> >> Any bootflow inside U-Boot is terminated by definition when reaching ExitBootServices.
> >>
> >> >
> >> >There is no need to unbind all the USB devices just to quiesce them.
> >> >Add a new usb_pause() call which removes them but leaves them bound.
> >>
> >> As U-Boot must not access any device after ExitBootServices() it is unclear to me what you want to achieve.
> >
> >I can't remember exactly what is needed from the bootflow, but
> >something is. Perhaps the kernel, or the cmdline, or fdt? It would
> >have been better if I had mentioned this explicitly,  But then this
> >patch has been sitting around for ages...
> >
> >In any case, the intent of exit-boot-services is not to free all the
> >memory used, since some of it may have been used to hold data that the
> >app continues to use.
>
> The EFI application reads the memory map and receives an ID which it passes to ExitBootServiced() after this point any memory not marked as EFI runtime can be used by the EFI app. This includes the memory that harbors the U-Boot USB drivers. Therefore no drivers can be used here.
>
> Starting the EFI app via StartImage() must  terminate any running U-Boot "boot flow".
>
> After ExitBootServices() the EFI application cannot return to U-Boot except for SetVirtualAdressMspsetting which relocates the EFI runtime.
>
> Bootflows and U-Boot drivers have no meaning after ExitBootServices().
>
>
>
> >
> >Also there is U-Boot code between when the devices are unbound and
> >when U-Boot actually exits back to the app.
> >
> >This hang was 100% repeatable on brya (an x86 Chromebook) and it took
> >quite a while to find.
>
> We need a better understanding of the problem that you experience to find an adequate solution. Why does removing all devices lead to hanging the system?

Are you able to test this? With your better knowledge of EFI you might
be able to figure out something else that is going on. But in my case
it causes some memory to be freed (perhaps the memory holding the EFI
app?), which is then overwritten by something else being malloc()'d,
so the boot hangs. It is hard to see what is going on after the app
starts.

This patch was sent almost two months ago and fixes a real bug. The
first few versions attracted no comment now it is being blocked
because you don't understand how it fixes anything.

I can get the hardware up again and try this but it will take a while.

Even without the hang, this still fixes a bug. We should be using
device_remove() to quiesce devices. There is no need to unbind them.

BTW another patch that suffered a similar fate was [1]. I just applied
it based on a review from Bin.

[..]

Regards,
Simon

[1] https://patchwork.ozlabs.org/project/uboot/patch/20231001191444.v3.1.I9f7f373d00947c704aeae0088dfedd8df07fab60@changeid/
Heinrich Schuchardt Nov. 15, 2023, 4:23 p.m. UTC | #6
On 11/15/23 16:50, Simon Glass wrote:
> Hi Heinrich,
>
> On Sun, 12 Nov 2023 at 16:35, Heinrich Schuchardt <xypron.glpk@gmx.de> wrote:
>>
>>
>>
>> Am 12. November 2023 22:20:50 MEZ schrieb Simon Glass <sjg@chromium.org>:
>>> Hi Heinrich,
>>>
>>> On Sun, 12 Nov 2023 at 13:32, Heinrich Schuchardt <xypron.glpk@gmx.de> wrote:
>>>>
>>>>
>>>>
>>>> Am 12. November 2023 21:02:42 MEZ schrieb Simon Glass <sjg@chromium.org>:
>>>>> When a USB device is unbound, it causes any bootflows attached to it to
>>>>> be removed, via a call to bootdev_clear_bootflows() from
>>>>> bootdev_pre_unbind(). This obviously makes it impossible to boot the
>>>>> bootflow.
>>>>>
>>>>> However, when booting a bootflow that relies on USB, usb_stop() is
>>>>> called, which unbinds the device. For EFI, this happens in
>>>>> efi_exit_boot_services() which means that the bootflow disappears
>>>>> before it is finished with.
>>>>
>>>> After ExitBootServices() no driver of U-Boot can be used as the operating system is in charge.
>>>>
>>>> Any bootflow inside U-Boot is terminated by definition when reaching ExitBootServices.
>>>>
>>>>>
>>>>> There is no need to unbind all the USB devices just to quiesce them.
>>>>> Add a new usb_pause() call which removes them but leaves them bound.
>>>>
>>>> As U-Boot must not access any device after ExitBootServices() it is unclear to me what you want to achieve.
>>>
>>> I can't remember exactly what is needed from the bootflow, but
>>> something is. Perhaps the kernel, or the cmdline, or fdt? It would
>>> have been better if I had mentioned this explicitly,  But then this
>>> patch has been sitting around for ages...
>>>
>>> In any case, the intent of exit-boot-services is not to free all the
>>> memory used, since some of it may have been used to hold data that the
>>> app continues to use.
>>
>> The EFI application reads the memory map and receives an ID which it passes to ExitBootServiced() after this point any memory not marked as EFI runtime can be used by the EFI app. This includes the memory that harbors the U-Boot USB drivers. Therefore no drivers can be used here.
>>
>> Starting the EFI app via StartImage() must  terminate any running U-Boot "boot flow".
>>
>> After ExitBootServices() the EFI application cannot return to U-Boot except for SetVirtualAdressMspsetting which relocates the EFI runtime.
>>
>> Bootflows and U-Boot drivers have no meaning after ExitBootServices().
>>
>>
>>
>>>
>>> Also there is U-Boot code between when the devices are unbound and
>>> when U-Boot actually exits back to the app.
>>>
>>> This hang was 100% repeatable on brya (an x86 Chromebook) and it took
>>> quite a while to find.
>>
>> We need a better understanding of the problem that you experience to find an adequate solution. Why does removing all devices lead to hanging the system?
>
> Are you able to test this? With your better knowledge of EFI you might
> be able to figure out something else that is going on. But in my case
> it causes some memory to be freed (perhaps the memory holding the EFI
> app?), which is then overwritten by something else being malloc()'d,

The memory for the EFI app is not assigned via malloc(). It is allocated
by AllocatedPages().

Reading "some memory freed" does not give me confidence that the problem
is sufficiently analyzed.

> so the boot hangs. It is hard to see what is going on after the app
> starts.
>
> This patch was sent almost two months ago and fixes a real bug. The
> first few versions attracted no comment now it is being blocked
> because you don't understand how it fixes anything.

Do you understand why unbinding the devices causes the problem?

>
> I can get the hardware up again and try this but it will take a while.

Digging a bit deeper seems to be the right approach.

>
> Even without the hang, this still fixes a bug. We should be using
> device_remove() to quiesce devices. There is no need to unbind them.

Is this what the patch does?

>
> BTW another patch that suffered a similar fate was [1]. I just applied
> it based on a review from Bin.

Please, ping Ilias and me if you don't get the feedback that you deserve.

Best regards

Heinrich

>
> [..]
>
> Regards,
> Simon
>
> [1] https://patchwork.ozlabs.org/project/uboot/patch/20231001191444.v3.1.I9f7f373d00947c704aeae0088dfedd8df07fab60@changeid/
Heinrich Schuchardt Nov. 16, 2023, 1:29 a.m. UTC | #7
On 11/15/23 17:23, Heinrich Schuchardt wrote:
> On 11/15/23 16:50, Simon Glass wrote:
>> Hi Heinrich,
>>
>> On Sun, 12 Nov 2023 at 16:35, Heinrich Schuchardt <xypron.glpk@gmx.de>
>> wrote:
>>>
>>>
>>>
>>> Am 12. November 2023 22:20:50 MEZ schrieb Simon Glass
>>> <sjg@chromium.org>:
>>>> Hi Heinrich,
>>>>
>>>> On Sun, 12 Nov 2023 at 13:32, Heinrich Schuchardt
>>>> <xypron.glpk@gmx.de> wrote:
>>>>>
>>>>>
>>>>>
>>>>> Am 12. November 2023 21:02:42 MEZ schrieb Simon Glass
>>>>> <sjg@chromium.org>:
>>>>>> When a USB device is unbound, it causes any bootflows attached to
>>>>>> it to
>>>>>> be removed, via a call to bootdev_clear_bootflows() from
>>>>>> bootdev_pre_unbind(). This obviously makes it impossible to boot the
>>>>>> bootflow.
>>>>>>
>>>>>> However, when booting a bootflow that relies on USB, usb_stop() is
>>>>>> called, which unbinds the device. For EFI, this happens in
>>>>>> efi_exit_boot_services() which means that the bootflow disappears
>>>>>> before it is finished with.
>>>>>
>>>>> After ExitBootServices() no driver of U-Boot can be used as the
>>>>> operating system is in charge.
>>>>>
>>>>> Any bootflow inside U-Boot is terminated by definition when
>>>>> reaching ExitBootServices.
>>>>>
>>>>>>
>>>>>> There is no need to unbind all the USB devices just to quiesce them.
>>>>>> Add a new usb_pause() call which removes them but leaves them bound.
>>>>>
>>>>> As U-Boot must not access any device after ExitBootServices() it is
>>>>> unclear to me what you want to achieve.
>>>>
>>>> I can't remember exactly what is needed from the bootflow, but
>>>> something is. Perhaps the kernel, or the cmdline, or fdt? It would
>>>> have been better if I had mentioned this explicitly,  But then this
>>>> patch has been sitting around for ages...
>>>>
>>>> In any case, the intent of exit-boot-services is not to free all the
>>>> memory used, since some of it may have been used to hold data that the
>>>> app continues to use.
>>>
>>> The EFI application reads the memory map and receives an ID which it
>>> passes to ExitBootServiced() after this point any memory not marked
>>> as EFI runtime can be used by the EFI app. This includes the memory
>>> that harbors the U-Boot USB drivers. Therefore no drivers can be used
>>> here.
>>>
>>> Starting the EFI app via StartImage() must  terminate any running
>>> U-Boot "boot flow".
>>>
>>> After ExitBootServices() the EFI application cannot return to U-Boot
>>> except for SetVirtualAdressMspsetting which relocates the EFI runtime.
>>>
>>> Bootflows and U-Boot drivers have no meaning after ExitBootServices().
>>>
>>>
>>>
>>>>
>>>> Also there is U-Boot code between when the devices are unbound and
>>>> when U-Boot actually exits back to the app.
>>>>
>>>> This hang was 100% repeatable on brya (an x86 Chromebook) and it took
>>>> quite a while to find.
>>>
>>> We need a better understanding of the problem that you experience to
>>> find an adequate solution. Why does removing all devices lead to
>>> hanging the system?
>>
>> Are you able to test this? With your better knowledge of EFI you might
>> be able to figure out something else that is going on. But in my case
>> it causes some memory to be freed (perhaps the memory holding the EFI
>> app?), which is then overwritten by something else being malloc()'d,
>
> The memory for the EFI app is not assigned via malloc(). It is allocated
> by AllocatedPages().
>
> Reading "some memory freed" does not give me confidence that the problem
> is sufficiently analyzed.
>
>> so the boot hangs. It is hard to see what is going on after the app
>> starts.
>>
>> This patch was sent almost two months ago and fixes a real bug. The
>> first few versions attracted no comment now it is being blocked
>> because you don't understand how it fixes anything.
>
> Do you understand why unbinding the devices causes the problem?
>
>>
>> I can get the hardware up again and try this but it will take a while.
>
> Digging a bit deeper seems to be the right approach.

Re: bug - bootflow: grub efi crashes when bootflow selected explicitly
https://lore.kernel.org/u-boot/CAHc5_t1H39YV=HSSa7TCEdzRctAX+zibkx1vsuwEW-raOL9TcA@mail.gmail.com/

points to a probable root cause:

https://github.com/u-boot/u-boot/blob/master/boot/bootflow.c#L470
free(bflow->buf)

In the EFI boot bflow->buf points to $kernel_addr_r which never was
allocated and therefore must not be freed.

Best regards

Heinrich

>
>>
>> Even without the hang, this still fixes a bug. We should be using
>> device_remove() to quiesce devices. There is no need to unbind them.
>
> Is this what the patch does?
>
>>
>> BTW another patch that suffered a similar fate was [1]. I just applied
>> it based on a review from Bin.
>
> Please, ping Ilias and me if you don't get the feedback that you deserve.
>
> Best regards
>
> Heinrich
>
>>
>> [..]
>>
>> Regards,
>> Simon
>>
>> [1]
>> https://patchwork.ozlabs.org/project/uboot/patch/20231001191444.v3.1.I9f7f373d00947c704aeae0088dfedd8df07fab60@changeid/
>
Simon Glass Nov. 16, 2023, 1:42 a.m. UTC | #8
Hi Heinrich,

On Wed, 15 Nov 2023 at 18:34, Heinrich Schuchardt <xypron.glpk@gmx.de> wrote:
>
> On 11/15/23 17:23, Heinrich Schuchardt wrote:
> > On 11/15/23 16:50, Simon Glass wrote:
> >> Hi Heinrich,
> >>
> >> On Sun, 12 Nov 2023 at 16:35, Heinrich Schuchardt <xypron.glpk@gmx.de>
> >> wrote:
> >>>
> >>>
> >>>
> >>> Am 12. November 2023 22:20:50 MEZ schrieb Simon Glass
> >>> <sjg@chromium.org>:
> >>>> Hi Heinrich,
> >>>>
> >>>> On Sun, 12 Nov 2023 at 13:32, Heinrich Schuchardt
> >>>> <xypron.glpk@gmx.de> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>> Am 12. November 2023 21:02:42 MEZ schrieb Simon Glass
> >>>>> <sjg@chromium.org>:
> >>>>>> When a USB device is unbound, it causes any bootflows attached to
> >>>>>> it to
> >>>>>> be removed, via a call to bootdev_clear_bootflows() from
> >>>>>> bootdev_pre_unbind(). This obviously makes it impossible to boot the
> >>>>>> bootflow.
> >>>>>>
> >>>>>> However, when booting a bootflow that relies on USB, usb_stop() is
> >>>>>> called, which unbinds the device. For EFI, this happens in
> >>>>>> efi_exit_boot_services() which means that the bootflow disappears
> >>>>>> before it is finished with.
> >>>>>
> >>>>> After ExitBootServices() no driver of U-Boot can be used as the
> >>>>> operating system is in charge.
> >>>>>
> >>>>> Any bootflow inside U-Boot is terminated by definition when
> >>>>> reaching ExitBootServices.
> >>>>>
> >>>>>>
> >>>>>> There is no need to unbind all the USB devices just to quiesce them.
> >>>>>> Add a new usb_pause() call which removes them but leaves them bound.
> >>>>>
> >>>>> As U-Boot must not access any device after ExitBootServices() it is
> >>>>> unclear to me what you want to achieve.
> >>>>
> >>>> I can't remember exactly what is needed from the bootflow, but
> >>>> something is. Perhaps the kernel, or the cmdline, or fdt? It would
> >>>> have been better if I had mentioned this explicitly,  But then this
> >>>> patch has been sitting around for ages...
> >>>>
> >>>> In any case, the intent of exit-boot-services is not to free all the
> >>>> memory used, since some of it may have been used to hold data that the
> >>>> app continues to use.
> >>>
> >>> The EFI application reads the memory map and receives an ID which it
> >>> passes to ExitBootServiced() after this point any memory not marked
> >>> as EFI runtime can be used by the EFI app. This includes the memory
> >>> that harbors the U-Boot USB drivers. Therefore no drivers can be used
> >>> here.
> >>>
> >>> Starting the EFI app via StartImage() must  terminate any running
> >>> U-Boot "boot flow".
> >>>
> >>> After ExitBootServices() the EFI application cannot return to U-Boot
> >>> except for SetVirtualAdressMspsetting which relocates the EFI runtime.
> >>>
> >>> Bootflows and U-Boot drivers have no meaning after ExitBootServices().
> >>>
> >>>
> >>>
> >>>>
> >>>> Also there is U-Boot code between when the devices are unbound and
> >>>> when U-Boot actually exits back to the app.
> >>>>
> >>>> This hang was 100% repeatable on brya (an x86 Chromebook) and it took
> >>>> quite a while to find.
> >>>
> >>> We need a better understanding of the problem that you experience to
> >>> find an adequate solution. Why does removing all devices lead to
> >>> hanging the system?
> >>
> >> Are you able to test this? With your better knowledge of EFI you might
> >> be able to figure out something else that is going on. But in my case
> >> it causes some memory to be freed (perhaps the memory holding the EFI
> >> app?), which is then overwritten by something else being malloc()'d,
> >
> > The memory for the EFI app is not assigned via malloc(). It is allocated
> > by AllocatedPages().
> >
> > Reading "some memory freed" does not give me confidence that the problem
> > is sufficiently analyzed.
> >
> >> so the boot hangs. It is hard to see what is going on after the app
> >> starts.
> >>
> >> This patch was sent almost two months ago and fixes a real bug. The
> >> first few versions attracted no comment now it is being blocked
> >> because you don't understand how it fixes anything.
> >
> > Do you understand why unbinding the devices causes the problem?
> >
> >>
> >> I can get the hardware up again and try this but it will take a while.
> >
> > Digging a bit deeper seems to be the right approach.
>
> Re: bug - bootflow: grub efi crashes when bootflow selected explicitly
> https://lore.kernel.org/u-boot/CAHc5_t1H39YV=HSSa7TCEdzRctAX+zibkx1vsuwEW-raOL9TcA@mail.gmail.com/
>
> points to a probable root cause:
>
> https://github.com/u-boot/u-boot/blob/master/boot/bootflow.c#L470
> free(bflow->buf)
>
> In the EFI boot bflow->buf points to $kernel_addr_r which never was
> allocated and therefore must not be freed.

Yes, this is the root cause of the crash.

However, this patch should still be applied. I can update the commit
message if you like.

Freeing the FDT and kernel before booting them is just not safe,
whether EFI or anything else. Freed memory is not guaranteed to hang
around for any length of time. For example, with truetype fonts,
malloc() is called during any console output and will likely corrupt
the images just as they are being booted.

We should leave the devices bound when booting.

Regards,
Simon
Heinrich Schuchardt Nov. 16, 2023, 2:01 a.m. UTC | #9
On 11/16/23 02:42, Simon Glass wrote:
> Hi Heinrich,
>
> On Wed, 15 Nov 2023 at 18:34, Heinrich Schuchardt <xypron.glpk@gmx.de> wrote:
>>
>> On 11/15/23 17:23, Heinrich Schuchardt wrote:
>>> On 11/15/23 16:50, Simon Glass wrote:
>>>> Hi Heinrich,
>>>>
>>>> On Sun, 12 Nov 2023 at 16:35, Heinrich Schuchardt <xypron.glpk@gmx.de>
>>>> wrote:
>>>>>
>>>>>
>>>>>
>>>>> Am 12. November 2023 22:20:50 MEZ schrieb Simon Glass
>>>>> <sjg@chromium.org>:
>>>>>> Hi Heinrich,
>>>>>>
>>>>>> On Sun, 12 Nov 2023 at 13:32, Heinrich Schuchardt
>>>>>> <xypron.glpk@gmx.de> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Am 12. November 2023 21:02:42 MEZ schrieb Simon Glass
>>>>>>> <sjg@chromium.org>:
>>>>>>>> When a USB device is unbound, it causes any bootflows attached to
>>>>>>>> it to
>>>>>>>> be removed, via a call to bootdev_clear_bootflows() from
>>>>>>>> bootdev_pre_unbind(). This obviously makes it impossible to boot the
>>>>>>>> bootflow.
>>>>>>>>
>>>>>>>> However, when booting a bootflow that relies on USB, usb_stop() is
>>>>>>>> called, which unbinds the device. For EFI, this happens in
>>>>>>>> efi_exit_boot_services() which means that the bootflow disappears
>>>>>>>> before it is finished with.
>>>>>>>
>>>>>>> After ExitBootServices() no driver of U-Boot can be used as the
>>>>>>> operating system is in charge.
>>>>>>>
>>>>>>> Any bootflow inside U-Boot is terminated by definition when
>>>>>>> reaching ExitBootServices.
>>>>>>>
>>>>>>>>
>>>>>>>> There is no need to unbind all the USB devices just to quiesce them.
>>>>>>>> Add a new usb_pause() call which removes them but leaves them bound.
>>>>>>>
>>>>>>> As U-Boot must not access any device after ExitBootServices() it is
>>>>>>> unclear to me what you want to achieve.
>>>>>>
>>>>>> I can't remember exactly what is needed from the bootflow, but
>>>>>> something is. Perhaps the kernel, or the cmdline, or fdt? It would
>>>>>> have been better if I had mentioned this explicitly,  But then this
>>>>>> patch has been sitting around for ages...
>>>>>>
>>>>>> In any case, the intent of exit-boot-services is not to free all the
>>>>>> memory used, since some of it may have been used to hold data that the
>>>>>> app continues to use.
>>>>>
>>>>> The EFI application reads the memory map and receives an ID which it
>>>>> passes to ExitBootServiced() after this point any memory not marked
>>>>> as EFI runtime can be used by the EFI app. This includes the memory
>>>>> that harbors the U-Boot USB drivers. Therefore no drivers can be used
>>>>> here.
>>>>>
>>>>> Starting the EFI app via StartImage() must  terminate any running
>>>>> U-Boot "boot flow".
>>>>>
>>>>> After ExitBootServices() the EFI application cannot return to U-Boot
>>>>> except for SetVirtualAdressMspsetting which relocates the EFI runtime.
>>>>>
>>>>> Bootflows and U-Boot drivers have no meaning after ExitBootServices().
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Also there is U-Boot code between when the devices are unbound and
>>>>>> when U-Boot actually exits back to the app.
>>>>>>
>>>>>> This hang was 100% repeatable on brya (an x86 Chromebook) and it took
>>>>>> quite a while to find.
>>>>>
>>>>> We need a better understanding of the problem that you experience to
>>>>> find an adequate solution. Why does removing all devices lead to
>>>>> hanging the system?
>>>>
>>>> Are you able to test this? With your better knowledge of EFI you might
>>>> be able to figure out something else that is going on. But in my case
>>>> it causes some memory to be freed (perhaps the memory holding the EFI
>>>> app?), which is then overwritten by something else being malloc()'d,
>>>
>>> The memory for the EFI app is not assigned via malloc(). It is allocated
>>> by AllocatedPages().
>>>
>>> Reading "some memory freed" does not give me confidence that the problem
>>> is sufficiently analyzed.
>>>
>>>> so the boot hangs. It is hard to see what is going on after the app
>>>> starts.
>>>>
>>>> This patch was sent almost two months ago and fixes a real bug. The
>>>> first few versions attracted no comment now it is being blocked
>>>> because you don't understand how it fixes anything.
>>>
>>> Do you understand why unbinding the devices causes the problem?
>>>
>>>>
>>>> I can get the hardware up again and try this but it will take a while.
>>>
>>> Digging a bit deeper seems to be the right approach.
>>
>> Re: bug - bootflow: grub efi crashes when bootflow selected explicitly
>> https://lore.kernel.org/u-boot/CAHc5_t1H39YV=HSSa7TCEdzRctAX+zibkx1vsuwEW-raOL9TcA@mail.gmail.com/
>>
>> points to a probable root cause:
>>
>> https://github.com/u-boot/u-boot/blob/master/boot/bootflow.c#L470
>> free(bflow->buf)
>>
>> In the EFI boot bflow->buf points to $kernel_addr_r which never was
>> allocated and therefore must not be freed.
>
> Yes, this is the root cause of the crash.
>
> However, this patch should still be applied. I can update the commit
> message if you like.
>
> Freeing the FDT and kernel before booting them is just not safe,
> whether EFI or anything else. Freed memory is not guaranteed to hang
> around for any length of time. For example, with truetype fonts,
> malloc() is called during any console output and will likely corrupt
> the images just as they are being booted.

Please, observe that malloc() and efi_allocate_pages() use completely
separate parts of the memory.

When reaching ExitBootServices() the memory used by the EFI binary
(relocated in efi_load_pe()) and for the configuration table with the
device-tree have been allocated by efi_allocate_pages(). These addresses
cannot neither allocated by malloc() nor freed with free().

Best regards

Heinrich

>
> We should leave the devices bound when booting.
>
> Regards,
> Simon
Simon Glass Nov. 16, 2023, 2:35 a.m. UTC | #10
Hi Heinrich,

On Wed, 15 Nov 2023 at 19:06, Heinrich Schuchardt <xypron.glpk@gmx.de> wrote:
>
> On 11/16/23 02:42, Simon Glass wrote:
> > Hi Heinrich,
> >
> > On Wed, 15 Nov 2023 at 18:34, Heinrich Schuchardt <xypron.glpk@gmx.de> wrote:
> >>
> >> On 11/15/23 17:23, Heinrich Schuchardt wrote:
> >>> On 11/15/23 16:50, Simon Glass wrote:
> >>>> Hi Heinrich,
> >>>>
> >>>> On Sun, 12 Nov 2023 at 16:35, Heinrich Schuchardt <xypron.glpk@gmx.de>
> >>>> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>> Am 12. November 2023 22:20:50 MEZ schrieb Simon Glass
> >>>>> <sjg@chromium.org>:
> >>>>>> Hi Heinrich,
> >>>>>>
> >>>>>> On Sun, 12 Nov 2023 at 13:32, Heinrich Schuchardt
> >>>>>> <xypron.glpk@gmx.de> wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Am 12. November 2023 21:02:42 MEZ schrieb Simon Glass
> >>>>>>> <sjg@chromium.org>:
> >>>>>>>> When a USB device is unbound, it causes any bootflows attached to
> >>>>>>>> it to
> >>>>>>>> be removed, via a call to bootdev_clear_bootflows() from
> >>>>>>>> bootdev_pre_unbind(). This obviously makes it impossible to boot the
> >>>>>>>> bootflow.
> >>>>>>>>
> >>>>>>>> However, when booting a bootflow that relies on USB, usb_stop() is
> >>>>>>>> called, which unbinds the device. For EFI, this happens in
> >>>>>>>> efi_exit_boot_services() which means that the bootflow disappears
> >>>>>>>> before it is finished with.
> >>>>>>>
> >>>>>>> After ExitBootServices() no driver of U-Boot can be used as the
> >>>>>>> operating system is in charge.
> >>>>>>>
> >>>>>>> Any bootflow inside U-Boot is terminated by definition when
> >>>>>>> reaching ExitBootServices.
> >>>>>>>
> >>>>>>>>
> >>>>>>>> There is no need to unbind all the USB devices just to quiesce them.
> >>>>>>>> Add a new usb_pause() call which removes them but leaves them bound.
> >>>>>>>
> >>>>>>> As U-Boot must not access any device after ExitBootServices() it is
> >>>>>>> unclear to me what you want to achieve.
> >>>>>>
> >>>>>> I can't remember exactly what is needed from the bootflow, but
> >>>>>> something is. Perhaps the kernel, or the cmdline, or fdt? It would
> >>>>>> have been better if I had mentioned this explicitly,  But then this
> >>>>>> patch has been sitting around for ages...
> >>>>>>
> >>>>>> In any case, the intent of exit-boot-services is not to free all the
> >>>>>> memory used, since some of it may have been used to hold data that the
> >>>>>> app continues to use.
> >>>>>
> >>>>> The EFI application reads the memory map and receives an ID which it
> >>>>> passes to ExitBootServiced() after this point any memory not marked
> >>>>> as EFI runtime can be used by the EFI app. This includes the memory
> >>>>> that harbors the U-Boot USB drivers. Therefore no drivers can be used
> >>>>> here.
> >>>>>
> >>>>> Starting the EFI app via StartImage() must  terminate any running
> >>>>> U-Boot "boot flow".
> >>>>>
> >>>>> After ExitBootServices() the EFI application cannot return to U-Boot
> >>>>> except for SetVirtualAdressMspsetting which relocates the EFI runtime.
> >>>>>
> >>>>> Bootflows and U-Boot drivers have no meaning after ExitBootServices().
> >>>>>
> >>>>>
> >>>>>
> >>>>>>
> >>>>>> Also there is U-Boot code between when the devices are unbound and
> >>>>>> when U-Boot actually exits back to the app.
> >>>>>>
> >>>>>> This hang was 100% repeatable on brya (an x86 Chromebook) and it took
> >>>>>> quite a while to find.
> >>>>>
> >>>>> We need a better understanding of the problem that you experience to
> >>>>> find an adequate solution. Why does removing all devices lead to
> >>>>> hanging the system?
> >>>>
> >>>> Are you able to test this? With your better knowledge of EFI you might
> >>>> be able to figure out something else that is going on. But in my case
> >>>> it causes some memory to be freed (perhaps the memory holding the EFI
> >>>> app?), which is then overwritten by something else being malloc()'d,
> >>>
> >>> The memory for the EFI app is not assigned via malloc(). It is allocated
> >>> by AllocatedPages().
> >>>
> >>> Reading "some memory freed" does not give me confidence that the problem
> >>> is sufficiently analyzed.
> >>>
> >>>> so the boot hangs. It is hard to see what is going on after the app
> >>>> starts.
> >>>>
> >>>> This patch was sent almost two months ago and fixes a real bug. The
> >>>> first few versions attracted no comment now it is being blocked
> >>>> because you don't understand how it fixes anything.
> >>>
> >>> Do you understand why unbinding the devices causes the problem?
> >>>
> >>>>
> >>>> I can get the hardware up again and try this but it will take a while.
> >>>
> >>> Digging a bit deeper seems to be the right approach.
> >>
> >> Re: bug - bootflow: grub efi crashes when bootflow selected explicitly
> >> https://lore.kernel.org/u-boot/CAHc5_t1H39YV=HSSa7TCEdzRctAX+zibkx1vsuwEW-raOL9TcA@mail.gmail.com/
> >>
> >> points to a probable root cause:
> >>
> >> https://github.com/u-boot/u-boot/blob/master/boot/bootflow.c#L470
> >> free(bflow->buf)
> >>
> >> In the EFI boot bflow->buf points to $kernel_addr_r which never was
> >> allocated and therefore must not be freed.
> >
> > Yes, this is the root cause of the crash.
> >
> > However, this patch should still be applied. I can update the commit
> > message if you like.
> >
> > Freeing the FDT and kernel before booting them is just not safe,
> > whether EFI or anything else. Freed memory is not guaranteed to hang
> > around for any length of time. For example, with truetype fonts,
> > malloc() is called during any console output and will likely corrupt
> > the images just as they are being booted.
>
> Please, observe that malloc() and efi_allocate_pages() use completely
> separate parts of the memory.
>
> When reaching ExitBootServices() the memory used by the EFI binary
> (relocated in efi_load_pe()) and for the configuration table with the
> device-tree have been allocated by efi_allocate_pages(). These addresses
> cannot neither allocated by malloc() nor freed with free().

OK...

To my reading, efi_load_pe() pulls the image apart into memory
allocated with efi_allocate_pages(). So that is separate memory? In
any case, it would end up in a 'struct bootflow'.

IMO the EFI memory-allocation functions should be called only when
booting, not before. I do worry about leakage...

[..]

Regards,
Simon
diff mbox series

Patch

diff --git a/boot/bootm.c b/boot/bootm.c
index cb61485c226c..d9c3fa8dad99 100644
--- a/boot/bootm.c
+++ b/boot/bootm.c
@@ -502,7 +502,7 @@  ulong bootm_disable_interrupts(void)
 	 * updated every 1 ms within the HCCA structure in SDRAM! For more
 	 * details see the OpenHCI specification.
 	 */
-	usb_stop();
+	usb_pause();
 #endif
 	return iflag;
 }
diff --git a/common/usb.c b/common/usb.c
index 836506dcd9e9..a486d1c87d4d 100644
--- a/common/usb.c
+++ b/common/usb.c
@@ -144,6 +144,11 @@  int usb_stop(void)
 	return 0;
 }
 
+int usb_pause(void)
+{
+	return usb_stop();
+}
+
 /******************************************************************************
  * Detect if a USB device has been plugged or unplugged.
  */
diff --git a/drivers/usb/host/usb-uclass.c b/drivers/usb/host/usb-uclass.c
index a1cd0ad2d669..c26c65d7986b 100644
--- a/drivers/usb/host/usb-uclass.c
+++ b/drivers/usb/host/usb-uclass.c
@@ -173,7 +173,7 @@  int usb_get_max_xfer_size(struct usb_device *udev, size_t *size)
 	return ops->get_max_xfer_size(bus, size);
 }
 
-int usb_stop(void)
+static int usb_finish(bool unbind_all)
 {
 	struct udevice *bus;
 	struct udevice *rh;
@@ -195,7 +195,7 @@  int usb_stop(void)
 
 		/* Locate root hub device */
 		device_find_first_child(bus, &rh);
-		if (rh) {
+		if (rh && unbind_all) {
 			/*
 			 * All USB devices are children of root hub.
 			 * Unbinding root hub will unbind all of its children.
@@ -222,6 +222,16 @@  int usb_stop(void)
 	return err;
 }
 
+int usb_stop(void)
+{
+	return usb_finish(true);
+}
+
+int usb_pause(void)
+{
+	return usb_finish(false);
+}
+
 static void usb_scan_bus(struct udevice *bus, bool recurse)
 {
 	struct usb_bus_priv *priv;
diff --git a/include/usb.h b/include/usb.h
index 09e3f0cb309c..b964e2e1f6b2 100644
--- a/include/usb.h
+++ b/include/usb.h
@@ -265,7 +265,26 @@  int usb_kbd_deregister(int force);
  */
 int usb_init(void);
 
-int usb_stop(void); /* stop the USB Controller */
+/**
+ * usb_stop() - stop the USB Controller and unbind all USB controllers/devices
+ *
+ * This unbinds all devices on the USB buses.
+ *
+ * Return: 0 if OK, -ve on error
+ */
+int usb_stop(void);
+
+/**
+ * usb_pause() - stop the USB Controller DMA, etc.
+ *
+ * Note that this does not unbind bus devices, so using usb_init() after this
+ * call is not permitted. This function is only useful just before booting, to
+ * ensure that devices are dormant.
+ *
+ * Return: 0 if OK, -ve on error
+ */
+int usb_pause(void);
+
 int usb_detect_change(void); /* detect if a USB device has been (un)plugged */