mbox series

[v3,00/12] fw_devlink improvements

Message ID 20230207014207.1678715-1-saravanak@google.com
Headers show
Series fw_devlink improvements | expand

Message

Saravana Kannan Feb. 7, 2023, 1:41 a.m. UTC
Naresh, Tony, Abel, Geert, Dmitry, Maxim(s), Miquel, Luca, Doug, Martin,
Jean-Philippe,

Can I get your Tested-by's for this v3 series please?

Vladimir,

Ccing you because DSA's and fw_devlink have known/existing problems
(still in my TODOs to fix). But I want to make sure this series doesn't
cause additional problems for DSA.

All,

This patch series improves fw_devlink in the following ways:

1. It no longer cares about a fwnode having a "compatible" property. It
   figures this out more dynamically. The only expectation is that
   fwnodes that are converted to devices actually get probed by a driver
   for the dependencies to be enforced correctly.

2. Finer grained dependency tracking. fw_devlink will now create device
   links from the consumer to the actual resource's device (if it has one,
   Eg: gpio_device) instead of the parent supplier device. This improves
   things like async suspend/resume ordering, potentially remove the need
   for frameworks to create device links, more parallelized async probing,
   and better sync_state() tracking.

3. Handle hardware/software quirks where a child firmware node gets
   populated as a device before its parent firmware node AND actually
   supplies a non-optional resource to the parent firmware node's
   device.

4. Way more robust at cycle handling (see patch for the insane cases).

5. Stops depending on OF_POPULATED to figure out some corner cases.

6. Simplifies the work that needs to be done by the firmware specific
   code.

The v3 series has gone through my usual testing on my end and looks good
to me.

Thanks,
Saravana

[1] - https://lore.kernel.org/lkml/20220810060040.321697-1-saravanak@google.com/
[2] - https://lore.kernel.org/lkml/CAGETcx-JUV1nj8wBJrTPfyvM7=Mre5j_vkVmZojeiumUGG6QZQ@mail.gmail.com/

v1 -> v2:
- Fixed Patch 1 to handle a corner case discussed in [2].
- New patch 10 to handle "fsl,imx8mq-gpc" being initialized by 2 drivers.
- New patch 11 to add fw_devlink support for SCMI devices.

v2 -> v3:
- Addressed most of Andy's comments in v2
- Added Colin and Sudeep's Tested-by for the series (except the imx and
  renesas patches)
- Added Sudeep's Acked-by for the scmi patch.
- Added Geert's Reviewed-by for the renesas patch.
- Fixed gpiolib crash reported by Naresh.
- Patch 6: Fix __fwnode_links_move_consumers() to preserve fwnode link flags.
- New Patch 12 to fix nvmem-cells issue reported by Maxim(s)/Miquel.
- Deleted some stale function doc in Patch 8

Cc: Abel Vesa <abel.vesa@linaro.org>
Cc: Alexander Stein <alexander.stein@ew.tq-group.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: John Stultz <jstultz@google.com>
Cc: Doug Anderson <dianders@chromium.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Cc: Maxim Kiselev <bigunclemax@gmail.com>
Cc: Maxim Kochetkov <fido_max@inbox.ru>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Luca Weiss <luca.weiss@fairphone.com>
Cc: Colin Foster <colin.foster@in-advantage.com>
Cc: Martin Kepplinger <martin.kepplinger@puri.sm>
Cc: Jean-Philippe Brucker <jpb@kernel.org>
Cc: Vladimir Oltean <vladimir.oltean@nxp.com>

Saravana Kannan (12):
  driver core: fw_devlink: Don't purge child fwnode's consumer links
  driver core: fw_devlink: Improve check for fwnode with no
    device/driver
  soc: renesas: Move away from using OF_POPULATED for fw_devlink
  gpiolib: Clear the gpio_device's fwnode initialized flag before adding
  driver core: fw_devlink: Add DL_FLAG_CYCLE support to device links
  driver core: fw_devlink: Allow marking a fwnode link as being part of
    a cycle
  driver core: fw_devlink: Consolidate device link flag computation
  driver core: fw_devlink: Make cycle detection more robust
  of: property: Simplify of_link_to_phandle()
  irqchip/irq-imx-gpcv2: Mark fwnode device as not initialized
  firmware: arm_scmi: Set fwnode for the scmi_device
  mtd: mtdpart: Don't create platform device that'll never probe

 drivers/base/core.c             | 449 +++++++++++++++++++++-----------
 drivers/firmware/arm_scmi/bus.c |   3 +-
 drivers/gpio/gpiolib.c          |   7 +
 drivers/irqchip/irq-imx-gpcv2.c |   1 +
 drivers/mtd/mtdpart.c           |  10 +
 drivers/of/property.c           |  84 +-----
 drivers/soc/imx/gpcv2.c         |   2 +-
 drivers/soc/renesas/rcar-sysc.c |   2 +-
 include/linux/device.h          |   1 +
 include/linux/fwnode.h          |  12 +-
 10 files changed, 344 insertions(+), 227 deletions(-)

Comments

Luca Weiss Feb. 7, 2023, 9:23 a.m. UTC | #1
On Tue Feb 7, 2023 at 2:41 AM CET, Saravana Kannan wrote:
> Naresh, Tony, Abel, Geert, Dmitry, Maxim(s), Miquel, Luca, Doug, Martin,
> Jean-Philippe,
>
> Can I get your Tested-by's for this v3 series please?

Hi Saravana,

Seems to be alright on the same platform where it broke previously.

Tested-by: Luca Weiss <luca.weiss@fairphone.com> # qcom/sm7225-fairphone-fp4

Regards
Luca

>
> Vladimir,
>
> Ccing you because DSA's and fw_devlink have known/existing problems
> (still in my TODOs to fix). But I want to make sure this series doesn't
> cause additional problems for DSA.
>
> All,
>
> This patch series improves fw_devlink in the following ways:
>
> 1. It no longer cares about a fwnode having a "compatible" property. It
>    figures this out more dynamically. The only expectation is that
>    fwnodes that are converted to devices actually get probed by a driver
>    for the dependencies to be enforced correctly.
>
> 2. Finer grained dependency tracking. fw_devlink will now create device
>    links from the consumer to the actual resource's device (if it has one,
>    Eg: gpio_device) instead of the parent supplier device. This improves
>    things like async suspend/resume ordering, potentially remove the need
>    for frameworks to create device links, more parallelized async probing,
>    and better sync_state() tracking.
>
> 3. Handle hardware/software quirks where a child firmware node gets
>    populated as a device before its parent firmware node AND actually
>    supplies a non-optional resource to the parent firmware node's
>    device.
>
> 4. Way more robust at cycle handling (see patch for the insane cases).
>
> 5. Stops depending on OF_POPULATED to figure out some corner cases.
>
> 6. Simplifies the work that needs to be done by the firmware specific
>    code.
>
> The v3 series has gone through my usual testing on my end and looks good
> to me.
>
> Thanks,
> Saravana
>
> [1] - https://lore.kernel.org/lkml/20220810060040.321697-1-saravanak@google.com/
> [2] - https://lore.kernel.org/lkml/CAGETcx-JUV1nj8wBJrTPfyvM7=Mre5j_vkVmZojeiumUGG6QZQ@mail.gmail.com/
>
> v1 -> v2:
> - Fixed Patch 1 to handle a corner case discussed in [2].
> - New patch 10 to handle "fsl,imx8mq-gpc" being initialized by 2 drivers.
> - New patch 11 to add fw_devlink support for SCMI devices.
>
> v2 -> v3:
> - Addressed most of Andy's comments in v2
> - Added Colin and Sudeep's Tested-by for the series (except the imx and
>   renesas patches)
> - Added Sudeep's Acked-by for the scmi patch.
> - Added Geert's Reviewed-by for the renesas patch.
> - Fixed gpiolib crash reported by Naresh.
> - Patch 6: Fix __fwnode_links_move_consumers() to preserve fwnode link flags.
> - New Patch 12 to fix nvmem-cells issue reported by Maxim(s)/Miquel.
> - Deleted some stale function doc in Patch 8
>
> Cc: Abel Vesa <abel.vesa@linaro.org>
> Cc: Alexander Stein <alexander.stein@ew.tq-group.com>
> Cc: Tony Lindgren <tony@atomide.com>
> Cc: Sudeep Holla <sudeep.holla@arm.com>
> Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> Cc: John Stultz <jstultz@google.com>
> Cc: Doug Anderson <dianders@chromium.org>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> Cc: Maxim Kiselev <bigunclemax@gmail.com>
> Cc: Maxim Kochetkov <fido_max@inbox.ru>
> Cc: Miquel Raynal <miquel.raynal@bootlin.com>
> Cc: Luca Weiss <luca.weiss@fairphone.com>
> Cc: Colin Foster <colin.foster@in-advantage.com>
> Cc: Martin Kepplinger <martin.kepplinger@puri.sm>
> Cc: Jean-Philippe Brucker <jpb@kernel.org>
> Cc: Vladimir Oltean <vladimir.oltean@nxp.com>
>
> Saravana Kannan (12):
>   driver core: fw_devlink: Don't purge child fwnode's consumer links
>   driver core: fw_devlink: Improve check for fwnode with no
>     device/driver
>   soc: renesas: Move away from using OF_POPULATED for fw_devlink
>   gpiolib: Clear the gpio_device's fwnode initialized flag before adding
>   driver core: fw_devlink: Add DL_FLAG_CYCLE support to device links
>   driver core: fw_devlink: Allow marking a fwnode link as being part of
>     a cycle
>   driver core: fw_devlink: Consolidate device link flag computation
>   driver core: fw_devlink: Make cycle detection more robust
>   of: property: Simplify of_link_to_phandle()
>   irqchip/irq-imx-gpcv2: Mark fwnode device as not initialized
>   firmware: arm_scmi: Set fwnode for the scmi_device
>   mtd: mtdpart: Don't create platform device that'll never probe
>
>  drivers/base/core.c             | 449 +++++++++++++++++++++-----------
>  drivers/firmware/arm_scmi/bus.c |   3 +-
>  drivers/gpio/gpiolib.c          |   7 +
>  drivers/irqchip/irq-imx-gpcv2.c |   1 +
>  drivers/mtd/mtdpart.c           |  10 +
>  drivers/of/property.c           |  84 +-----
>  drivers/soc/imx/gpcv2.c         |   2 +-
>  drivers/soc/renesas/rcar-sysc.c |   2 +-
>  include/linux/device.h          |   1 +
>  include/linux/fwnode.h          |  12 +-
>  10 files changed, 344 insertions(+), 227 deletions(-)
>
> -- 
> 2.39.1.519.gcb327c4b5f-goog
Douglas Anderson Feb. 7, 2023, 3:27 p.m. UTC | #2
Hi,

On Mon, Feb 6, 2023 at 5:42 PM Saravana Kannan <saravanak@google.com> wrote:
>
> Naresh, Tony, Abel, Geert, Dmitry, Maxim(s), Miquel, Luca, Doug, Martin,
> Jean-Philippe,
>
> Can I get your Tested-by's for this v3 series please?
>
> Vladimir,
>
> Ccing you because DSA's and fw_devlink have known/existing problems
> (still in my TODOs to fix). But I want to make sure this series doesn't
> cause additional problems for DSA.
>
> All,
>
> This patch series improves fw_devlink in the following ways:
>
> 1. It no longer cares about a fwnode having a "compatible" property. It
>    figures this out more dynamically. The only expectation is that
>    fwnodes that are converted to devices actually get probed by a driver
>    for the dependencies to be enforced correctly.
>
> 2. Finer grained dependency tracking. fw_devlink will now create device
>    links from the consumer to the actual resource's device (if it has one,
>    Eg: gpio_device) instead of the parent supplier device. This improves
>    things like async suspend/resume ordering, potentially remove the need
>    for frameworks to create device links, more parallelized async probing,
>    and better sync_state() tracking.
>
> 3. Handle hardware/software quirks where a child firmware node gets
>    populated as a device before its parent firmware node AND actually
>    supplies a non-optional resource to the parent firmware node's
>    device.
>
> 4. Way more robust at cycle handling (see patch for the insane cases).
>
> 5. Stops depending on OF_POPULATED to figure out some corner cases.
>
> 6. Simplifies the work that needs to be done by the firmware specific
>    code.
>
> The v3 series has gone through my usual testing on my end and looks good
> to me.
>
> Thanks,
> Saravana
>
> [1] - https://lore.kernel.org/lkml/20220810060040.321697-1-saravanak@google.com/
> [2] - https://lore.kernel.org/lkml/CAGETcx-JUV1nj8wBJrTPfyvM7=Mre5j_vkVmZojeiumUGG6QZQ@mail.gmail.com/
>
> v1 -> v2:
> - Fixed Patch 1 to handle a corner case discussed in [2].
> - New patch 10 to handle "fsl,imx8mq-gpc" being initialized by 2 drivers.
> - New patch 11 to add fw_devlink support for SCMI devices.
>
> v2 -> v3:
> - Addressed most of Andy's comments in v2
> - Added Colin and Sudeep's Tested-by for the series (except the imx and
>   renesas patches)
> - Added Sudeep's Acked-by for the scmi patch.
> - Added Geert's Reviewed-by for the renesas patch.
> - Fixed gpiolib crash reported by Naresh.
> - Patch 6: Fix __fwnode_links_move_consumers() to preserve fwnode link flags.
> - New Patch 12 to fix nvmem-cells issue reported by Maxim(s)/Miquel.
> - Deleted some stale function doc in Patch 8
>
> Cc: Abel Vesa <abel.vesa@linaro.org>
> Cc: Alexander Stein <alexander.stein@ew.tq-group.com>
> Cc: Tony Lindgren <tony@atomide.com>
> Cc: Sudeep Holla <sudeep.holla@arm.com>
> Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> Cc: John Stultz <jstultz@google.com>
> Cc: Doug Anderson <dianders@chromium.org>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> Cc: Maxim Kiselev <bigunclemax@gmail.com>
> Cc: Maxim Kochetkov <fido_max@inbox.ru>
> Cc: Miquel Raynal <miquel.raynal@bootlin.com>
> Cc: Luca Weiss <luca.weiss@fairphone.com>
> Cc: Colin Foster <colin.foster@in-advantage.com>
> Cc: Martin Kepplinger <martin.kepplinger@puri.sm>
> Cc: Jean-Philippe Brucker <jpb@kernel.org>
> Cc: Vladimir Oltean <vladimir.oltean@nxp.com>
>
> Saravana Kannan (12):
>   driver core: fw_devlink: Don't purge child fwnode's consumer links
>   driver core: fw_devlink: Improve check for fwnode with no
>     device/driver
>   soc: renesas: Move away from using OF_POPULATED for fw_devlink
>   gpiolib: Clear the gpio_device's fwnode initialized flag before adding
>   driver core: fw_devlink: Add DL_FLAG_CYCLE support to device links
>   driver core: fw_devlink: Allow marking a fwnode link as being part of
>     a cycle
>   driver core: fw_devlink: Consolidate device link flag computation
>   driver core: fw_devlink: Make cycle detection more robust
>   of: property: Simplify of_link_to_phandle()
>   irqchip/irq-imx-gpcv2: Mark fwnode device as not initialized
>   firmware: arm_scmi: Set fwnode for the scmi_device
>   mtd: mtdpart: Don't create platform device that'll never probe

I tested the whole series together on several devices. I tried to test
on a wide variety since previous versions had broken due to all the
dependency cycles in the display and some of these devices used
different components in their display pipeline. I didn't do massive
testing but did confirm that basic devices came up, including display.

Devices tested with your v3 applied atop v6.2-rc7-11-g05ecb680708a:

* sc7180-trogdor-lazor (with ps8640 bridge), which had failed to bring
up the display on v2.

* sc7180-trogdor-lazor (with sn65dsi86 bridge)

* sc7180-trogdor-pazquel (with sn65dsi86 bridge)

* sc7180-trogdor-homestar (with ps8640 bridge)

* sc7180-trogdor-wormdingler

* sc7280-herobrine-villager

Tested-by: Douglas Anderson <dianders@chromium.org>
Saravana Kannan Feb. 7, 2023, 6:15 p.m. UTC | #3
On Tue, Feb 7, 2023 at 7:28 AM Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Mon, Feb 6, 2023 at 5:42 PM Saravana Kannan <saravanak@google.com> wrote:
> >
> > Naresh, Tony, Abel, Geert, Dmitry, Maxim(s), Miquel, Luca, Doug, Martin,
> > Jean-Philippe,
> >
> > Can I get your Tested-by's for this v3 series please?
> >
> > Vladimir,
> >
> > Ccing you because DSA's and fw_devlink have known/existing problems
> > (still in my TODOs to fix). But I want to make sure this series doesn't
> > cause additional problems for DSA.
> >
> > All,
> >
> > This patch series improves fw_devlink in the following ways:
> >
> > 1. It no longer cares about a fwnode having a "compatible" property. It
> >    figures this out more dynamically. The only expectation is that
> >    fwnodes that are converted to devices actually get probed by a driver
> >    for the dependencies to be enforced correctly.
> >
> > 2. Finer grained dependency tracking. fw_devlink will now create device
> >    links from the consumer to the actual resource's device (if it has one,
> >    Eg: gpio_device) instead of the parent supplier device. This improves
> >    things like async suspend/resume ordering, potentially remove the need
> >    for frameworks to create device links, more parallelized async probing,
> >    and better sync_state() tracking.
> >
> > 3. Handle hardware/software quirks where a child firmware node gets
> >    populated as a device before its parent firmware node AND actually
> >    supplies a non-optional resource to the parent firmware node's
> >    device.
> >
> > 4. Way more robust at cycle handling (see patch for the insane cases).
> >
> > 5. Stops depending on OF_POPULATED to figure out some corner cases.
> >
> > 6. Simplifies the work that needs to be done by the firmware specific
> >    code.
> >
> > The v3 series has gone through my usual testing on my end and looks good
> > to me.
> >
> > Thanks,
> > Saravana
> >
> > [1] - https://lore.kernel.org/lkml/20220810060040.321697-1-saravanak@google.com/
> > [2] - https://lore.kernel.org/lkml/CAGETcx-JUV1nj8wBJrTPfyvM7=Mre5j_vkVmZojeiumUGG6QZQ@mail.gmail.com/
> >
> > v1 -> v2:
> > - Fixed Patch 1 to handle a corner case discussed in [2].
> > - New patch 10 to handle "fsl,imx8mq-gpc" being initialized by 2 drivers.
> > - New patch 11 to add fw_devlink support for SCMI devices.
> >
> > v2 -> v3:
> > - Addressed most of Andy's comments in v2
> > - Added Colin and Sudeep's Tested-by for the series (except the imx and
> >   renesas patches)
> > - Added Sudeep's Acked-by for the scmi patch.
> > - Added Geert's Reviewed-by for the renesas patch.
> > - Fixed gpiolib crash reported by Naresh.
> > - Patch 6: Fix __fwnode_links_move_consumers() to preserve fwnode link flags.
> > - New Patch 12 to fix nvmem-cells issue reported by Maxim(s)/Miquel.
> > - Deleted some stale function doc in Patch 8
> >
> > Cc: Abel Vesa <abel.vesa@linaro.org>
> > Cc: Alexander Stein <alexander.stein@ew.tq-group.com>
> > Cc: Tony Lindgren <tony@atomide.com>
> > Cc: Sudeep Holla <sudeep.holla@arm.com>
> > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > Cc: John Stultz <jstultz@google.com>
> > Cc: Doug Anderson <dianders@chromium.org>
> > Cc: Guenter Roeck <linux@roeck-us.net>
> > Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> > Cc: Maxim Kiselev <bigunclemax@gmail.com>
> > Cc: Maxim Kochetkov <fido_max@inbox.ru>
> > Cc: Miquel Raynal <miquel.raynal@bootlin.com>
> > Cc: Luca Weiss <luca.weiss@fairphone.com>
> > Cc: Colin Foster <colin.foster@in-advantage.com>
> > Cc: Martin Kepplinger <martin.kepplinger@puri.sm>
> > Cc: Jean-Philippe Brucker <jpb@kernel.org>
> > Cc: Vladimir Oltean <vladimir.oltean@nxp.com>
> >
> > Saravana Kannan (12):
> >   driver core: fw_devlink: Don't purge child fwnode's consumer links
> >   driver core: fw_devlink: Improve check for fwnode with no
> >     device/driver
> >   soc: renesas: Move away from using OF_POPULATED for fw_devlink
> >   gpiolib: Clear the gpio_device's fwnode initialized flag before adding
> >   driver core: fw_devlink: Add DL_FLAG_CYCLE support to device links
> >   driver core: fw_devlink: Allow marking a fwnode link as being part of
> >     a cycle
> >   driver core: fw_devlink: Consolidate device link flag computation
> >   driver core: fw_devlink: Make cycle detection more robust
> >   of: property: Simplify of_link_to_phandle()
> >   irqchip/irq-imx-gpcv2: Mark fwnode device as not initialized
> >   firmware: arm_scmi: Set fwnode for the scmi_device
> >   mtd: mtdpart: Don't create platform device that'll never probe
>
> I tested the whole series together on several devices. I tried to test
> on a wide variety since previous versions had broken due to all the
> dependency cycles in the display and some of these devices used
> different components in their display pipeline. I didn't do massive
> testing but did confirm that basic devices came up, including display.
>
> Devices tested with your v3 applied atop v6.2-rc7-11-g05ecb680708a:
>
> * sc7180-trogdor-lazor (with ps8640 bridge), which had failed to bring
> up the display on v2.
>
> * sc7180-trogdor-lazor (with sn65dsi86 bridge)
>
> * sc7180-trogdor-pazquel (with sn65dsi86 bridge)
>
> * sc7180-trogdor-homestar (with ps8640 bridge)
>
> * sc7180-trogdor-wormdingler
>
> * sc7280-herobrine-villager
>
> Tested-by: Douglas Anderson <dianders@chromium.org>

Thanks a lot for all this testing and helping me debug the ps8640 issue Doug!

-Saravana
Geert Uytterhoeven Feb. 7, 2023, 9:35 p.m. UTC | #4
Hi Saravana,

On Tue, Feb 7, 2023 at 2:42 AM Saravana Kannan <saravanak@google.com> wrote:
> Naresh, Tony, Abel, Geert, Dmitry, Maxim(s), Miquel, Luca, Doug, Martin,
> Jean-Philippe,
>
> Can I get your Tested-by's for this v3 series please?

I have tested this on a variety of Renesas arm32/arm64 platforms,
and on several RISC-V platforms.
Apart from the regression related to dynamic overlays caused by
"[PATCH v3 09/12] of: property: Simplify of_link_to_phandle()"
(which you may decide to ignore for now ;-)
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
Saravana Kannan Feb. 7, 2023, 11:12 p.m. UTC | #5
On Tue, Feb 7, 2023 at 1:36 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Saravana,
>
> On Tue, Feb 7, 2023 at 2:42 AM Saravana Kannan <saravanak@google.com> wrote:
> > Naresh, Tony, Abel, Geert, Dmitry, Maxim(s), Miquel, Luca, Doug, Martin,
> > Jean-Philippe,
> >
> > Can I get your Tested-by's for this v3 series please?
>
> I have tested this on a variety of Renesas arm32/arm64 platforms,
> and on several RISC-V platforms.
> Apart from the regression related to dynamic overlays caused by
> "[PATCH v3 09/12] of: property: Simplify of_link_to_phandle()"
> (which you may decide to ignore for now ;-)
> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>

Thanks a lot for the extensive testing Geert!

I'll take a look at that issue with the out of tree code separately.

-Saravana


-Saravana
Vladimir Oltean Feb. 10, 2023, 10:13 a.m. UTC | #6
Hi Saravana,

On Mon, Feb 06, 2023 at 05:41:52PM -0800, Saravana Kannan wrote:
> Vladimir,
> 
> Ccing you because DSA's and fw_devlink have known/existing problems
> (still in my TODOs to fix). But I want to make sure this series doesn't
> cause additional problems for DSA.
> 
> All,
> 
> This patch series improves fw_devlink in the following ways:
> 
> 1. It no longer cares about a fwnode having a "compatible" property. It
>    figures this out more dynamically. The only expectation is that
>    fwnodes that are converted to devices actually get probed by a driver
>    for the dependencies to be enforced correctly.
> 
> 2. Finer grained dependency tracking. fw_devlink will now create device
>    links from the consumer to the actual resource's device (if it has one,
>    Eg: gpio_device) instead of the parent supplier device. This improves
>    things like async suspend/resume ordering, potentially remove the need
>    for frameworks to create device links, more parallelized async probing,
>    and better sync_state() tracking.
> 
> 3. Handle hardware/software quirks where a child firmware node gets
>    populated as a device before its parent firmware node AND actually
>    supplies a non-optional resource to the parent firmware node's
>    device.
> 
> 4. Way more robust at cycle handling (see patch for the insane cases).
> 
> 5. Stops depending on OF_POPULATED to figure out some corner cases.
> 
> 6. Simplifies the work that needs to be done by the firmware specific
>    code.
> 
> The v3 series has gone through my usual testing on my end and looks good
> to me.

Booted on an NXP LS1028A (arch/arm64/boot/dts/freescale/fsl-ls1028a-rdb.dts)
and a Turris MOX (arch/arm64/boot/dts/marvell/armada-3720-turris-mox.dts)
with no observed regressions. Is there something specific you would like
me to test?
Saravana Kannan Feb. 10, 2023, 7:27 p.m. UTC | #7
On Fri, Feb 10, 2023 at 2:13 AM Vladimir Oltean <vladimir.oltean@nxp.com> wrote:
>
> Hi Saravana,
>
> On Mon, Feb 06, 2023 at 05:41:52PM -0800, Saravana Kannan wrote:
> > Vladimir,
> >
> > Ccing you because DSA's and fw_devlink have known/existing problems
> > (still in my TODOs to fix). But I want to make sure this series doesn't
> > cause additional problems for DSA.
> >
> > All,
> >
> > This patch series improves fw_devlink in the following ways:
> >
> > 1. It no longer cares about a fwnode having a "compatible" property. It
> >    figures this out more dynamically. The only expectation is that
> >    fwnodes that are converted to devices actually get probed by a driver
> >    for the dependencies to be enforced correctly.
> >
> > 2. Finer grained dependency tracking. fw_devlink will now create device
> >    links from the consumer to the actual resource's device (if it has one,
> >    Eg: gpio_device) instead of the parent supplier device. This improves
> >    things like async suspend/resume ordering, potentially remove the need
> >    for frameworks to create device links, more parallelized async probing,
> >    and better sync_state() tracking.
> >
> > 3. Handle hardware/software quirks where a child firmware node gets
> >    populated as a device before its parent firmware node AND actually
> >    supplies a non-optional resource to the parent firmware node's
> >    device.
> >
> > 4. Way more robust at cycle handling (see patch for the insane cases).
> >
> > 5. Stops depending on OF_POPULATED to figure out some corner cases.
> >
> > 6. Simplifies the work that needs to be done by the firmware specific
> >    code.
> >
> > The v3 series has gone through my usual testing on my end and looks good
> > to me.
>
> Booted on an NXP LS1028A (arch/arm64/boot/dts/freescale/fsl-ls1028a-rdb.dts)
> and a Turris MOX (arch/arm64/boot/dts/marvell/armada-3720-turris-mox.dts)
> with no observed regressions.

Thanks for testing Vladimir!

> Is there something specific you would like
> me to test?


Not really, I just want to make sure the common DSA architectures
don't hit any regression. In the hardware you tested, are there cases
of PHYs where the supplier is the parent MDIO? I remember that being
the only case where I needed special casing
(FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD) in fw_devlink -- so it'll be
good to make sure I didn't accidentally break anything there.


-Saravana
Vladimir Oltean Feb. 10, 2023, 9:08 p.m. UTC | #8
On Fri, Feb 10, 2023 at 11:27:11AM -0800, Saravana Kannan wrote:
> On Fri, Feb 10, 2023 at 2:13 AM Vladimir Oltean <vladimir.oltean@nxp.com> wrote:
> >
> > Hi Saravana,
> >
> > On Mon, Feb 06, 2023 at 05:41:52PM -0800, Saravana Kannan wrote:
> > > Vladimir,
> > >
> > > Ccing you because DSA's and fw_devlink have known/existing problems
> > > (still in my TODOs to fix). But I want to make sure this series doesn't
> > > cause additional problems for DSA.
> > >
> > > All,
> > >
> > > This patch series improves fw_devlink in the following ways:
> > >
> > > 1. It no longer cares about a fwnode having a "compatible" property. It
> > >    figures this out more dynamically. The only expectation is that
> > >    fwnodes that are converted to devices actually get probed by a driver
> > >    for the dependencies to be enforced correctly.
> > >
> > > 2. Finer grained dependency tracking. fw_devlink will now create device
> > >    links from the consumer to the actual resource's device (if it has one,
> > >    Eg: gpio_device) instead of the parent supplier device. This improves
> > >    things like async suspend/resume ordering, potentially remove the need
> > >    for frameworks to create device links, more parallelized async probing,
> > >    and better sync_state() tracking.
> > >
> > > 3. Handle hardware/software quirks where a child firmware node gets
> > >    populated as a device before its parent firmware node AND actually
> > >    supplies a non-optional resource to the parent firmware node's
> > >    device.
> > >
> > > 4. Way more robust at cycle handling (see patch for the insane cases).
> > >
> > > 5. Stops depending on OF_POPULATED to figure out some corner cases.
> > >
> > > 6. Simplifies the work that needs to be done by the firmware specific
> > >    code.
> > >
> > > The v3 series has gone through my usual testing on my end and looks good
> > > to me.
> >
> > Booted on an NXP LS1028A (arch/arm64/boot/dts/freescale/fsl-ls1028a-rdb.dts)
> > and a Turris MOX (arch/arm64/boot/dts/marvell/armada-3720-turris-mox.dts)
> > with no observed regressions.
> 
> Thanks for testing Vladimir!
> 
> > Is there something specific you would like
> > me to test?
> 
> Not really, I just want to make sure the common DSA architectures
> don't hit any regression. In the hardware you tested, are there cases
> of PHYs where the supplier is the parent MDIO? I remember that being
> the only case where I needed special casing
> (FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD) in fw_devlink -- so it'll be
> good to make sure I didn't accidentally break anything there.
> 
> -Saravana

Yes and no (I never had a system which depended on FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD).

Yes, because well, yes, in arch/arm64/boot/dts/marvell/armada-3720-turris-mox.dts,
the PHYs will depend on interrupts provided by their (parent) switch. However this
is not explicit in the device tree. To make it explicit, one would need to add:

diff --git a/arch/arm64/boot/dts/marvell/armada-3720-turris-mox.dts b/arch/arm64/boot/dts/marvell/armada-3720-turris-mox.dts
index cd0988317623..d789cda49e35 100644
--- a/arch/arm64/boot/dts/marvell/armada-3720-turris-mox.dts
+++ b/arch/arm64/boot/dts/marvell/armada-3720-turris-mox.dts
@@ -305,12 +305,14 @@ phy1: ethernet-phy@1 {
 	};
 
 	/* switch nodes are enabled by U-Boot if modules are present */
-	switch0@10 {
+	switch0_peridot: switch0@10 {
 		compatible = "marvell,mv88e6190";
 		reg = <0x10>;
 		dsa,member = <0 0>;
 		interrupt-parent = <&moxtet>;
 		interrupts = <MOXTET_IRQ_PERIDOT(0)>;
+		interrupt-controller;
+		#interrupt-cells = <2>;
 		status = "disabled";
 
 		mdio {
@@ -319,34 +321,42 @@ mdio {
 
 			switch0phy1: switch0phy1@1 {
 				reg = <0x1>;
+				interrupts-extended = <&switch0_peridot 1 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch0phy2: switch0phy2@2 {
 				reg = <0x2>;
+				interrupts-extended = <&switch0_peridot 2 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch0phy3: switch0phy3@3 {
 				reg = <0x3>;
+				interrupts-extended = <&switch0_peridot 3 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch0phy4: switch0phy4@4 {
 				reg = <0x4>;
+				interrupts-extended = <&switch0_peridot 4 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch0phy5: switch0phy5@5 {
 				reg = <0x5>;
+				interrupts-extended = <&switch0_peridot 5 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch0phy6: switch0phy6@6 {
 				reg = <0x6>;
+				interrupts-extended = <&switch0_peridot 6 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch0phy7: switch0phy7@7 {
 				reg = <0x7>;
+				interrupts-extended = <&switch0_peridot 7 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch0phy8: switch0phy8@8 {
 				reg = <0x8>;
+				interrupts-extended = <&switch0_peridot 8 IRQ_TYPE_LEVEL_HIGH>;
 			};
 		};
 
@@ -430,12 +440,14 @@ port-sfp@a {
 		};
 	};
 
-	switch0@2 {
+	switch0_topaz: switch0@2 {
 		compatible = "marvell,mv88e6085";
 		reg = <0x2>;
 		dsa,member = <0 0>;
 		interrupt-parent = <&moxtet>;
 		interrupts = <MOXTET_IRQ_TOPAZ>;
+		interrupt-controller;
+		#interrupt-cells = <2>;
 		status = "disabled";
 
 		mdio {
@@ -444,18 +456,22 @@ mdio {
 
 			switch0phy1_topaz: switch0phy1@11 {
 				reg = <0x11>;
+				interrupts-extended = <&switch0_topaz 0x11 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch0phy2_topaz: switch0phy2@12 {
 				reg = <0x12>;
+				interrupts-extended = <&switch0_topaz 0x12 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch0phy3_topaz: switch0phy3@13 {
 				reg = <0x13>;
+				interrupts-extended = <&switch0_topaz 0x13 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch0phy4_topaz: switch0phy4@14 {
 				reg = <0x14>;
+				interrupts-extended = <&switch0_topaz 0x14 IRQ_TYPE_LEVEL_HIGH>;
 			};
 		};
 
@@ -497,12 +513,14 @@ port@5 {
 		};
 	};
 
-	switch1@11 {
+	switch1_peridot: switch1@11 {
 		compatible = "marvell,mv88e6190";
 		reg = <0x11>;
 		dsa,member = <0 1>;
 		interrupt-parent = <&moxtet>;
 		interrupts = <MOXTET_IRQ_PERIDOT(1)>;
+		interrupt-controller;
+		#interrupt-cells = <2>;
 		status = "disabled";
 
 		mdio {
@@ -511,34 +529,42 @@ mdio {
 
 			switch1phy1: switch1phy1@1 {
 				reg = <0x1>;
+				interrupts-extended = <&switch1_peridot 1 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch1phy2: switch1phy2@2 {
 				reg = <0x2>;
+				interrupts-extended = <&switch1_peridot 2 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch1phy3: switch1phy3@3 {
 				reg = <0x3>;
+				interrupts-extended = <&switch1_peridot 3 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch1phy4: switch1phy4@4 {
 				reg = <0x4>;
+				interrupts-extended = <&switch1_peridot 4 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch1phy5: switch1phy5@5 {
 				reg = <0x5>;
+				interrupts-extended = <&switch1_peridot 5 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch1phy6: switch1phy6@6 {
 				reg = <0x6>;
+				interrupts-extended = <&switch1_peridot 6 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch1phy7: switch1phy7@7 {
 				reg = <0x7>;
+				interrupts-extended = <&switch1_peridot 7 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch1phy8: switch1phy8@8 {
 				reg = <0x8>;
+				interrupts-extended = <&switch1_peridot 8 IRQ_TYPE_LEVEL_HIGH>;
 			};
 		};
 
@@ -622,12 +648,14 @@ port-sfp@a {
 		};
 	};
 
-	switch1@2 {
+	switch1_topaz: switch1@2 {
 		compatible = "marvell,mv88e6085";
 		reg = <0x2>;
 		dsa,member = <0 1>;
 		interrupt-parent = <&moxtet>;
 		interrupts = <MOXTET_IRQ_TOPAZ>;
+		interrupt-controller;
+		#interrupt-cells = <2>;
 		status = "disabled";
 
 		mdio {
@@ -636,18 +664,22 @@ mdio {
 
 			switch1phy1_topaz: switch1phy1@11 {
 				reg = <0x11>;
+				interrupts-extended = <&switch1_topaz 0x11 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch1phy2_topaz: switch1phy2@12 {
 				reg = <0x12>;
+				interrupts-extended = <&switch1_topaz 0x12 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch1phy3_topaz: switch1phy3@13 {
 				reg = <0x13>;
+				interrupts-extended = <&switch1_topaz 0x13 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch1phy4_topaz: switch1phy4@14 {
 				reg = <0x14>;
+				interrupts-extended = <&switch1_topaz 0x14 IRQ_TYPE_LEVEL_HIGH>;
 			};
 		};
 
@@ -689,12 +721,14 @@ port@5 {
 		};
 	};
 
-	switch2@12 {
+	switch2_peridot: switch2@12 {
 		compatible = "marvell,mv88e6190";
 		reg = <0x12>;
 		dsa,member = <0 2>;
 		interrupt-parent = <&moxtet>;
 		interrupts = <MOXTET_IRQ_PERIDOT(2)>;
+		interrupt-controller;
+		#interrupt-cells = <2>;
 		status = "disabled";
 
 		mdio {
@@ -703,34 +737,42 @@ mdio {
 
 			switch2phy1: switch2phy1@1 {
 				reg = <0x1>;
+				interrupts-extended = <&switch2_peridot 1 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch2phy2: switch2phy2@2 {
 				reg = <0x2>;
+				interrupts-extended = <&switch2_peridot 2 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch2phy3: switch2phy3@3 {
 				reg = <0x3>;
+				interrupts-extended = <&switch2_peridot 3 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch2phy4: switch2phy4@4 {
 				reg = <0x4>;
+				interrupts-extended = <&switch2_peridot 4 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch2phy5: switch2phy5@5 {
 				reg = <0x5>;
+				interrupts-extended = <&switch2_peridot 5 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch2phy6: switch2phy6@6 {
 				reg = <0x6>;
+				interrupts-extended = <&switch2_peridot 6 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch2phy7: switch2phy7@7 {
 				reg = <0x7>;
+				interrupts-extended = <&switch2_peridot 7 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch2phy8: switch2phy8@8 {
 				reg = <0x8>;
+				interrupts-extended = <&switch2_peridot 8 IRQ_TYPE_LEVEL_HIGH>;
 			};
 		};
 
@@ -805,12 +847,14 @@ port-sfp@a {
 		};
 	};
 
-	switch2@2 {
+	switch2_topaz: switch2@2 {
 		compatible = "marvell,mv88e6085";
 		reg = <0x2>;
 		dsa,member = <0 2>;
 		interrupt-parent = <&moxtet>;
 		interrupts = <MOXTET_IRQ_TOPAZ>;
+		interrupt-controller;
+		#interrupt-cells = <2>;
 		status = "disabled";
 
 		mdio {
@@ -819,18 +863,22 @@ mdio {
 
 			switch2phy1_topaz: switch2phy1@11 {
 				reg = <0x11>;
+				interrupts-extended = <&switch2_topaz 0x11 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch2phy2_topaz: switch2phy2@12 {
 				reg = <0x12>;
+				interrupts-extended = <&switch2_topaz 0x12 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch2phy3_topaz: switch2phy3@13 {
 				reg = <0x13>;
+				interrupts-extended = <&switch2_topaz 0x13 IRQ_TYPE_LEVEL_HIGH>;
 			};
 
 			switch2phy4_topaz: switch2phy4@14 {
 				reg = <0x14>;
+				interrupts-extended = <&switch2_topaz 0x14 IRQ_TYPE_LEVEL_HIGH>;
 			};
 		};
 

However, as I had explained in one of the first discussions here:
https://lore.kernel.org/netdev/20210901012826.iuy2bhvkrgahhrl7@skbuf/

it was always hit-or-miss whether the above device tree had an issue
with fw_devlink or not: it depended on how the driver was written (and
the mv88e6xxx switch driver was tricking the fw_devlink logic from that
time to drop the device links because of an unrelated -EPROBE_DEFER).

What I had done to "untrick" fw_devlink so that I could see the issue
(which was originally reported by Alvin Šipraga) was to modify the
mv88e6xxx driver, and change the placement of mv88e6xxx_mdios_register()
to a point after which we will never hit -EPROBE_DEFER (from driver probe()
to the dsa_switch_ops :: setup() method):

diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 0a5d6c7bb128..48650465660d 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -3672,11 +3672,18 @@ static int mv88e6390_setup_errata(struct mv88e6xxx_chip *chip)
 	return mv88e6xxx_software_reset(chip);
 }
 
+static int mv88e6xxx_mdios_register(struct mv88e6xxx_chip *chip,
+				    struct device_node *np);
+static void mv88e6xxx_mdios_unregister(struct mv88e6xxx_chip *chip);
+
 static void mv88e6xxx_teardown(struct dsa_switch *ds)
 {
+	struct mv88e6xxx_chip *chip = ds->priv;
+
 	mv88e6xxx_teardown_devlink_params(ds);
 	dsa_devlink_resources_unregister(ds);
 	mv88e6xxx_teardown_devlink_regions_global(ds);
+	mv88e6xxx_mdios_unregister(chip);
 }
 
 static int mv88e6xxx_setup(struct dsa_switch *ds)
@@ -3686,6 +3693,10 @@ static int mv88e6xxx_setup(struct dsa_switch *ds)
 	int err;
 	int i;
 
+	err = mv88e6xxx_mdios_register(chip, chip->dev->of_node);
+	if (err)
+		return err;
+
 	chip->ds = ds;
 	ds->slave_mii_bus = mv88e6xxx_default_mdio_bus(chip);
 
@@ -3811,8 +3822,10 @@ static int mv88e6xxx_setup(struct dsa_switch *ds)
 unlock:
 	mv88e6xxx_reg_unlock(chip);
 
-	if (err)
+	if (err) {
+		mv88e6xxx_mdios_unregister(chip);
 		return err;
+	}
 
 	/* Have to be called without holding the register lock, since
 	 * they take the devlink lock, and we later take the locks in
@@ -3837,6 +3850,7 @@ static int mv88e6xxx_setup(struct dsa_switch *ds)
 	mv88e6xxx_teardown_devlink_params(ds);
 out_resources:
 	dsa_devlink_resources_unregister(ds);
+	mv88e6xxx_mdios_unregister(chip);
 
 	return err;
 }
@@ -7220,18 +7234,12 @@ static int mv88e6xxx_probe(struct mdio_device *mdiodev)
 	if (err)
 		goto out_g1_atu_prob_irq;
 
-	err = mv88e6xxx_mdios_register(chip, np);
-	if (err)
-		goto out_g1_vtu_prob_irq;
-
 	err = mv88e6xxx_register_switch(chip);
 	if (err)
-		goto out_mdio;
+		goto out_g1_vtu_prob_irq;
 
 	return 0;
 
-out_mdio:
-	mv88e6xxx_mdios_unregister(chip);
 out_g1_vtu_prob_irq:
 	mv88e6xxx_g1_vtu_prob_irq_free(chip);
 out_g1_atu_prob_irq:
@@ -7268,7 +7276,6 @@ static void mv88e6xxx_remove(struct mdio_device *mdiodev)
 
 	mv88e6xxx_phy_destroy(chip);
 	mv88e6xxx_unregister_switch(chip);
-	mv88e6xxx_mdios_unregister(chip);
 
 	mv88e6xxx_g1_vtu_prob_irq_free(chip);
 	mv88e6xxx_g1_atu_prob_irq_free(chip);

After applying both of the above changes on top of yours, I confirm that
the PHYs on the mv88e6xxx on Turris MOX still probe with their specific
PHY driver rather than the generic one, and with interrupts (not poll mode):

[    6.125894] mv88e6085 d0032004.mdio-mii:11 lan9 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch1@11!mdio:01] driver [Marvell 88E6390 Family] (irq=49)
[    6.222024] mv88e6085 d0032004.mdio-mii:11 lan10 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch1@11!mdio:02] driver [Marvell 88E6390 Family] (irq=50)
[    6.277554] mv88e6085 d0032004.mdio-mii:11 lan11 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch1@11!mdio:03] driver [Marvell 88E6390 Family] (irq=51)
[    6.361556] mv88e6085 d0032004.mdio-mii:11 lan12 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch1@11!mdio:04] driver [Marvell 88E6390 Family] (irq=52)
[    6.445574] mv88e6085 d0032004.mdio-mii:11 lan13 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch1@11!mdio:05] driver [Marvell 88E6390 Family] (irq=53)
[    6.529560] mv88e6085 d0032004.mdio-mii:11 lan14 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch1@11!mdio:06] driver [Marvell 88E6390 Family] (irq=54)
[    6.589555] mv88e6085 d0032004.mdio-mii:11 lan15 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch1@11!mdio:07] driver [Marvell 88E6390 Family] (irq=55)
[    6.673559] mv88e6085 d0032004.mdio-mii:11 lan16 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch1@11!mdio:08] driver [Marvell 88E6390 Family] (irq=56)
[    6.733558] mv88e6085 d0032004.mdio-mii:12 lan17 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch2@12!mdio:01] driver [Marvell 88E6390 Family] (irq=74)
[    6.817557] mv88e6085 d0032004.mdio-mii:12 lan18 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch2@12!mdio:02] driver [Marvell 88E6390 Family] (irq=75)
[    6.889628] mv88e6085 d0032004.mdio-mii:12 lan19 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch2@12!mdio:03] driver [Marvell 88E6390 Family] (irq=76)
[    6.973593] mv88e6085 d0032004.mdio-mii:12 lan20 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch2@12!mdio:04] driver [Marvell 88E6390 Family] (irq=77)
[    7.057572] mv88e6085 d0032004.mdio-mii:12 lan21 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch2@12!mdio:05] driver [Marvell 88E6390 Family] (irq=78)
[    7.109547] mv88e6085 d0032004.mdio-mii:12 lan22 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch2@12!mdio:06] driver [Marvell 88E6390 Family] (irq=79)
[    7.193553] mv88e6085 d0032004.mdio-mii:12 lan23 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch2@12!mdio:07] driver [Marvell 88E6390 Family] (irq=80)
[    7.277550] mv88e6085 d0032004.mdio-mii:12 lan24 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch2@12!mdio:08] driver [Marvell 88E6390 Family] (irq=81)
[    7.365556] mv88e6085 d0032004.mdio-mii:10 lan1 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch0@10!mdio:01] driver [Marvell 88E6390 Family] (irq=109)
[    7.421549] mv88e6085 d0032004.mdio-mii:10 lan2 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch0@10!mdio:02] driver [Marvell 88E6390 Family] (irq=110)
[    7.505554] mv88e6085 d0032004.mdio-mii:10 lan3 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch0@10!mdio:03] driver [Marvell 88E6390 Family] (irq=111)
[    7.589571] mv88e6085 d0032004.mdio-mii:10 lan4 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch0@10!mdio:04] driver [Marvell 88E6390 Family] (irq=112)
[    7.673560] mv88e6085 d0032004.mdio-mii:10 lan5 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch0@10!mdio:05] driver [Marvell 88E6390 Family] (irq=113)
[    7.733547] mv88e6085 d0032004.mdio-mii:10 lan6 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch0@10!mdio:06] driver [Marvell 88E6390 Family] (irq=114)
[    7.817555] mv88e6085 d0032004.mdio-mii:10 lan7 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch0@10!mdio:07] driver [Marvell 88E6390 Family] (irq=115)
[    7.901572] mv88e6085 d0032004.mdio-mii:10 lan8 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch0@10!mdio:08] driver [Marvell 88E6390 Family] (irq=116)

even though I am seeing these error messages earlier in the boot process (maybe this is something to look into):

[    0.910219] mdio_bus d0032004.mdio-mii:10: Failed to create device link with d0032004.mdio-mii:10
[    0.910228] mdio_bus d0032004.mdio-mii:10: Failed to create device link with d0032004.mdio-mii:10
[    0.910237] mdio_bus d0032004.mdio-mii:10: Failed to create device link with d0032004.mdio-mii:10
[    0.910244] mdio_bus d0032004.mdio-mii:10: Failed to create device link with d0032004.mdio-mii:10
[    0.910251] mdio_bus d0032004.mdio-mii:10: Failed to create device link with d0032004.mdio-mii:10
[    0.910259] mdio_bus d0032004.mdio-mii:10: Failed to create device link with d0032004.mdio-mii:10
[    0.910266] mdio_bus d0032004.mdio-mii:10: Failed to create device link with d0032004.mdio-mii:10
[    0.910273] mdio_bus d0032004.mdio-mii:10: Failed to create device link with d0032004.mdio-mii:10
[    0.910936] mv88e6085 d0032004.mdio-mii:10: switch 0x3900 detected: Marvell 88E6390, revision 1
[    0.928360] mdio_bus d0032004.mdio-mii:11: Failed to create device link with d0032004.mdio-mii:11
[    0.928380] mdio_bus d0032004.mdio-mii:11: Failed to create device link with d0032004.mdio-mii:11
[    0.928388] mdio_bus d0032004.mdio-mii:11: Failed to create device link with d0032004.mdio-mii:11
[    0.928396] mdio_bus d0032004.mdio-mii:11: Failed to create device link with d0032004.mdio-mii:11
[    0.928403] mdio_bus d0032004.mdio-mii:11: Failed to create device link with d0032004.mdio-mii:11
[    0.928410] mdio_bus d0032004.mdio-mii:11: Failed to create device link with d0032004.mdio-mii:11
[    0.928418] mdio_bus d0032004.mdio-mii:11: Failed to create device link with d0032004.mdio-mii:11
[    0.928425] mdio_bus d0032004.mdio-mii:11: Failed to create device link with d0032004.mdio-mii:11
[    0.928993] mv88e6085 d0032004.mdio-mii:11: switch 0x1900 detected: Marvell 88E6190, revision 1
[    0.943306] mdio_bus d0032004.mdio-mii:12: Failed to create device link with d0032004.mdio-mii:12
[    0.943327] mdio_bus d0032004.mdio-mii:12: Failed to create device link with d0032004.mdio-mii:12
[    0.943334] mdio_bus d0032004.mdio-mii:12: Failed to create device link with d0032004.mdio-mii:12
[    0.943342] mdio_bus d0032004.mdio-mii:12: Failed to create device link with d0032004.mdio-mii:12
[    0.943349] mdio_bus d0032004.mdio-mii:12: Failed to create device link with d0032004.mdio-mii:12
[    0.943357] mdio_bus d0032004.mdio-mii:12: Failed to create device link with d0032004.mdio-mii:12
[    0.943364] mdio_bus d0032004.mdio-mii:12: Failed to create device link with d0032004.mdio-mii:12
[    0.943371] mdio_bus d0032004.mdio-mii:12: Failed to create device link with d0032004.mdio-mii:12
[    0.943879] mv88e6085 d0032004.mdio-mii:12: switch 0x1900 detected: Marvell 88E6190, revision 1


If _on top_ of all the above, I also remove the logic that sets FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD:

 drivers/net/phy/mdio_bus.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
index 00d5bcdf0e6f..4d4c42771d37 100644
--- a/drivers/net/phy/mdio_bus.c
+++ b/drivers/net/phy/mdio_bus.c
@@ -665,9 +665,9 @@ int __mdiobus_register(struct mii_bus *bus, struct module *owner)
 	if (!bus->read && !bus->read_c45)
 		return -EINVAL;
 
-	if (bus->parent && bus->parent->of_node)
-		bus->parent->of_node->fwnode.flags |=
-					FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD;
+//	if (bus->parent && bus->parent->of_node)
+//		bus->parent->of_node->fwnode.flags |=
+//					FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD;
 
 	WARN(bus->state != MDIOBUS_ALLOCATED &&
 	     bus->state != MDIOBUS_UNREGISTERED,

then *finally* I get something approximating Alvin's reported issue.
In my case, one switch out of 3 gets its PHYs bound to the Generic PHY
driver (why not all is a story for another time):

[    6.106204] mv88e6085 d0032004.mdio-mii:11 lan9 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch1@11!mdio:01] driver [Marvell 88E6390 Family] (irq=49)
[    6.193561] mv88e6085 d0032004.mdio-mii:11 lan10 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch1@11!mdio:02] driver [Marvell 88E6390 Family] (irq=50)
[    6.249654] mv88e6085 d0032004.mdio-mii:11 lan11 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch1@11!mdio:03] driver [Marvell 88E6390 Family] (irq=51)
[    6.333580] mv88e6085 d0032004.mdio-mii:11 lan12 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch1@11!mdio:04] driver [Marvell 88E6390 Family] (irq=52)
[    6.417577] mv88e6085 d0032004.mdio-mii:11 lan13 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch1@11!mdio:05] driver [Marvell 88E6390 Family] (irq=53)
[    6.501561] mv88e6085 d0032004.mdio-mii:11 lan14 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch1@11!mdio:06] driver [Marvell 88E6390 Family] (irq=54)
[    6.561655] mv88e6085 d0032004.mdio-mii:11 lan15 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch1@11!mdio:07] driver [Marvell 88E6390 Family] (irq=55)
[    6.645583] mv88e6085 d0032004.mdio-mii:11 lan16 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch1@11!mdio:08] driver [Marvell 88E6390 Family] (irq=56)
[    6.733557] mv88e6085 d0032004.mdio-mii:12 lan17 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch2@12!mdio:01] driver [Marvell 88E6390 Family] (irq=74)
[    6.817563] mv88e6085 d0032004.mdio-mii:12 lan18 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch2@12!mdio:02] driver [Marvell 88E6390 Family] (irq=75)
[    6.873653] mv88e6085 d0032004.mdio-mii:12 lan19 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch2@12!mdio:03] driver [Marvell 88E6390 Family] (irq=76)
[    6.957582] mv88e6085 d0032004.mdio-mii:12 lan20 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch2@12!mdio:04] driver [Marvell 88E6390 Family] (irq=77)
[    7.041567] mv88e6085 d0032004.mdio-mii:12 lan21 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch2@12!mdio:05] driver [Marvell 88E6390 Family] (irq=78)
[    7.093561] mv88e6085 d0032004.mdio-mii:12 lan22 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch2@12!mdio:06] driver [Marvell 88E6390 Family] (irq=79)
[    7.177476] mv88e6085 d0032004.mdio-mii:12 lan23 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch2@12!mdio:07] driver [Marvell 88E6390 Family] (irq=80)
[    7.245560] mv88e6085 d0032004.mdio-mii:12 lan24 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch2@12!mdio:08] driver [Marvell 88E6390 Family] (irq=81)
[    7.269178] mv88e6085 d0032004.mdio-mii:10 lan1 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch0@10!mdio:01] driver [Generic PHY] (irq=POLL)
[    7.288234] mv88e6085 d0032004.mdio-mii:10 lan2 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch0@10!mdio:02] driver [Generic PHY] (irq=POLL)
[    7.307295] mv88e6085 d0032004.mdio-mii:10 lan3 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch0@10!mdio:03] driver [Generic PHY] (irq=POLL)
[    7.326342] mv88e6085 d0032004.mdio-mii:10 lan4 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch0@10!mdio:04] driver [Generic PHY] (irq=POLL)
[    7.345384] mv88e6085 d0032004.mdio-mii:10 lan5 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch0@10!mdio:05] driver [Generic PHY] (irq=POLL)
[    7.364449] mv88e6085 d0032004.mdio-mii:10 lan6 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch0@10!mdio:06] driver [Generic PHY] (irq=POLL)
[    7.383496] mv88e6085 d0032004.mdio-mii:10 lan7 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch0@10!mdio:07] driver [Generic PHY] (irq=POLL)
[    7.402563] mv88e6085 d0032004.mdio-mii:10 lan8 (uninitialized): PHY [!soc!internal-regs@d0000000!mdio@32004!switch0@10!mdio:08] driver [Generic PHY] (irq=POLL)

So I guess that FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD does something.

OTOH, if I apply just my patch to remove the FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD,
but without making the device tree change or the mv88e6xxx driver change, the PHYs
still probe with the correct driver and with interrupts (i.e. they don't get their
probing deferred). This also seems to prove my point that my patches to bring the
Turris MOX to a testable state for this fwnode flag are indeed required.

HTH.
Saravana Kannan Feb. 10, 2023, 9:32 p.m. UTC | #9
On Fri, Feb 10, 2023 at 1:08 PM Vladimir Oltean <vladimir.oltean@nxp.com> wrote:
>
> On Fri, Feb 10, 2023 at 11:27:11AM -0800, Saravana Kannan wrote:
> > On Fri, Feb 10, 2023 at 2:13 AM Vladimir Oltean <vladimir.oltean@nxp.com> wrote:
> > >
> > > Hi Saravana,
> > >
> > > On Mon, Feb 06, 2023 at 05:41:52PM -0800, Saravana Kannan wrote:
> > > > Vladimir,
> > > >
> > > > Ccing you because DSA's and fw_devlink have known/existing problems
> > > > (still in my TODOs to fix). But I want to make sure this series doesn't
> > > > cause additional problems for DSA.
> > > >
> > > > All,
> > > >
> > > > This patch series improves fw_devlink in the following ways:
> > > >
> > > > 1. It no longer cares about a fwnode having a "compatible" property. It
> > > >    figures this out more dynamically. The only expectation is that
> > > >    fwnodes that are converted to devices actually get probed by a driver
> > > >    for the dependencies to be enforced correctly.
> > > >
> > > > 2. Finer grained dependency tracking. fw_devlink will now create device
> > > >    links from the consumer to the actual resource's device (if it has one,
> > > >    Eg: gpio_device) instead of the parent supplier device. This improves
> > > >    things like async suspend/resume ordering, potentially remove the need
> > > >    for frameworks to create device links, more parallelized async probing,
> > > >    and better sync_state() tracking.
> > > >
> > > > 3. Handle hardware/software quirks where a child firmware node gets
> > > >    populated as a device before its parent firmware node AND actually
> > > >    supplies a non-optional resource to the parent firmware node's
> > > >    device.
> > > >
> > > > 4. Way more robust at cycle handling (see patch for the insane cases).
> > > >
> > > > 5. Stops depending on OF_POPULATED to figure out some corner cases.
> > > >
> > > > 6. Simplifies the work that needs to be done by the firmware specific
> > > >    code.
> > > >
> > > > The v3 series has gone through my usual testing on my end and looks good
> > > > to me.
> > >
> > > Booted on an NXP LS1028A (arch/arm64/boot/dts/freescale/fsl-ls1028a-rdb.dts)
> > > and a Turris MOX (arch/arm64/boot/dts/marvell/armada-3720-turris-mox.dts)
> > > with no observed regressions.
> >
> > Thanks for testing Vladimir!
> >
> > > Is there something specific you would like
> > > me to test?
> >
> > Not really, I just want to make sure the common DSA architectures
> > don't hit any regression. In the hardware you tested, are there cases
> > of PHYs where the supplier is the parent MDIO? I remember that being
> > the only case where I needed special casing
> > (FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD) in fw_devlink -- so it'll be
> > good to make sure I didn't accidentally break anything there.
> >
> > -Saravana
>
> Yes and no (I never had a system which depended on FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD).
>
> Yes, because well, yes, in arch/arm64/boot/dts/marvell/armada-3720-turris-mox.dts,
> the PHYs will depend on interrupts provided by their (parent) switch. However this
> is not explicit in the device tree. To make it explicit, one would need to add:
>
> diff --git a/arch/arm64/boot/dts/marvell/armada-3720-turris-mox.dts b/arch/arm64/boot/dts/marvell/armada-3720-turris-mox.dts
> index cd0988317623..d789cda49e35 100644
> --- a/arch/arm64/boot/dts/marvell/armada-3720-turris-mox.dts
> +++ b/arch/arm64/boot/dts/marvell/armada-3720-turris-mox.dts

-----8<---- Snipped DT diff -----

> However, as I had explained in one of the first discussions here:
> https://lore.kernel.org/netdev/20210901012826.iuy2bhvkrgahhrl7@skbuf/
>
> it was always hit-or-miss whether the above device tree had an issue
> with fw_devlink or not: it depended on how the driver was written (and
> the mv88e6xxx switch driver was tricking the fw_devlink logic from that
> time to drop the device links because of an unrelated -EPROBE_DEFER).

Yeah, I never forgot this issue. That's why I used "additional" in my
cover letter :)

So far I've not needed to change fw_devlink in a way that'd break this
unintentional "tricky behavior" but I might be coming up to that wall
soon. So this reply is becoming more relevant to me:
https://lore.kernel.org/lkml/CAGETcx8De_qm9hVtK5CznfWke9nmOfV8OcvAW6kmwyeb7APr=g@mail.gmail.com/

Not sure if you've had a chance to read or think about it.

> What I had done to "untrick" fw_devlink so that I could see the issue
> (which was originally reported by Alvin Šipraga) was to modify the
> mv88e6xxx driver, and change the placement of mv88e6xxx_mdios_register()
> to a point after which we will never hit -EPROBE_DEFER (from driver probe()
> to the dsa_switch_ops :: setup() method):
>
> diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
> index 0a5d6c7bb128..48650465660d 100644
> --- a/drivers/net/dsa/mv88e6xxx/chip.c
> +++ b/drivers/net/dsa/mv88e6xxx/chip.c

-----8<---- snipped

> After applying both of the above changes on top of yours, I confirm that
> the PHYs on the mv88e6xxx on Turris MOX still probe with their specific
> PHY driver rather than the generic one, and with interrupts (not poll mode):
>
-----8<---- snipped

>
> even though I am seeing these error messages earlier in the boot process (maybe this is something to look into):
>
> [    0.910219] mdio_bus d0032004.mdio-mii:10: Failed to create device link with d0032004.mdio-mii:10

-----8<---- snipped

> [    0.943879] mv88e6085 d0032004.mdio-mii:12: switch 0x1900 detected: Marvell 88E6190, revision 1
>
>
> If _on top_ of all the above, I also remove the logic that sets FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD:

> then *finally* I get something approximating Alvin's reported issue.
> In my case, one switch out of 3 gets its PHYs bound to the Generic PHY
> driver (why not all is a story for another time):

-----8<---- snipped

> So I guess that FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD does something.

Thanks for the extensive effort into testing this!

-Saravana
Tony Lindgren Feb. 15, 2023, 7:39 a.m. UTC | #10
Hi,

* Saravana Kannan <saravanak@google.com> [230207 01:42]:
> Naresh, Tony, Abel, Geert, Dmitry, Maxim(s), Miquel, Luca, Doug, Martin,
> Jean-Philippe,
> 
> Can I get your Tested-by's for this v3 series please?

Just FYI, the patches in next-20230215 behave for me.

Regards,

Tony
Jean-Philippe Brucker Feb. 15, 2023, 12:34 p.m. UTC | #11
Hi Saravana,

On Mon, Feb 06, 2023 at 05:41:52PM -0800, Saravana Kannan wrote:
> Naresh, Tony, Abel, Geert, Dmitry, Maxim(s), Miquel, Luca, Doug, Martin,
> Jean-Philippe,
> 
> Can I get your Tested-by's for this v3 series please?

Sorry for the delay (I misconfigured my inbox). I tested virtio-iommu with
these changes, no regression:

Tested-by: Jean-Philippe Brucker <jpb@kernel.org>


Removing driver_deferred_probe_check_state() by reverting [1] breaks
loading virtio-iommu as a module, as the dependency between PCI devices
and PCI IOMMU is ignored, and the device probed too early [2]. I'll try to
figure out how to make that work.

Thanks,
Jean

[1] https://lore.kernel.org/lkml/20220819221616.2107893-5-saravanak@google.com/
[2] https://lore.kernel.org/lkml/Yv+dpeIPvde7oDHi@myrica/
Dmitry Baryshkov Feb. 16, 2023, 3:12 a.m. UTC | #12
On 07/02/2023 03:41, Saravana Kannan wrote:
> Naresh, Tony, Abel, Geert, Dmitry, Maxim(s), Miquel, Luca, Doug, Martin,
> Jean-Philippe,
>
> Can I get your Tested-by's for this v3 series please?
>
> Vladimir,
>
> Ccing you because DSA's and fw_devlink have known/existing problems
> (still in my TODOs to fix). But I want to make sure this series doesn't
> cause additional problems for DSA.
>
> All,
>
> This patch series improves fw_devlink in the following ways:
>
> 1. It no longer cares about a fwnode having a "compatible" property. It
>     figures this out more dynamically. The only expectation is that
>     fwnodes that are converted to devices actually get probed by a driver
>     for the dependencies to be enforced correctly.
>
> 2. Finer grained dependency tracking. fw_devlink will now create device
>     links from the consumer to the actual resource's device (if it has one,
>     Eg: gpio_device) instead of the parent supplier device. This improves
>     things like async suspend/resume ordering, potentially remove the need
>     for frameworks to create device links, more parallelized async probing,
>     and better sync_state() tracking.
>
> 3. Handle hardware/software quirks where a child firmware node gets
>     populated as a device before its parent firmware node AND actually
>     supplies a non-optional resource to the parent firmware node's
>     device.
>
> 4. Way more robust at cycle handling (see patch for the insane cases).
>
> 5. Stops depending on OF_POPULATED to figure out some corner cases.
>
> 6. Simplifies the work that needs to be done by the firmware specific
>     code.
>
> The v3 series has gone through my usual testing on my end and looks good
> to me.

Saravana,

Please excuse me, I was completely overwhelmed with my regular work and
had no time to properly test the series, while doing just the light
test would defeat the purpose of testing.

Tested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> # Qualcomm RB3

Thanks a lot for going through all the troubles and hunting all the issues!

Just a note: on an RB3 device (arm64 qcom/sdm845-db845c.dtsi) extended
with the patch at [3] I got the following messages in dmesg:

[    1.051325] platform ae00000.mdss: Failed to create device link
with ae00000.mdss
[    1.059368] platform ae00000.mdss: Failed to create device link
with ae00000.mdss
[    1.067174] platform ae00000.mdss: Failed to create device link
with ae00000.mdss
[    1.088322] platform c440000.spmi: Failed to create device link
with c440000.spmi
[    1.096019] platform c440000.spmi: Failed to create device link
with c440000.spmi
[    1.103707] platform c440000.spmi: Failed to create device link
with c440000.spmi
[    1.111400] platform c440000.spmi: Failed to create device link
with c440000.spmi
[    1.119141] platform c440000.spmi: Failed to create device link
with c440000.spmi
[    1.126825] platform c440000.spmi: Failed to create device link
with c440000.spmi
[    2.024763] qcom-lab-ibb-regulator c440000.spmi:pmic@3:labibb:
Failed to create device link with c440000.spmi
[    2.035026] qcom-lab-ibb-regulator c440000.spmi:pmic@3:labibb:
Failed to create device link with c440000.spmi

They look to be harmless, but it might be good to filter some of them
out? Especially the ones which tell about creating a device link
pointing back to the same device.

[3] https://lore.kernel.org/linux-arm-msm/20230118082048.2198715-1-dmitry.baryshkov@linaro.org/

>
> Thanks,
> Saravana
>
> [1] - https://lore.kernel.org/lkml/20220810060040.321697-1-saravanak@google.com/
> [2] - https://lore.kernel.org/lkml/CAGETcx-JUV1nj8wBJrTPfyvM7=Mre5j_vkVmZojeiumUGG6QZQ@mail.gmail.com/
>
> v1 -> v2:
> - Fixed Patch 1 to handle a corner case discussed in [2].
> - New patch 10 to handle "fsl,imx8mq-gpc" being initialized by 2 drivers.
> - New patch 11 to add fw_devlink support for SCMI devices.
>
> v2 -> v3:
> - Addressed most of Andy's comments in v2
> - Added Colin and Sudeep's Tested-by for the series (except the imx and
>    renesas patches)
> - Added Sudeep's Acked-by for the scmi patch.
> - Added Geert's Reviewed-by for the renesas patch.
> - Fixed gpiolib crash reported by Naresh.
> - Patch 6: Fix __fwnode_links_move_consumers() to preserve fwnode link flags.
> - New Patch 12 to fix nvmem-cells issue reported by Maxim(s)/Miquel.
> - Deleted some stale function doc in Patch 8
>
> Cc: Abel Vesa <abel.vesa@linaro.org>
> Cc: Alexander Stein <alexander.stein@ew.tq-group.com>
> Cc: Tony Lindgren <tony@atomide.com>
> Cc: Sudeep Holla <sudeep.holla@arm.com>
> Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> Cc: John Stultz <jstultz@google.com>
> Cc: Doug Anderson <dianders@chromium.org>
> Cc: Guenter Roeck <linux@roeck-us.net>
> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> Cc: Maxim Kiselev <bigunclemax@gmail.com>
> Cc: Maxim Kochetkov <fido_max@inbox.ru>
> Cc: Miquel Raynal <miquel.raynal@bootlin.com>
> Cc: Luca Weiss <luca.weiss@fairphone.com>
> Cc: Colin Foster <colin.foster@in-advantage.com>
> Cc: Martin Kepplinger <martin.kepplinger@puri.sm>
> Cc: Jean-Philippe Brucker <jpb@kernel.org>
> Cc: Vladimir Oltean <vladimir.oltean@nxp.com>
>
> Saravana Kannan (12):
>    driver core: fw_devlink: Don't purge child fwnode's consumer links
>    driver core: fw_devlink: Improve check for fwnode with no
>      device/driver
>    soc: renesas: Move away from using OF_POPULATED for fw_devlink
>    gpiolib: Clear the gpio_device's fwnode initialized flag before adding
>    driver core: fw_devlink: Add DL_FLAG_CYCLE support to device links
>    driver core: fw_devlink: Allow marking a fwnode link as being part of
>      a cycle
>    driver core: fw_devlink: Consolidate device link flag computation
>    driver core: fw_devlink: Make cycle detection more robust
>    of: property: Simplify of_link_to_phandle()
>    irqchip/irq-imx-gpcv2: Mark fwnode device as not initialized
>    firmware: arm_scmi: Set fwnode for the scmi_device
>    mtd: mtdpart: Don't create platform device that'll never probe
>
>   drivers/base/core.c             | 449 +++++++++++++++++++++-----------
>   drivers/firmware/arm_scmi/bus.c |   3 +-
>   drivers/gpio/gpiolib.c          |   7 +
>   drivers/irqchip/irq-imx-gpcv2.c |   1 +
>   drivers/mtd/mtdpart.c           |  10 +
>   drivers/of/property.c           |  84 +-----
>   drivers/soc/imx/gpcv2.c         |   2 +-
>   drivers/soc/renesas/rcar-sysc.c |   2 +-
>   include/linux/device.h          |   1 +
>   include/linux/fwnode.h          |  12 +-
>   10 files changed, 344 insertions(+), 227 deletions(-)
>

--
With best wishes

Dmitry
Saravana Kannan Feb. 25, 2023, 6:24 a.m. UTC | #13
On Wed, Feb 15, 2023 at 7:12 PM Dmitry Baryshkov
<dmitry.baryshkov@linaro.org> wrote:
>
> On 07/02/2023 03:41, Saravana Kannan wrote:
> > Naresh, Tony, Abel, Geert, Dmitry, Maxim(s), Miquel, Luca, Doug, Martin,
> > Jean-Philippe,
> >
> > Can I get your Tested-by's for this v3 series please?
> >
> > Vladimir,
> >
> > Ccing you because DSA's and fw_devlink have known/existing problems
> > (still in my TODOs to fix). But I want to make sure this series doesn't
> > cause additional problems for DSA.
> >
> > All,
> >
> > This patch series improves fw_devlink in the following ways:
> >
> > 1. It no longer cares about a fwnode having a "compatible" property. It
> >     figures this out more dynamically. The only expectation is that
> >     fwnodes that are converted to devices actually get probed by a driver
> >     for the dependencies to be enforced correctly.
> >
> > 2. Finer grained dependency tracking. fw_devlink will now create device
> >     links from the consumer to the actual resource's device (if it has one,
> >     Eg: gpio_device) instead of the parent supplier device. This improves
> >     things like async suspend/resume ordering, potentially remove the need
> >     for frameworks to create device links, more parallelized async probing,
> >     and better sync_state() tracking.
> >
> > 3. Handle hardware/software quirks where a child firmware node gets
> >     populated as a device before its parent firmware node AND actually
> >     supplies a non-optional resource to the parent firmware node's
> >     device.
> >
> > 4. Way more robust at cycle handling (see patch for the insane cases).
> >
> > 5. Stops depending on OF_POPULATED to figure out some corner cases.
> >
> > 6. Simplifies the work that needs to be done by the firmware specific
> >     code.
> >
> > The v3 series has gone through my usual testing on my end and looks good
> > to me.
>
> Saravana,
>
> Please excuse me, I was completely overwhelmed with my regular work and
> had no time to properly test the series, while doing just the light
> test would defeat the purpose of testing.
>
> Tested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> # Qualcomm RB3
>
> Thanks a lot for going through all the troubles and hunting all the issues!

You are welcome! Thanks for testing it.

> Just a note: on an RB3 device (arm64 qcom/sdm845-db845c.dtsi) extended
> with the patch at [3] I got the following messages in dmesg:
>
> [    1.051325] platform ae00000.mdss: Failed to create device link
> with ae00000.mdss
> [    1.059368] platform ae00000.mdss: Failed to create device link
> with ae00000.mdss
> [    1.067174] platform ae00000.mdss: Failed to create device link
> with ae00000.mdss
> [    1.088322] platform c440000.spmi: Failed to create device link
> with c440000.spmi
> [    1.096019] platform c440000.spmi: Failed to create device link
> with c440000.spmi
> [    1.103707] platform c440000.spmi: Failed to create device link
> with c440000.spmi
> [    1.111400] platform c440000.spmi: Failed to create device link
> with c440000.spmi
> [    1.119141] platform c440000.spmi: Failed to create device link
> with c440000.spmi
> [    1.126825] platform c440000.spmi: Failed to create device link
> with c440000.spmi
> [    2.024763] qcom-lab-ibb-regulator c440000.spmi:pmic@3:labibb:
> Failed to create device link with c440000.spmi
> [    2.035026] qcom-lab-ibb-regulator c440000.spmi:pmic@3:labibb:
> Failed to create device link with c440000.spmi
>
> They look to be harmless, but it might be good to filter some of them
> out? Especially the ones which tell about creating a device link
> pointing back to the same device.

I'm sure it's harmless when the supplier == consumer. Agreed on
filtering these out.

I looked at [3], but it's not obvious to me how this is happening for
your specific case. There are a couple of  ways I can think of:
1. A SYNC_STATE_ONLY link being created as a proxy link (I don't do as
many checks here because it can't break anything)
2. __fw_devlink_pickup_dangling_consumers() causing the consumer and
supplier to be the same.

But I want to understand which one is happening in your case. Can you
add a WARN_ON(1) after the error message and give me the list of stack
dumps that are unique?

Thanks,
Saravana


>
> [3] https://lore.kernel.org/linux-arm-msm/20230118082048.2198715-1-dmitry.baryshkov@linaro.org/
>
> >
> > Thanks,
> > Saravana
> >
> > [1] - https://lore.kernel.org/lkml/20220810060040.321697-1-saravanak@google.com/
> > [2] - https://lore.kernel.org/lkml/CAGETcx-JUV1nj8wBJrTPfyvM7=Mre5j_vkVmZojeiumUGG6QZQ@mail.gmail.com/
> >
> > v1 -> v2:
> > - Fixed Patch 1 to handle a corner case discussed in [2].
> > - New patch 10 to handle "fsl,imx8mq-gpc" being initialized by 2 drivers.
> > - New patch 11 to add fw_devlink support for SCMI devices.
> >
> > v2 -> v3:
> > - Addressed most of Andy's comments in v2
> > - Added Colin and Sudeep's Tested-by for the series (except the imx and
> >    renesas patches)
> > - Added Sudeep's Acked-by for the scmi patch.
> > - Added Geert's Reviewed-by for the renesas patch.
> > - Fixed gpiolib crash reported by Naresh.
> > - Patch 6: Fix __fwnode_links_move_consumers() to preserve fwnode link flags.
> > - New Patch 12 to fix nvmem-cells issue reported by Maxim(s)/Miquel.
> > - Deleted some stale function doc in Patch 8
> >
> > Cc: Abel Vesa <abel.vesa@linaro.org>
> > Cc: Alexander Stein <alexander.stein@ew.tq-group.com>
> > Cc: Tony Lindgren <tony@atomide.com>
> > Cc: Sudeep Holla <sudeep.holla@arm.com>
> > Cc: Geert Uytterhoeven <geert@linux-m68k.org>
> > Cc: John Stultz <jstultz@google.com>
> > Cc: Doug Anderson <dianders@chromium.org>
> > Cc: Guenter Roeck <linux@roeck-us.net>
> > Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> > Cc: Maxim Kiselev <bigunclemax@gmail.com>
> > Cc: Maxim Kochetkov <fido_max@inbox.ru>
> > Cc: Miquel Raynal <miquel.raynal@bootlin.com>
> > Cc: Luca Weiss <luca.weiss@fairphone.com>
> > Cc: Colin Foster <colin.foster@in-advantage.com>
> > Cc: Martin Kepplinger <martin.kepplinger@puri.sm>
> > Cc: Jean-Philippe Brucker <jpb@kernel.org>
> > Cc: Vladimir Oltean <vladimir.oltean@nxp.com>
> >
> > Saravana Kannan (12):
> >    driver core: fw_devlink: Don't purge child fwnode's consumer links
> >    driver core: fw_devlink: Improve check for fwnode with no
> >      device/driver
> >    soc: renesas: Move away from using OF_POPULATED for fw_devlink
> >    gpiolib: Clear the gpio_device's fwnode initialized flag before adding
> >    driver core: fw_devlink: Add DL_FLAG_CYCLE support to device links
> >    driver core: fw_devlink: Allow marking a fwnode link as being part of
> >      a cycle
> >    driver core: fw_devlink: Consolidate device link flag computation
> >    driver core: fw_devlink: Make cycle detection more robust
> >    of: property: Simplify of_link_to_phandle()
> >    irqchip/irq-imx-gpcv2: Mark fwnode device as not initialized
> >    firmware: arm_scmi: Set fwnode for the scmi_device
> >    mtd: mtdpart: Don't create platform device that'll never probe
> >
> >   drivers/base/core.c             | 449 +++++++++++++++++++++-----------
> >   drivers/firmware/arm_scmi/bus.c |   3 +-
> >   drivers/gpio/gpiolib.c          |   7 +
> >   drivers/irqchip/irq-imx-gpcv2.c |   1 +
> >   drivers/mtd/mtdpart.c           |  10 +
> >   drivers/of/property.c           |  84 +-----
> >   drivers/soc/imx/gpcv2.c         |   2 +-
> >   drivers/soc/renesas/rcar-sysc.c |   2 +-
> >   include/linux/device.h          |   1 +
> >   include/linux/fwnode.h          |  12 +-
> >   10 files changed, 344 insertions(+), 227 deletions(-)
> >
>
> --
> With best wishes
>
> Dmitry