diff mbox series

[v1,2/4] driver core: fw_devlink: Link to supplier ancestor if no device

Message ID 20240123014517.5787-3-mcpratt@pm.me
State New
Headers show
Series fw_devlink: generically handle bad links to child fwnodes | expand

Commit Message

Michael Pratt Jan. 23, 2024, 1:46 a.m. UTC
Driver core currently supports linking to the next parent fwnode,
but is not yet handling cases where that parent
is also a firmware child node not representing a real device,
which can lead to an indefinite deferred probe in some cases.
In this case, the fwnode that should actually be linked to
is multiple ancestors up which presents a challenge where
it is unknown how many ancestors up the node that
represents the real probing device is. This makes the usage of
fwnode_get_next_parent_dev() insufficient because the real device's
fwnode may or may not be an ancestor of the next parent fwnode as well.

Introduce flag FWNODE_FLAG_PARENT_IS_DEV
in order to mark child firmware nodes of a device
as having a parent device that can probe.

Allow fwnode link creation to the original supplier fwnode's ancestors
when the original supplier fwnode and any fwnodes in between are flagged
as FWNODE_FLAG_NOT_DEVICE and/or FWNODE_FLAG_PARENT_IS_DEV
with a new function __fwnode_link_add_parents() which then creates
the fwnode link to a real device that provides the supplier's function.

This depends on other functions to label a supplier fwnode
as not a real device, which must be done before the fwnode links
are created, and if after that, relevant links to the supplier
would have to be deleted and have links recreated, otherwise,
the fwnode link would be dropped before the device link is attempted
or a fwnode link would not be able to become a device link at all,
because they were created before these fwnode flags can have any effect.

It also depends on the supplier device to actually probe first
in order to have the fwnode flags in place to know for certain
which fwnodes are non-probing child nodes
of the fwnode for the supplier device.

The use case of function __fw_devlink_pickup_dangling_consumers()
is designed so that the parameters are always a supplier fwnode
and one of it's parent fwnodes, so it is safer to assume and more specific
that the flag PARENT_IS_DEV should be added there, rather than
declaring the original supplier fwnode as NOT_DEVICE at that point.
Because this function is called when the real supplier device probes
and recursively calls itself for all child nodes of the device's fwnode,
set the new flag here in order to let it propagate down
to all descendant nodes, thereby providing the info needed later
in order to link to the proper fwnode representing the supplier device.

If a fwnode is flagged as FWNODE_FLAG_NOT_DEVICE
by the time a device link is to be made with it,
but not flagged as FWNODE_FLAG_PARENT_IS_DEV,
the link is dropped, otherwise the device link
is still made with the original supplier fwnode.
Theoretically, we can also handle linking to an ancestor
of the supplier fwnode when forming device links, but there
are still cases where the necessary fwnode flags are still missing
because the real supplier device did not probe yet.

Signed-off-by: Michael Pratt <mcpratt@pm.me>
---
 drivers/base/core.c    | 49 ++++++++++++++++++++++++++++++++++++------
 include/linux/fwnode.h |  4 ++++
 2 files changed, 47 insertions(+), 6 deletions(-)

Comments

Miquel Raynal Feb. 5, 2024, 3 p.m. UTC | #1
Hi Michael,

First, I want to say, this is a great job as fw_devlinks in mtd and
nvmem are really not easy to handle. I am willing to help, despite my
very light understanding of what the core actually does with these
flags.

mcpratt@pm.me wrote on Tue, 23 Jan 2024 01:46:40 +0000:

> Driver core currently supports linking to the next parent fwnode,
> but is not yet handling cases where that parent
> is also a firmware child node not representing a real device,
> which can lead to an indefinite deferred probe in some cases.
> In this case, the fwnode that should actually be linked to
> is multiple ancestors up which presents a challenge where
> it is unknown how many ancestors up the node that
> represents the real probing device is. This makes the usage of
> fwnode_get_next_parent_dev() insufficient because the real device's
> fwnode may or may not be an ancestor of the next parent fwnode as well.
> 
> Introduce flag FWNODE_FLAG_PARENT_IS_DEV
> in order to mark child firmware nodes of a device
> as having a parent device that can probe.
> 
> Allow fwnode link creation to the original supplier fwnode's ancestors
> when the original supplier fwnode and any fwnodes in between are flagged
> as FWNODE_FLAG_NOT_DEVICE and/or FWNODE_FLAG_PARENT_IS_DEV
> with a new function __fwnode_link_add_parents() which then creates
> the fwnode link to a real device that provides the supplier's function.
> 
> This depends on other functions to label a supplier fwnode
> as not a real device, which must be done before the fwnode links
> are created, and if after that, relevant links to the supplier
> would have to be deleted and have links recreated, otherwise,
> the fwnode link would be dropped before the device link is attempted
> or a fwnode link would not be able to become a device link at all,
> because they were created before these fwnode flags can have any effect.
> 
> It also depends on the supplier device to actually probe first
> in order to have the fwnode flags in place to know for certain
> which fwnodes are non-probing child nodes
> of the fwnode for the supplier device.
> 
> The use case of function __fw_devlink_pickup_dangling_consumers()
> is designed so that the parameters are always a supplier fwnode
> and one of it's parent fwnodes, so it is safer to assume and more specific
> that the flag PARENT_IS_DEV should be added there, rather than
> declaring the original supplier fwnode as NOT_DEVICE at that point.
> Because this function is called when the real supplier device probes
> and recursively calls itself for all child nodes of the device's fwnode,
> set the new flag here in order to let it propagate down
> to all descendant nodes, thereby providing the info needed later
> in order to link to the proper fwnode representing the supplier device.
> 
> If a fwnode is flagged as FWNODE_FLAG_NOT_DEVICE
> by the time a device link is to be made with it,
> but not flagged as FWNODE_FLAG_PARENT_IS_DEV,
> the link is dropped, otherwise the device link
> is still made with the original supplier fwnode.
> Theoretically, we can also handle linking to an ancestor
> of the supplier fwnode when forming device links, but there
> are still cases where the necessary fwnode flags are still missing
> because the real supplier device did not probe yet.

I am not sure I follow this. In the following case, I would expect any
dependency towards node-c to be made against node-a. But the above
paragraph seems to tell otherwise: that the the link would be dropped
(and thus, not enforced) because recursively searching for a parent
that would be a device could be endless? It feels wrong, so I probably
mis

node-a {
	# IS DEV
	node-b {
		# PARENT IS DEV
		node-c {
			# PARENT IS DEV
		};
	};
};

Besides that, the commit feels like a good idea.

Thanks,
Miquèl
Saravana Kannan Feb. 5, 2024, 7:25 p.m. UTC | #2
On Mon, Jan 22, 2024 at 5:46 PM Michael Pratt <mcpratt@pm.me> wrote:
>
> Driver core currently supports linking to the next parent fwnode,
> but is not yet handling cases where that parent
> is also a firmware child node not representing a real device,
> which can lead to an indefinite deferred probe in some cases.
> In this case, the fwnode that should actually be linked to
> is multiple ancestors up which presents a challenge where
> it is unknown how many ancestors up the node that
> represents the real probing device is. This makes the usage of
> fwnode_get_next_parent_dev() insufficient because the real device's
> fwnode may or may not be an ancestor of the next parent fwnode as well.
>
> Introduce flag FWNODE_FLAG_PARENT_IS_DEV
> in order to mark child firmware nodes of a device
> as having a parent device that can probe.
>
> Allow fwnode link creation to the original supplier fwnode's ancestors
> when the original supplier fwnode and any fwnodes in between are flagged
> as FWNODE_FLAG_NOT_DEVICE and/or FWNODE_FLAG_PARENT_IS_DEV
> with a new function __fwnode_link_add_parents() which then creates
> the fwnode link to a real device that provides the supplier's function.
>
> This depends on other functions to label a supplier fwnode
> as not a real device, which must be done before the fwnode links
> are created, and if after that, relevant links to the supplier
> would have to be deleted and have links recreated, otherwise,
> the fwnode link would be dropped before the device link is attempted
> or a fwnode link would not be able to become a device link at all,
> because they were created before these fwnode flags can have any effect.
>
> It also depends on the supplier device to actually probe first
> in order to have the fwnode flags in place to know for certain
> which fwnodes are non-probing child nodes
> of the fwnode for the supplier device.
>
> The use case of function __fw_devlink_pickup_dangling_consumers()
> is designed so that the parameters are always a supplier fwnode
> and one of it's parent fwnodes, so it is safer to assume and more specific
> that the flag PARENT_IS_DEV should be added there, rather than
> declaring the original supplier fwnode as NOT_DEVICE at that point.
> Because this function is called when the real supplier device probes
> and recursively calls itself for all child nodes of the device's fwnode,
> set the new flag here in order to let it propagate down
> to all descendant nodes, thereby providing the info needed later
> in order to link to the proper fwnode representing the supplier device.
>
> If a fwnode is flagged as FWNODE_FLAG_NOT_DEVICE
> by the time a device link is to be made with it,
> but not flagged as FWNODE_FLAG_PARENT_IS_DEV,
> the link is dropped, otherwise the device link
> is still made with the original supplier fwnode.
> Theoretically, we can also handle linking to an ancestor
> of the supplier fwnode when forming device links, but there
> are still cases where the necessary fwnode flags are still missing
> because the real supplier device did not probe yet.
>
> Signed-off-by: Michael Pratt <mcpratt@pm.me>

NACK for a bunch of reasons.

1. This logic should not be in fwnode_link_add(). That's supposed to
do the exact linking that the firmware meant. This logic should be
where the fwnode links are converted to device links or deferred
probing decision is done.

2. If we handle one parent above correctly, it should be easy to
handle multiple parents above too. So why not fix it where we handle
multiple parents above?

Btw, I'm happy to fix this or help you fix this once I understand the
issue. Just wanted to give a quick NACK to avoid people spending
cycles on this patch in its current state.

-Saravana


> ---
>  drivers/base/core.c    | 49 ++++++++++++++++++++++++++++++++++++------
>  include/linux/fwnode.h |  4 ++++
>  2 files changed, 47 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index c05a5f6b0641..7f2652cf5edc 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -92,13 +92,45 @@ static int __fwnode_link_add(struct fwnode_handle *con,
>         return 0;
>  }
>
> +static int __fwnode_link_add_parents(struct fwnode_handle *con,
> +                                    struct fwnode_handle *sup,
> +                                    u8 flags, bool loop)
> +{
> +       int ret = -EINVAL;
> +       struct fwnode_handle *parent;
> +
> +       fwnode_for_each_parent_node(sup, parent) {
> +               /* Ignore the root node */
> +               if (fwnode_count_parents(parent) &&
> +                   fwnode_device_is_available(parent) &&
> +                 !(parent->flags & FWNODE_FLAG_NOT_DEVICE) &&
> +                 !(parent->flags & FWNODE_FLAG_PARENT_IS_DEV)) {
> +                       ret = __fwnode_link_add(con, parent, flags);
> +               }
> +
> +               if (!loop && !ret) {
> +                       fwnode_handle_put(parent);
> +                       return 0;
> +               }
> +       }
> +
> +       return ret;
> +}
> +
>  int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
>  {
>         int ret;
>
>         mutex_lock(&fwnode_link_lock);
> -       ret = __fwnode_link_add(con, sup, 0);
> +
> +       if ((sup->flags & FWNODE_FLAG_NOT_DEVICE) &&
> +           (sup->flags & FWNODE_FLAG_PARENT_IS_DEV))
> +               ret = __fwnode_link_add_parents(con, sup, 0, false);
> +       else
> +               ret = __fwnode_link_add(con, sup, 0);
> +
>         mutex_unlock(&fwnode_link_lock);
> +
>         return ret;
>  }
>
> @@ -218,7 +250,8 @@ static void __fwnode_links_move_consumers(struct fwnode_handle *from,
>   * MANAGED device links to this device, so leave @fwnode and its descendant's
>   * fwnode links alone.
>   *
> - * Otherwise, move its consumers to the new supplier @new_sup.
> + * Otherwise, flag descendants of @fwnode as having a parent fwnode for a device
> + * that has probed and move bad fwnode consumer links from @fwnode to @new_sup.
>   */
>  static void __fw_devlink_pickup_dangling_consumers(struct fwnode_handle *fwnode,
>                                                    struct fwnode_handle *new_sup)
> @@ -228,8 +261,11 @@ static void __fw_devlink_pickup_dangling_consumers(struct fwnode_handle *fwnode,
>         if (fwnode->dev && fwnode->dev->driver)
>                 return;
>
> -       fwnode->flags |= FWNODE_FLAG_NOT_DEVICE;
> -       __fwnode_links_move_consumers(fwnode, new_sup);
> +       if (fwnode->flags & FWNODE_FLAG_NOT_DEVICE)
> +               __fwnode_links_move_consumers(fwnode, new_sup);
> +
> +       fwnode->flags |= FWNODE_FLAG_PARENT_IS_DEV;
> +       new_sup->flags &= ~(FWNODE_FLAG_PARENT_IS_DEV);
>
>         fwnode_for_each_available_child_node(fwnode, child)
>                 __fw_devlink_pickup_dangling_consumers(child, new_sup);
> @@ -2071,8 +2107,9 @@ static int fw_devlink_create_devlink(struct device *con,
>                 device_links_write_unlock();
>         }
>
> -       if (sup_handle->flags & FWNODE_FLAG_NOT_DEVICE)
> -               sup_dev = fwnode_get_next_parent_dev(sup_handle);
> +       if ((sup_handle->flags & FWNODE_FLAG_NOT_DEVICE) &&
> +          !(sup_handle->flags & FWNODE_FLAG_PARENT_IS_DEV))
> +               return -EINVAL;
>         else
>                 sup_dev = get_dev_from_fwnode(sup_handle);
>
> diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h
> index 2a72f55d26eb..3ed8ba503062 100644
> --- a/include/linux/fwnode.h
> +++ b/include/linux/fwnode.h
> @@ -30,6 +30,9 @@ struct device;
>   * BEST_EFFORT: The fwnode/device needs to probe early and might be missing some
>   *             suppliers. Only enforce ordering with suppliers that have
>   *             drivers.
> + * PARENT_IS_DEV: The fwnode is a child of a fwnode that is or will be populated as a
> + *               struct device, which is more suitable for device links
> + *               than the fwnode child which may never have a struct device.
>   */
>  #define FWNODE_FLAG_LINKS_ADDED                        BIT(0)
>  #define FWNODE_FLAG_NOT_DEVICE                 BIT(1)
> @@ -37,6 +40,7 @@ struct device;
>  #define FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD   BIT(3)
>  #define FWNODE_FLAG_BEST_EFFORT                        BIT(4)
>  #define FWNODE_FLAG_VISITED                    BIT(5)
> +#define FWNODE_FLAG_PARENT_IS_DEV              BIT(6)
>
>  struct fwnode_handle {
>         struct fwnode_handle *secondary;
> --
> 2.30.2
>
>
Saravana Kannan Feb. 14, 2024, 3:18 a.m. UTC | #3
On Mon, Feb 5, 2024 at 11:25 AM Saravana Kannan <saravanak@google.com> wrote:
>
> On Mon, Jan 22, 2024 at 5:46 PM Michael Pratt <mcpratt@pm.me> wrote:
> >
> > Driver core currently supports linking to the next parent fwnode,
> > but is not yet handling cases where that parent
> > is also a firmware child node not representing a real device,

Can you tell me why you say this? Because as far as I can tell, I did
take into account climbing all the way up to find the actual struct
device that encompasses a fwnode and then create a device link to that
struct device.

> > which can lead to an indefinite deferred probe in some cases.
> > In this case, the fwnode that should actually be linked to
> > is multiple ancestors up which presents a challenge where
> > it is unknown how many ancestors up the node that
> > represents the real probing device is.

The goal is to find the device. Not "the probing device". That device
IS the one providing the functionality of the fwnode. If you never use
that device, don't create it? If you never probe the device, then use
a class instead of a bus?

> > This makes the usage of
> > fwnode_get_next_parent_dev() insufficient because the real device's
> > fwnode may or may not be an ancestor of the next parent fwnode as well.

It is doing what it's supposed to do. Finding the parent device. Not
the next probing parent device. This becomes even more relevant when
we start talking about suspend/resume. Too much to explain about
suspend/resume and it's not relevant to your case. So I'll leave that
as an exercise to the reader.

> > Introduce flag FWNODE_FLAG_PARENT_IS_DEV
> > in order to mark child firmware nodes of a device
> > as having a parent device that can probe.
> >
> > Allow fwnode link creation to the original supplier fwnode's ancestors
> > when the original supplier fwnode and any fwnodes in between are flagged
> > as FWNODE_FLAG_NOT_DEVICE and/or FWNODE_FLAG_PARENT_IS_DEV
> > with a new function __fwnode_link_add_parents() which then creates
> > the fwnode link to a real device that provides the supplier's function.
> >
> > This depends on other functions to label a supplier fwnode
> > as not a real device, which must be done before the fwnode links
> > are created, and if after that, relevant links to the supplier
> > would have to be deleted and have links recreated, otherwise,
> > the fwnode link would be dropped before the device link is attempted
> > or a fwnode link would not be able to become a device link at all,
> > because they were created before these fwnode flags can have any effect.
> >
> > It also depends on the supplier device to actually probe first
> > in order to have the fwnode flags in place to know for certain
> > which fwnodes are non-probing child nodes
> > of the fwnode for the supplier device.
> >
> > The use case of function __fw_devlink_pickup_dangling_consumers()
> > is designed so that the parameters are always a supplier fwnode
> > and one of it's parent fwnodes, so it is safer to assume and more specific
> > that the flag PARENT_IS_DEV should be added there, rather than
> > declaring the original supplier fwnode as NOT_DEVICE at that point.
> > Because this function is called when the real supplier device probes
> > and recursively calls itself for all child nodes of the device's fwnode,
> > set the new flag here in order to let it propagate down
> > to all descendant nodes, thereby providing the info needed later
> > in order to link to the proper fwnode representing the supplier device.
> >
> > If a fwnode is flagged as FWNODE_FLAG_NOT_DEVICE
> > by the time a device link is to be made with it,
> > but not flagged as FWNODE_FLAG_PARENT_IS_DEV,
> > the link is dropped, otherwise the device link
> > is still made with the original supplier fwnode.
> > Theoretically, we can also handle linking to an ancestor
> > of the supplier fwnode when forming device links, but there
> > are still cases where the necessary fwnode flags are still missing
> > because the real supplier device did not probe yet.

It's still not clear why you need FWNODE_FLAG_PARENT_IS_DEV, but
considering patch 1 is a definite no-go, I'll stop here.

-Saravana

> >
> > Signed-off-by: Michael Pratt <mcpratt@pm.me>
>
> NACK for a bunch of reasons.
>
> 1. This logic should not be in fwnode_link_add(). That's supposed to
> do the exact linking that the firmware meant. This logic should be
> where the fwnode links are converted to device links or deferred
> probing decision is done.
>
> 2. If we handle one parent above correctly, it should be easy to
> handle multiple parents above too. So why not fix it where we handle
> multiple parents above?
>
> Btw, I'm happy to fix this or help you fix this once I understand the
> issue. Just wanted to give a quick NACK to avoid people spending
> cycles on this patch in its current state.
>
> -Saravana
>
>
> > ---
> >  drivers/base/core.c    | 49 ++++++++++++++++++++++++++++++++++++------
> >  include/linux/fwnode.h |  4 ++++
> >  2 files changed, 47 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > index c05a5f6b0641..7f2652cf5edc 100644
> > --- a/drivers/base/core.c
> > +++ b/drivers/base/core.c
> > @@ -92,13 +92,45 @@ static int __fwnode_link_add(struct fwnode_handle *con,
> >         return 0;
> >  }
> >
> > +static int __fwnode_link_add_parents(struct fwnode_handle *con,
> > +                                    struct fwnode_handle *sup,
> > +                                    u8 flags, bool loop)
> > +{
> > +       int ret = -EINVAL;
> > +       struct fwnode_handle *parent;
> > +
> > +       fwnode_for_each_parent_node(sup, parent) {
> > +               /* Ignore the root node */
> > +               if (fwnode_count_parents(parent) &&
> > +                   fwnode_device_is_available(parent) &&
> > +                 !(parent->flags & FWNODE_FLAG_NOT_DEVICE) &&
> > +                 !(parent->flags & FWNODE_FLAG_PARENT_IS_DEV)) {
> > +                       ret = __fwnode_link_add(con, parent, flags);
> > +               }
> > +
> > +               if (!loop && !ret) {
> > +                       fwnode_handle_put(parent);
> > +                       return 0;
> > +               }
> > +       }
> > +
> > +       return ret;
> > +}
> > +
> >  int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
> >  {
> >         int ret;
> >
> >         mutex_lock(&fwnode_link_lock);
> > -       ret = __fwnode_link_add(con, sup, 0);
> > +
> > +       if ((sup->flags & FWNODE_FLAG_NOT_DEVICE) &&
> > +           (sup->flags & FWNODE_FLAG_PARENT_IS_DEV))
> > +               ret = __fwnode_link_add_parents(con, sup, 0, false);
> > +       else
> > +               ret = __fwnode_link_add(con, sup, 0);
> > +
> >         mutex_unlock(&fwnode_link_lock);
> > +
> >         return ret;
> >  }
> >
> > @@ -218,7 +250,8 @@ static void __fwnode_links_move_consumers(struct fwnode_handle *from,
> >   * MANAGED device links to this device, so leave @fwnode and its descendant's
> >   * fwnode links alone.
> >   *
> > - * Otherwise, move its consumers to the new supplier @new_sup.
> > + * Otherwise, flag descendants of @fwnode as having a parent fwnode for a device
> > + * that has probed and move bad fwnode consumer links from @fwnode to @new_sup.
> >   */
> >  static void __fw_devlink_pickup_dangling_consumers(struct fwnode_handle *fwnode,
> >                                                    struct fwnode_handle *new_sup)
> > @@ -228,8 +261,11 @@ static void __fw_devlink_pickup_dangling_consumers(struct fwnode_handle *fwnode,
> >         if (fwnode->dev && fwnode->dev->driver)
> >                 return;
> >
> > -       fwnode->flags |= FWNODE_FLAG_NOT_DEVICE;
> > -       __fwnode_links_move_consumers(fwnode, new_sup);
> > +       if (fwnode->flags & FWNODE_FLAG_NOT_DEVICE)
> > +               __fwnode_links_move_consumers(fwnode, new_sup);
> > +
> > +       fwnode->flags |= FWNODE_FLAG_PARENT_IS_DEV;
> > +       new_sup->flags &= ~(FWNODE_FLAG_PARENT_IS_DEV);
> >
> >         fwnode_for_each_available_child_node(fwnode, child)
> >                 __fw_devlink_pickup_dangling_consumers(child, new_sup);
> > @@ -2071,8 +2107,9 @@ static int fw_devlink_create_devlink(struct device *con,
> >                 device_links_write_unlock();
> >         }
> >
> > -       if (sup_handle->flags & FWNODE_FLAG_NOT_DEVICE)
> > -               sup_dev = fwnode_get_next_parent_dev(sup_handle);
> > +       if ((sup_handle->flags & FWNODE_FLAG_NOT_DEVICE) &&
> > +          !(sup_handle->flags & FWNODE_FLAG_PARENT_IS_DEV))
> > +               return -EINVAL;
> >         else
> >                 sup_dev = get_dev_from_fwnode(sup_handle);
> >
> > diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h
> > index 2a72f55d26eb..3ed8ba503062 100644
> > --- a/include/linux/fwnode.h
> > +++ b/include/linux/fwnode.h
> > @@ -30,6 +30,9 @@ struct device;
> >   * BEST_EFFORT: The fwnode/device needs to probe early and might be missing some
> >   *             suppliers. Only enforce ordering with suppliers that have
> >   *             drivers.
> > + * PARENT_IS_DEV: The fwnode is a child of a fwnode that is or will be populated as a
> > + *               struct device, which is more suitable for device links
> > + *               than the fwnode child which may never have a struct device.
> >   */
> >  #define FWNODE_FLAG_LINKS_ADDED                        BIT(0)
> >  #define FWNODE_FLAG_NOT_DEVICE                 BIT(1)
> > @@ -37,6 +40,7 @@ struct device;
> >  #define FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD   BIT(3)
> >  #define FWNODE_FLAG_BEST_EFFORT                        BIT(4)
> >  #define FWNODE_FLAG_VISITED                    BIT(5)
> > +#define FWNODE_FLAG_PARENT_IS_DEV              BIT(6)
> >
> >  struct fwnode_handle {
> >         struct fwnode_handle *secondary;
> > --
> > 2.30.2
> >
> >
Michael Pratt Feb. 25, 2024, 8:16 p.m. UTC | #4
Hi Miquel,

Thanks for taking a look at what I have so far.

On Monday, February 5th, 2024 at 10:00, Miquel Raynal <miquel.raynal@bootlin.com> wrote:

> 
> 
> Hi Michael,
> 
> mcpratt@pm.me wrote on Tue, 23 Jan 2024 01:46:40 +0000:
> 
> > If a fwnode is flagged as FWNODE_FLAG_NOT_DEVICE
> > by the time a device link is to be made with it,
> > but not flagged as FWNODE_FLAG_PARENT_IS_DEV,
> > the link is dropped, otherwise the device link
> > is still made with the original supplier fwnode.
> > Theoretically, we can also handle linking to an ancestor
> > of the supplier fwnode when forming device links, but there
> > are still cases where the necessary fwnode flags are still missing
> > because the real supplier device did not probe yet.
> 
> 
> I am not sure I follow this. In the following case, I would expect any
> dependency towards node-c to be made against node-a. But the above
> paragraph seems to tell otherwise: that the link would be dropped
> (and thus, not enforced) because recursively searching for a parent
> that would be a device could be endless? It feels wrong, so I probably
> mis
> 
> node-a {
>	# IS DEV
>	node-b {
>		# PARENT IS DEV
>		node-c {
>			# PARENT IS DEV
>		};
>	};
>};
>
> Besides that, the commit feels like a good idea.


The link is dropped _in order to_ make the dependency towards node-c
become a dependency towards node-a instead.

The "recursive searching" happens after links are dropped in order to
create the new fwnode links, and it depends on the new flag
being placed when the supplier device (node-a) probes, which can happen
before the links are re-attempted, or after a single probe defer of the consumer.

I placed all the logic that decides whether to drop links and retry linking
to the consumer immediately before a probe defer of the consumer would occur.
That logic could be distributed around multiple functions for fw_devlink,
but I'm concerned about false positives. The only reason I didn't use or make
a new function in order to "move" the links is that in this position in the driver core
which I believe is the right place to do the fixup function, we don't have direct access
to the fwnode that the links should go to, it would have to be discovered by recursively
walking up the tree looking for the flag in the new fixup function
instead of where the fwnode links are made.

To me, it doesn't matter which order we call the functions,
but if we are starting with fwnode links that refuse
to be converted to device links, we need to do more than just move the fwnode links
because after a probe defer there is no hook to automatically try switching them
to device links again. Driver core expects that to have already happened by then.
I imagine that without having to add a lot more code in a lot of places,
I would have to call fw_devlink_link_device() after "moving" links anyway...

It's possible to call that function only when the bad link is still a fwnode_link
and do a "move" function when the bad link is a device_link, that is,
if "moving" a finished device_link is possible or good practice at all
since it would be skipping quite a few checks that occur before a device_link is made.
It seems to me that making a device_link is a multiple-step process, so a new function
to only move the supplier of a device_link might be a big one as well.
I tend to try to reuse as many core functions as I could.

> 
> Thanks,
> Miquèl

--
MCP
Michael Pratt Feb. 25, 2024, 8:37 p.m. UTC | #5
Hi Saravana,

Thanks for taking a look at the patch series.
I'm going to write a longer reply to the rest of what you said later.

On Monday, February 5th, 2024 at 14:25, Saravana Kannan <saravanak@google.com> wrote:

> 
> 
> On Mon, Jan 22, 2024 at 5:46 PM Michael Pratt mcpratt@pm.me wrote:
> 
> > 
> > Signed-off-by: Michael Pratt mcpratt@pm.me
> 
> 
> NACK for a bunch of reasons.
> 
> 1. This logic should not be in fwnode_link_add(). That's supposed to
> do the exact linking that the firmware meant. This logic should be
> where the fwnode links are converted to device links or deferred
> probing decision is done.

Yeah, I can definitely make this change without any problems.

> 
> 2. If we handle one parent above correctly, it should be easy to
> handle multiple parents above too. So why not fix it where we handle
> multiple parents above?

I can do this too, but since going one parent up would have to occur
multiple times, it would still end up being a recursive search for the flag.
Without finding the right fwnode right away, the new fixup function here
would have to recursively call itself 3 or 4 times instead of once or twice,
and each call would result in going through the entire loop of 
fw_devlink_link_device() for the consumer which would overall take a little more time.


> 
> Btw, I'm happy to fix this or help you fix this once I understand the
> issue. Just wanted to give a quick NACK to avoid people spending
> cycles on this patch in its current state.


I totally understand. I am still new to kernel things, so I imagine that
whenever I think that I'm finished with something,
that I'm actually just halfway done with it, even though it took forever
to get to this point, haha.


> 
> -Saravana
> 
> > ---
> > drivers/base/core.c | 49 ++++++++++++++++++++++++++++++++++++------
> > include/linux/fwnode.h | 4 ++++
> > 2 files changed, 47 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > index c05a5f6b0641..7f2652cf5edc 100644
> > --- a/drivers/base/core.c
> > +++ b/drivers/base/core.c
> > @@ -92,13 +92,45 @@ static int __fwnode_link_add(struct fwnode_handle *con,
> > return 0;
> > }
> > 
> > +static int __fwnode_link_add_parents(struct fwnode_handle *con,
> > + struct fwnode_handle *sup,
> > + u8 flags, bool loop)
> > +{
> > + int ret = -EINVAL;
> > + struct fwnode_handle parent;
> > +
> > + fwnode_for_each_parent_node(sup, parent) {
> > + / Ignore the root node */
> > + if (fwnode_count_parents(parent) &&
> > + fwnode_device_is_available(parent) &&
> > + !(parent->flags & FWNODE_FLAG_NOT_DEVICE) &&
> > + !(parent->flags & FWNODE_FLAG_PARENT_IS_DEV)) {
> > + ret = __fwnode_link_add(con, parent, flags);
> > + }
> > +
> > + if (!loop && !ret) {
> > + fwnode_handle_put(parent);
> > + return 0;
> > + }
> > + }
> > +
> > + return ret;
> > +}
> > +
> > int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
> > {
> > int ret;
> > 
> > mutex_lock(&fwnode_link_lock);
> > - ret = __fwnode_link_add(con, sup, 0);
> > +
> > + if ((sup->flags & FWNODE_FLAG_NOT_DEVICE) &&
> > + (sup->flags & FWNODE_FLAG_PARENT_IS_DEV))
> > + ret = __fwnode_link_add_parents(con, sup, 0, false);
> > + else
> > + ret = __fwnode_link_add(con, sup, 0);
> > +
> > mutex_unlock(&fwnode_link_lock);
> > +
> > return ret;
> > }
> > 
> > @@ -218,7 +250,8 @@ static void __fwnode_links_move_consumers(struct fwnode_handle *from,
> > * MANAGED device links to this device, so leave @fwnode and its descendant's
> > * fwnode links alone.
> > *
> > - * Otherwise, move its consumers to the new supplier @new_sup.
> > + * Otherwise, flag descendants of @fwnode as having a parent fwnode for a device
> > + * that has probed and move bad fwnode consumer links from @fwnode to @new_sup.
> > */
> > static void __fw_devlink_pickup_dangling_consumers(struct fwnode_handle *fwnode,
> > struct fwnode_handle *new_sup)
> > @@ -228,8 +261,11 @@ static void __fw_devlink_pickup_dangling_consumers(struct fwnode_handle *fwnode,
> > if (fwnode->dev && fwnode->dev->driver)
> > return;
> > 
> > - fwnode->flags |= FWNODE_FLAG_NOT_DEVICE;
> > - __fwnode_links_move_consumers(fwnode, new_sup);
> > + if (fwnode->flags & FWNODE_FLAG_NOT_DEVICE)
> > + __fwnode_links_move_consumers(fwnode, new_sup);
> > +
> > + fwnode->flags |= FWNODE_FLAG_PARENT_IS_DEV;
> > + new_sup->flags &= ~(FWNODE_FLAG_PARENT_IS_DEV);
> > 
> > fwnode_for_each_available_child_node(fwnode, child)
> > __fw_devlink_pickup_dangling_consumers(child, new_sup);
> > @@ -2071,8 +2107,9 @@ static int fw_devlink_create_devlink(struct device *con,
> > device_links_write_unlock();
> > }
> > 
> > - if (sup_handle->flags & FWNODE_FLAG_NOT_DEVICE)
> > - sup_dev = fwnode_get_next_parent_dev(sup_handle);
> > + if ((sup_handle->flags & FWNODE_FLAG_NOT_DEVICE) &&
> > + !(sup_handle->flags & FWNODE_FLAG_PARENT_IS_DEV))
> > + return -EINVAL;
> > else
> > sup_dev = get_dev_from_fwnode(sup_handle);
> > 
> > diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h
> > index 2a72f55d26eb..3ed8ba503062 100644
> > --- a/include/linux/fwnode.h
> > +++ b/include/linux/fwnode.h
> > @@ -30,6 +30,9 @@ struct device;
> > * BEST_EFFORT: The fwnode/device needs to probe early and might be missing some
> > * suppliers. Only enforce ordering with suppliers that have
> > * drivers.
> > + * PARENT_IS_DEV: The fwnode is a child of a fwnode that is or will be populated as a
> > + * struct device, which is more suitable for device links
> > + * than the fwnode child which may never have a struct device.
> > */
> > #define FWNODE_FLAG_LINKS_ADDED BIT(0)
> > #define FWNODE_FLAG_NOT_DEVICE BIT(1)
> > @@ -37,6 +40,7 @@ struct device;
> > #define FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD BIT(3)
> > #define FWNODE_FLAG_BEST_EFFORT BIT(4)
> > #define FWNODE_FLAG_VISITED BIT(5)
> > +#define FWNODE_FLAG_PARENT_IS_DEV BIT(6)
> > 
> > struct fwnode_handle {
> > struct fwnode_handle *secondary;
> > --
> > 2.30.2

--
MCP
diff mbox series

Patch

diff --git a/drivers/base/core.c b/drivers/base/core.c
index c05a5f6b0641..7f2652cf5edc 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -92,13 +92,45 @@  static int __fwnode_link_add(struct fwnode_handle *con,
 	return 0;
 }
 
+static int __fwnode_link_add_parents(struct fwnode_handle *con,
+				     struct fwnode_handle *sup,
+				     u8 flags, bool loop)
+{
+	int ret = -EINVAL;
+	struct fwnode_handle *parent;
+
+	fwnode_for_each_parent_node(sup, parent) {
+		/* Ignore the root node */
+		if (fwnode_count_parents(parent) &&
+		    fwnode_device_is_available(parent) &&
+		  !(parent->flags & FWNODE_FLAG_NOT_DEVICE) &&
+		  !(parent->flags & FWNODE_FLAG_PARENT_IS_DEV)) {
+			ret = __fwnode_link_add(con, parent, flags);
+		}
+
+		if (!loop && !ret) {
+			fwnode_handle_put(parent);
+			return 0;
+		}
+	}
+
+	return ret;
+}
+
 int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup)
 {
 	int ret;
 
 	mutex_lock(&fwnode_link_lock);
-	ret = __fwnode_link_add(con, sup, 0);
+
+	if ((sup->flags & FWNODE_FLAG_NOT_DEVICE) &&
+	    (sup->flags & FWNODE_FLAG_PARENT_IS_DEV))
+		ret = __fwnode_link_add_parents(con, sup, 0, false);
+	else
+		ret = __fwnode_link_add(con, sup, 0);
+
 	mutex_unlock(&fwnode_link_lock);
+
 	return ret;
 }
 
@@ -218,7 +250,8 @@  static void __fwnode_links_move_consumers(struct fwnode_handle *from,
  * MANAGED device links to this device, so leave @fwnode and its descendant's
  * fwnode links alone.
  *
- * Otherwise, move its consumers to the new supplier @new_sup.
+ * Otherwise, flag descendants of @fwnode as having a parent fwnode for a device
+ * that has probed and move bad fwnode consumer links from @fwnode to @new_sup.
  */
 static void __fw_devlink_pickup_dangling_consumers(struct fwnode_handle *fwnode,
 						   struct fwnode_handle *new_sup)
@@ -228,8 +261,11 @@  static void __fw_devlink_pickup_dangling_consumers(struct fwnode_handle *fwnode,
 	if (fwnode->dev && fwnode->dev->driver)
 		return;
 
-	fwnode->flags |= FWNODE_FLAG_NOT_DEVICE;
-	__fwnode_links_move_consumers(fwnode, new_sup);
+	if (fwnode->flags & FWNODE_FLAG_NOT_DEVICE)
+		__fwnode_links_move_consumers(fwnode, new_sup);
+
+	fwnode->flags |= FWNODE_FLAG_PARENT_IS_DEV;
+	new_sup->flags &= ~(FWNODE_FLAG_PARENT_IS_DEV);
 
 	fwnode_for_each_available_child_node(fwnode, child)
 		__fw_devlink_pickup_dangling_consumers(child, new_sup);
@@ -2071,8 +2107,9 @@  static int fw_devlink_create_devlink(struct device *con,
 		device_links_write_unlock();
 	}
 
-	if (sup_handle->flags & FWNODE_FLAG_NOT_DEVICE)
-		sup_dev = fwnode_get_next_parent_dev(sup_handle);
+	if ((sup_handle->flags & FWNODE_FLAG_NOT_DEVICE) &&
+	   !(sup_handle->flags & FWNODE_FLAG_PARENT_IS_DEV))
+		return -EINVAL;
 	else
 		sup_dev = get_dev_from_fwnode(sup_handle);
 
diff --git a/include/linux/fwnode.h b/include/linux/fwnode.h
index 2a72f55d26eb..3ed8ba503062 100644
--- a/include/linux/fwnode.h
+++ b/include/linux/fwnode.h
@@ -30,6 +30,9 @@  struct device;
  * BEST_EFFORT: The fwnode/device needs to probe early and might be missing some
  *		suppliers. Only enforce ordering with suppliers that have
  *		drivers.
+ * PARENT_IS_DEV: The fwnode is a child of a fwnode that is or will be populated as a
+ *		  struct device, which is more suitable for device links
+ *		  than the fwnode child which may never have a struct device.
  */
 #define FWNODE_FLAG_LINKS_ADDED			BIT(0)
 #define FWNODE_FLAG_NOT_DEVICE			BIT(1)
@@ -37,6 +40,7 @@  struct device;
 #define FWNODE_FLAG_NEEDS_CHILD_BOUND_ON_ADD	BIT(3)
 #define FWNODE_FLAG_BEST_EFFORT			BIT(4)
 #define FWNODE_FLAG_VISITED			BIT(5)
+#define FWNODE_FLAG_PARENT_IS_DEV		BIT(6)
 
 struct fwnode_handle {
 	struct fwnode_handle *secondary;