diff mbox series

[v2] gpiolib: Bind gpio_device to a driver to enable fw_devlink=on by default

Message ID 20210116011412.3211292-1-saravanak@google.com
State New
Headers show
Series [v2] gpiolib: Bind gpio_device to a driver to enable fw_devlink=on by default | expand

Commit Message

Saravana Kannan Jan. 16, 2021, 1:14 a.m. UTC
There are multiple instances of GPIO devictree nodes of the form:

foo {
	compatible = "acme,foo";
	...

	gpio0: gpio0@xxxxxxxx {
		compatible = "acme,bar";
		...
		gpio-controller;
	};

	gpio1: gpio1@xxxxxxxx {
		compatible = "acme,bar";
		...
		gpio-controller;
	};

	...
}

bazz {
	my-gpios = <&gpio0 ...>;
}

Case 1: The driver for "foo" populates struct device for these gpio*
nodes and then probes them using a driver that binds with "acme,bar".
This lines up with how DT nodes with the "compatible" property are
generally converted to struct devices and then registered with driver
core to probe them. This also allows the gpio* devices to hook into all
the driver core capabilities like runtime PM, probe deferral,
suspend/resume ordering, device links, etc.

Case 2: The driver for "foo" doesn't populate its child device nodes
with "compatible" property and instead just loops through its child
nodes and directly registers the GPIOs with gpiolib without ever
populating a struct device or binding a driver to it.

Drivers that follow the case 2 cause problems with fw_devlink=on.  This
is because fw_devlink will prevent bazz from probing until there's a
struct device that has gpio0 as its fwnode (because bazz lists gpio0 as
a GPIO supplier). Once the struct device is available, fw_devlink will
create a device link between with gpio0 as the supplier and bazz as the
consumer. After this point, the device link will prevent bazz from
probing until its supplier (the gpio0 device) has bound to a driver.
Once the supplier is bound to a driver, the probe of bazz is triggered
automatically.

Finding and refactoring all the instances of drivers that follow case 2
will cause a lot of code churn and it not something that can be done in
one shot. Examples of such instances are [1] [2].

This patch works around this problem and avoids all the code churn by
simply creating a stub driver to bind to the gpio_device. Since the
gpio_device already points to the GPIO device tree node, this allows all
the consumers to continue probing when the driver follows case 2.

If/when all the old drivers are refactored, we can revert this patch.

[1] - https://lore.kernel.org/lkml/20201014191235.7f71fcb4@xhacker.debian/
[2] - https://lore.kernel.org/lkml/e28e1f38d87c12a3c714a6573beba6e1@kernel.org/
Cc: Marc Zyngier <maz@kernel.org>
Cc: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Cc: Kever Yang <kever.yang@rock-chips.com>
Fixes: e590474768f1 ("driver core: Set fw_devlink=on by default")
Signed-off-by: Saravana Kannan <saravanak@google.com>
---
v1 -> v2:
- Fixed up compilation errors that were introduced accidentally
- Fixed a missing put_device()

 drivers/gpio/gpiolib.c | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

Comments

Andy Shevchenko Jan. 16, 2021, 8:37 p.m. UTC | #1
On Sat, Jan 16, 2021 at 3:15 AM Saravana Kannan <saravanak@google.com> wrote:

> v1 -> v2:
> - Fixed up compilation errors that were introduced accidentally
> - Fixed a missing put_device()

See my comments as per v1 and address.
Marc Zyngier Jan. 17, 2021, 12:02 p.m. UTC | #2
Hi Saravana,

Thanks for posting this, much appreciated.

On Sat, 16 Jan 2021 01:14:11 +0000,
Saravana Kannan <saravanak@google.com> wrote:
> 
> There are multiple instances of GPIO devictree nodes of the form:
> 
> foo {
> 	compatible = "acme,foo";
> 	...
> 
> 	gpio0: gpio0@xxxxxxxx {
> 		compatible = "acme,bar";
> 		...
> 		gpio-controller;
> 	};
> 
> 	gpio1: gpio1@xxxxxxxx {
> 		compatible = "acme,bar";
> 		...
> 		gpio-controller;
> 	};
> 
> 	...
> }
> 
> bazz {
> 	my-gpios = <&gpio0 ...>;
> }
> 
> Case 1: The driver for "foo" populates struct device for these gpio*
> nodes and then probes them using a driver that binds with "acme,bar".
> This lines up with how DT nodes with the "compatible" property are
> generally converted to struct devices and then registered with driver
> core to probe them. This also allows the gpio* devices to hook into all
> the driver core capabilities like runtime PM, probe deferral,
> suspend/resume ordering, device links, etc.
> 
> Case 2: The driver for "foo" doesn't populate its child device nodes
> with "compatible" property and instead just loops through its child
> nodes and directly registers the GPIOs with gpiolib without ever
> populating a struct device or binding a driver to it.

That's not quite an accurate description. The gpiolib subsystem does
create a struct device. It doesn't register a driver though, which is
what causes the issue with fr_devlink (more on that below).

> 
> Drivers that follow the case 2 cause problems with fw_devlink=on.  This
> is because fw_devlink will prevent bazz from probing until there's a
> struct device that has gpio0 as its fwnode (because bazz lists gpio0 as
> a GPIO supplier). Once the struct device is available, fw_devlink will
> create a device link between with gpio0 as the supplier and bazz as the
> consumer. After this point, the device link will prevent bazz from
> probing until its supplier (the gpio0 device) has bound to a driver.
> Once the supplier is bound to a driver, the probe of bazz is triggered
> automatically.
> 
> Finding and refactoring all the instances of drivers that follow case 2
> will cause a lot of code churn and it not something that can be done in
> one shot. Examples of such instances are [1] [2].
> 
> This patch works around this problem and avoids all the code churn by
> simply creating a stub driver to bind to the gpio_device. Since the
> gpio_device already points to the GPIO device tree node, this allows all
> the consumers to continue probing when the driver follows case 2.
> 
> If/when all the old drivers are refactored, we can revert this
> patch.

My personal gripe with this approach is that it is an abrupt change in
the way DT and device model are mapped onto each other.

As far as I know (and someone please correct me if I am wrong), there
is zero expectation that a device/subdevice/random IP block described
by a node with a "compatible" property will end-up being managed by a
driver that is bound to that particular node.

The node/subnode division is a good way to express some HW boundaries,
but doesn't say anything about the way this should be handled in the
kernel. Assuming that everything containing a "compatible" string will
eventually be bound to a driver is IMO pretty limiting.

> 
> [1] - https://lore.kernel.org/lkml/20201014191235.7f71fcb4@xhacker.debian/
> [2] - https://lore.kernel.org/lkml/e28e1f38d87c12a3c714a6573beba6e1@kernel.org/
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
> Cc: Kever Yang <kever.yang@rock-chips.com>
> Fixes: e590474768f1 ("driver core: Set fw_devlink=on by default")
> Signed-off-by: Saravana Kannan <saravanak@google.com>
> ---
> v1 -> v2:
> - Fixed up compilation errors that were introduced accidentally
> - Fixed a missing put_device()
> 
>  drivers/gpio/gpiolib.c | 37 +++++++++++++++++++++++++++++++++++++
>  1 file changed, 37 insertions(+)
> 
> diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
> index b02cc2abd3b6..12c579a953b0 100644
> --- a/drivers/gpio/gpiolib.c
> +++ b/drivers/gpio/gpiolib.c
> @@ -574,6 +574,9 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data,
>  	unsigned	i;
>  	int		base = gc->base;
>  	struct gpio_device *gdev;
> +	struct device_node *of_node;
> +	struct fwnode_handle *fwnode;
> +	struct device *fwnode_dev;
>  
>  	/*
>  	 * First: allocate and populate the internal stat container, and
> @@ -596,6 +599,22 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data,
>  		gdev->dev.of_node = gc->of_node;
>  	else
>  		gc->of_node = gdev->dev.of_node;
> +
> +	of_node = gdev->dev.of_node;
> +	fwnode = of_fwnode_handle(of_node);
> +	fwnode_dev = get_dev_from_fwnode(fwnode);
> +	/*
> +	 * If your driver hits this warning, it's because you are directly
> +	 * parsing a device tree node with "compatible" property and
> +	 * initializing it instead of using the standard DT + device driver
> +	 * model of creating a struct device and then initializing it in the
> +	 * probe function. Please refactor your driver.
> +	 */
> +	if (!fwnode_dev && of_find_property(of_node, "compatible", NULL)) {
> +		chip_warn(gc, "Create a real device for %pOF\n", of_node);

chip_warn() is not a good idea here, as gc->dev hasn't been
initialised yet, and results in the following output:

[    0.113996] gpio (null): (gpio0): Create a real device for /pinctrl/gpio0@ff720000
[    0.114727] gpio (null): (gpio1): Create a real device for /pinctrl/gpio1@ff730000
[    0.115340] gpio (null): (gpio2): Create a real device for /pinctrl/gpio2@ff780000
[    0.115912] gpio (null): (gpio3): Create a real device for /pinctrl/gpio3@ff788000
[    0.116437] gpio (null): (gpio4): Create a real device for /pinctrl/gpio4@ff790000

> +		gdev->dev.fwnode = fwnode;
> +	}
> +	put_device(fwnode_dev);
>  #endif
>  
>  	gdev->id = ida_alloc(&gpio_ida, GFP_KERNEL);
> @@ -4202,6 +4221,17 @@ void gpiod_put_array(struct gpio_descs *descs)
>  }
>  EXPORT_SYMBOL_GPL(gpiod_put_array);
>  
> +static int gpio_drv_probe(struct device *dev)
> +{
> +	return 0;
> +}
> +
> +static struct device_driver gpio_drv = {
> +	.name = "gpio_drv",
> +	.bus = &gpio_bus_type,
> +	.probe = gpio_drv_probe,
> +};
> +
>  static int __init gpiolib_dev_init(void)
>  {
>  	int ret;
> @@ -4213,9 +4243,16 @@ static int __init gpiolib_dev_init(void)
>  		return ret;
>  	}
>  
> +	if (driver_register(&gpio_drv) < 0) {
> +		pr_err("gpiolib: could not register GPIO stub driver\n");
> +		bus_unregister(&gpio_bus_type);
> +		return ret;
> +	}
> +
>  	ret = alloc_chrdev_region(&gpio_devt, 0, GPIO_DEV_MAX, GPIOCHIP_NAME);
>  	if (ret < 0) {
>  		pr_err("gpiolib: failed to allocate char dev region\n");
> +		driver_unregister(&gpio_drv);
>  		bus_unregister(&gpio_bus_type);
>  		return ret;
>  	}

On the positive side, this patch brings my RK3399 system back to life.
However, on a system that doesn't suffer from this problem, I end-up
with the following issue:

$ ls -l /sys/bus/platform/drivers/mb86s70-gpio/51000000.gpio/of_node
lrwxrwxrwx 1 root root 0 Jan 17 11:15 /sys/bus/platform/drivers/mb86s70-gpio/51000000.gpio/of_node -> ../../../firmware/devicetree/base/gpio@51000000
$ ls -l /sys/bus/gpio/drivers/gpio_drv/gpiochip0/of_node
lrwxrwxrwx 1 root root 0 Jan 16 18:30 /sys/bus/gpio/drivers/gpio_drv/gpiochip0/of_node -> ../../../../firmware/devicetree/base/gpio@51000000

where two drivers are now handling the same of_node. Somehow, I don't
think this is a good idea, even if I didn't spot anything problematic
yet (maybe because there isn't anything useful hanging off the GPIOs
on this particular machine).

Overall, I'm concerned that this is a change in semantics that affects
the whole device model, and I wonder whether forcing everyone in the
same mould is the right approach.

An alternative I've been thinking about is to flag the device as
"satisfying existing dependencies" at the point where it is
registered. Because at the end of the day, we don't really care for
the *driver* part of it (case in point, the stub driver you introduced
doesn't do anything). We only care about *someone* saying "this device
is ready to be used".

So how about cutting the middleman?

Thanks,

	M.
Saravana Kannan Jan. 18, 2021, 8:38 p.m. UTC | #3
On Sun, Jan 17, 2021 at 4:02 AM Marc Zyngier <maz@kernel.org> wrote:
>
> Hi Saravana,
>
> Thanks for posting this, much appreciated.
>
> On Sat, 16 Jan 2021 01:14:11 +0000,
> Saravana Kannan <saravanak@google.com> wrote:
> >
> > There are multiple instances of GPIO devictree nodes of the form:
> >
> > foo {
> >       compatible = "acme,foo";
> >       ...
> >
> >       gpio0: gpio0@xxxxxxxx {
> >               compatible = "acme,bar";
> >               ...
> >               gpio-controller;
> >       };
> >
> >       gpio1: gpio1@xxxxxxxx {
> >               compatible = "acme,bar";
> >               ...
> >               gpio-controller;
> >       };
> >
> >       ...
> > }
> >
> > bazz {
> >       my-gpios = <&gpio0 ...>;
> > }
> >
> > Case 1: The driver for "foo" populates struct device for these gpio*
> > nodes and then probes them using a driver that binds with "acme,bar".
> > This lines up with how DT nodes with the "compatible" property are
> > generally converted to struct devices and then registered with driver
> > core to probe them. This also allows the gpio* devices to hook into all
> > the driver core capabilities like runtime PM, probe deferral,
> > suspend/resume ordering, device links, etc.
> >
> > Case 2: The driver for "foo" doesn't populate its child device nodes
> > with "compatible" property and instead just loops through its child
> > nodes and directly registers the GPIOs with gpiolib without ever
> > populating a struct device or binding a driver to it.
>
> That's not quite an accurate description. The gpiolib subsystem does
> create a struct device. It doesn't register a driver though, which is
> what causes the issue with fr_devlink (more on that below).

The devices created by gpiolib care are created for case 1 and case 2.
They are just devices gpiolib uses to represent a virtual software
device to hook different attributes to and expose them in sysfs. I'm
not talking about those devices here. The devices I'm referring to are
devices that represent the actual HW IP -- so what I'm saying is
accurate.

> >
> > Drivers that follow the case 2 cause problems with fw_devlink=on.  This
> > is because fw_devlink will prevent bazz from probing until there's a
> > struct device that has gpio0 as its fwnode (because bazz lists gpio0 as
> > a GPIO supplier). Once the struct device is available, fw_devlink will
> > create a device link between with gpio0 as the supplier and bazz as the
> > consumer. After this point, the device link will prevent bazz from
> > probing until its supplier (the gpio0 device) has bound to a driver.
> > Once the supplier is bound to a driver, the probe of bazz is triggered
> > automatically.
> >
> > Finding and refactoring all the instances of drivers that follow case 2
> > will cause a lot of code churn and it not something that can be done in
> > one shot. Examples of such instances are [1] [2].
> >
> > This patch works around this problem and avoids all the code churn by
> > simply creating a stub driver to bind to the gpio_device. Since the
> > gpio_device already points to the GPIO device tree node, this allows all
> > the consumers to continue probing when the driver follows case 2.
> >
> > If/when all the old drivers are refactored, we can revert this
> > patch.
>
> My personal gripe with this approach is that it is an abrupt change in
> the way DT and device model are mapped onto each other.
>
> As far as I know (and someone please correct me if I am wrong), there
> is zero expectation that a device/subdevice/random IP block described
> by a node with a "compatible" property will end-up being managed by a
> driver that is bound to that particular node.
>
> The node/subnode division is a good way to express some HW boundaries,
> but doesn't say anything about the way this should be handled in the
> kernel. Assuming that everything containing a "compatible" string will
> eventually be bound to a driver is IMO pretty limiting.

The default behavior of of_platform_populate() is to create a struct
device for every node with "compatible" property. That's how top level
devices (or children of simple-bus devices) are populated. IIRC,
that's what a lot of other busses do too. So, if anything, not having
a struct device for nodes with "compatible" property is an exception.

Honestly, if one has a driver that supports a HW IP, I don't see any
reason it should operate outside of the device-driver model supported
by driver core. The driver code is there for a reason -- to solve all
the common problems faced by drivers. Operating outside of it just
causes reinventing the wheel, things like playing chicken with
initcalls to make sure drivers initialize their device in the right
order, not working with deferred probe, etc. For example, the rockchip
driver in your device (the one that follows case 2) tries to get some
clocks. But if that fails with -EPROBE_DEFER, the driver has no way
for it to recover and just doesn't register the GPIO anymore.

Obviously exceptions are allowed for devices that are needed before
the driver core even comes up -- like early, clocks, irqs and timers
for the kernel/scheduler to kick off.

> >
> > [1] - https://lore.kernel.org/lkml/20201014191235.7f71fcb4@xhacker.debian/
> > [2] - https://lore.kernel.org/lkml/e28e1f38d87c12a3c714a6573beba6e1@kernel.org/
> > Cc: Marc Zyngier <maz@kernel.org>
> > Cc: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
> > Cc: Kever Yang <kever.yang@rock-chips.com>
> > Fixes: e590474768f1 ("driver core: Set fw_devlink=on by default")
> > Signed-off-by: Saravana Kannan <saravanak@google.com>
> > ---
> > v1 -> v2:
> > - Fixed up compilation errors that were introduced accidentally
> > - Fixed a missing put_device()
> >
> >  drivers/gpio/gpiolib.c | 37 +++++++++++++++++++++++++++++++++++++
> >  1 file changed, 37 insertions(+)
> >
> > diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
> > index b02cc2abd3b6..12c579a953b0 100644
> > --- a/drivers/gpio/gpiolib.c
> > +++ b/drivers/gpio/gpiolib.c
> > @@ -574,6 +574,9 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data,
> >       unsigned        i;
> >       int             base = gc->base;
> >       struct gpio_device *gdev;
> > +     struct device_node *of_node;
> > +     struct fwnode_handle *fwnode;
> > +     struct device *fwnode_dev;
> >
> >       /*
> >        * First: allocate and populate the internal stat container, and
> > @@ -596,6 +599,22 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data,
> >               gdev->dev.of_node = gc->of_node;
> >       else
> >               gc->of_node = gdev->dev.of_node;
> > +
> > +     of_node = gdev->dev.of_node;
> > +     fwnode = of_fwnode_handle(of_node);
> > +     fwnode_dev = get_dev_from_fwnode(fwnode);
> > +     /*
> > +      * If your driver hits this warning, it's because you are directly
> > +      * parsing a device tree node with "compatible" property and
> > +      * initializing it instead of using the standard DT + device driver
> > +      * model of creating a struct device and then initializing it in the
> > +      * probe function. Please refactor your driver.
> > +      */
> > +     if (!fwnode_dev && of_find_property(of_node, "compatible", NULL)) {
> > +             chip_warn(gc, "Create a real device for %pOF\n", of_node);
>
> chip_warn() is not a good idea here, as gc->dev hasn't been
> initialised yet, and results in the following output:

Thanks, will look into this.

>
> [    0.113996] gpio (null): (gpio0): Create a real device for /pinctrl/gpio0@ff720000
> [    0.114727] gpio (null): (gpio1): Create a real device for /pinctrl/gpio1@ff730000
> [    0.115340] gpio (null): (gpio2): Create a real device for /pinctrl/gpio2@ff780000
> [    0.115912] gpio (null): (gpio3): Create a real device for /pinctrl/gpio3@ff788000
> [    0.116437] gpio (null): (gpio4): Create a real device for /pinctrl/gpio4@ff790000
>
> > +             gdev->dev.fwnode = fwnode;
> > +     }
> > +     put_device(fwnode_dev);
> >  #endif
> >
> >       gdev->id = ida_alloc(&gpio_ida, GFP_KERNEL);
> > @@ -4202,6 +4221,17 @@ void gpiod_put_array(struct gpio_descs *descs)
> >  }
> >  EXPORT_SYMBOL_GPL(gpiod_put_array);
> >
> > +static int gpio_drv_probe(struct device *dev)
> > +{
> > +     return 0;
> > +}
> > +
> > +static struct device_driver gpio_drv = {
> > +     .name = "gpio_drv",
> > +     .bus = &gpio_bus_type,
> > +     .probe = gpio_drv_probe,
> > +};
> > +
> >  static int __init gpiolib_dev_init(void)
> >  {
> >       int ret;
> > @@ -4213,9 +4243,16 @@ static int __init gpiolib_dev_init(void)
> >               return ret;
> >       }
> >
> > +     if (driver_register(&gpio_drv) < 0) {
> > +             pr_err("gpiolib: could not register GPIO stub driver\n");
> > +             bus_unregister(&gpio_bus_type);
> > +             return ret;
> > +     }
> > +
> >       ret = alloc_chrdev_region(&gpio_devt, 0, GPIO_DEV_MAX, GPIOCHIP_NAME);
> >       if (ret < 0) {
> >               pr_err("gpiolib: failed to allocate char dev region\n");
> > +             driver_unregister(&gpio_drv);
> >               bus_unregister(&gpio_bus_type);
> >               return ret;
> >       }
>
> On the positive side, this patch brings my RK3399 system back to life.
> However, on a system that doesn't suffer from this problem, I end-up
> with the following issue:
>
> $ ls -l /sys/bus/platform/drivers/mb86s70-gpio/51000000.gpio/of_node
> lrwxrwxrwx 1 root root 0 Jan 17 11:15 /sys/bus/platform/drivers/mb86s70-gpio/51000000.gpio/of_node -> ../../../firmware/devicetree/base/gpio@51000000
> $ ls -l /sys/bus/gpio/drivers/gpio_drv/gpiochip0/of_node
> lrwxrwxrwx 1 root root 0 Jan 16 18:30 /sys/bus/gpio/drivers/gpio_drv/gpiochip0/of_node -> ../../../../firmware/devicetree/base/gpio@51000000
>
> where two drivers are now handling the same of_node. Somehow, I don't
> think this is a good idea, even if I didn't spot anything problematic
> yet (maybe because there isn't anything useful hanging off the GPIOs
> on this particular machine).

I agree it looks weird, but it doesn't really break anything and I'm
pretty confident about it. IIRC there are other cases (unrelated to
fw_devlink) where this happens too. Also, there are many cases of two
devices pointing to the same of_node. If we really don't like this,
then the solution is to have gpiolib NOT initialize the of_node like
this. I can also have the stub driver not probe this device for case
1. That's pretty easy to do.

> Overall, I'm concerned that this is a change in semantics that affects
> the whole device model, and I wonder whether forcing everyone in the
> same mould is the right approach.
>
> An alternative I've been thinking about is to flag the device as
> "satisfying existing dependencies" at the point where it is
> registered. Because at the end of the day, we don't really care for
> the *driver* part of it (case in point, the stub driver you introduced
> doesn't do anything). We only care about *someone* saying "this device
> is ready to be used".

This might be doable (haven't fully thought of the corner cases it
might cause for device links) if gpiolib had put the device in a class
(instead of a bus) because "class" devices don't probe. "Bus" devices
are expected to probe and there's no way to tell if it's safe to
ignore waiting for this device to probe. That'd break device links
(not just fw_devlink) in a fundamental way too. Also, I don't know the
reasons why gpiolib uses a bus instead of a class, but if we change
that now, it's going to affect the sysfs layout and break UAPI.

So, IMHO, this patch (with the clean up for Case 1) is the best
compromise to get fw_devlink=on working. Considering how trivial this
patch is and fw_devlink=on helping out a lot of people, I'd recommend
we go with this patch and if we find a better solution later, we can
implement that and remove this.

-Saravana
Marc Zyngier Jan. 20, 2021, 3:39 p.m. UTC | #4
On 2021-01-18 20:38, Saravana Kannan wrote:
> On Sun, Jan 17, 2021 at 4:02 AM Marc Zyngier <maz@kernel.org> wrote:
>> 
>> Hi Saravana,
>> 
>> Thanks for posting this, much appreciated.
>> 
>> On Sat, 16 Jan 2021 01:14:11 +0000,
>> Saravana Kannan <saravanak@google.com> wrote:
>> >
>> > There are multiple instances of GPIO devictree nodes of the form:
>> >
>> > foo {
>> >       compatible = "acme,foo";
>> >       ...
>> >
>> >       gpio0: gpio0@xxxxxxxx {
>> >               compatible = "acme,bar";
>> >               ...
>> >               gpio-controller;
>> >       };
>> >
>> >       gpio1: gpio1@xxxxxxxx {
>> >               compatible = "acme,bar";
>> >               ...
>> >               gpio-controller;
>> >       };
>> >
>> >       ...
>> > }
>> >
>> > bazz {
>> >       my-gpios = <&gpio0 ...>;
>> > }
>> >
>> > Case 1: The driver for "foo" populates struct device for these gpio*
>> > nodes and then probes them using a driver that binds with "acme,bar".
>> > This lines up with how DT nodes with the "compatible" property are
>> > generally converted to struct devices and then registered with driver
>> > core to probe them. This also allows the gpio* devices to hook into all
>> > the driver core capabilities like runtime PM, probe deferral,
>> > suspend/resume ordering, device links, etc.
>> >
>> > Case 2: The driver for "foo" doesn't populate its child device nodes
>> > with "compatible" property and instead just loops through its child
>> > nodes and directly registers the GPIOs with gpiolib without ever
>> > populating a struct device or binding a driver to it.
>> 
>> That's not quite an accurate description. The gpiolib subsystem does
>> create a struct device. It doesn't register a driver though, which is
>> what causes the issue with fr_devlink (more on that below).
> 
> The devices created by gpiolib care are created for case 1 and case 2.
> They are just devices gpiolib uses to represent a virtual software
> device to hook different attributes to and expose them in sysfs. I'm
> not talking about those devices here. The devices I'm referring to are
> devices that represent the actual HW IP -- so what I'm saying is
> accurate.

Please read what you wrote : "without ever populating a struct device".
I stand by the "not quite accurate".

> 
>> >
>> > Drivers that follow the case 2 cause problems with fw_devlink=on.  This
>> > is because fw_devlink will prevent bazz from probing until there's a
>> > struct device that has gpio0 as its fwnode (because bazz lists gpio0 as
>> > a GPIO supplier). Once the struct device is available, fw_devlink will
>> > create a device link between with gpio0 as the supplier and bazz as the
>> > consumer. After this point, the device link will prevent bazz from
>> > probing until its supplier (the gpio0 device) has bound to a driver.
>> > Once the supplier is bound to a driver, the probe of bazz is triggered
>> > automatically.
>> >
>> > Finding and refactoring all the instances of drivers that follow case 2
>> > will cause a lot of code churn and it not something that can be done in
>> > one shot. Examples of such instances are [1] [2].
>> >
>> > This patch works around this problem and avoids all the code churn by
>> > simply creating a stub driver to bind to the gpio_device. Since the
>> > gpio_device already points to the GPIO device tree node, this allows all
>> > the consumers to continue probing when the driver follows case 2.
>> >
>> > If/when all the old drivers are refactored, we can revert this
>> > patch.
>> 
>> My personal gripe with this approach is that it is an abrupt change in
>> the way DT and device model are mapped onto each other.
>> 
>> As far as I know (and someone please correct me if I am wrong), there
>> is zero expectation that a device/subdevice/random IP block described
>> by a node with a "compatible" property will end-up being managed by a
>> driver that is bound to that particular node.
>> 
>> The node/subnode division is a good way to express some HW boundaries,
>> but doesn't say anything about the way this should be handled in the
>> kernel. Assuming that everything containing a "compatible" string will
>> eventually be bound to a driver is IMO pretty limiting.
> 
> The default behavior of of_platform_populate() is to create a struct
> device for every node with "compatible" property. That's how top level
> devices (or children of simple-bus devices) are populated. IIRC,
> that's what a lot of other busses do too. So, if anything, not having
> a struct device for nodes with "compatible" property is an exception.

Again, that's not a description of the reality. The case we are talking
about here does have a struct device. The misconception you keep
repeating is that binding it to a driver is the only valid approach.

> 
> Honestly, if one has a driver that supports a HW IP, I don't see any
> reason it should operate outside of the device-driver model supported
> by driver core. The driver code is there for a reason -- to solve all
> the common problems faced by drivers.

News flash: this isn't the case. Most of the code I deal with cannot
be represented as a driver, because it is needed way earlier
than the device model.

> Operating outside of it just
> causes reinventing the wheel, things like playing chicken with
> initcalls to make sure drivers initialize their device in the right
> order, not working with deferred probe, etc. For example, the rockchip
> driver in your device (the one that follows case 2) tries to get some
> clocks. But if that fails with -EPROBE_DEFER, the driver has no way
> for it to recover and just doesn't register the GPIO anymore.
> 
> Obviously exceptions are allowed for devices that are needed before
> the driver core even comes up -- like early, clocks, irqs and timers
> for the kernel/scheduler to kick off.

There you go. Exceptions. Tons of. The trouble is, you are rewriting
the rules of the game. Except that you are about 10 year late to
the party. Forcing your changes on everyone by saying that was
perfectly valid a month ago is now illegal doesn't really fly.

Anyway, I said what I had to say. If platforms break with this
change, I'll expect it to be disabled in 5.12.

Thanks,

         M.
Greg Kroah-Hartman Jan. 20, 2021, 3:48 p.m. UTC | #5
On Wed, Jan 20, 2021 at 03:39:30PM +0000, Marc Zyngier wrote:
> On 2021-01-18 20:38, Saravana Kannan wrote:
> > On Sun, Jan 17, 2021 at 4:02 AM Marc Zyngier <maz@kernel.org> wrote:
> > > 
> > > Hi Saravana,
> > > 
> > > Thanks for posting this, much appreciated.
> > > 
> > > On Sat, 16 Jan 2021 01:14:11 +0000,
> > > Saravana Kannan <saravanak@google.com> wrote:
> > > >
> > > > There are multiple instances of GPIO devictree nodes of the form:
> > > >
> > > > foo {
> > > >       compatible = "acme,foo";
> > > >       ...
> > > >
> > > >       gpio0: gpio0@xxxxxxxx {
> > > >               compatible = "acme,bar";
> > > >               ...
> > > >               gpio-controller;
> > > >       };
> > > >
> > > >       gpio1: gpio1@xxxxxxxx {
> > > >               compatible = "acme,bar";
> > > >               ...
> > > >               gpio-controller;
> > > >       };
> > > >
> > > >       ...
> > > > }
> > > >
> > > > bazz {
> > > >       my-gpios = <&gpio0 ...>;
> > > > }
> > > >
> > > > Case 1: The driver for "foo" populates struct device for these gpio*
> > > > nodes and then probes them using a driver that binds with "acme,bar".
> > > > This lines up with how DT nodes with the "compatible" property are
> > > > generally converted to struct devices and then registered with driver
> > > > core to probe them. This also allows the gpio* devices to hook into all
> > > > the driver core capabilities like runtime PM, probe deferral,
> > > > suspend/resume ordering, device links, etc.
> > > >
> > > > Case 2: The driver for "foo" doesn't populate its child device nodes
> > > > with "compatible" property and instead just loops through its child
> > > > nodes and directly registers the GPIOs with gpiolib without ever
> > > > populating a struct device or binding a driver to it.
> > > 
> > > That's not quite an accurate description. The gpiolib subsystem does
> > > create a struct device. It doesn't register a driver though, which is
> > > what causes the issue with fr_devlink (more on that below).
> > 
> > The devices created by gpiolib care are created for case 1 and case 2.
> > They are just devices gpiolib uses to represent a virtual software
> > device to hook different attributes to and expose them in sysfs. I'm
> > not talking about those devices here. The devices I'm referring to are
> > devices that represent the actual HW IP -- so what I'm saying is
> > accurate.
> 
> Please read what you wrote : "without ever populating a struct device".
> I stand by the "not quite accurate".
> 
> > 
> > > >
> > > > Drivers that follow the case 2 cause problems with fw_devlink=on.  This
> > > > is because fw_devlink will prevent bazz from probing until there's a
> > > > struct device that has gpio0 as its fwnode (because bazz lists gpio0 as
> > > > a GPIO supplier). Once the struct device is available, fw_devlink will
> > > > create a device link between with gpio0 as the supplier and bazz as the
> > > > consumer. After this point, the device link will prevent bazz from
> > > > probing until its supplier (the gpio0 device) has bound to a driver.
> > > > Once the supplier is bound to a driver, the probe of bazz is triggered
> > > > automatically.
> > > >
> > > > Finding and refactoring all the instances of drivers that follow case 2
> > > > will cause a lot of code churn and it not something that can be done in
> > > > one shot. Examples of such instances are [1] [2].
> > > >
> > > > This patch works around this problem and avoids all the code churn by
> > > > simply creating a stub driver to bind to the gpio_device. Since the
> > > > gpio_device already points to the GPIO device tree node, this allows all
> > > > the consumers to continue probing when the driver follows case 2.
> > > >
> > > > If/when all the old drivers are refactored, we can revert this
> > > > patch.
> > > 
> > > My personal gripe with this approach is that it is an abrupt change in
> > > the way DT and device model are mapped onto each other.
> > > 
> > > As far as I know (and someone please correct me if I am wrong), there
> > > is zero expectation that a device/subdevice/random IP block described
> > > by a node with a "compatible" property will end-up being managed by a
> > > driver that is bound to that particular node.
> > > 
> > > The node/subnode division is a good way to express some HW boundaries,
> > > but doesn't say anything about the way this should be handled in the
> > > kernel. Assuming that everything containing a "compatible" string will
> > > eventually be bound to a driver is IMO pretty limiting.
> > 
> > The default behavior of of_platform_populate() is to create a struct
> > device for every node with "compatible" property. That's how top level
> > devices (or children of simple-bus devices) are populated. IIRC,
> > that's what a lot of other busses do too. So, if anything, not having
> > a struct device for nodes with "compatible" property is an exception.
> 
> Again, that's not a description of the reality. The case we are talking
> about here does have a struct device. The misconception you keep
> repeating is that binding it to a driver is the only valid approach.
> 
> > 
> > Honestly, if one has a driver that supports a HW IP, I don't see any
> > reason it should operate outside of the device-driver model supported
> > by driver core. The driver code is there for a reason -- to solve all
> > the common problems faced by drivers.
> 
> News flash: this isn't the case. Most of the code I deal with cannot
> be represented as a driver, because it is needed way earlier
> than the device model.
> 
> > Operating outside of it just
> > causes reinventing the wheel, things like playing chicken with
> > initcalls to make sure drivers initialize their device in the right
> > order, not working with deferred probe, etc. For example, the rockchip
> > driver in your device (the one that follows case 2) tries to get some
> > clocks. But if that fails with -EPROBE_DEFER, the driver has no way
> > for it to recover and just doesn't register the GPIO anymore.
> > 
> > Obviously exceptions are allowed for devices that are needed before
> > the driver core even comes up -- like early, clocks, irqs and timers
> > for the kernel/scheduler to kick off.
> 
> There you go. Exceptions. Tons of. The trouble is, you are rewriting
> the rules of the game. Except that you are about 10 year late to
> the party. Forcing your changes on everyone by saying that was
> perfectly valid a month ago is now illegal doesn't really fly.
> 
> Anyway, I said what I had to say. If platforms break with this
> change, I'll expect it to be disabled in 5.12.

I'm thinking we can not change the default and will probably revert this
patch "soon".

thanks,

greg k-h
Marc Zyngier Jan. 20, 2021, 3:58 p.m. UTC | #6
On 2021-01-20 15:48, Greg Kroah-Hartman wrote:
> On Wed, Jan 20, 2021 at 03:39:30PM +0000, Marc Zyngier wrote:

>> Anyway, I said what I had to say. If platforms break with this
>> change, I'll expect it to be disabled in 5.12.
> 
> I'm thinking we can not change the default and will probably revert 
> this
> patch "soon".

I think there is a lot of value in keeping this enabled for a bit,
so that we can work out what breaks, and find solutions that scale
a bit better.

Thanks,

         M.
Greg Kroah-Hartman Jan. 20, 2021, 5:01 p.m. UTC | #7
On Wed, Jan 20, 2021 at 03:58:29PM +0000, Marc Zyngier wrote:
> On 2021-01-20 15:48, Greg Kroah-Hartman wrote:
> > On Wed, Jan 20, 2021 at 03:39:30PM +0000, Marc Zyngier wrote:
> 
> > > Anyway, I said what I had to say. If platforms break with this
> > > change, I'll expect it to be disabled in 5.12.
> > 
> > I'm thinking we can not change the default and will probably revert this
> > patch "soon".
> 
> I think there is a lot of value in keeping this enabled for a bit,
> so that we can work out what breaks, and find solutions that scale
> a bit better.

Ok, will leave it alone for a few more weeks.

thanks,

greg k-h
Linus Walleij Jan. 21, 2021, 1 p.m. UTC | #8
On Sat, Jan 16, 2021 at 2:14 AM Saravana Kannan <saravanak@google.com> wrote:

> There are multiple instances of GPIO devictree nodes of the form:
>
> foo {
>         compatible = "acme,foo";
>         ...
>
>         gpio0: gpio0@xxxxxxxx {
>                 compatible = "acme,bar";
>                 ...
>                 gpio-controller;
>         };
>
>         gpio1: gpio1@xxxxxxxx {
>                 compatible = "acme,bar";
>                 ...
>                 gpio-controller;
>         };
>
>         ...
> }
>
> bazz {
>         my-gpios = <&gpio0 ...>;
> }
>
> Case 1: The driver for "foo" populates struct device for these gpio*
> nodes and then probes them using a driver that binds with "acme,bar".
> This lines up with how DT nodes with the "compatible" property are
> generally converted to struct devices and then registered with driver
> core to probe them. This also allows the gpio* devices to hook into all
> the driver core capabilities like runtime PM, probe deferral,
> suspend/resume ordering, device links, etc.

Agreed. I usually tried to get driver authors to do this.

Sometimes they have been very strongly convinced that this is not
right for their hardware which leads to case 2.

> Case 2: The driver for "foo" doesn't populate its child device nodes
> with "compatible" property and instead just loops through its child
> nodes and directly registers the GPIOs with gpiolib without ever
> populating a struct device or binding a driver to it.

This usually happens when the registers for these "subdevices"
are mixed up in the address space, so there could not be a clean
"reg" property on the node, i.e. it breaks another implicit assumtion
that each such gpio node is a separate hardware and register map
entity.

They may still have "ports" or "banks" of GPIO that make sense
to separate into logical nodes and this is most often why they
do this.

I bet there are some other oddities as well.

> Drivers that follow the case 2 cause problems with fw_devlink=on.  This
> is because fw_devlink will prevent bazz from probing until there's a
> struct device that has gpio0 as its fwnode (because bazz lists gpio0 as
> a GPIO supplier). Once the struct device is available, fw_devlink will
> create a device link between with gpio0 as the supplier and bazz as the
> consumer. After this point, the device link will prevent bazz from
> probing until its supplier (the gpio0 device) has bound to a driver.
> Once the supplier is bound to a driver, the probe of bazz is triggered
> automatically.

I think in many or all cases there will eventually be such a device,
and indeed when you are talking about struct gpio_device
below, that is that device, right?

> Finding and refactoring all the instances of drivers that follow case 2
> will cause a lot of code churn and it not something that can be done in
> one shot. Examples of such instances are [1] [2].

[1] is the DesignWare GPIO controller which is actually pretty important,
a lot of devices synthesize this controller so it would be really nice to
fix properly.

That said I am not sure there is an option to actually refactor these
devices, they look like they do for very good reasons.

> This patch works around this problem and avoids all the code churn by
> simply creating a stub driver to bind to the gpio_device. Since the
> gpio_device already points to the GPIO device tree node, this allows all
> the consumers to continue probing when the driver follows case 2.

That makes sense.

> If/when all the old drivers are refactored, we can revert this patch.

I have a bad feeling about this.

This type of hacks tend to stay around forever.

That said I'm not sure this is entirely wrong either, maybe this
is business as usual and *should* stay around forever. Haven't
made my mind up about that.

> @@ -574,6 +574,9 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data,
>         unsigned        i;
>         int             base = gc->base;
>         struct gpio_device *gdev;
> +       struct device_node *of_node;
> +       struct fwnode_handle *fwnode;
> +       struct device *fwnode_dev;
>
>         /*
>          * First: allocate and populate the internal stat container, and
> @@ -596,6 +599,22 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data,
>                 gdev->dev.of_node = gc->of_node;
>         else
>                 gc->of_node = gdev->dev.of_node;
> +
> +       of_node = gdev->dev.of_node;
> +       fwnode = of_fwnode_handle(of_node);
> +       fwnode_dev = get_dev_from_fwnode(fwnode);

This symbol is called by every GPIO driver on the planet, including those
that are not using device tree such as ACPI and board files, or
PCI or USB GPIO device.

So this will not work, it will break a lot of driver.

You need to put code into drivers/gpio/gpiolib-of.c with stubs
for the !OF case in drivers/gpio/gpiolib-of.h so that systems
not using device tree can avoid this code path.

> +       /*
> +        * If your driver hits this warning, it's because you are directly
> +        * parsing a device tree node with "compatible" property and
> +        * initializing it instead of using the standard DT + device driver
> +        * model of creating a struct device and then initializing it in the
> +        * probe function. Please refactor your driver.
> +        */
> +       if (!fwnode_dev && of_find_property(of_node, "compatible", NULL)) {
> +               chip_warn(gc, "Create a real device for %pOF\n", of_node);
> +               gdev->dev.fwnode = fwnode;
> +       }

As discussed in other messages I don't know if this message
is aligned with the device tree ontology. The DT people should
speak about that.

As device tree person FWIW I think it is perfectly legal to have
DT nodes that do not incarnate as struct device.

> +static int gpio_drv_probe(struct device *dev)
> +{
> +       return 0;
> +}
> +
> +static struct device_driver gpio_drv = {
> +       .name = "gpio_drv",
> +       .bus = &gpio_bus_type,
> +       .probe = gpio_drv_probe,
> +};

Well that was curious. It actually looks like something we can
make use of one day. But put in some comments in the code
describing when this gets probed and why we do not do anything
in probe() here.

Yours,
Linus Walleij
Saravana Kannan Jan. 21, 2021, 7:43 p.m. UTC | #9
On Thu, Jan 21, 2021 at 5:01 AM Linus Walleij <linus.walleij@linaro.org> wrote:
>
> On Sat, Jan 16, 2021 at 2:14 AM Saravana Kannan <saravanak@google.com> wrote:
>
> > There are multiple instances of GPIO devictree nodes of the form:
> >
> > foo {
> >         compatible = "acme,foo";
> >         ...
> >
> >         gpio0: gpio0@xxxxxxxx {
> >                 compatible = "acme,bar";
> >                 ...
> >                 gpio-controller;
> >         };
> >
> >         gpio1: gpio1@xxxxxxxx {
> >                 compatible = "acme,bar";
> >                 ...
> >                 gpio-controller;
> >         };
> >
> >         ...
> > }
> >
> > bazz {
> >         my-gpios = <&gpio0 ...>;
> > }
> >
> > Case 1: The driver for "foo" populates struct device for these gpio*
> > nodes and then probes them using a driver that binds with "acme,bar".
> > This lines up with how DT nodes with the "compatible" property are
> > generally converted to struct devices and then registered with driver
> > core to probe them. This also allows the gpio* devices to hook into all
> > the driver core capabilities like runtime PM, probe deferral,
> > suspend/resume ordering, device links, etc.
>
> Agreed. I usually tried to get driver authors to do this.
>
> Sometimes they have been very strongly convinced that this is not
> right for their hardware which leads to case 2.
>
> > Case 2: The driver for "foo" doesn't populate its child device nodes
> > with "compatible" property and instead just loops through its child
> > nodes and directly registers the GPIOs with gpiolib without ever
> > populating a struct device or binding a driver to it.
>
> This usually happens when the registers for these "subdevices"
> are mixed up in the address space, so there could not be a clean
> "reg" property on the node, i.e. it breaks another implicit assumtion
> that each such gpio node is a separate hardware and register map
> entity.
>
> They may still have "ports" or "banks" of GPIO that make sense
> to separate into logical nodes and this is most often why they
> do this.
>
> I bet there are some other oddities as well.

Ah, thanks for the context. But couldn't they just skip the
"compatible" property in the DT if these individual nodes aren't
considered separate devices? It's too late for existing DT device
bindings, but maybe in the future we can ask them to skip the
"compatible" property if they don't consider the sub nodes to be
distinct devices?

> > Drivers that follow the case 2 cause problems with fw_devlink=on.  This
> > is because fw_devlink will prevent bazz from probing until there's a
> > struct device that has gpio0 as its fwnode (because bazz lists gpio0 as
> > a GPIO supplier). Once the struct device is available, fw_devlink will
> > create a device link between with gpio0 as the supplier and bazz as the
> > consumer. After this point, the device link will prevent bazz from
> > probing until its supplier (the gpio0 device) has bound to a driver.
> > Once the supplier is bound to a driver, the probe of bazz is triggered
> > automatically.
>
> I think in many or all cases there will eventually be such a device,
> and indeed when you are talking about struct gpio_device
> below, that is that device, right?
>
> > Finding and refactoring all the instances of drivers that follow case 2
> > will cause a lot of code churn and it not something that can be done in
> > one shot. Examples of such instances are [1] [2].
>
> [1] is the DesignWare GPIO controller which is actually pretty important,
> a lot of devices synthesize this controller so it would be really nice to
> fix properly.
>
> That said I am not sure there is an option to actually refactor these
> devices, they look like they do for very good reasons.
>
> > This patch works around this problem and avoids all the code churn by
> > simply creating a stub driver to bind to the gpio_device. Since the
> > gpio_device already points to the GPIO device tree node, this allows all
> > the consumers to continue probing when the driver follows case 2.
>
> That makes sense.
>
> > If/when all the old drivers are refactored, we can revert this patch.
>
> I have a bad feeling about this.
>
> This type of hacks tend to stay around forever.
>
> That said I'm not sure this is entirely wrong either, maybe this
> is business as usual and *should* stay around forever. Haven't
> made my mind up about that.

Considering your comment about why some (not all) of these nodes
aren't considered separate devices, looks like this has to stay this
way forever? I can drop this line in the commit text.

> > @@ -574,6 +574,9 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data,
> >         unsigned        i;
> >         int             base = gc->base;
> >         struct gpio_device *gdev;
> > +       struct device_node *of_node;
> > +       struct fwnode_handle *fwnode;
> > +       struct device *fwnode_dev;
> >
> >         /*
> >          * First: allocate and populate the internal stat container, and
> > @@ -596,6 +599,22 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data,
> >                 gdev->dev.of_node = gc->of_node;
> >         else
> >                 gc->of_node = gdev->dev.of_node;
> > +
> > +       of_node = gdev->dev.of_node;
> > +       fwnode = of_fwnode_handle(of_node);
> > +       fwnode_dev = get_dev_from_fwnode(fwnode);
>
> This symbol is called by every GPIO driver on the planet, including those
> that are not using device tree such as ACPI and board files, or
> PCI or USB GPIO device.
>
> So this will not work, it will break a lot of driver.
>
> You need to put code into drivers/gpio/gpiolib-of.c with stubs
> for the !OF case in drivers/gpio/gpiolib-of.h so that systems
> not using device tree can avoid this code path.

It's not clear in the diff due to lack of sufficient context lines,
but this piece of code is already inside a #ifdef CONFIG_OF_GPIO.

To cover the case where CONFIG_OF_GPIO is enabled but we get here for
non-DT devices, I just need to add a !fwnode check here. Then stuff
will automatically NOP out for non-DT devices. Since the
gdev->dev.of_node is set a few lines above, I think gdev->dev.fwnode
should also be set close to it (which is what the next few lines do).
I'll add this additional check to v3.

>
> > +       /*
> > +        * If your driver hits this warning, it's because you are directly
> > +        * parsing a device tree node with "compatible" property and
> > +        * initializing it instead of using the standard DT + device driver
> > +        * model of creating a struct device and then initializing it in the
> > +        * probe function. Please refactor your driver.
> > +        */
> > +       if (!fwnode_dev && of_find_property(of_node, "compatible", NULL)) {
> > +               chip_warn(gc, "Create a real device for %pOF\n", of_node);
> > +               gdev->dev.fwnode = fwnode;
> > +       }
>
> As discussed in other messages I don't know if this message
> is aligned with the device tree ontology. The DT people should
> speak about that.

Considering what you said earlier, I'll just drop this message.

> As device tree person FWIW I think it is perfectly legal to have
> DT nodes that do not incarnate as struct device.
>
> > +static int gpio_drv_probe(struct device *dev)
> > +{
> > +       return 0;
> > +}
> > +
> > +static struct device_driver gpio_drv = {
> > +       .name = "gpio_drv",
> > +       .bus = &gpio_bus_type,
> > +       .probe = gpio_drv_probe,
> > +};
>
> Well that was curious. It actually looks like something we can
> make use of one day. But put in some comments in the code
> describing when this gets probed and why we do not do anything
> in probe() here.

Ack.

Thanks for the review.

-Saravana
Linus Walleij Jan. 22, 2021, 1:04 p.m. UTC | #10
On Thu, Jan 21, 2021 at 8:43 PM Saravana Kannan <saravanak@google.com> wrote:

> > They may still have "ports" or "banks" of GPIO that make sense
> > to separate into logical nodes and this is most often why they
> > do this.
> >
> > I bet there are some other oddities as well.
>
> Ah, thanks for the context. But couldn't they just skip the
> "compatible" property in the DT if these individual nodes aren't
> considered separate devices? It's too late for existing DT device
> bindings, but maybe in the future we can ask them to skip the
> "compatible" property if they don't consider the sub nodes to be
> distinct devices?

That makes sense and has been done in other cases.

> > > This patch works around this problem and avoids all the code churn by
> > > simply creating a stub driver to bind to the gpio_device. Since the
> > > gpio_device already points to the GPIO device tree node, this allows all
> > > the consumers to continue probing when the driver follows case 2.
> >
> > That makes sense.
> >
> > > If/when all the old drivers are refactored, we can revert this patch.
> >
> > I have a bad feeling about this.
> >
> > This type of hacks tend to stay around forever.
> >
> > That said I'm not sure this is entirely wrong either, maybe this
> > is business as usual and *should* stay around forever. Haven't
> > made my mind up about that.
>
> Considering your comment about why some (not all) of these nodes
> aren't considered separate devices, looks like this has to stay this
> way forever? I can drop this line in the commit text.

Yep looks like so. I think this patch is sound.

> > You need to put code into drivers/gpio/gpiolib-of.c with stubs
> > for the !OF case in drivers/gpio/gpiolib-of.h so that systems
> > not using device tree can avoid this code path.
>
> It's not clear in the diff due to lack of sufficient context lines,
> but this piece of code is already inside a #ifdef CONFIG_OF_GPIO.
>
> To cover the case where CONFIG_OF_GPIO is enabled but we get here for
> non-DT devices, I just need to add a !fwnode check here. Then stuff
> will automatically NOP out for non-DT devices. Since the
> gdev->dev.of_node is set a few lines above, I think gdev->dev.fwnode
> should also be set close to it (which is what the next few lines do).
> I'll add this additional check to v3.

I dunno about that. If you add more than a few lines of DT-specific
code, I prefer that you put that into gpiolib-of.c to keep things
separate, or we will get a mess with more and more hardware
descriptions.

Things that are agnostic fwnode is fine to have in generic code,
it should be used the same by pretty much anything.

A matter of taste I suppose, so no strong opinion.

> > As discussed in other messages I don't know if this message
> > is aligned with the device tree ontology. The DT people should
> > speak about that.
>
> Considering what you said earlier, I'll just drop this message.

Thanks.

Yours,
Linus Walleij
diff mbox series

Patch

diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index b02cc2abd3b6..12c579a953b0 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -574,6 +574,9 @@  int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data,
 	unsigned	i;
 	int		base = gc->base;
 	struct gpio_device *gdev;
+	struct device_node *of_node;
+	struct fwnode_handle *fwnode;
+	struct device *fwnode_dev;
 
 	/*
 	 * First: allocate and populate the internal stat container, and
@@ -596,6 +599,22 @@  int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data,
 		gdev->dev.of_node = gc->of_node;
 	else
 		gc->of_node = gdev->dev.of_node;
+
+	of_node = gdev->dev.of_node;
+	fwnode = of_fwnode_handle(of_node);
+	fwnode_dev = get_dev_from_fwnode(fwnode);
+	/*
+	 * If your driver hits this warning, it's because you are directly
+	 * parsing a device tree node with "compatible" property and
+	 * initializing it instead of using the standard DT + device driver
+	 * model of creating a struct device and then initializing it in the
+	 * probe function. Please refactor your driver.
+	 */
+	if (!fwnode_dev && of_find_property(of_node, "compatible", NULL)) {
+		chip_warn(gc, "Create a real device for %pOF\n", of_node);
+		gdev->dev.fwnode = fwnode;
+	}
+	put_device(fwnode_dev);
 #endif
 
 	gdev->id = ida_alloc(&gpio_ida, GFP_KERNEL);
@@ -4202,6 +4221,17 @@  void gpiod_put_array(struct gpio_descs *descs)
 }
 EXPORT_SYMBOL_GPL(gpiod_put_array);
 
+static int gpio_drv_probe(struct device *dev)
+{
+	return 0;
+}
+
+static struct device_driver gpio_drv = {
+	.name = "gpio_drv",
+	.bus = &gpio_bus_type,
+	.probe = gpio_drv_probe,
+};
+
 static int __init gpiolib_dev_init(void)
 {
 	int ret;
@@ -4213,9 +4243,16 @@  static int __init gpiolib_dev_init(void)
 		return ret;
 	}
 
+	if (driver_register(&gpio_drv) < 0) {
+		pr_err("gpiolib: could not register GPIO stub driver\n");
+		bus_unregister(&gpio_bus_type);
+		return ret;
+	}
+
 	ret = alloc_chrdev_region(&gpio_devt, 0, GPIO_DEV_MAX, GPIOCHIP_NAME);
 	if (ret < 0) {
 		pr_err("gpiolib: failed to allocate char dev region\n");
+		driver_unregister(&gpio_drv);
 		bus_unregister(&gpio_bus_type);
 		return ret;
 	}