diff mbox series

[4/4] net: dsa: remove master santiy check

Message ID 20210224164042.21747-5-michael@walle.cc
State Accepted
Commit 714555374f2ff889cecbde62938a17e9678a0f09
Delegated to: Priyanka Jain
Headers show
Series net: dsa: various fixes | expand

Commit Message

Michael Walle Feb. 24, 2021, 4:40 p.m. UTC
Because we probe the master ourselves (and fail if there is no master),
it is not possible that we don't have a master device.

There is one catch though: device removal. We don't support that. It
wasn't supported neither before this patch. Because the master device
was only set in .pre_probe(), if a device was removed master_dev was a
dangling pointer and transmitting a frame cause a panic. I don't see a
good solution without having some sort of notify machanism when a
udevice is removed.

Signed-off-by: Michael Walle <michael@walle.cc>
---
 net/dsa-uclass.c | 25 ++++---------------------
 1 file changed, 4 insertions(+), 21 deletions(-)

Comments

Vladimir Oltean Feb. 24, 2021, 5:11 p.m. UTC | #1
On Wed, Feb 24, 2021 at 05:40:42PM +0100, Michael Walle wrote:
> Because we probe the master ourselves (and fail if there is no master),
> it is not possible that we don't have a master device.
> 
> There is one catch though: device removal. We don't support that. It
> wasn't supported neither before this patch. Because the master device
> was only set in .pre_probe(), if a device was removed master_dev was a
> dangling pointer and transmitting a frame cause a panic. I don't see a
> good solution without having some sort of notify machanism when a
> udevice is removed.
> 
> Signed-off-by: Michael Walle <michael@walle.cc>
> ---

Ha, this is a tough one.
So the DSA master is not a parent of the DSA switch, instead the SPI
bus/whatever is. Additionally, the checks for NULL are pointless,
because the master is not NULL, just deallocated.

In Linux we solved this by using device links:
https://github.com/torvalds/linux/commit/07b90056cb15ff9877dca0d8f1b6583d1051f724

However it doesn't seem like we have such a generic dependency graph
between udevices in U-Boot, do we? I could just find device_reparent,
which is definitely not what we want.

My best guess as to how to solve the DSA master removal scenario would
be to:
(a) scatter all DSA callbacks with device_probe(master), so even if it
    was unbound, it will be rebound.
(b) introduce DSA specific code in eth_pre_remove and a DSA-specific
    pointer in eth_pdata to manually call device_remove for an attached
    DSA switch, if that exists.

Other ideas are welcome, of course.
Michael Walle Feb. 24, 2021, 5:29 p.m. UTC | #2
Am 2021-02-24 18:11, schrieb Vladimir Oltean:
> On Wed, Feb 24, 2021 at 05:40:42PM +0100, Michael Walle wrote:
>> Because we probe the master ourselves (and fail if there is no 
>> master),
>> it is not possible that we don't have a master device.
>> 
>> There is one catch though: device removal. We don't support that. It
>> wasn't supported neither before this patch. Because the master device
>> was only set in .pre_probe(), if a device was removed master_dev was a
>> dangling pointer and transmitting a frame cause a panic. I don't see a
>> good solution without having some sort of notify machanism when a
>> udevice is removed.
>> 
>> Signed-off-by: Michael Walle <michael@walle.cc>
>> ---
> 
> Ha, this is a tough one.
> So the DSA master is not a parent of the DSA switch, instead the SPI
> bus/whatever is. Additionally, the checks for NULL are pointless,
> because the master is not NULL, just deallocated.
> 
> In Linux we solved this by using device links:
> https://github.com/torvalds/linux/commit/07b90056cb15ff9877dca0d8f1b6583d1051f724
> 
> However it doesn't seem like we have such a generic dependency graph
> between udevices in U-Boot, do we? I could just find device_reparent,
> which is definitely not what we want.

I didn't find anything else either. Thus I just ignored it for the
moment.

> My best guess as to how to solve the DSA master removal scenario would
> be to:
> (a) scatter all DSA callbacks with device_probe(master), so even if it
>     was unbound, it will be rebound.
> (b) introduce DSA specific code in eth_pre_remove and a DSA-specific
>     pointer in eth_pdata to manually call device_remove for an attached
>     DSA switch, if that exists.
> 
> Other ideas are welcome, of course.

What is the reason to remove that device in the first place? Like is
this really a valid scenario? I really don't know when a device is
removed and if its remove, will it still be there or is it rather
a hot-plug type and rebinding it won't work anyways.

I also had (b) in mind, but again. Does it really apply? After all,
this is a bootloader ;)
Vladimir Oltean Feb. 24, 2021, 5:45 p.m. UTC | #3
On Wed, Feb 24, 2021 at 06:29:39PM +0100, Michael Walle wrote:
> What is the reason to remove that device in the first place? Like is
> this really a valid scenario? I really don't know when a device is
> removed and if its remove, will it still be there or is it rather
> a hot-plug type and rebinding it won't work anyways.

Did you get this to crash under any circumstance other than using the
'unbind' command?
Michael Walle Feb. 24, 2021, 6:08 p.m. UTC | #4
Am 2021-02-24 18:45, schrieb Vladimir Oltean:
> On Wed, Feb 24, 2021 at 06:29:39PM +0100, Michael Walle wrote:
>> What is the reason to remove that device in the first place? Like is
>> this really a valid scenario? I really don't know when a device is
>> removed and if its remove, will it still be there or is it rather
>> a hot-plug type and rebinding it won't work anyways.
> 
> Did you get this to crash under any circumstance other than using the
> 'unbind' command?

Nope, thus I was curious about that comment in dsa_port_stop(). Someone
(Alex, Claudiu maybe?) must have something in mind when writing about 
it.
But I couldn't figure out in which case a device is removed.

-michael
Vladimir Oltean Feb. 24, 2021, 6:19 p.m. UTC | #5
On Wed, Feb 24, 2021 at 07:08:51PM +0100, Michael Walle wrote:
> Am 2021-02-24 18:45, schrieb Vladimir Oltean:
> > On Wed, Feb 24, 2021 at 06:29:39PM +0100, Michael Walle wrote:
> > > What is the reason to remove that device in the first place? Like is
> > > this really a valid scenario? I really don't know when a device is
> > > removed and if its remove, will it still be there or is it rather
> > > a hot-plug type and rebinding it won't work anyways.
> > 
> > Did you get this to crash under any circumstance other than using the
> > 'unbind' command?
> 
> Nope, thus I was curious about that comment in dsa_port_stop(). Someone
> (Alex, Claudiu maybe?) must have something in mind when writing about it.
> But I couldn't figure out in which case a device is removed.

I'm pretty sure that the checks that are in place now were once written
so that the sandbox tests would pass. If they still do, we should be
fine.

You can run the sandbox tests using:

make sandbox_defconfig NO_SDL=1
make -j 8 NO_SDL=1
./u-boot -d ./arch/sandbox/dts/test.dtb
setenv ethact swp0
ping 1.2.3.5
ut dm dsa_probe
ut dm dsa
ut dm
ut dm net_retry
Michael Walle Feb. 24, 2021, 7:03 p.m. UTC | #6
Am 2021-02-24 19:19, schrieb Vladimir Oltean:
> On Wed, Feb 24, 2021 at 07:08:51PM +0100, Michael Walle wrote:
>> Am 2021-02-24 18:45, schrieb Vladimir Oltean:
>> > On Wed, Feb 24, 2021 at 06:29:39PM +0100, Michael Walle wrote:
>> > > What is the reason to remove that device in the first place? Like is
>> > > this really a valid scenario? I really don't know when a device is
>> > > removed and if its remove, will it still be there or is it rather
>> > > a hot-plug type and rebinding it won't work anyways.
>> >
>> > Did you get this to crash under any circumstance other than using the
>> > 'unbind' command?
>> 
>> Nope, thus I was curious about that comment in dsa_port_stop(). 
>> Someone
>> (Alex, Claudiu maybe?) must have something in mind when writing about 
>> it.
>> But I couldn't figure out in which case a device is removed.
> 
> I'm pretty sure that the checks that are in place now were once written
> so that the sandbox tests would pass. If they still do, we should be
> fine.

Ah.

> You can run the sandbox tests using:

Just if one is trying to follow this thread: you'll also need to
have the following series applied:
   https://patchwork.ozlabs.org/project/uboot/list/?series=229778

> make sandbox_defconfig NO_SDL=1
> make -j 8 NO_SDL=1
> ./u-boot -d ./arch/sandbox/dts/test.dtb

btw theres a shortcut for this "u-boot -T"

> setenv ethact swp0

"setenv ethact lan0" I guess

> ping 1.2.3.5
> ut dm dsa_probe
> ut dm dsa
> ut dm
> ut dm net_retry

Not more failures than without my patch.

-michael
Vladimir Oltean Feb. 24, 2021, 9:40 p.m. UTC | #7
On Wed, Feb 24, 2021 at 08:03:03PM +0100, Michael Walle wrote:
> Not more failures than without my patch.

Ok, thanks, so could you leave a Tested-by there so we could also make
some progress with that?
Vladimir Oltean Feb. 24, 2021, 10:25 p.m. UTC | #8
On Wed, Feb 24, 2021 at 05:40:42PM +0100, Michael Walle wrote:
> Because we probe the master ourselves (and fail if there is no master),
> it is not possible that we don't have a master device.
> 
> There is one catch though: device removal. We don't support that. It
> wasn't supported neither before this patch. Because the master device
> was only set in .pre_probe(), if a device was removed master_dev was a
> dangling pointer and transmitting a frame cause a panic. I don't see a
> good solution without having some sort of notify machanism when a
> udevice is removed.
> 
> Signed-off-by: Michael Walle <michael@walle.cc>
> ---

U-Boot DM maintainers might want to comment on this as well, but with
the knowledge I have this seems good enough to me:

Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Michael Walle Feb. 24, 2021, 10:52 p.m. UTC | #9
Am 2021-02-24 17:40, schrieb Michael Walle:
> Because we probe the master ourselves (and fail if there is no master),
> it is not possible that we don't have a master device.
> 
> There is one catch though: device removal. We don't support that. It
> wasn't supported neither before this patch. Because the master device
> was only set in .pre_probe(), if a device was removed master_dev was a
> dangling pointer and transmitting a frame cause a panic. I don't see a
> good solution without having some sort of notify machanism when a
> udevice is removed.
> 
> Signed-off-by: Michael Walle <michael@walle.cc>

FWIW

Tested-by: Michael Walle <michael@walle.cc> [DSA unit tests]
diff mbox series

Patch

diff --git a/net/dsa-uclass.c b/net/dsa-uclass.c
index d453cc6930..7ea1cb6949 100644
--- a/net/dsa-uclass.c
+++ b/net/dsa-uclass.c
@@ -69,11 +69,6 @@  static int dsa_port_start(struct udevice *pdev)
 	struct dsa_ops *ops = dsa_get_ops(dev);
 	int err;
 
-	if (!master) {
-		dev_err(pdev, "DSA master Ethernet device not found!\n");
-		return -EINVAL;
-	}
-
 	if (ops->port_enable) {
 		struct dsa_port_pdata *port_pdata;
 
@@ -108,13 +103,7 @@  static void dsa_port_stop(struct udevice *pdev)
 		ops->port_disable(dev, priv->cpu_port, NULL);
 	}
 
-	/*
-	 * stop master only if it's active, don't probe it otherwise.
-	 * Under normal usage it would be active because we're using it, but
-	 * during tear-down it may have been removed ahead of us.
-	 */
-	if (master && device_active(master))
-		eth_get_ops(master)->stop(master);
+	eth_get_ops(master)->stop(master);
 }
 
 /*
@@ -133,9 +122,6 @@  static int dsa_port_send(struct udevice *pdev, void *packet, int length)
 	struct dsa_port_pdata *port_pdata;
 	int err;
 
-	if (!master)
-		return -EINVAL;
-
 	if (length + head + tail > PKTSIZE_ALIGN)
 		return -EINVAL;
 
@@ -165,9 +151,6 @@  static int dsa_port_recv(struct udevice *pdev, int flags, uchar **packetp)
 	struct dsa_port_pdata *port_pdata;
 	int length, port_index, err;
 
-	if (!master)
-		return -EINVAL;
-
 	length = eth_get_ops(master)->recv(master, flags, packetp);
 	if (length <= 0)
 		return length;
@@ -201,9 +184,6 @@  static int dsa_port_free_pkt(struct udevice *pdev, uchar *packet, int length)
 	struct udevice *master = dsa_get_master(dev);
 	struct dsa_priv *priv;
 
-	if (!master)
-		return -EINVAL;
-
 	priv = dev_get_uclass_priv(dev);
 	if (eth_get_ops(master)->free_pkt) {
 		/* return the original pointer and length to master Eth */
@@ -284,6 +264,9 @@  static int dsa_port_probe(struct udevice *pdev)
 	/*
 	 * Probe the master device. We depend on the master device for proper
 	 * operation and we also need it for MAC inheritance below.
+	 *
+	 * TODO: we assume the master device is always there and doesn't get
+	 * removed during runtime.
 	 */
 	ret = device_probe(master);
 	if (ret)