diff mbox series

[net-next] net: restore DSA behavior of not overriding ndo_get_phys_port_name if present

Message ID 20200722205348.2688142-1-olteanv@gmail.com
State Superseded
Delegated to: David Miller
Headers show
Series [net-next] net: restore DSA behavior of not overriding ndo_get_phys_port_name if present | expand

Commit Message

Vladimir Oltean July 22, 2020, 8:53 p.m. UTC
Prior to the commit below, dsa_master_ndo_setup() used to avoid
overriding .ndo_get_phys_port_name() unless the callback was empty.

https://elixir.bootlin.com/linux/v5.7.7/source/net/dsa/master.c#L269

Now, it overrides it unconditionally.

This matters for boards where DSA switches are hanging off of other DSA
switches, or switchdev interfaces.
Say a user has these udev rules for the top-level switch:

ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p0", NAME="swp0"
ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p1", NAME="swp1"
ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p2", NAME="swp2"
ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p3", NAME="swp3"
ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p4", NAME="swp4"
ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p5", NAME="swp5"

If the DSA switches below start randomly overriding
ndo_get_phys_port_name with their own CPU port, bad things can happen.
Not only may the CPU port number be not unique among different
downstream DSA switches, but one of the upstream switchdev interfaces
may also happen to have a port with the same number. So, we may even end
up in a situation where all interfaces of the top-level switch end up
having a phys_port_name attribute of "p0". Clearly not ok if the purpose
of the udev rules is to assign unique names.

Fix this by restoring the old behavior, which did not overlay this
operation on top of the DSA master logic, if there was one in place
already.

Fixes: 3369afba1e46 ("net: Call into DSA netdevice_ops wrappers")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
This is brain-dead, please consider killing this and retrieving the CPU
port number from "devlink port"...

pci/0000:00:00.5/0: type eth netdev swp0 flavour physical port 0
pci/0000:00:00.5/2: type eth netdev swp2 flavour physical port 2
pci/0000:00:00.5/4: type notset flavour cpu port 4
spi/spi2.0/0: type eth netdev sw0p0 flavour physical port 0
spi/spi2.0/1: type eth netdev sw0p1 flavour physical port 1
spi/spi2.0/2: type eth netdev sw0p2 flavour physical port 2
spi/spi2.0/4: type notset flavour cpu port 4
spi/spi2.1/0: type eth netdev sw1p0 flavour physical port 0
spi/spi2.1/1: type eth netdev sw1p1 flavour physical port 1
spi/spi2.1/2: type eth netdev sw1p2 flavour physical port 2
spi/spi2.1/3: type eth netdev sw1p3 flavour physical port 3
spi/spi2.1/4: type notset flavour cpu port 4

 net/core/dev.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

Comments

Florian Fainelli July 22, 2020, 9:53 p.m. UTC | #1
On 7/22/20 1:53 PM, Vladimir Oltean wrote:
> Prior to the commit below, dsa_master_ndo_setup() used to avoid
> overriding .ndo_get_phys_port_name() unless the callback was empty.
> 
> https://elixir.bootlin.com/linux/v5.7.7/source/net/dsa/master.c#L269
> 
> Now, it overrides it unconditionally.
> 
> This matters for boards where DSA switches are hanging off of other DSA
> switches, or switchdev interfaces.
> Say a user has these udev rules for the top-level switch:
> 
> ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p0", NAME="swp0"
> ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p1", NAME="swp1"
> ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p2", NAME="swp2"
> ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p3", NAME="swp3"
> ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p4", NAME="swp4"
> ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p5", NAME="swp5"
> 
> If the DSA switches below start randomly overriding
> ndo_get_phys_port_name with their own CPU port, bad things can happen.
> Not only may the CPU port number be not unique among different
> downstream DSA switches, but one of the upstream switchdev interfaces
> may also happen to have a port with the same number. So, we may even end
> up in a situation where all interfaces of the top-level switch end up
> having a phys_port_name attribute of "p0". Clearly not ok if the purpose
> of the udev rules is to assign unique names.
> 
> Fix this by restoring the old behavior, which did not overlay this
> operation on top of the DSA master logic, if there was one in place
> already.
> 
> Fixes: 3369afba1e46 ("net: Call into DSA netdevice_ops wrappers")
> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> ---
> This is brain-dead, please consider killing this and retrieving the CPU
> port number from "devlink port"...

That is fair enough. Do you want to submit such a change while you are
at it?

> 
> pci/0000:00:00.5/0: type eth netdev swp0 flavour physical port 0
> pci/0000:00:00.5/2: type eth netdev swp2 flavour physical port 2
> pci/0000:00:00.5/4: type notset flavour cpu port 4
> spi/spi2.0/0: type eth netdev sw0p0 flavour physical port 0
> spi/spi2.0/1: type eth netdev sw0p1 flavour physical port 1
> spi/spi2.0/2: type eth netdev sw0p2 flavour physical port 2
> spi/spi2.0/4: type notset flavour cpu port 4
> spi/spi2.1/0: type eth netdev sw1p0 flavour physical port 0
> spi/spi2.1/1: type eth netdev sw1p1 flavour physical port 1
> spi/spi2.1/2: type eth netdev sw1p2 flavour physical port 2
> spi/spi2.1/3: type eth netdev sw1p3 flavour physical port 3
> spi/spi2.1/4: type notset flavour cpu port 4
> 
>  net/core/dev.c | 13 +++++++++----
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 19f1abc26fcd..60778bd8c3b1 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -8603,15 +8603,20 @@ int dev_get_phys_port_name(struct net_device *dev,
>  	const struct net_device_ops *ops = dev->netdev_ops;
>  	int err;
>  
> -	err  = dsa_ndo_get_phys_port_name(dev, name, len);
> -	if (err == 0 || err != -EOPNOTSUPP)
> -		return err;
> -
>  	if (ops->ndo_get_phys_port_name) {
>  		err = ops->ndo_get_phys_port_name(dev, name, len);
>  		if (err != -EOPNOTSUPP)
>  			return err;
> +	} else {
> +		/* DSA may override this operation, but only if the master
> +		 * isn't a switchdev or another DSA, in that case it breaks
> +		 * their port numbering.
> +		 */
> +		err  = dsa_ndo_get_phys_port_name(dev, name, len);

Extraneous space here.

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Vladimir Oltean July 22, 2020, 10:06 p.m. UTC | #2
On Wed, Jul 22, 2020 at 02:53:28PM -0700, Florian Fainelli wrote:
> On 7/22/20 1:53 PM, Vladimir Oltean wrote:
> > Prior to the commit below, dsa_master_ndo_setup() used to avoid
> > overriding .ndo_get_phys_port_name() unless the callback was empty.
> > 
> > https://elixir.bootlin.com/linux/v5.7.7/source/net/dsa/master.c#L269
> > 
> > Now, it overrides it unconditionally.
> > 
> > This matters for boards where DSA switches are hanging off of other DSA
> > switches, or switchdev interfaces.
> > Say a user has these udev rules for the top-level switch:
> > 
> > ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p0", NAME="swp0"
> > ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p1", NAME="swp1"
> > ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p2", NAME="swp2"
> > ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p3", NAME="swp3"
> > ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p4", NAME="swp4"
> > ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p5", NAME="swp5"
> > 
> > If the DSA switches below start randomly overriding
> > ndo_get_phys_port_name with their own CPU port, bad things can happen.
> > Not only may the CPU port number be not unique among different
> > downstream DSA switches, but one of the upstream switchdev interfaces
> > may also happen to have a port with the same number. So, we may even end
> > up in a situation where all interfaces of the top-level switch end up
> > having a phys_port_name attribute of "p0". Clearly not ok if the purpose
> > of the udev rules is to assign unique names.
> > 
> > Fix this by restoring the old behavior, which did not overlay this
> > operation on top of the DSA master logic, if there was one in place
> > already.
> > 
> > Fixes: 3369afba1e46 ("net: Call into DSA netdevice_ops wrappers")
> > Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> > ---
> > This is brain-dead, please consider killing this and retrieving the CPU
> > port number from "devlink port"...
> 
> That is fair enough. Do you want to submit such a change while you are
> at it?
> 

If I'm getting you right, you mean I should be dropping this patch, and
send another one that deletes dsa_ndo_get_phys_port_name()?
I would expect that to be so - the problem is the fact that we're
retrieving the number of the CPU port through an ndo of the master
interface, it's not something we can fix by just calling into devlink
from kernel side. The user has to call into devlink.

> > 
> > pci/0000:00:00.5/0: type eth netdev swp0 flavour physical port 0
> > pci/0000:00:00.5/2: type eth netdev swp2 flavour physical port 2
> > pci/0000:00:00.5/4: type notset flavour cpu port 4
> > spi/spi2.0/0: type eth netdev sw0p0 flavour physical port 0
> > spi/spi2.0/1: type eth netdev sw0p1 flavour physical port 1
> > spi/spi2.0/2: type eth netdev sw0p2 flavour physical port 2
> > spi/spi2.0/4: type notset flavour cpu port 4
> > spi/spi2.1/0: type eth netdev sw1p0 flavour physical port 0
> > spi/spi2.1/1: type eth netdev sw1p1 flavour physical port 1
> > spi/spi2.1/2: type eth netdev sw1p2 flavour physical port 2
> > spi/spi2.1/3: type eth netdev sw1p3 flavour physical port 3
> > spi/spi2.1/4: type notset flavour cpu port 4
> > 
> >  net/core/dev.c | 13 +++++++++----
> >  1 file changed, 9 insertions(+), 4 deletions(-)
> > 
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 19f1abc26fcd..60778bd8c3b1 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -8603,15 +8603,20 @@ int dev_get_phys_port_name(struct net_device *dev,
> >  	const struct net_device_ops *ops = dev->netdev_ops;
> >  	int err;
> >  
> > -	err  = dsa_ndo_get_phys_port_name(dev, name, len);
> > -	if (err == 0 || err != -EOPNOTSUPP)
> > -		return err;
> > -
> >  	if (ops->ndo_get_phys_port_name) {
> >  		err = ops->ndo_get_phys_port_name(dev, name, len);
> >  		if (err != -EOPNOTSUPP)
> >  			return err;
> > +	} else {
> > +		/* DSA may override this operation, but only if the master
> > +		 * isn't a switchdev or another DSA, in that case it breaks
> > +		 * their port numbering.
> > +		 */
> > +		err  = dsa_ndo_get_phys_port_name(dev, name, len);
> 
> Extraneous space here.
> 

I understand what you mean, but I was just moving it...
I wouldn't exactly respin for a whitespace. Although I might end up
dropping this patch if I understand your first comment correctly.

> Acked-by: Florian Fainelli <f.fainelli@gmail.com>
> -- 
> Florian

Thanks,
-Vladimir
Florian Fainelli July 22, 2020, 10:11 p.m. UTC | #3
On 7/22/20 3:06 PM, Vladimir Oltean wrote:
> On Wed, Jul 22, 2020 at 02:53:28PM -0700, Florian Fainelli wrote:
>> On 7/22/20 1:53 PM, Vladimir Oltean wrote:
>>> Prior to the commit below, dsa_master_ndo_setup() used to avoid
>>> overriding .ndo_get_phys_port_name() unless the callback was empty.
>>>
>>> https://elixir.bootlin.com/linux/v5.7.7/source/net/dsa/master.c#L269
>>>
>>> Now, it overrides it unconditionally.
>>>
>>> This matters for boards where DSA switches are hanging off of other DSA
>>> switches, or switchdev interfaces.
>>> Say a user has these udev rules for the top-level switch:
>>>
>>> ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p0", NAME="swp0"
>>> ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p1", NAME="swp1"
>>> ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p2", NAME="swp2"
>>> ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p3", NAME="swp3"
>>> ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p4", NAME="swp4"
>>> ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p5", NAME="swp5"
>>>
>>> If the DSA switches below start randomly overriding
>>> ndo_get_phys_port_name with their own CPU port, bad things can happen.
>>> Not only may the CPU port number be not unique among different
>>> downstream DSA switches, but one of the upstream switchdev interfaces
>>> may also happen to have a port with the same number. So, we may even end
>>> up in a situation where all interfaces of the top-level switch end up
>>> having a phys_port_name attribute of "p0". Clearly not ok if the purpose
>>> of the udev rules is to assign unique names.
>>>
>>> Fix this by restoring the old behavior, which did not overlay this
>>> operation on top of the DSA master logic, if there was one in place
>>> already.
>>>
>>> Fixes: 3369afba1e46 ("net: Call into DSA netdevice_ops wrappers")
>>> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
>>> ---
>>> This is brain-dead, please consider killing this and retrieving the CPU
>>> port number from "devlink port"...
>>
>> That is fair enough. Do you want to submit such a change while you are
>> at it?
>>
> 
> If I'm getting you right, you mean I should be dropping this patch, and
> send another one that deletes dsa_ndo_get_phys_port_name()?
> I would expect that to be so - the problem is the fact that we're
> retrieving the number of the CPU port through an ndo of the master
> interface, it's not something we can fix by just calling into devlink
> from kernel side. The user has to call into devlink.

Yes, that is what I meant, that an user should call the appropriate
devlink command to obtain the port number, this particular change has
caused more harm than good, and the justification for doing it in the
first place was weak to begin with.
Vladimir Oltean July 22, 2020, 10:46 p.m. UTC | #4
On Wed, Jul 22, 2020 at 03:11:09PM -0700, Florian Fainelli wrote:
> On 7/22/20 3:06 PM, Vladimir Oltean wrote:
> > On Wed, Jul 22, 2020 at 02:53:28PM -0700, Florian Fainelli wrote:
> >> On 7/22/20 1:53 PM, Vladimir Oltean wrote:
> >>> Prior to the commit below, dsa_master_ndo_setup() used to avoid
> >>> overriding .ndo_get_phys_port_name() unless the callback was empty.
> >>>
> >>> https://elixir.bootlin.com/linux/v5.7.7/source/net/dsa/master.c#L269
> >>>
> >>> Now, it overrides it unconditionally.
> >>>
> >>> This matters for boards where DSA switches are hanging off of other DSA
> >>> switches, or switchdev interfaces.
> >>> Say a user has these udev rules for the top-level switch:
> >>>
> >>> ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p0", NAME="swp0"
> >>> ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p1", NAME="swp1"
> >>> ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p2", NAME="swp2"
> >>> ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p3", NAME="swp3"
> >>> ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p4", NAME="swp4"
> >>> ACTION=="add", SUBSYSTEM=="net", KERNELS=="0000:00:00.5", DRIVERS=="mscc_felix", ATTR{phys_port_name}=="p5", NAME="swp5"
> >>>
> >>> If the DSA switches below start randomly overriding
> >>> ndo_get_phys_port_name with their own CPU port, bad things can happen.
> >>> Not only may the CPU port number be not unique among different
> >>> downstream DSA switches, but one of the upstream switchdev interfaces
> >>> may also happen to have a port with the same number. So, we may even end
> >>> up in a situation where all interfaces of the top-level switch end up
> >>> having a phys_port_name attribute of "p0". Clearly not ok if the purpose
> >>> of the udev rules is to assign unique names.
> >>>
> >>> Fix this by restoring the old behavior, which did not overlay this
> >>> operation on top of the DSA master logic, if there was one in place
> >>> already.
> >>>
> >>> Fixes: 3369afba1e46 ("net: Call into DSA netdevice_ops wrappers")
> >>> Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> >>> ---
> >>> This is brain-dead, please consider killing this and retrieving the CPU
> >>> port number from "devlink port"...
> >>
> >> That is fair enough. Do you want to submit such a change while you are
> >> at it?
> >>
> > 
> > If I'm getting you right, you mean I should be dropping this patch, and
> > send another one that deletes dsa_ndo_get_phys_port_name()?
> > I would expect that to be so - the problem is the fact that we're
> > retrieving the number of the CPU port through an ndo of the master
> > interface, it's not something we can fix by just calling into devlink
> > from kernel side. The user has to call into devlink.
> 
> Yes, that is what I meant, that an user should call the appropriate
> devlink command to obtain the port number, this particular change has
> caused more harm than good, and the justification for doing it in the
> first place was weak to begin with.
> -- 
> Florian

If the patch at this link is considered appropriate:
https://patchwork.ozlabs.org/project/netdev/patch/20200722224312.2719813-1-olteanv@gmail.com/
then this can be considered superseded by that.

If not, please apply this.

Thanks,
-Vladimir
diff mbox series

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index 19f1abc26fcd..60778bd8c3b1 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -8603,15 +8603,20 @@  int dev_get_phys_port_name(struct net_device *dev,
 	const struct net_device_ops *ops = dev->netdev_ops;
 	int err;
 
-	err  = dsa_ndo_get_phys_port_name(dev, name, len);
-	if (err == 0 || err != -EOPNOTSUPP)
-		return err;
-
 	if (ops->ndo_get_phys_port_name) {
 		err = ops->ndo_get_phys_port_name(dev, name, len);
 		if (err != -EOPNOTSUPP)
 			return err;
+	} else {
+		/* DSA may override this operation, but only if the master
+		 * isn't a switchdev or another DSA, in that case it breaks
+		 * their port numbering.
+		 */
+		err  = dsa_ndo_get_phys_port_name(dev, name, len);
+		if (err == 0 || err != -EOPNOTSUPP)
+			return err;
 	}
+
 	return devlink_compat_phys_port_name_get(dev, name, len);
 }
 EXPORT_SYMBOL(dev_get_phys_port_name);